How Europe is trying to regulate the rise of AI
By Arnoud Engelfriet
Have you heard the one about the exhausted programmer in the bathroom? The shampoo bottle said: “Apply, rinse, repeat” – and he did. But you’ll never find an exhausted lawyer in the same situation; lawyers treat such instructions with reason. And this joke shows exactly why things so often go wrong in discussions about Artificial Intelligence, machine learning and the regulation of algorithms, as is currently the case with the AI Act that’s underway in Europe.
The fear of algorithms
Are those instructions on the shampoo bottle an algorithm? Taking the usual definition, an algorithm is a "step-by-step plan" or "recipe for solving a mathematical or computer problem". The name comes from the Persian mathematician Al-Khwarizmi, who originated the general concept. And yes, those definitions are so broad that that shampoo bottle certainly falls under it. In society (and therefore also among lawyers), the term algorithm is mainly used for complex step-by-step plans or recipes, typically things one wouldn’t want to solve without a computer.
Algorithms are increasingly making decisions that have a major impact on people, and that is a major concern for lawyers. For example, since 2018 we have seen calls for supervision on algorithm usage to be able to curb this “fourth power” (in addition to the legislative, executive and judicial powers). Because that fourth power is a black box, can apply uncontrolled bias and deprive people of rights and impose sanctions without any transparency or accountability. In other words: there is no grip and we don't see how it works, but we do see it increasing and the concerns are growing.
The answer from legislators, regulators and lawyers is of course 'regulation'. New trends and developments must nevertheless be managed in the right direction, with the additional aim of minimizing adverse effects or harmful side effects. The only problem is that usually existing regulatory frameworks are taken as the starting point. And they are often not fully geared to that.
The concern about data processing
Concerns about AI and decision-making can be traced indirectly to the birth of the field of data processing. That starts somewhere around the second Industrial Revolution, when large groups of people moved to the cities. Making numerical statements based on registered data was one way of getting to grips with this major change. The rapidly growing complexity of this data processing has been a source of technological innovation, with Herman Hollerith's (later IBM) counting machines as the best-known exponent.
Automatic – albeit mechanical – data processing became extremely popular with governments and large companies in the early twentieth century. A common complaint at the time was that the human dimension disappeared: calculators were used to make abstract statements on a large scale, such as who could or could not receive benefits or who had to move because of new area development. An individual exceptional case, or even just a misregistered case, then quickly had a huge problem.
An important insight was that of Leavitt and Whisler in 1958: information technology is the field in which large amounts of information are processed for the purpose of information-based decision-making and for the simulation of higher-order thinking. This drew attention to the importance of information in business processes, which fitted in well with a new technological innovation: the database, which was given a boost in the 1970s by the relational model and the SQL search language.
Relational databases and SQL (and related techniques) meant that the number of data collections – and all companies and governments that used them – could grow rapidly. That in turn led to a greater data hunger, because if there is so much possible, then there quickly is a need for more. This gave rise to more concerns and protests about the protection of citizens, who, for example, could not defend themselves against errors in a database. One simple reason was because they didn't know they were in it. But at least as persistent was - and is - the belief that 'the computer is right', data in a database has an aura of correctness.
In the 1960s and 1970s, we saw a growth in large-scale data collection. First of all in government: censuses, health care, public safety. But also in the private sector. For example, credit organizations automated their data processing, so that many more companies could test whether a consumer was financially reliable. Automated systems also allowed companies to send mass mail, especially advertising. This met with a lot of concern and protest, resulting in calls for new legislation. It also put the subject of data protection on the agenda of lawyers.
In Germany, the use of electronic databases became very popular among police forces, in response to terrorist threats and attacks that gripped the country. They tried to locate the perpetrators using huge databases and searches. This caused a lot of arguments and discussion: the police knew just about everything about everyone, without any legal grip on it. The result was the Bundesdatenschutzgesetz of 1976, the first law to explicitly restrict data processing. There was also great resistance in the Netherlands during that period: the 1971 census – which was to take place automatically – caused a great deal of commotion, which led to a postponement, which was canceled in 1991 with the abolition of the Census Act.
As a result of the protests and legal concerns mentioned above, we have seen more legislation appearing from the 1980s that restricts the handling of personal data. Of great importance was the Convention for the Protection of Individuals with regard to Automatic Processing of Personal Data, initiated in 1981 by the Council of Europe, also known as the Strasbourg Convention or Convention 108. Convention 108 was, and still is, the only international legally binding data protection instrument. The treaty regulates the handling of personal data by both the private sector and the government. Many elements of the General Data Protection Regulation (2018) can be found directly in this treaty.
The 1983 census – of course with computerized data processing – in Germany led to a case in the Supreme Court that declared the law in question unconstitutional: the citizen has the right of informational self-determination, a right of control over information concerning him. This has become the core of the European law vision on personal data. It is about self-determination, about control, not necessarily about private space or privacy.
That right of control is also reflected in the discussion about AIs that decide over people: it is a form of dehumanization when a computer determines where you stand legally, especially if you are not even able to respond. And here's one piece of imaging that always gets me excited: in publications about legal AI we always see a robot with a wig on or a judge gavel. But a robot does not reason like a lawyer.
Machines that learn to reason
The discussion about regulation of AI has been greatly clouded by the view that computers have started to think for themselves, interfering with our human business and social processes like a kind of pseudo-human. “The question of whether computers can think is like the question of whether submarines can swim”, as the Dutch computer scientist Edsger W. Dijkstra once put it – computers don't think at all, they calculate.
The creators of the concept of “Artificial Intelligence” knew this very well in 1956: “every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it”. It was all about simulating, about imitating – it has never been an aim to actually realize a new form of intelligence, if that was even possible. John Searle's Turing test and the Chinese Chamber argument have also always been about being unable to distinguish between humans with computer intelligence simulations.
In the early AI research world, most attention was paid to formal logic and expert systems that reason on the basis of pre-programmed decision rules and databases of knowledge. All humans are mortal. Socrates is human, so Socrates is mortal. By formulating enough such decision rules and providing enough databases to extract knowledge (“the following items are plants”, “these items are animals”), the computer could make a decision about everything, the thought was. After several successes in the 1970s and 1980s, however, it became quiet in the AI research world, because setting up truly generic and advanced expert systems turned out to be a lot more complicated than previously thought.
An alternative research direction that has been overlooked for a long time was based on pattern recognition: machine learning (ML). The foundation for this work was also laid in the 1950s. Pattern recognition works best with large amounts of data, which was something that was lacking in the 1970s and 1980s. Processing data also took a lot of storage and computation time. In addition, the outcomes were not certain, but probabilities. This while rule-based expert systems could offer certainty. ML therefore remained a hidden child for a long time, until the growth of big data processing in the early 1990s which suddenly made it feasible to use ML practically. And that happened en masse.
ML systems are technically very clever, but also very opaque. After all, the fundamental point is that the machine itself looks for a pattern or dividing line in the data and classifies new entries on that basis, preferably even without a starter set of man-applied labels. And that leads to an essential aspect that is often misunderstood: an ML statement cannot be reduced to the decision rules that people use to arrive at a similar statement. come. According to a classical expert system, that animal is a rabbit because of its fluffy tail and short ears. According to an ML system, this is because certain neurons attached a lot of value to that label.
There's a cartoon going around of a scientist who has a statistical plot, puts a picture frame around it and calls it AI. There is a kernel of truth in this: many AI applications are no more than dressed up statistics-driven processes in which the output goes unchecked into the follow-up process. And that is quite exciting if you use it in a government or business process that does something with people.
Machines that decide on people
There have been many incidents where AI – both classical expert systems and ML-driven systems – made painful or legally incorrect statements. This varies from an ML-driven system with a large but biased dataset (such as the American COMPAS system that calculates the risk of recidivism) to the Dutch Fraud Scorecard, a huge Excel sheet with which municipalities profiled social assistance beneficiaries on fraud risk – based on factors that have never been tested. Many stories are known from China about social credit scoring, in which citizens are deprived of social rights on the basis of algorithmically determined pluses and minuses. From Terminator 2 it is known that autonomous weapon systems can destroy the world. And no, that's no joke – media imagery has a huge impact on risk-based thinking among lawyers and legislators. Our legislation against computer crime can be traced directly to the 1983 movie Wargames, for example.
Concerns about this decision-making are as old as the field of data protection, but the first major milestone was in 1995: the European Personal Data Protection Directive, the predecessor of the GDPR, first stipulated that people are not allowed to be subjected to automated decision-making based on a profile. A profile is understood to mean a collection of personal data that says something about a person, and which is deemed to be representative of that person. The provision was rather vaguely worded and – like the rest of the Directive – was quite widely ignored by the tech sector, barring fancy words in privacy statements. It wasn't until the introduction of the GDPR, which involved a hefty fine for violations, that companies started to worry about compliance with automated decision-making and discriminating based on data profiles.
The Artificial Intelligence Act
The most recent attempt by the legislator to curb these types of systems is called the AI Act, the abbreviated name for the Proposal for a Regulation of the European Parliament and of the Council laying down harmonized rules on artificial intelligence (Artificial Intelligence Act). The approach: risk mitigation for European citizens and protection of fundamental rights, in particular aimed at problems arising from opacity, complexity, dependence on data and autonomous behavior of AI systems. The list of examples ranges from worker protection to public surveillance, subliminal influencing and self-driving cars. That's because of the rather broad definition of AI, which caused quite a stir.
AI, expert systems, rule-based algorithms, machine learning systems, ML: there are quite a few terms around that mean roughly the same thing but in a slightly different way. For legal practice, the definition of the EU expert group from 2019 is the most important: a system that extracts information from received data, thus decides which actions can best achieve a set goal and then carries out those actions. Such actions are, for example, a prediction, a recommendation or a decision, but can also be self-generated output such as texts or images. A system is therefore not only AI if it runs autonomously or imposes the decisions itself; a recommendation faithfully followed by humans also makes a system AI.
This seems extremely broad, because applying an Excel filter to your customer base (or welfare recipients) means that you will soon fall under the AI Act. But that's exactly the point.
ML systems, and more generally AI systems, attract conclusions based on input and attach an action to it. A car that parks itself neatly, a camera that matches a face with a list of authorized visitors, or an algorithm that predicts answers to questions to the ECtHR, the list is in principle endless. This flexibility and scalability has given ML-based decision systems enormous popularity.
However, there is one crucial aspect of ML and that is how they arrive at their conclusions. Human decision makers, like expert systems, work with rules of reasoning. For example: if two people lives together and their joint income is more than 12,000 euros and the gas bill is more than 150 euros per month, then they should be investigated for benefit fraud. We also call this deduction, deriving conclusions from general rules. In the example, the general rule is that the three factors are each indicators of fraud, and three are enough for further investigation.
ML systems work inductively, they derive rules of reasoning from the data and then apply them to new situations. For example: the data shows that a gas bill of more than 150 euros is often associated with fraud, just like being between the ages of 18 and 32. The reasoning rule then becomes: if the gas bill is above 150 euros or the age is 18-32, then investigate further for benefit fraud. In this example, the age category is a coincidental correlation with fraud, but the system has partly based the rule on this because the data shows this.
This difference touches on the explainability of AI statements and decisions. Reasoning rules of human decision makers can be motivated because the general rules are available. Rules obtained by induction are not: the algorithm knows that the age category 18-32 is relevant, but cannot link a why. Although legislation (such as the GDPR, article 13 paragraph 2 sub f) requires an explanation of the underlying logic in automatic decisions such as this, in practice this is virtually impossible. This makes the use of AI fundamentally problematic.
A major problem in dataset analysis is bias. This is the phenomenon that an ML system has found patterns in the data that work out negatively for certain groups – what many people call discrimination, although strictly speaking there is no intention to disadvantage groups. An algorithm acts purely on the basis of the data, and looks for the distribution that best fits the set goal. There is no difference for ML between "none" from the field of "work experience" or "women's tennis" from the field of hobbies correlating to "rejected applicant". For humans, however, the distinction between these elements of information is huge: we're not supposed to reject people because they're women.
Because of these problems, the approach of the AI Act is purely risk-based: it does not matter whether you work with ML, expert systems or with a filter in Excel, what matters is whether risks arise for people. AI with very high risks (such as social credit scoring) is prohibited, with high risks (such as screening of applicants) the supplier must take heavy measures to limit it. This approach has the advantage of true technology neutrality: it doesn't matter how it works, as long as it's secure.
Regulation of machines
The structure of the proposed AI Act is set up on three levels. At the highest level are the unacceptable AIs: their use conflicts with fundamental values in the EU, such as social credit scoring or the use of subliminal techniques to harm people, such as analyzing whether someone is unhappy and then selling hair-leather oil products. . These products may not be used throughout the EU.
The mid-level is in the high-risk AIs to humans in their uncontrolled deployment. Think of biometrics in public spaces, infrastructure management, selection and recruitment of personnel, law enforcement or border controls. Such systems are only allowed if they are subject to strict rules, in particular an AI Impact Assessment to map the risks in advance, a documented design and improvement process and transparency about the way of working. We call other AIs low-risk and in principle they are allowed to run their course. If it turns out to go wrong, the European Commission can still declare them high-risk.
The AI Act has yet to be reviewed by the European Parliament, so it will be some time before (any version of) this law is passed. But the signal is clear enough: it's not about what AI is, it's about what risks people run. And we will limit those risks.
About the Author
Arnoud has been working as an IT lawyer since 1993. After a career at Royal Philips as IP counsel, he became partner at ICTRecht Legal Services, which has grown from a two-man firm in 2008 to a 80+ person legal consultancy firm.
Read more from thought leader and industry expert Arnoud Engelfriet in the series Legaltech Beyond the Myths