Background
The AI Act in the European Union is soon coming into effect (Beginning on 2024-08-01). This is the first of its kind in the world and is expected that many other countries will follow soon.
As is typical with pioneering initiatives and EU regulations, the boundaries are often ambiguous, allowing those opposed to regulation to exploit the uncertainty to advance their narrative. While interacting with other community members, I often encounter the narrative of profit-oriented corporations, thus felt necessary to write this post to explain why blindly following corporations is eventually bad for the end users.
In the ongoing debate, two major entities, Meta — which I commended in a previous article: Local LLM resource: Llama3.1 —and Apple, are refraining from releasing products in the EU, attributing their actions to the AI Act.
Regulation break-down
Before digging in the current conflict, let’s review the AI Act. It is classifying uses of AI into different categories, not machine learning itself, and aims to regulate how end users interact with AI systems instead of the technical parameters themselves.
Uses AI carries some risks and the EU expects organizations who develop and/or deploy to mitigate them.
Category 1: Prohibited uses of AI
Uses of AI is illegal in the EU (with some exceptions of course) when it is capable of manipulating human behaviour, real-time identifying people remotely or social scoring.
Some concrete examples:
Chatbots exploiting vulnerabilities, e.g. by knowing the user’s age, socio-economic status or other properties
Subliminal, manipulative or deceptive communication aimed to impair decision making, e.g. in advertising, or misinformation campaigns
Social scoring, like the one in use in China
Assessing the risk of an individual committing criminal offenses solely based on profiling or personality traits, except when used to augment human assessments (+ other factors).
Compiling facial recognition databases by untargeted scraping of facial images from the internet or CCTV footage.
Inferring emotions in workplaces or educational institutions, except for medical or safety reasons.
Real-time biometric identification is allowed when it is used to search for missing people, preventing threat to life, or identifying suspects.
This one is detailed in Article 5: Prohibited Artificial Intelligence Practices
Category 2: High Risk AI
This is the main focus of the AI Act.
High-risk AI systems are those that pose significant risks to health, safety, or fundamental rights. They are classified based on their intended use in critical areas such as biometrics, critical infrastructure, education, employment, public services, law enforcement, migration, and justice.
Some examples:
AI managing safety components in digital infrastructure, road traffic, and utilities like water, gas, heating, and electricity.
AI determining access to educational institutions, evaluating learning outcomes, and monitoring student behaviour during tests.
AI used for recruitment, job ads, evaluating candidates, and monitoring employee performance.
AI assessing crime risk, evaluating evidence reliability, and profiling during criminal investigations.
These are generally accepted, but steps must be taken to mitigate risks:
Data governance, risk management, quality management processes should be in place.
Technical documentation should be drawn up to demonstrate compliance.
Th AI system should be designed for record-keeping to enable it to automatically record events throughout the system's lifecycle.
Instructions for use should be provided to downstream deployers to enable the latter's compliance.
The AI system should be designed to allow deployers to implement human oversight.
The AI system should be designed to achieve appropriate levels of accuracy, robustness, and cybersecurity.
Now you might think these are a lot of tasks for an AI system, but please remember which organizations will use systems which is designated as high risk: infrastructure providers, authorities, and places where they decide whether you get employed or admitted to a school.
Generative AI models may belong to this category. Much of the text below comes from High-level summary of the AI Act.
GPAI model means an AI model, including when trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable to competently perform a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications. This does not cover AI models that are used before release on the market for research, development and prototyping activities.
GPAI system means an AI system which is based on a general purpose AI model, that has the capability to serve a variety of purposes, both for direct use as well as for integration in other AI systems.
GPAI compliance requirements:
Draw up technical documentation, including training and testing process and evaluation results. (Required by closed source models)
Draw up information and documentation to supply to downstream providers that intend to integrate the GPAI model into their own AI system in order that the latter understands capabilities and limitations and is enabled to comply. (Required by closed source models)
Establish a policy to respect the Copyright Directive. (Required by closed and open source models)
Publish a sufficiently detailed summary about the content used for training the GPAI model. (Required by closed and open source models)
Models posing a systemic risk
This is a weird category, where I do not personally agree with the terminology used, but let’s dig in a bit more:
GPAI models present systemic risks when the cumulative amount of compute used for its training is greater than 1025 FLOPs. The first LLM that falls into this category is Llama3.1 405B. Researchers/engineers (model provider) who create such models are obligated to report to the EU AI office within 2 weeks and evaluate whether the model poses risks or not. The goal here appears to be a dialogue between the provider and the AI Office, but making the decision ultimately is the responsibility of the AI Office.
Providers of GPAI models with systemic risk must also:
Document the model and, evaluate + mitigate risks.
Assess and mitigate possible systemic risks, including their sources.
Track, document and report serious incidents and possible corrective measures to the AI Office.
Ensure an adequate level of cybersecurity protection.
Now, the model does not explicitly say what the exact procedures must be - they are to be created and adopted by the teams deploying the models themselves.
This is what I understand based on Article 56: Codes of Practice:
Example of Systemic Risks and Application:
Financial Sector:
Risk: AI-driven trading algorithms could lead to systemic financial instability if they malfunction or are manipulated.
Identification: Codes of practice should identify how these algorithms interact and the potential for cascading failures.
Management: Procedures might include stringent testing, real-time monitoring, and collaboration with financial regulators to mitigate these risks.
Healthcare Systems:
Risk: Misdiagnoses or improper treatments by AI systems could lead to a public health crisis.
Identification: Codes of practice should map out potential failure points in AI diagnostic tools and treatment recommendation systems.
Management: Measures could include rigorous validation processes, continuous performance monitoring, and protocols for immediate intervention if systemic issues are detected.
Category 3: Limited Risk AI
Limited Risk AI have transparency obligations. These are for example chatbots or AI generated images, where for the end users, it must be obvious that they are not interacting with a human.
For example, if you are talking to an agent on an insurance portal, or talking to a call-centre, it must be stated if you are interacting with an AI system or a human.
Same things with computer generated images, it must be obvious. Remember when an AI generated (by Midjourney) image of the Pope circulated around the internet, fooling many people?
This regulation is aimed to prevent such situations.
Category 4: Minimal Risk AI
Minimal risk AI is a simple category, and requires no declaration. These are uses with little to no risk, such as spam filters, image upscalers in video games or inventory management systems.
Requirements, rephrased
Here are the most important points, worded again (Taken from an official press release):
Obligations for high-risk systems
Clear obligations are also foreseen for other high-risk AI systems (due to their significant potential harm to health, safety, fundamental rights, environment, democracy and the rule of law). Examples of high-risk AI uses include critical infrastructure, education and vocational training, employment, essential private and public services (e.g. healthcare, banking), certain systems in law enforcement, migration and border management, justice and democratic processes (e.g. influencing elections). Such systems must assess and reduce risks, maintain use logs, be transparent and accurate, and ensure human oversight. Citizens will have a right to submit complaints about AI systems and receive explanations about decisions based on high-risk AI systems that affect their rights.
Transparency requirements
General-purpose AI systems, and the GPAI models they are based on, must meet certain transparency requirements, including compliance with EU copyright law and publishing detailed summaries of the content used for training. The more powerful GPAI models that could pose systemic risks will face additional requirements, including performing model evaluations, assessing and mitigating systemic risks, and reporting on incidents.
Additionally, artificial or manipulated images, audio or video content (“deepfakes”) need to be clearly labelled as such.
Meta’s use case
Meta says they will not release the multimodal (meaning the same model can understand and generate text, audio and images, without the need for separate models) will not be available in the EU, their reasoning being not being able to comply with these. Some articles mentioned that they cannot comply with GPAI transparency requirements, probably disclosing where they sourced the training data from. We know, that Llama 3 was trained on 15T tokens, “that were all collected from publicly available sources.” That obviously yielded excellent results but for a company whose revenue rivals entire smaller countries’ GDP, I agree with the EU that they must put in more effort to disclosing their data.
My personal opinion is that Llama 3.1 is excellent and a huge leap. I also understand that this is unfair, as other similarly powerful companies are even worse (I’m looking at you OpenAI, Google) as they are disclosing even less information. But other organizations doing something unethical does not give Meta permission to do something shady.
They have world class engineers, and probably a world-class compliance compliance/legal department. I am sure they will find a solution to comply and not miss out on the EU market, no matter how uncomfortable it is to comply with stricter regulatory standards.
Past conflicts between EU and tech giants
It is not the first and likely not the last time that tech giants conflict with EU regulations. It is also not the only time where they delay or threaten to leave the union. Some examples:
Google vs. GDPR: When the EU introduced GDPR, Google complained that the regulations were too burdensome and would stifle innovation. However, the real issue was that GDPR forced Google to be more transparent about its data collection practices and gave users more control over their personal data. (Source: The CNIL’s restricted committee imposes a financial penalty of 50 Million euros against GOOGLE LLC )
Apple vs. EU's Tax Rulings: In 2016, the EU ordered Apple to pay €13 billion in back taxes to Ireland, citing illegal state aid. Apple claimed that the ruling was unfair and would harm economic growth, but the EU argued that Apple had been exploiting loopholes to avoid paying its fair share of taxes. (Source: Apple's EU tax dispute )
Amazon vs. EU's Antitrust Probes: In 2015, the EU launched an antitrust investigation into Amazon's e-book distribution practices, alleging that the company was abusing its dominant market position. Amazon claimed that the probe was unwarranted and would harm authors and publishers, but the EU argued that Amazon was stifling competition and limiting consumer choice. ( Source: Antitrust: Commission opens formal investigation into Amazon's e-book distribution arrangements )
Microsoft vs. EU's Browser Choice: In 2009, the EU ordered Microsoft to offer users a choice of web browsers, rather than forcing them to use Internet Explorer. Microsoft complained that the ruling was unnecessary and would confuse users, but the EU argued that the company was abusing its dominant market position to stifle competition (Source: Microsoft Corp. v. Commission )
In each instance, the EU has represented the average end-users against corporate exploitation. Although my research did not extend beyond this article, drawing from past precedents and understanding the regulation, it appears to be yet another instance in corporate power struggle against regulation, which will likely result in Meta/Apple throwing in the towel and complying to not miss out on the EU’s market.
FAQ
Are hobbyists affected by the AI Act?
Hobbyists fine-tuning AI models (LLMs or text-to-image models) for personal use are not the primary target. However, if the models are deployed in a professional capacity, compliance with the Act is required.
How can I check if I am affected as a business?
Read the high level overview and then use the EU AI Act Compliance Checker to assess whether you need to take action.
Where does it apply?
It applies if the AI system is hosted inside the EU and/or if it has users from the EU.
Will this make the EU lag behind?
I personally do not think so. These regulations do not prevent innovation, and research use is mostly exempt. Mistral is a good example of an EU based firm providing foundation models, both open and closed, and Wolters Kluwer is an excellent example of a traditional company bringing AI capabilities to existing products.
🖖