Opinion

Zooming in on AI – #12: spotlight on GPAI Models

DOWNLOAD PDF

COPY LINK

Read Time

8 mins

Published Date

Nov 25 2024

Related people

When the AI Act was first proposed by the European Commission in 2021, the concept of “general purpose AI” was nowhere to be found. These rules were introduced during the legislative process to align the AI Act to the evermore present general-purpose AI models (GPAI) such as GPT-4.

This publication in our “Zooming in on AI” series focuses on GPAI models. Specifically, we look into (i) the definition of a GPAI model, (ii) the main obligations that apply to GPAI model providers and (iii) the timing for complying with these obligations.

Definition: so what is a GPAI?

As already explained in a previous “Zooming in on AI” article, article 3(63) of the AI Act defines GPAI models as an “AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications, except AI models that are used for research, development or prototyping activities before they are placed on the market”.

Picking apart this definition, for an AI model to be qualified as a GPAI model, the following criteria must be cumulatively met:

It must be an AI model, including where it is trained with a large amount of data using self-supervision at scale;
It must display significant generality;
It must be capable of competently performing a wide range of distinct tasks;
It must be able to be integrated into a variety of downstream systems or applications; and
It is not an AI model that is used for research, development, or prototyping activities before they are placed on the market.

The key criteria are that the AI model must display significant generality and be able to perform a wide range of tasks. Recital 98 clarifies that “models with at least a billion of parameters and trained with a large amount of data using self-supervision at scale should be considered to display significant generality and to competently perform a wide range of distinctive tasks”. Whereas an AI model with one specific functionality will therefore not qualify as a GPAI model, LLMs such as GPT-4 undoubtedly do as they can be used for a near immeasurable number of tasks.

Some GPAI models qualify as a GPAI model “with systemic risk” – these models are subject to additional obligations (see below). Currently, GPAI models are considered to be systemic risk models if the cumulative amount of computation used for their training is more than 10(^25) FLOPs. The precise definition of what constitutes a GPAI model with systemic risk is however intended to change over time as technology evolves (and the European Commission can take decisions to update this definition).

Obligations: so what does it mean?

GPAI model providers (i.e. those entities that developed such models) must comply with the obligations in article 53 of the AI Act. These obligations can mainly be classified in two categories:

Transparency obligations. GPAI model providers must draft and keep updated technical documentation on the GPAI model, including on how it has been trained and tested. They must also provide information and documentation to providers of AI system that intend to integrate such GPAI model in their AI systems. Annex XII of the AI Act sets out a detailed list of information that must be provided, which includes for instance information on any acceptable use policies, the licensing model and the technical means to integrate the GPAI model in an AI system.

Copyright protections. As GPAI models are trained on large amounts of data, the AI Act includes two obligations specifically intended to protect copyright owners.

First, GPAI model providers must provide a sufficiently detailed summary of the content used to train the GPAI model. Recital 107 clarifies that this summary must be generally comprehensive so that it allows copyright holders to exercise and enforce their rights, for example by listing the main data collections used for training the model and by providing a narrative explanation about any other data sources used. The AI Office is expected to publish a template of the summary.

Second, GPAI model providers must put in place a policy to comply with EU copyright law. Not much detail is provided on what such policy should cover, except that such policy must allow the GPAI model provider to identify (through state-of-the-art technologies) any rights that have been reserved by right holders under the text and data mining copyright exception set out in Directive 2019/790. It will be particularly interesting to see how this obligation is expected, as many GPAI model providers are currently struggling with how to implement this obligation.

In addition to these general obligations that apply to all GPAI model providers, GPAI models with a systemic risk (see above) are subject to certain additional obligations. These are mainly intended to mitigate the risks that such large-scale GPAI models may pose to society, such as disruptions of critical sectors, consequences for public safety, impact on democratic processes and security, or the dissemination of false or discriminatory content. Specifically, providers of GPAI models with systemic risk must:

Perform an AI model evaluation, including by conducting an adversarial testing of the AI model to identify and mitigate systemic risks;
Assess and mitigate potential systemic risks;
Report any serious incident (and any corrective measures to address such incident) to the AI Office and relevant national authorities; and
Ensure an adequate level of cybersecurity protection for the GPAI model as well as for the physical infrastructure of the model.

To help GPAI model providers comply with these obligations, the AI Office will draw up codes of practice that providers can use to demonstrate that they comply with these obligations. On November, 14 2024, the European Commission published a first draft Code of Practice on GPAI – this Code of Practice will now be subject to an iterative drafting process and is expected to be finalised in April 2025.

Timing: so when do I have to comply with these obligations?

As set out in a previous “Zooming in on AI” article, the GPAI model obligations will apply as of August 2, 2025. This gives GPAI model providers a good 8 months (as of the date of publication of this article) to prepare for such compliance.

A&O Shearman will of course continue to monitor any developments in relation to GPAI models and keep you posted as we get closer to the date of application.