2024 has seen behavioural advertising and cookies continue to dominate the agenda of data protection authorities (DPAs) in the EU and for the ICO in the UK. Alongside that, DPAs have started to consult about guidance on Artificial Intelligence (AI), particularly Generative AI and Large Language Models (LLMs). After the initial flush of AI enforcement actions in 2023, we have not yet seen clarifying precedent emerge from the DPAs in 2024. We have also seen an increasing focus from DPAs on employee surveillance, particularly use of biometric technologies. We have also seen a sting in the tail of enforcement related to EU-US data transfers, with a major fine issued for breaching GDPR.
This blog focuses on AI.
Full clarity yet to emerge on lawful basis for training and whether Large Language Models (LLMs) constitute personal data
Many in the data protection community had expected a significant amount of enforcement action related to AI in 2024, given the early flurry of actions related toGenerative AI in 2023, including the action by the Italian DPA, the Garante, against OpenAI and the ChatGPT service. There has been less activity than expected and a greater focus on consultation and guidance from DPAs.
A number of DPAs have focused on consultations about Generative AI. A key question has been what GDPR lawful basis can be used when controllers acquire personal data for the purposes of AI model training, including when data is scraped or re-used from sources such as websites or social media services. The most practical solution is to consider the use of legitimate interests under Article 6(1)(f) of GDPR.
In October 2024 the Court of Justice of the EU (CJEU) issued its judgment in the case Koninklijke Nederlandse Lawn Tennisbond v Autoriteit Persoonsgegevens (C-621/22). Building on previous judgements, the CJEU clarified the meaning of legitimate interest under Article 6(1)(f) GDPR. It found that the commercial interest of a controller which discloses personal data to third parties undertaking direct marketing, can be a legitimate interest, provided it is not contrary to the law (to be assessed by the referring court, taking into account the applicable legal framework and all the circumstances of the case). This overturned the position taken by the Dutch DPA and was a welcome confirmation for controllers on the scope of the term legitimate interest.
The CJEU’s judgment also reiterated the importance of considering the necessity test and whether there are means of processing that are less restrictive of data subjects’ rights. They also highlight the need to consider the reasonable expectations of the data subject when conducting the balancing test under Article 6(1)(f).
The ICO and the CNIL have both issued guidance consultations that recognise it may be possible to rely on legitimate interests for AI training, subject to the full elements on the three part test being met: (i) the pursuit of a legitimate interest by the controller, (ii) the necessity of the processing in order to achieve the legitimate interest pursued and (iii) the fundamental rights and freedoms of the data subject must not prevail. The Dutch DPA issued the most strident statement on web scraping and GDPR, stating in guidance that web-scraping to acquire personal data for AI models is “almost always a violation of the GDPR”. These differing messages highlight that there is work to be done in EDPB to present a clear EU position.
The EDPB have also just released a consultation about draft guidance on legitimate interest (not focused on AI) which is open until 20 November 2024. Watch out for a forthcoming A&O Sherman blog covering the key points on the consultation.
In a positive recognition of the validity of legitimate interests as a lawful basis when using personal data for AI training purposes, the Belgian DPA issued a decision that concluded a bank could rely on this basis for using payment transactions to train an AI model for personalised discounts service. The bank distinguished between personalised information based on the data subject's consent, that they could withdraw, and the model building, which was based on legitimate interest with the data subject's right to object. The Belgian DPA found that the processing to train the model did meet the three-part legitimate interest test.
It is also notable that the Irish DPA, the DPC, has intervened about the use of social media data for Generative AI model training purposes. In August, the DPC recognised X’s agreement to suspend its processing of the personal data contained in the public posts of X’s EU/EEA users which it processed between 7 May 2024 and 1 August 2024, for the purpose of training its AI ‘Grok’. Before this statement, the DPC had used Section 134 of the Data Protection Act 2018 to make an application to the High Court for an order requiring the data controller to suspend, restrict or prohibit the processing of personal data. This highlights that the most impactful actions from DPAs may not always be fines - interventions focused pausing or stopping the processing can also be major challenges to the business operation of AI models.
Both above cases pivoted on the issue of legitimate interests, whether there had been effective transparency to users and effectiveness of the right to object and opt out. Some NGOs, such as NOYB, also continue to argue that consent is the only valid basis for reusing social media data for AI model training.
Ultimately the EDPB will need to need to clarify the position on the use of legitimate interests for AI model training. The May 2024 report of EDPB’s ChatGPT taskforce did not provide the detailed guidance some had been hoping for and was an interim position, restating the provisions of the GDPR, without explaining in detail how they apply. However, the signals in 2024 indicate that a nuanced position on legitimate interests is possible, that it can be possible to use for AI model training in relevant circumstances.
While the inputs and outputs of generative AI systems clearly involve the processing of personal data, the question of the models themselves appears to be still open. The Hamburg DPA issued guidance in July stating that the mere storage of an LLM does not constitute processing within the meaning of the GDPR and that data subject rights under the GDPR cannot apply to the model itself. Practical questions also remain if models are to be treated as personal data, such as how the right to erasure can apply to the tokenised form of generative AI models, where personal data is not found in records as in a database structure.
The ICO’s published decision about Snapchat’s MyAI tool (May 2024) and whether their Data Protection Impact Assessment (DPIA) was compliant with the requirement under Article 35 GDPR has also provided important guidance. The ICO’s investigation led to a Preliminary Enforcement Notice to Snap in October 2023 on the basis that it had not undertaken a valid DPIA. This resulted in Snap taking significant steps to carry out a more thorough review of the risks posed by ‘My AI’ and demonstrate to the ICO that it had implemented appropriate mitigations. The ICO’s decision in May 2024 found that they were satisfied that Snap has now undertaken a risk assessment relating to ‘My AI’ that is compliant with GDPR.
Key learning points about DPIAs and AI can be drawn from the ICO’s decision:
-
The importance of documenting a detailed breakdown of the processing operations in the documentation.
-
Ensuring that mitigating steps are clearly matched to the relevant risks in the DPIA.
-
The ICO expect a detailed consideration of why new technologies differ from solutions previously used and how this informs the necessity and proportionality assessment. This would also include whether there is an increased risks of processing more special category data than previously.
-
Including evidence of alternative measures considered in the DPIA.
-
Specific consideration of the risks to children, in this case the risks of targeting under 18s with advertising.
The ICO’s decision also illustrates that DPAs are prepared to intervene early in the roll out of AI systems to address concerns about risk.
Look out for our roundup on employee surveillance and biometrics in the workplace tomorrow.