The UK government is clear that changes need to be made to clarify how copyright law applies in the context of AI training. Currently, creators are finding it difficult to control the use of their content in AI training and want to be better remunerated for its use. In turn, AI developers are having difficulties navigating UK copyright law and this legal uncertainty is undermining investment in and adoption of AI technology.
To resolve this tension, the government is proposing the introduction of an exception to copyright infringement for commercial TDM. However, right holders will have the ability to reserve their rights (to “opt-out” of the general exception), enabling them to license and be paid for the use of their work in AI training. The aim of this approach is to deliver a copyright and AI framework that will reward human creativity, incentivise innovation and provide the legal certainty required for long-term growth in both the AI sector and the creative industry.
Practically, the proposed exception is premised on the adoption of effective and accessible standards and requirements for rights reservation and transparency.
The consultation also asks for views on many of the other issues regarding copyright in the context of AI:
The consultation will remain open for response until February 25, 2025.
Proposed TDM Exception
The proposed TDM exception is a compromise that aims to give right holders’ the ability to control and seek remuneration for use of their copyrighted content, while supporting the development of world-leading AI models in the UK by ensuring wide and lawful access to high-quality data.
The proposed exception to copyright infringement will cover anyone conducting text and data mining for any purpose, including commercial purposes provided that:
- the user has lawful access to the relevant works. This is stated to include works that have been made available on the internet, and those made available under contractual terms, such as via a subscription. More consideration may be needed as to whether this should include works that are freely available on the internet without permission; and
- the right holder has not reserved its rights in the work. If a right holder has reserved its rights through an agreed mechanism, a licence would be required for the data mining.
This exception will be underpinned by greater transparency from AI models about the sources of training material, to ensure compliance with the law and to build trust between right holders and developers.
Other options have been considered but are not preferred. Having no TDM exception would require AI developers to obtain an express licence to train their models on copyright works in the UK. Whilst this would provide legal certainty and a clear route to remuneration for creators, this is likely to make the UK significantly less competitive compared to other jurisdictions (e.g. the EU and US), which do not have such restrictive laws, and could reduce investment in AI in the UK. On the other hand, permitting commercial TDM without a reservation of rights in the same way as other countries (e.g. Singapore) would improve AI developers’ access to training material, and could increase investment in the UK AI sector, but would not allow copyright owners to control or seek remuneration for use of their works. A further option, allowing data mining where this meets a “fair use” standard (as in the US), often requires expensive litigation and would involve a radical change to the UK copyright framework, which could constrain the growth of the creative and media sectors.
The proposed compromise broadly aligns the UK with the EU’s general purpose TDM exception (Article 4 Digital Single Market Directive).
The UK’s existing exception for TDM for research purposes (s29A CDPA 1988) is also broadly similar to the EU’s research exception (Article 3 Digital Single Market Directive). However, the consultation notes some differences. The UK’s exception applies to both research institutions and researchers themselves. This means that individual researchers can benefit from the UK exception, whereas the EU’s exception applies to specific research organisations and cultural heritage institutions. The EU exception also permits commercial research, and extends to databases as well as copyright works, whereas the UK’s does not. The UK government asks for views on this as part of the consultation.
Rights holders will be able to reserve their rights in works made available online using effective and accessible machine-readable formats. AI firms and right holders are already familiar with this, being the current EU approach. However, it is noted that there are significant limitations to the current technology and insufficient standardisation and adoption of the current tools. There are also uncertainties about some of the EU requirements, e.g. the practical application of a “machine-readable” opt out. The government therefore envisages further standardisation so that content owners can easily reserve their rights, and AI developers can easily respect these decisions. In particular, it wishes to explore several implementation options and is seeking views on how it can support work to improve these tools and drive their adoption:
- the robots.txt standard: this is already used by many news publishers to block the main generative AI web-crawlers. Currently, however, this standard cannot provide the granular control over the use of works that many right holders seek (it allows blocking of works at the site level but does not recognise reservations associated with individual works). It also does not enable right holders to distinguish between uses of works (e.g. allowing use for search indexing or language training, but not for generative AI);
- within the work’s metadata: it may be possible to flag within the metadata of a file whether the copyright work is, or is not, available for training. AI developers have not, to date, adopted a consistent standard to support this; and
- simplifying notification of individual work: e.g. via “Do Not Train” registries.
The UK government is clear that it wants to engage with industry and international standards initiatives to ensure a collaborative approach and flags that regulation may be needed to support the adoption of and compliance with standards. There could also be a possible dispute resolution mechanism outside the existing copyright framework.
It is noted that, currently, creators find copyright enforcement difficult because of a lack of transparency from AI developers about what content is used and how it is acquired. The UK government therefore seeks views on how best to deliver greater transparency from AI developers, in order to strengthen trust between the two sectors. This could include requirements for AI firms to disclose:
- the use of specific works and datasets;
- details of web crawlers and their purpose of their use; and/or
- information on request to evidence compliance with rights reservations.
Again, the government will consider assisting in the development of new technical tools supporting transparency and intends to engage with the EU and other international partners on international interoperability. It notes the requirement in Article 53(1)(d) of the EU AI Act for AI providers to make publicly available a “sufficiently detailed summary” of training content and in California’s Assembly Bill 2013 (AB 2013) to disclose a high-level summary of the datasets used in the development of generative-AI models. Similar legislation is an option to provide legal certainty in the UK, but this will be considered when the approach is agreed.
There is a real emphasis throughout the consultation on the importance of licensing in securing rights holder remuneration and providing AI developers with access to high-quality training material. Accordingly, there is a call for the AI companies and creative industries to work together to create new technical systems to deliver greater licensing of IP.
The government seeks views on how it should support good licensing practice and whether current practices support right holder control. There is recognition of potential industry pressure to agree standard terms and conditions that require broad licensing of works because of expectations to use certain tools and services.
New structures may be put in place to support the aggregation and licensing of data for AI training purposes. There is a clear indication that collective licensing and aggregation/brokering services are possible ways to provide easy access for AI developers to licensed material and remuneration for rights holders. Collective management organisations could also work to reserve the rights of their members.
The government encourages clear labelling of AI outputs but also acknowledges the technical challenges involved. Again, it seeks views on how it can support the development of emerging tools and standards and acknowledges that regulation may be needed to ensure consistent labelling.
Again, the government recognises how the EU AI Act requires AI outputs to be machine-readable and detectable as AI generated or manipulated and how the EU’s AI Office is working on pending guidelines and codes of practice to ensure effective implementation of these obligations. The government again highlights areas for improvement (e.g. ensuring labels are resilient to manipulation or removal) and will consider whether and how it could support research and development into labelling tools.
The UK currently provides copyright protection for purely computer-generated works (i.e. those without a human creator). However, it noted how this does not function properly within the broader copyright framework leading to legal uncertainty (see our previous article for further details on this). Economically, the protection is not widely used, it results in unjustified costs to third parties, and it does not incentivise content production or AI development. It is also often said to be morally unnecessary because only human-created works deserve protection. Accordingly, the UK government does not see any real justification for maintaining the current form of copyright protection for computer-generated works.
Views are sought on two potential options for reform:
- clarify the existing protection for computer generated works: clarification of the originality threshold could make the protection more valuable, either by removing it altogether or defining it in some way (e.g. the work would be deemed “original” if an identical human-authored work would be considered original); and
- remove copyright protection for computer generated works: “AI-assisted” works that exhibit human creativity would continue to be protected (e.g. where a human provides the creative essence but uses an AI tool like an advanced photo editor). AI-generated music and video could continue to be protected as sound recordings, films, and broadcasts but text or images generated by AI without a human creator would not.
The government’s preference is to remove protection because most leading AI nations (e.g., the US and EU) do not provide this protection for computer-generated works. However, it takes no definitive view at this stage and welcomes up-to-date evidence to help it to assess the current situation. There is no suggestion to change the similar protection that is currently available for computer-generated designs.
It will be interesting to see the responses on this issue. The consultation takes a very clear stance that AI developers are unlikely to benefit from this protection because they are unlikely to own copyright in any computer-generated content. It specifically states that the user would usually own any copyright in output generated by general-purpose AI from a simple prompt. This is on the basis that the user undertakes the arrangements necessary to create the work. However, this seems to ignore the possibility of significant contribution arising from, e.g. the creation of the algorithms, the selection of training data, the fine-tuning of the model or the meta-prompts or system prompts.
The consultation seeks to gather evidence on the challenges posed by “digital replicas” (or deepfakes), being images, videos and audio recordings created by digital technology that realistically replicate a person’s voice, image, or personal likeness. It notes that there are existing mechanisms that can assist in tackling the increasing number and quality of deepfakes. However, they all have their limitations:
- passing off requires a misrepresentation and is generally limited to the false endorsement of well-known individuals with goodwill;
- copyright in sound recordings, if there is a copying of a substantial part of a sound recording then a singer’s voice may be protected as contained in those recordings;
- data protection laws if there is improper processing of personal data; and
- performance rights if the AI produces an unauthorised reproduction of a performance (but not an imitation).
The UK government welcomes views on whether the current legal framework allows individuals to control the use of their likeness or voice sufficiently or whether further intervention is required. This may trigger calls for the introduction of UK “personality rights” following similar developments in the US (including the July 2024 US Copyright Office report on digital replicas, the proposal for a Federal Digital Replica Law and the California Assembly Bills 2602 and 1836, which seek to provide protection to performers in the context of AI-generated digital replicas).
Other issues
The consultation completes its wide-ranging analysis of copyright law and how it applies in the AI context by encouraging views on an even broader range of issues:
- Models trained models in other jurisdictions: the government wants the UK’s copyright provisions to be internationally interoperable and workable for global AI providers. Therefore, it intends to engage with international partners, including the EU and US, to align approaches where appropriate. It also wants to create a level playing field between those models that are trained in the UK, and those trained outside the UK but made available for use in the UK market. There is a clear intention to encourage investment from the major AI developers and to allow UK-based SMEs to be able to compete with those who train overseas with clearer and more permissive rules.
- The “temporary copies” exception: it is currently not clear whether this exception applies to the training of generative-AI models, so there may be a case for clarifying the scope of this exception.
- Infringement and liability relating to AI-generated content: the government considers the current copyright framework for infringing AI outputs to be reasonably clear and adequate. It is acknowledged that it will not always be easy to determine liability, but that is also true of other technologies. Views are encouraged on any specific areas where copyright law may be deficient, if there are barriers to enforcement, or if there are practical measures that may help enforcement such as keyword filtering.
- Emerging issues: the government wants to be seen to be on the front-foot and keeping track of emerging issues. It therefore asks about new AI developments that might raise novel copyright issues. By way of example, it suggests: the interaction of copyright works at inference (the process by which a trained AI system generates outputs using new data); the interface with live data sources; using retrieval augmented generation (RAG); and how the use of synthetic data to train AI models may affect the ecosystem.
Key thoughts
Overall, the UK government’s consultation should be welcomed in its attempt to create legal certainty and to provide a workable compromise between the tech and creative industries on the important topic of AI training. There is a clear emphasis on working together to create technically effective solutions and to work pragmatically towards a solution that benefits overall growth in the UK.
The proposal broadly follows the EU approach, but the UK government wants to resolve some of the problems encountered with the EU framework and, to achieve this, it is helpfully open to supporting the development of emerging tools and standards. Stakeholders should therefore engage with the process and help to identify new and improved technologies to help create a transparent system with easily accessible opt-out and licensing mechanisms.
We are particularly interested to see how stakeholders respond to the potential removal of protection for computer-generated works. This is not something that is offered by other major AI countries and is rarely relied upon, particularly if it is possible to identify some level of human creativity. However, the provision for the protection of computer-generated works in the CDPA 1988 was introduced with a view to providing protection for AI works in the UK and its removal would be a clear change of policy. As noted in the consultation, stakeholders may consider the protection to be more valuable if its scope is clarified.
The consultation, however, does not stop there and goes on to encourage views and suggestions on a very broad range of copyright issues that have arisen with the recent AI-revolution. This approach is warranted considering the rapid advances in generative-AI technology since the last consultation. There is nothing concrete promised at this stage, but stakeholders should take the opportunity to consider and submit their views.