01 March 2024
Legal action against AI companies for IP infringement continues, with Stability AI arguing it did not infringe copyright by training models from pictures owned by Getty Images.
Stability AI’s defence to the UK's High Court argued that Stability did not “derive any output from the whole or any substantial part” of the copyrighted images. Stability also denied that Getty Images’ pictures were “highly desirable” for training its models because they were poor-quality, low resolution and watermarked. The defence included reference to many other images being available from other sources, and that the models were trained outside the UK.
The case raises interesting questions around the location at which models are trained, and the extent to which IP claims rest upon using copyrighted materials in AI outputs (rather than for training models during early development). As is often the case, details of how the technology functions will be key to the High Court’s judgment, although it will be interesting to see how existing precedents are applied in the case of foundation models.
Other notable AI copyright cases currently ongoing include:
🔹 The New York Times against Microsoft and OpenAI (copyrighted content, New York)
🔹 Authors Guild class-action suit against OpenAI (copyrighted content, New York)
🔹 Universal Music against Anthropic (song lyrics, Tennessee)
Solutions to the issues caused by existing models that have been trained using copyrighted material are still in the early stages of development. These include OpenAI CEO Sam Altman stating at Davos that he hoped to resolve current lawsuits with a system for paying content providers for their material.
The UK’s Department for Science, Innovation and Technology are also working on a code of practice on copyright and AI that includes licensing for data mining to protect rights holders.
The use of data to train AI models also creates issues around data protection and privacy, as well as the potential for bias in results where particular data sets have been used, or insufficient data has been taken into account when developing underlying models.
There don’t appear to be any simple solutions to the problems of already-existing AI, such as the large language models (LLMs) that underpin ChatGPT. However, global AI regulation is demanding increased transparency from developers and greater oversight of the data used to train models. Perhaps these will go some way towards solving these issues.
The extent to which global IP lawsuits affect AI regulation will be very interesting…