AI is an ongoing trend that every startup wishes to implement in their product. It can speed up delivery time and offer new features otherwise complicated to develop. But there are still legal issues, especially around intellectual property (IP) rights, that create concerns for investors. A slowdown in the merger and acquisitions deal market has been reported by S&P Global, suggesting the primary reasons are connected to issues with IP rights. So how can this struggle be avoided? The simple answer is to ensure your IP rights are protected and demonstrable.
In our previous article, we discussed the protection of rights when using AI tools during software development. This time, we'll look at AI development itself tackling the need for a large sample of data for AI training. We’ll cover both the EU and US perspectives.
Data is essential. When building an AI product, you'll soon realize the importance of having a huge amount of data to train your AI model. The question then becomes: How to get the necessary data? A common answer is scraping from the web. Web scraping, or text and data mining, uses software to collect and extract data from web pages automatically. But if you plan to do text and data mining or use that data to train your artificial intelligence model, will it be free of intellectual property issues? Not necessarily.
When web scraping in the European Union, you need to be aware of the rules arising from the Directive on Copyright and Related Rights in the Digital Single Market. This law lets website owners stop you from scraping and using their data. If you don't follow these rules, you could face legal issues and harm your business. To avoid problems, make sure your scraping tools can recognize and respect website owners' restrictions, like checking for 'no scraping' signs in website code (like robots.txt files and meta tags).
Specifically, your software should be able to do at a minimum:
Restricting text and data mining is just one hurdle in the European Union. There are many rules, including EU directives and national laws, that can complicate the issues with intellectual property rights. Even if you comply with TDM restrictions, you might still breach the intellectual property rights of third parties.
Text and data mining in the US also comes with its challenges. Recently, new discussions about using copyright materials to train machine learning models were highlighted in Ross's case. The Ross case provides legal guidance on the interpretation of fair use, particularly in the context of the development of artificial intelligence.
The core of the matter is to decide if the use of copyrighted materials to train AI models is fair use or copyright violation. If a developer's use falls under fair use, they can use copyrighted materials without the need to get permission or provide compensation to the copyright owner. But if the final work is considered a derivative work, this means that a new copyrighted work has been created by modifying or adapting existing copyrighted material. In such a scenario, the AI developer would need to secure permission from the original copyright holder to lawfully distribute, display or perform the new work.
While the Ross case is ongoing it hits on a discussion of when AI training involves creating a derivative work versus the exercise of fair use. A derivative work would mimic the creative talent of the original, while fair use might involve altering the material for a new purpose, such as analyzing language patterns instead of copying the creative design. Addressing these issues will have far-reaching implications for AI development and the legal frameworks governing intellectual property.
In the US, fair use permits the use of copyrighted material without a license based on a four-factor test. These factors include the purpose and nature of the use, the type of copyrighted work, the amount used and the effect on the market value of the original work.
On the other hand, the EU doesn’t have an equivalent to the US fair use and follows stricter copyright exceptions for specific uses like teaching and research, offering less flexibility.
Making derivative works in the US usually requires the copyright holder's permission, unless it's considered fair use. The EU also requires permission for derivative works, but what counts as a derivative work can vary by country due to different national laws.
To strengthen the company's investment position despite challenges with fair use, derivative works and text and data mining, it’s essential to:
Are you looking for legal advice on your new AI tool? Get in touch with us here.
Training your AI model
Restriction for text and data mining in EU
Training an artificial intelligence model in the US
The EU and the US perspectives compared
Practical recommendations