The press archives may only be transferred to companies that intend to use them to train their AI systems in accordance with data protection regulations.
The archives of publishing houses are a valuable asset, which can also be used for commercial operations linked to the artificial intelligence supply chain, such as the transfer of archives to companies that use them in the context of the training of machine learning systems.
Given that these files also contain “personal data”, it is essential to plan such operations in compliance with the current data protection legislation.
In this regard, the indications emerging from the provision of Italian Data Protection Authority (“Garante Privacy”) adopted in 2024 toward a well-known publishing group, which had signed an agreement with Open AI, authorizing it to use the data of the archives and future editorial content for the training of its AI systems, are of great importance.
Indeed, in that context, the Authority has highlighted several critical aspects that need to be taken into account when designing such activities.
Firstly, it is necessary to clearly identify the appropriate ‘legal basis’ for legitimizing such processing, in particular with regard to sensitive and judicial data, which could not be considered authorised by the ‘legitimate interest’ legal basis or by the favourable discipline for processing for journalistic purposes.
The need to respect the principle of transparency was also emphasised: privacy statements must be comprehensive and intelligible.
Data subjects must also be put in a position to exercise their rights effectively, especially the rights to object to processing.