Personal Data processed by Platforms

The quantitative and qualitative analyses presented in the Observatory focus on privacy policy clauses describing the categories of personal data processed.

Privacy Policies and Level of Comprehensiveness of information

The chart illustrates the level of comprehensiveness of information provided in the privacy policies included in the PRIMA dataset. For each privacy policy analysed, the graph compares the number of clauses that were classified as sufficiently informative with those considered insufficiently informative according to the evaluation criteria defined in the PRIMA annotation guidelines and the project’s gold standard.

Overall, the results show that insufficiently informative clauses significantly outnumber sufficiently informative ones across most privacy policies in the dataset. This suggests that many policies provide incomplete or unclear information to data subjects regarding the processing of personal data.

The analysis also reveals considerable variation among companies in the level of informational quality. While some privacy policies contain a relatively higher number of sufficiently informative clauses, most exhibit a substantial proportion of clauses that fail to meet the required level of transparency and completeness.

These findings highlight systematic shortcomings in current privacy policy drafting practices, particularly with respect to the clarity and completeness of the information provided to users about the processing of their personal data. The results therefore confirm the relevance of the PRIMA project’s objective of identifying recurring deficiencies and promoting improved drafting practices aligned with the legal requirements of transparency and fairness under the GDPR.

distribution of sufficiently informative clauses by market sector

The chart presents the distribution of sufficiently informative clauses across different market sectors in the PRIMA dataset, highlighting sector-specific patterns in the quality of information provided in privacy policies. It shows the distribution of sufficiently informative clauses by market sector. The largest share is observed in the Health and Well-being sector (39%), followed by Social Networks (21%), eCommerce (18%), and Gaming and Entertainment (17%). Smaller contributions are observed for Productivity and Business-Management Tools (3%) and Finance (2%), while Travel and Service Intermediaries show no sufficiently informative clauses in the analysed dataset. These results indicate that some sectors—particularly those dealing with sensitive or highly regulated data, such as health-related services—tend to provide relatively more complete information regarding the categories of personal data processed.

distribution of insufficiently informative clauses by market sectors

This chart present the distribution of insufficiently informative clauses across different market sectors in the PRIMA dataset, highlighting sector-specific patterns in the quality of information provided in privacy policies. 

This second chart illustrates the distribution of insufficiently informative clauses across the same sectors. In this case, the largest proportions are found in Social Networks (32%) and Health and Well-being (29%), followed by Gaming and Entertainment (19%) and eCommerce (14%). Smaller shares are observed in Productivity and Business-Management Tools (5%) and Travel and Service Intermediaries (1%), while no insufficiently informative clauses were identified in the Finance sector within the analysed sample.

These results reveal significant sectoral differences in the level of transparency and completeness of privacy policy clauses. While some sectors show relatively higher shares of sufficiently informative clauses, they may simultaneously exhibit high proportions of insufficiently informative ones, suggesting considerable variability in drafting practices within the same sector. Overall, the findings confirm that deficiencies in the clarity and completeness of information remain widespread across market sectors, reinforcing the need for clearer drafting practices aligned with the transparency requirements established by the GDPR.

Genus of personal data AND LEVEL OF COMPREHENSIVENESS OF INFORMATION

The chart illustrates the distribution of sufficiently informative and insufficiently informative clauses across different categories of personal data identified in the PRIMA dataset. The x-axis represents the various types of personal data referenced in privacy policy clauses, while the y-axis indicates the number of clauses classified as sufficiently informative or insufficiently informative according to the PRIMA annotation criteria.

Overall, the results show a marked predominance of insufficiently informative clauses across many categories of personal data, suggesting that privacy policies often fail to provide complete or sufficiently clear information regarding the data being collected and processed.

The highest numbers of insufficiently informative clauses are observed in categories related to technical and behavioural data, such as Device Information (361 clauses), Usage Data (272), User-Generated Content (201), and Generic data categories (317). Similarly, significant shortcomings appear in categories such as Health and Fitness data (118), Demographic data (98), User Profile Information (90), and Metadata (86), indicating that information about these types of personal data is frequently incomplete or insufficiently detailed.

In contrast, some categories display a relatively higher number of sufficiently informative clauses, particularly those related to transactional or account-related data. For example, Basic Account Information (127 clauses), Payment data (120), Purchase information (92), Contact information (82), and Geolocation data (70) show comparatively higher levels of informational completeness. These categories often correspond to data that are more directly linked to core service functionalities and therefore tend to be described more explicitly in privacy policies.

However, even within these categories, insufficiently informative clauses remain present, indicating variability in drafting practices across companies. Overall, the analysis highlights systematic deficiencies in the transparency of privacy policies, particularly with respect to the description of technical, behavioural, and user-generated data. These findings reinforce the need for improved drafting practices to ensure that data subjects receive clear and comprehensive information about the types of personal data collected and processed.

The chart illustrates the distribution of different categories (genus) of personal data across the market sectors represented in the PRIMA dataset. The x-axis reports the various types of personal data identified in privacy policy clauses, while the data are grouped according to the market sectors in which the analysed services operate, namely Social Networks, Health and Well-being, Gaming and Entertainment, Productivity and Business-Management Tools, Travel and Service Intermediaries, eCommerce, and Finance.

The purpose of this analysis is to examine how different categories of personal data are distributed across sectors, thereby highlighting sector-specific patterns in data collection and disclosure practices. The results show that certain categories of personal data tend to be strongly associated with particular sectors, reflecting the functional characteristics of the services offered.

For example, social networking services tend to include a higher number of clauses referring to categories such as user profile information, social interaction data, contact lists, user-generated content, and images, which are directly linked to the social and communicative features of these platforms. Similarly, services in the Gaming and Entertainment sector frequently refer to performance data, usage data, device information, and user-generated content, reflecting the interactive and behavioural nature of these applications. In the Health and Well-being sector, the analysis reveals a higher presence of clauses referring to health and fitness data, demographic information, and identity verification information, which are typically required to provide personalised health-related services. In contrast, the Finance and eCommerce sectors tend to show a stronger concentration of categories such as payment information, financial data, purchase history, and basic account information, which are essential for managing transactions and commercial activities. Other sectors exhibit different patterns. For instance, Productivity and Business-Management Tools frequently refer to communication data, metadata, and account-related information, while Travel and Service Intermediaries often process contact information, geolocation data, and booking or transaction-related information

Overall, the chart highlights how the categories of personal data referenced in privacy policies vary significantly across market sectors, reflecting both the operational requirements of the services and the different types of data processing activities involved. This sectoral distribution provides important insights into the structural logic of data collection practices and the informational patterns emerging from privacy policies in different areas of the digital economy.

Sensible Personal Data and Level of Informativeness

The chart illustrates the frequency with which clauses referring to personal data and to special categories of personal data are classified as sufficiently informative or insufficiently informative within the PRIMA dataset.

The analysis distinguishes between general personal data and special categories of personal data within the meaning of Article 9 GDPR, in order to assess whether privacy policies provide an adequate level of information when particularly sensitive data are involved.

The results show a clear predominance of insufficiently informative clauses in both categories. For general personal data, 704 clauses were classified as sufficiently informative, while 2,082 were considered insufficiently informative, indicating that in the majority of cases privacy policies do not provide sufficiently detailed or clear information about the processing of personal data.

The imbalance is even more pronounced in relation to special categories of personal data. In this case, only 6 clauses were classified as sufficiently informative, whereas 154 were considered insufficiently informative. This finding is particularly significant given the higher level of protection required under the GDPR for sensitive data such as health information, biometric data, or data revealing racial or ethnic origin, political opinions, religious beliefs, or sexual orientation.

Overall, the analysis suggests that privacy policies frequently fail to provide adequate information when referring to sensitive categories of personal data, despite the stricter transparency and accountability requirements imposed by the GDPR. These results highlight a critical gap between regulatory expectations and current drafting practices, particularly in contexts involving the processing of special categories of personal data.