20 July 2020

GDPR and Google

'Neoliberal business-as-usual or post-surveillance capitalism with European characteristics? The EU’s General Data Protection Regulation in a multipolar Internet' by Angela Daly comments
Since the Global Financial Crisis from 2008/2009 and the decade which followed it, the tenets of neoclassical economic theory and its application through neoliberalism have been questioned in western economies including the European Union (EU). The Snowden revelations of US-driven global digital surveillance prompted law and policy changes in various jurisdictions including the EU to counter privacy infringements by both state and corporate actors. This all takes place against a backdrop of an increasingly multipolar world, including in matters of Internet governance, where the US’s hegemony is weakened and the EU and BRICS emerge as powerful actors, including in the Internet sphere. 
One highly prominent event in the last ten years thrusting the EU into global prominence is its data protection legislation update, with the introduction of the General Data Protection Regulation (GDPR), widely believed to be the international ‘gold standard’ in privacy and data protection standards, holding actors in the digital economy to account for their (ab)use of personal data and empowering users thereby seemingly placing some limitations on unfettered digital capitalism and surveillance. 
Can the GDPR developments be understood as an attempt by the EU to forge a different political economy of digital technology to neoliberal capitalism whereby large companies are constrained in their actions and EU citizens’ rights upheld? Or can these developments be seen through another lens, that of varieties of capitalism, whereby the European Union, in the context of growing geopolitical multipolarity, is leveraging its large and attractive market and the ‘Brussels Effect’ to become a ‘regulatory superpower’ after having lost the tech superpower battle to the US and China? 
The GDPR exposes the nature of contemporary EU internet regulation as a contested and hybrid site, containing both capitalist impulses and overtures to protect aspects of individuals’ privacy and data protection. This latter aspect of protecting privacy and data protection is diminished to the extent that the GDPR facilitates much data gathering and use, and seems to have the effect in practice of consolidating US tech giant Google’s corporate power. Thus, the GDPR seems more an example of EU regulatory capitalism, ‘constraining and encouraging the spread of neoliberal norms’, rather than an attempt to move to digital postcapitalism. Nevertheless, the external-reaching aspects of the GDPR make it an example of the ‘Brussels effect’ and contribute to shoring up the EU’s regulatory power in a multipolar Internet by proactively setting de facto standards. In this way, the EU ‘remains relevant as a global economic power’ through its regulation, rather than through its technological innovation like the US and China. The GDPR’s aspects which promote human rights and social goals may also give rise to more privacy protective innovation, albeit innovation which may come from non-EU Big Tech companies.

'Google Data Collection' (Digital Content Next, 2018) comments 

1. Google is the world’s largest digital advertising company. It also provides the #1 web browser, the #1 mobile platform, and the #1 search engine worldwide. Google’s video platform, email service, and map application have over 1 billion monthly active users each. Google utilizes the tremendous reach of its products to collect detailed information about people’s online and real-world behaviors, which it then uses to target them with paid advertising. Google’s revenues increase significantly as the targeting technology and data are refined. 

2. Google collects user data in a variety of ways. The most obvious are “active,” with the user directly and consciously communicating information to Google, as for example by signing in to any of its widely used applications such as YouTube, Gmail, Search etc. Less obvious ways for Google to collect data are “passive” means, whereby an application is instrumented to gather information while it’s running, possibly without the user’s knowledge. Google’s passive data gathering methods arise from platforms (e.g. Android and Chrome), applications (e.g. Search, YouTube, Maps), publisher tools (e.g. Google Analytics, AdSense) and advertiser tools (e.g. AdMob, AdWords). The extent and magnitude of Google’s passive data collection has largely been overlooked by past studies on this topic. 

3. To understand what data Google collects, this study draws on four key sources:

  • Google’s My Activity and Takeout tools, which describe information collected during the use of Google’s user-facing products; 

  • Data intercepted as it is sent to Google server domains while Google or 3rd-party products are used; 

  • Google’s privacy policies (both general and product-specific); and 

  • Other 3rd-party research that has examined Google’s data collection efforts. 

4. Through the combined use of the above resources this study provides a unique and comprehensive view of Google’s data collection approaches and delves deeper into specific types of information it collects from users. This study highlights the following key findings: 

a. Google learns a great deal about a user’s personal interests during even a single day of typical internet usage. In an example “day in the life” scenario, where a real user with a new Google account and an Android phone (with new SIM card) goes through her daily routine, Google collected data at numerous activity touchpoints, such as user location, routes taken, items purchased, and music listened to. Surprisingly, Google collected or inferred over two-thirds of the information through passive means. At the end of the day, Google identified user interests with remarkable accuracy. 

b. Android is a key enabler of data collection for Google, with over 2 billion monthly active users worldwide. While the Android OS is used by Original Equipment Manufacturers (OEMs) around the world, it is tightly connected with Google’s ecosystem through Google Play Services. Android helps Google collect personal user information (e.g. name, mobile phone number, birthdate, zip code, and in many cases, credit card number), activity on the mobile phone (e.g. apps used, websites visited), and location coordinates. In the background, Android frequently sends Google user location and device- related information, such as apps usage, crash reports, device configuration, backups, and various device-related identifiers. 

c. The Chrome browser helps Google collect user data from both mobile and desktop devices, with over 2 billion active installs worldwide. The Chrome browser collects personal information (e.g. when a user completes online forms) and sends it to Google as part of the data synchronization process. It also tracks webpage visits and sends user location coordinates to Google. 

d. Both Android and Chrome send data to Google even in the absence of any user interaction.Our experiments show that a dormant, stationary Android phone (with Chrome active in the background) communicated location information to Google 340 times during a 24-hour period, or at an average of 14 data communications per hour. In fact, location information constituted 35% of all the data samples sent to Google. In contrast, a similar experiment showed that on an iOS Apple device with Safari (where neither Android nor Chrome were used), Google could not collect any appreciable data (location or otherwise) in the absence of a user interaction with the device. 

e. After a user starts interacting with an Android phone (e.g. moves around, visits webpages, uses apps), passive communications to Google server domains increase significantly, even in cases where the user did not use any prominent Google applications (i.e. no Google Search, no YouTube, no Gmail, and no Google Maps). This increase is driven largely by data activity from Google’s publisher and advertiser products (e.g. Google Analytics, DoubleClick, AdWords). Such data constituted 46% of all requests to Google servers from the Android phone. Google collected location at a 1.4x higher rate compared to the stationary phone experiment with no user interaction. Magnitude wise, Google’s servers communicated 11.6 MB of data per day (or 0.35 GB/month) with the Android device. This experiment suggests that even if a user does not interact with any key Google applications, Google is still able to collect considerable information through its advertiser and publisher products. 

f. While using an iOS device, if a user decides to forgo the use of any Google product (i.e. no Android, no Chrome, no Google applications), and visits only non-Google webpages, the number of times data is communicated to Google servers still remains surprisingly high. This communication is driven purely by advertiser/publisher services. The number of times such Google services are called from an iOS device is similar to an Android device. In this experiment, the total magnitude of data communicated to Google servers from an iOS device is found to be approximately half of that from the Android device. 

g. Advertising identifiers (which are purportedly “user anonymous” and collect activity data on apps and 3rd-party webpage visits) can get connected with a user’s Google identity. This happens via passing of device-level identification information to Google servers by an Android device. Likewise, the DoubleClick cookie ID (which tracks a user’s activity on the 3rd-party webpages) is another purportedly “user anonymous” identifier that Google can connect to a user’s Google Account if a user accesses a Google application in the same browser in which a 3rd-party webpage was previously accessed. Overall, our findings indicate that Google has the ability to connect the anonymous data collected through passive means with the personal information of the user.