Cambridge Analytica: Give me your data, I will tell you what to vote
Internet is now a place where the all acts, all habits are monitored. As Cambridge Analytica case is shaking public opinion, some are calling to delete Facebook accounts with the #DeleteFacebook. Yet, deleting your Facebook account will not put an end to your profiling on the internet. Therefore, a solution must come from the European legislator. The General Data Protection Regulation(GDPR) will come into force on May 25th, 2018 and will provide a first solution. However, the text arrives too late and the harm is already done. Other than data theft and misuse, more than a decade of commercial practices must be questioned and, in some cases, could be prohibited.
On March 17th, 2018, The Guardian and The New York Times published a joint investigation about Cambridge Analytica, a British company specialized in “political profiling”. The newspapers claim that the personal data of more than 50 million Facebook users has been collected for political purposes. A few weeks after this revelation, the number of betrayed users is 87 million and it is likely to increase in the next weeks. Most of the accounts concerned belong to people living in the United States. However, more than 3.1 million European citizens could be concerned. The scandal is even more important as Cambridge Analytica could have played an important role in the 2016 US presidential elections as well as in the British referendum that led to Brexit.
The public opinion is outraged and the European Commission already asked Facebook to come to testify. Nevertheless, the techniques used have long been known in the context of commercial advertising and marketing. Indeed, data mining, cyber-profiling and the resale of these profiles for commercial purposes (targeted advertising for example) have been common practices for years now. However, in the recent years, digital profiling and data collection techniques are systematic. What happened with the data collection on Facebook is in no way a case apart. Therefore, the main problem in Cambridge Analytica’s case is that most of the users’ data collected and processed were without consent.
What is peculiar about Cambridge Analytica?
The company bases its activity on a technique initially developed by researchers at the Psychometric Centre at the University of Cambridge in 2014. Psychometric helps to determine the particular characteristics of an individual by referring to a norm (population of reference). They evaluate their general behaviour (personality, motivation, etc.) and their basic skills such as reasoning, communication, leadership or emotional intelligence. Thus, this technique allows the determination of the psychological profile of a person on the sole basis of their likes on Facebook. As part of the recent scandal, Aleksandr Kogan, a Russian-American researcher at the Psychometric Centre, developed his own application, « ThisIsYourDigitalLife », presented as a personality test on Facebook. More than 270,000 people downloaded the application and answered to his questions. Nevertheless, the conditions of use of the application and the permissions granted were, at that time, much wider than today. In this way, Aleksandr Kogan was able to collect data related to the geolocation of participants but also to their friends on Facebook and the content they liked. This database was sold to Cambridge Analytica. The latter defines its activity as behavioural research and communication. It relies on data mining and data analysis to determine audience groups. In other words, it offers to its customers the opportunity to determine the behaviour of groups of individuals and to influence their choices through communication campaigns.
This might sound scary, but psychometric predictions are a common practice. Researchers can predict users’ personality based on Instagram pictures[i] or Twitter profiles[ii]. For example, International Business Machines (IBM) developed an algorithm that infers personality[iii] from unstructured text (such as Tweets, emails, your blog). The Crystal Knows start-up gives customers access to the personality reports from their Google contact list. Thus, it offers real-time suggestions on how to personalize their emails or messages based on their contacts’ profile[iv].
In the context of political campaigns, it is important to collect as much information as possible in order to maximize the proficiency of your campaign: where to hold conventions, which regions to focus on, how to communicate with your supporters and, most important, the undecided voters. Firms like Cambridge Analytica are interesting in this way: profiling individuals and using these profiles to personalize political communication. Thus, determining psychometric profiles is to apprehend information about an individual that could have been previously apprehended only through the results of tests and questionnaires specifically designed: how neurotic you are, if you are open to new experiences or if you are litigious (to claim a undone promo at the supermarket for example).
IBM Watson’s personality test
Every single action on the Internet is recorded
In order to establish users’ profiles, it is necessary to gather a substantial amount of data or to have access to it. For example, while some companies (such as Cambridge Analytica) collect the data themselves via personality tests on Facebook, others simply buy it from data brokers and online marketing specialists. They collect personal data (your browsing history, your location data, your friends or the charging frequency of your battery, etc.), and then use it to derive additional and unknown information. (What you will buy next, your probability of being a woman, your chances of being conservative, your current emotional state, your reliability, or if you are heterosexual, etc.).
This personal data can be collected mainly in two ways. Firstly, in a direct way, they are explicitly provided by users during registration or acceptance of the terms of service. Secondly, they can be collected indirectly through the use of trackers on most websites. This is more problematic because generally Internet users are unaware of their existence and that data is being collected. One interaction of a user, such as a visit to a website, can trigger a wide range of data feeds and a chain of hidden events. The data profile spreads over several services linked together to establish an ever more accurate profile of the consumer.
These trackers (cookies, tags, Flash cookies, etc.) collect data when visiting a website. The number of trackers on a site will depend on either the owner of the website decided to include some or the others. For example, some websites have more than 60 trackers, belonging to a multitude of companies, while others have only one – perhaps to track the number of visitors, or see where these visitors come from, or to activate a certain feature. Not all trackers are linked to companies that follow browsing habits – but when you “accept cookies”, you accept all trackers, including those that transmit information to businesses. For example, on EU-Logos website different trackers are installed:
Trackers on our website
To avoid such tracking of navigation data some tools are very useful. They prevent the collection of data from trackers and allow you to choose which trackers you accept. The add-on module “Ghostery”[v] can be added to your internet browser (Mozilla Firefox, Google Chrome, Opera, etc.) and blocks abusive advertisements as well as trackers. It also allows you to see how many data is collected by certain sites without you knowing it.
It might be worst on mobile phones
These trackers are also on smartphones and in particular in downloaded mobile applications. A study conducted in 2015[vi] by Australian researchers consisted of analysing the trackers in the Top-100 free applications and the Top-100 most downloaded paid applications. This study was based on the Top-100 of 4 countries divided into 4 geographical regions: Australia, Germany, United States and Brazil. Thus, 275 unique applications were tested among which Facebook, Clash of the Clans and Skype among the free applications and Minecraft, Tasker and Poweramp among paid applications.
As a result, 60% of paid applications included trackers. However, this is nothing compared to the 85% -95% of free applications including trackers. It goes further, by combining the results with data on the phone applications of more than 300 smartphone owners, the study reveals that more than half of the individuals are exposed to at least 25 trackers with their smartphones. These plotters access to your geolocation data, messages, contacts, photos or your time spent on different applications.
Example: A user with 11 apps installed is linked to 26 trackers
Figure from Australian study (see Note VI.)
The potential drifts of such practices
Data mining and profiling on the internet might entail some advantages by, for example, allowing companies to reduce the costs of advertising campaigns and better meet consumer expectations. This method is sometimes presented as a solution to mass-consumption and mass-production since it would avoid, in the long term, excessive production of certain products. In addition, faced with more and more demanding consumers, companies can thus put forward quality over quantity.
However, profiling is frightening. It allows anyone with access to enough personal data to learn very intimate details about you, most of which have never been disclosed. Someone can use your data to know if you are gay, even if you have never shared this information. But it might be tricky, this derived information is often strangely accurate (which makes profiling a nightmare of privacy), but as it is predictive, predictions can sometimes be wrong. Thus, if the objectives of a prediction are very important, systematic classification errors can have real consequences. Indeed, profiling and similar techniques are increasingly used not only to classify and understand people, but also to make decisions that have serious consequences, credit to housing, well-being and employment. Intelligent video surveillance software automatically signals « suspicious behavior » and the justice system claims to be able to predict future criminals.
The European Union and the GDPR – better late than never
With such potential risks it is obvious that the situation cannot continue. Moreover, in view of the statements of Cambridge Analytica, public opinion and elections can be influenced on the sole profiling of individuals through the data collected on the Internet. The European Commission and the European Parliament have called Facebook to account in this case. However, the events took place between 2014 and 2015, before the conviction of Facebook by the Court of Justice of the European Union (CJEU) in the Schrems’ case. From now on, it is no longer possible for applications to collect data from Facebook contacts without their consent. Yet, the harm has been done and Europe must prevent such cases from happening in the future.
The first step was done with the adoption of the General Data Protection Regulation (GDPR) which will become effective on May 25, 2018. This is the most protective text in the world about personal data since not only companies in the European Union (EU) must comply but all entities that process personal data of EU citizens. It requires organizations to hire a data protection officer if they process sensitive data on a large scale. It also limits the possibilities of using the data for purposes other than those provided (as in the case of Cambridge Analytica) and requires companies to communicate, in a clear and precise manner, the objectives of the data collection. Previously, data protection agencies in the different EU Member States could impose financial penalties for the violation of existing data laws, but these fines were relatively small – especially compared to the revenues of sanctioned private entities. Thus, for example in France, the amount of the sanctions imposed by the National Commission of the Computing and Liberties (CNIL), based by the law of January 6th, 1978, could not exceed 150.000 euros for a first failure. In the event of a repeated offense within five years from the date of the pronouncement of the sanction that became final, the amount could not exceed 300,000 euros or, in the case of a company, 5% of the turnover excluding taxes of the last financial year closed within the limit of 300,000 euros. In the United Kingdom, the Information Commissioner’s Office (ICO) can currently impose a fine of up to £ 500,000. For comparison, Alphabet (Google) has annual revenues of about $ 90 billion. From now on, the maximum fine for the most serious infringements of the Regulation is 4% of their total annual turnover (or € 20 million, whichever is the higher)[vii].
In addition, the GDPR increases users’ control over their own data. The 28 Member States implemented the 1995 rules in different ways, with the GDPR now imposes a uniform legislation, putting an end to the current fragmentation. Finally, it helps to strengthen consumer’s confidence about online services and the vital boosting of growth, jobs and innovation in Europe.
This text also makes it possible to extend data protection outside the European Union by requiring non-EU companies to comply with the protection standards imposed. The States to which the data will be transferred will have to provide similar guarantees in order to repatriate certain data. Other than this, some companies such as Facebook are willing to apply worldwide the protections offered by the GDPR[viii]. It is probably guided by a will to regain trust among users. Finally, the Canadian Parliament has expressed its desire to adopt reforms identical to those proposed by the GDPR[ix].
UE needs to go further to protect its citizens
The Cambridge Analytica case already points out to the limits of the GDPR. It seems, in the opinion of some people, that the text does not go far enough[x]. MEP Marju Lauristin (S&D) drafted and presented a text in October 2017 to ensure high standards of privacy, confidentiality and security in electronic communications across the EU. The LIBE Committee thus adopted the draft legislative resolution on the proposal for a Regulation of the European Parliament and of the Council concerning the respect for private life and the protection of personal data in electronic communications and repealing Directive 2002/58 / EC (regulation “privacy and electronic communications”)[xi].
This proposal includes important changes such as:
- No internet user could be denied access to a site simply because he refused to be traced there;[xii]
- The default configuration of web browsers should work as some ad blockers and prevent the display of abusive content and tracing attempts[xiii];
- Communication providers should guarantee the confidentiality of communications (for example by using end-to-end encryption) without the possibility for any national law to force them to use a weaker method or to require the introduction of backdoor (backdoor) allowing access to communications)[xiv];
Interview of MEP, Marju Lauristin (S&D)
Facebook recently announced that it will fight election meddling[xv]. The social network plans to reference all political ads and display the person or organization that paid for it. In addition, anyone wishing to launch a political advertisement must be submitted to an identity and location check. It is particularly dangerous to let private companies control political targeting in this way. The problem is not so much in the veracity of political and targeted advertising, but rather in the very essence of profiling for political ends. Indeed, it seems particularly dangerous that uncontrolled trackers collect data and establish such accurate profiles. Disinterest in politics of citizens is a real problem in our democracies. Nevertheless, political profiling and targeting should in no way be a solution to answer this problem.
This system is particularly dangerous as could be its potential abuses. The method is, in our opinion, not a lasting solution to the problem posed by political mobilization. Thus, it seems appropriate to opt for a radical but effective solution consisting of a total ban on the use of personal data for purposes of political propaganda. To a lesser extent, it could be considered that, as the banning of polls during a given period before the elections, political targeting on the internet and social networks could be banned in a period of time before the elections.
Internet and the digital society are in a constant evolution. Nevertheless, it is now that we should determine how we will be protected tomorrow, what the digital economy will be made of and how it will develop. As they are waiting for the necessary protections, citizens and Internet users cannot do otherwise than sacrifice their data to access the benefits of digital. As Maciej Ceglowski says, “opting out of surveillance capitalism is like opting out of electricity, or cooked foods—you are free to do it in theory. In practice, it will upend your life”[xvi].
For further information:
CHRISTL Wolfie, “Corporate surveillance in everyday life. How companies collect, combine, analyse, trade, and use personal data on billions”, available here : http://crackedlabs.org/en/corporate-surveillance
MIGEON Jean-Hugues, « Introduction to data protection: state of the art of EU law and basic advices on how to protect personal data », eu-logos : Available here : http://www.eu-logos.org/?p=22057
LIBE Committee report on the proposal for a regulation of the European Parliament and of the Council concerning the respect for private life and the protection of personal data in electronic communications and repealing Directive 2002/58/EC (Regulation on Privacy and Electronic Communications)
[i] Ferwerda B., Schedl M., Tkalcic M. (2016) Using Instagram Picture Features to Predict Users’ Personality. In: Tian Q., Sebe N., Qi GJ., Huet B., Hong R., Liu X. (eds) MultiMedia Modeling. Lecture Notes in Computer Science, vol 9516. Springer, Cham, available at : <https://link.springer.com/chapter/10.1007%2F978-3-319-27671-7_71>
[ii] , available at : <http://ieeexplore.ieee.org/abstract/document/6113111/authors>
[vi] Available at : <https://www.researchgate.net/publication/282356703_A_measurement_study_of_tracking_in_paid_mobile_applications?enrichId=rgreq-76f4ccccd045296ab02b6e3022572c5b-XXX&enrichSource=Y292ZXJQYWdlOzI4MjM1NjcwMztBUzoyODAwMDQ2ODU1MTY4MDdAMTQ0Mzc2OTcyNzcwNA%3D%3D&el=1_x_3&_esc=publicationCoverPdf>
[vii] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance), Article 83, para.5, available at : < https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32016R0679 >.
[xii] Ibid., article 8
[xiii] Ibid., article 10, para. 1.
[xiv] Ibid., article 17.