Nova SBE

Tell me what it is that you do, I will tell you who you are

May 16, 2019 at 12:46 PM by Paulo Almeida

When shopping online or "liking" something on Facebook, we give away our identity and privacy to gigantic data banks. This collective data offers great potential for humanity. But also raises numerous threats to which we must be alert. This is the second of a series of ten articles on the risks of the "Digital Revolution".

Author: Paulo Almeida | Reading time: 6 minutes

Blog - Diz-me o que fazes, dir-te-ei quem ésunsplash-logoAntoine Beauvillain

"- When he begins to work, he will no longer be free.
- Bound, yes, by an obligation to the order that he himself instituted, and therefore free.
- Yes, the dialectic of freedom is unfathomable."
Thomas Mann, Doctor Faustus

In 2006, Google bought YouTube for an extraordinary amount of 1.65 billion dollars. In the meantime, this value was considered excessive: an online video platform is hardly worth as much [1]. Soon enough, however, it became clear that Google’s interest did not necessarily lie on the contents but on what these revealed about the users. The big technology companies – Google, Facebook, Microsoft, etc. – have simultaneous access to data of numerous and diverse platforms as social networks, video sharing websites and direct messages. Google, aside from the products associated with its name (Chrome, Gmail, Photos, Maps) and to YouTube is also in possession of reCaptcha, which forces us to prove to robots that we are not robots. Those who avoid Facebook, but use Instagram or WhatsApp, might not know that these belong to the same company. Microsoft owns not only Skype, but also LinkedIn and Hotmail. Every single one of these companies recurrently buy others who either collect data from their users or develop artificial intelligence technologies. Through this, they can supply us products that recognize our voice (as Siri or Alexa), they identify who is who in a photograph and they can predict what we will buy next week.

Why do they invest so much on platforms we don’t even pay for [2]?

The business model lies on the monetization of our data, to be used in targeted advertising or in recommendations, using automatic learning algorithms. Think about the supermarket points card that you possibly have in your wallet. Why are discounts “offered” to those who use it? What companies are truly buying is your data, so that they can understand the customers shopping patterns. With well trained algorithms, they can make a better management of stocks or gain additional knowledge and influence over your choices.

These algorithms work in a conceptually simple way:

  1. they analyse thousands (or millions) of profiles of whomever shops daily in their supermarkets;

  2. they detect the “shopping patterns” – who buys diapers size 0 in January is expected to buy the first baby porridge from April onward;

  3. they try to identify similarities and predict behavior, “knowing” not only who will buy baby food but also what brands they will prefer.

These algorithms are getting increasingly better. It is already possible to extract information and patterns of structured information (e.g. shopping for groceries), free text, images and even videos. It is enough to show the algorithms some images with the classification of “cats” or “non-cats” for it to learn to distinguish cats. But, who classified these images that train algorithms? Millions of humans who share the pictures of their pet animals, “tag” their friends and describe their vacations. All of us know that our data is being used, but maybe we don’t know to what extent, nor the problems that might arise from it.

Firstly, and although these procedures imply consent, data sharing is not always voluntary, but we will save this topic for the next weeks. Even when it is voluntary, it can have unexpected implications and shared data can be used unpredictably. Two examples: in 2012, the New York Times reported a case where a supermarket chain store, in the US, managed to predict when their clients were pregnant by analyzing their shopping habits [3]. One day, one of the stores received a visit from a furious father whose teenage daughter received promotions for baby clothes and cribs; later, when the store manager called to apologize, the man was embarrassed and told him that his daughter was in fact pregnant. This supermarket chain knew before her own family. Other data companies, as HiQ, use information that can be found on the web to predict when an employee is about to quit and sell this information to the employer.

Even when we are careful with what we publish online, being careful may also be informative, as Shoshana Zuboff summarizes in The Age of Surveillance Capitalism [4]:

“It is not what is in your sentences but in their length and complexity (...) not where you make plans to see your friends but how you do so: a casual "later" or a precise time and place? Exclamation marks and adverb choices operate as revelatory and potentially damaging signals of your self.”

It is natural that this level of control has social consequences that can be profound: from the extractions of data to the prediction of behavior for commercial and political purposes, these technological giants are reducing uncertainty, and the next step will be taking control of the action. Games, subtle stimulus, content designed to generate an answer; there is a reduction of free will in favor of security, efficiency and profit – we will return to this matter in later articles of this series.

These tensions between freedom of choice and external imposition are not new. They are constant throughout history, and they have been increasingly present in the interaction with modern technology. We gave away the control over our privacy and decisions in exchange for convenience and productivity: when we search for a book on Amazon and receive recommendations based on our profile and history; when we make a google search and, not long after we see related commercials on Facebook, during the period when they predict we will be more receptive.

It is important to clarify that we are not condemned to false dichotomy between the good savage and permanent intruding oppression. There are alternate technologies, such as distributed and federated systems, that are, as far as possible, under the control of each one, that respect autonomy and privacy. The social network Mastodon, with millions of users, might be the best example, but others are taking shape [5]. Nevertheless, the issue is mostly political, and society (as a whole) should be called in to decide, democratically and duly informed, as a regulator of this sector. This will be the topic for the next article. 

10 months and 10 articles by the Data Science and Policy Research Group at Nova SBE

Over the next few months we will describe and discuss some of the possible darker side of this revolution. We will begin by explaining the so-called recommendation systems (or what supermarket points are for) and then discuss how current legislation (does not) protect us. In the weeks that follow, we will see if we should cover our phone's camera, how to deal with health data, and how to identify fake news. We will offer information and practical tips, while also addressing issues of principle and ethical values. The goal is to help us think about the world not as it exists today, but as we would like it to be. Because the future is decided now.

Next Month: “Legal vs. Ethical"
Subscribe to our Blog's monthly newsletter here to receive our articles.

Já conhece o programa Data Science for Managers?

Originally published at Jornal Público

Topics: Digital & IT, Opinion Articles

Paulo Almeida

Published by: Paulo Almeida

Researcher @ Nova SBE

Subscribe to our Newsletter

 

Nova SBE

Spark a
conversation!

Know more