The datafied ‘us’

D ATA once meant tabulated or graphed information about people and events obtained by ploughing through a huge volume of mathematical representations. This meaning of data has morphed into imperceptible, machine-breeding statistical categorisations that govern us and the world. Today, data about individuals are where tech companies and the state put their money because such data have become the lever with which they can control and manipulate the actions of individuals.

Data are free-floating and open to confiscation. They are not veils; they are transparent gateways to infinite facades. The transparency of data ensures that we are being watched, monitored, governed, followed, traced, tracked and disciplined. This is the central thesis advanced by the book We Are Data: Algorithms and the Making of our Digital Selves by John Cheney-Lippold.

The book dilates on how algorithms make and compute us. The computed “us” is what algorithms make us out to be, which is a far cry from what we are. Cheney-Lippold (assistant professor of Digital Studies and American Culture, University of Michigan) has laid out lucidly his arguments on the maniacal proportions of information embezzlement by digital empires. The book is also a trailblazer for the new-found interest in algorithm studies in academia. The book is filled with theoretical concepts and draws on the studies of contemporary scholars and digital media philosophers. The theoretical concepts have been largely used as annotative, nice-to-know information.

Although the concepts are not a prerequisite to comprehend the contents of the book, the reader will be enthused to read them before proceeding further. At its core, the book, in its four chapters—categorisation, control, subjectivity and privacy—deals with how algorithmic logic organises and orders our lives.

The everyday life of people has become deeply entrenched in digital networks as they wade through a host of apps and social media sites. The expansive, networked ecosystem that they are part of holds them hostage to invisible traps. We seemingly take comfort in the possibilities of the networked world, which at best excites people with information. But, that is the pretentious saving grace at the superficial level. We Are Data is a scary and grisly account of how human beings cease to be individuals. Not because individuals cease to exist but because their ubiquitous and fragmented presence makes them susceptible to surveillance and control beyond one’s imagination.

Data make us, but we cannot see them. So, what is not seen is what determines our life and regulates and modulates our self. If Rene Descartes argued “I think, therefore I am”, the unseeable data do not necessarily mean that we do not exist. But, as Cheney-Lippold says, data in themselves do not carry meaning. It is the aggregation of data derived from our engagement with a wide variety of apps and activities that lends itself to meaningful information that we as human beings cannot compute.

The first chapter is a chilling revelation of how algorithms create categories in terms of profiling, racialising and marginalising us. Our data and the metadata that algorithms create by drawing information about what we do on social media and in the digital space give rise to constructions of gender, race, wealth and identity, among other things. But, is it our gender, race and identity or something that has been constructed through statistical patterning and correlations? Cheney-Lippold explains in the introduction: “Algorthmic identity doesn’t declare that you are ‘male’ or ‘female’.… Rather, you are likely to be 92 per cent confidently male and 32 per cent female confidently.” Gender as we know is different from “gender” formulated by algorithms. This chapter brings to the fore the unobtrusive algorithms and their mechanics in carving out identities.

The author also explains that data in themselves are not meaningful and their aggregation is so generative that one can hardly locate data. There is neither needle (particularities) nor haystack (mass of data) in the data world. We are producing actual perceivable data from the virtual possibilities that are imperceptible.

‘As if’ category

The intangible techno-digital materiality of metadata—the bits and our coded inputs—is processed by algorithms to produce a categorisation that does not resemble what one really is. As the author puts it, we are denied the freedom to be evaluated on the basis of our own self. On the other, the linking of different types of data, strung together by algorithms, yield an “as if” category—a “self” and not our own self. Blending the concepts of Rosi Braidotti’s nomadism and Hans Vanhinger’s “as if”, Cheney-Lippold explains that “as if” is different from “as” or as is.

Vaihinger argues that sensations and feelings are real, while knowledge produced otherwise is fictitious. We believe as if what we do were true. Yet, it is not real. Vanhinger uses mathematical concepts such as “Let x be the amount that Hari had”. This assumption advances our thought to solve the problem.

Cheney-Lippold has extrapolated this concept of “as if”; to underline how algorithms assume that a woman = x, y z. A reference to one particular category such as a woman = x would mean that an identity of a woman is dependent on static and unitary dimensions of gender. However, algorithms do not construct nor do they conform to such unitary definitions of gender. Thus, it would mean that there are multiplicities of gender, there are many “as if” categories. There are no as or “it is” categories since they are “as if” real and not real. Thus, a woman could be an x, y, z according to different data companies. Each of them would make a gender out of the datafied performances.

Since the “as if” defies fixity, Cheney-Lippold has connected with Rosi Braidotti’s concept of “nomadism” where the subject position flows across a spectrum of indices. As it is defying a static representational category, Cheney-Lippold argues that the “as if” category is relational, that is, a citizen can be understood in relation to the multiple indices that data produce. There is the possibility that an “as if” terrorist could stem from metadata.

Metadata, thus, frame individuals into what the author calls the measurable-type. The measurable-type is the quantified version of the self, the “as if” self.

While acknowledging the threat of terrorism and terrorists in the world, Cheney-Lippold points out that data patterns could be used to target “terrorists” or “anti-social” elements owing to their statistically derived profile resembling the state’s concept of a “terrorist” or an “antisocial”. Let me provide an imaginative analogy: a techno-savvy wife or husband tracks the spouse’s data behaviour such as parking a vehicle in front of a mall (he/she would have, for want of space, parked it in front of the hospital nearby) and buying two tickets for a movie (he/she might have got it for friends or cousins). This can possibly create the measurable type of the spouse being categorised as two-timing.

Cheney-Lippold explains the same vividly: If Bill and Ted live at the same address, rent a truck, visit a sensitive location and buy some ammonium nitrate (used as fertilizer in growing potatoes and also as a vital component in making country bombs), they would exhibit a behavioural pattern that corresponds to that of a terrorist signature, and the algorithm designed to process this data would then sound the alarm. The measurable type thus produced is rerouted through technological computation, while the real values of an individual, the ideal type, are transcoded into data and numbers. We become data objects.

In the second chapter, Cheney-Lippold weaves his expositions of algorithmic control over the human population around Michel Foucault’s concept of biopolitics and governmentality.

Biopolitics is about how the state controls its population and imposes its hegemonic sway over the mass to ensure that its conduct is in conformity with the state’s intentions and diktats. The author points to data’s pervasive powers to control and determine the conduct of the population. The algorithmic control is not exercised over the population or the mass but over dividuals. Dividuals are micro traits culled out of individuals: my gender is a dividual as much as my age, my political affiliation, religion, liking for coffee, activism, crusading for the rights of Dalits, and the number of children I have, to name a few. These dividualities will not be tied together in a pattern that is foreseeable.

Dividuals are like Lego blocks and what the machine-learning and the algorithmic programme produces out of them is anyone’s guess. Unlike the traditional form of biopolitical control in which the state seeks to have power over the lives of the population through coercive and cajoling techniques, in the data world control is a generative phenomenon. While the traditional biopolitical control sweeps individuals into a homogenised population, stripping off markers of individuality, datafied control breaks individuals into parts called dividuals.

This chapter spells out the ways in which the state controls and disciplines the population. The author has cited examples from the political spectrum and technology empires. For instance, Cheney-Lippold underlines the point that hospitals and doctors are the only preserve of our medical data. But, digital technologies and companies have wrenched this right from hospitals, which represent enclosed spaces, and made medical information a commodity as well as a discursive product. As a commodity, our data are marketed to brands that are keen to target dividuals. Imagine using an app that monitors our heartbeat, obesity, exercise regimen and calories burnt, or imagine Googling to get a grasp of the side effects of drugs, paying the drug store through Google Tex or playing a game that counsels on our fitness. In fact, this is the practice of Quantified Self, or QS community, the author argues.

All our data are siphoned off by the tech companies and they are put through diverse non-linear assembly-line productions to generate unpredictable and speculative information about our future. On the one hand, an app that is developed to alert us about our health is well-intentioned, yet the hidden truth is that we are disciplined to conform to some standards of health. All of a sudden, there is an increasing craving for health-conscious measures and this craving leads to technological fetishism, which, in turn, results in mining and milking of data for profit.

The author has generously imported case studies from digital corporates such as Google and the Pentagon of the United States administration. Cheney-Lippold says that the T2 Mood Tracker mobile app funded by the Pentagon creates a statistical mood and health index that defines one’s anxiety and depression on the basis of the data captured. Anxiety and emotions thus quantified become parameters that can combine with other quantified characteristics to define a “person”. For instance, Cheney-Lippold argues, statistically composed anxiety and depression are ascribed to poor diet and an inadequate exercise regimen. What is worrisome is that before capturing data, users offer themselves at the altar of technology giants, leaving traces and trails of life to be exploited for profit.

Cheney-Lippold discusses in detail Web-analytic companies such as Quantacast and their intent to monitor what sites we watch, what sneakers we purchase, which movie we watch, how much we spend on snacks, the itinerary of our travel from home to hospital—all our activities are analysed. What the author underlines here is that “biopolitics is not the politics of life itself but the politics of data itself which has also become a new index for life itself”.

Soft biopolitics

Cheney-Lippold cites Antoinette Rouvroy’s “algorithmic governmentality”, a biopolitical concept that best describes how soft biopolitics is getting naturalised. Soft biopolitics is stewarded by tech companies, which have control over us, and the control is less physical. Cheney-Lippold says that “hard biopolitics engages life on a social level” while “soft biopolitics engages life on a techno-social level”. He cites Google Flu to describe soft biopolitics. Google Flu became the digital doctor diagnosing flu trends in the world.

The application did what Professor Stephen D.N. Graham calls “software sorting” to compile search queries in different places, provinces and during different time periods. On the basis of 160 search queries from users, Google Flu determined whether the “flu” (different from flu) infected the town or not. In fact, it made doctors redundant.

The author laments that Google Flu flu-ed us and age-d us. Even when doctors advise us not to travel, the suitability of travel is determined by Google Flu. Examples such as this from the chapter indicate how algorithms nudge a subject to conform to datafied standards. In the process, we bring about self-discipline without knowing consciously that we are subjected to it.

The third chapter is a comprehensive take on subjectivity and perfectly tailgates the first two chapters. Predicating his arguments on the leak of classified surveillance information of the U.S. National Security Agency by Edward Snowden, the author outlines the emergence of an assemblage of identities from Facebook posts, Google Chat, WhatsApp, GPS usage and images uploaded on Instagram, among others. These user positions, which are outside the control of the users themselves, are algorithmic identities. These identities are produced in the back end, brought to life through the aggregation of a combination of personalities from data dispersed all across the space. Cheney-Lippold says, Google might use it for “knowledge, order and advertising”, while the NSA might exploit the data for knowledge, order and exercising state power. The author presents the dreadful surveillance machinations of many agencies.

The tentacles of surveillance facilitated by algorithms go beyond national boundaries. Cheney-Lippold says that a citizen is not jus soli , citizenship acquired by the right of anyone born in a soil or a territory, or jus sanguinis , citizenship acquired by virtue of the birthplace rights of parents and ancestors. He argues that the digitally acquired citizenship is jus algorithmi thereby taking the body beyond the nation state. The difference lies in the fact that the nation state’s conferment of citizenship deals with identifiable individuals.

The datafied world deals with unidentifiable subjects that are an assemblage of diverse behavioural patterns. Agencies such as the NSA tracking an individual is not as facile as following someone in the street. The NSA would follow the data footprints captured from a cross section of data and behaviours in multiple contexts. But jus algorithmi identities are transient and continue to evolve with the tracking and aggregation of different strands of data. The subjectivity or jus algorithmi identity, is thus not stable. Just as the anthropologist Arjun Appadurai terms a nation as curatorial, subjectivities are also curatorial. The author, at this juncture, locates his jus algorithmi subjectivity in Deleuzian new materialism, which proposes that subjectivity is always constituted in the network of relations or an assemblage. This point subverts the notion of individuals as stable categories.

Transcoded identities

Subsequently, subjectivities in the network of relations in the digital context would mean how identities are co-constructed through their interface with technologies. Farmers are not handling the stable material entity called seeds, rather they deal with the material-informational entity called genetically modified seeds constructed out of codes. Our materiality, and therefore our subjectivity, in this context is posthuman, in that algorithms interact with our behaviour producing transcoded identities. The algorithmic function of generating identities pronounces that datafied subjects are nomads living outside any rigid organised form of life, and centre and state interference. Our behaviour as captured by data will never align itself with a “particular” category; algorithms compute one for us. That could be a gay, troublemaker, terrorist, man or woman. The assemblage of fluid identities constructed by the data defeats any singular subjectivity.

Extending this line of thought, the author explains how and whether data can make a subject out of individuals. Statistical data are used to predict and estimate behavioural patterns of individuals in the future. Algorithms are developed to evaluate existing data sets to arrive at predictions. Such predictions can reek of algorithmic bias, and in one such instance, when the data were regressed, an Afro-American arrested for stealing an object in public space was statistically found to be twice likely to commit crimes as opposed to a white American. Interestingly, the New York Corporation took this up and drafted the Algorithm Accountability Bill following the expose by ProPublica in 2018. This chapter digs into such racialised performative actions by examining Google Translator. The algorithmic logic that drives Google Translator at times exhibits mala fide intentions when data sets connect two words synonymously not because of their closeness on the basis of their semantic or semiotic references.

But the mathematical terms and the number of times the two words have been incidentally found to be synonymous could result in producing a racialised or a cybertype subjectivity. Words such as “strong” and “negro” if found correlated frequently may become synonymous and Google Translator may err giving out the meaning of strong as negro. This happens not out of prejudice but because of statistical cues.

Algorithmic identities also move beyond particularities of individuals, beyond enclosures, prison walls and onto cyberspace. The noise of data, in plenty, ruptures fathomable identities and what we have at hand are not individuals and mass. They have turned into dividuals and market or data.

These arguments, which run a chill down the spine, have been reinforced throughout the chapter to make a profound statement about the distinction between I and “I”, the latter emanating as an “as if” Indian and U.S. citizen, an “as if” terrorist that dovetails with the state’s conceptualisation. We become data, erasable and rewritable, Cheney-Lippold argues. We write into algorithms, speaking to them. When algorithms begin coalescing our traces, they start speaking for us.

Notion of privacy

The fourth and last chapter sheds light on the pivotal issue of privacy. The notion of privacy in digital spaces has been subjected to vagaries of contexts. Privacy at home is different from privacy at the workplace and at recreational spaces.

Algorithms and technologies, in general, have become informationalised state apparatuses to control and monitor dividuals. With every fragment of our self in the digital space brought under the watchful eyes of the state, the question of privacy has been tossed to the wind. Privacy is untenable and the same has been vociferously echoed by technology companies such as Facebook and Google.

Extrapolating this further, Cheney-Lippold mentions that privacy in digital spaces is not a fixed concept but one that is characterised by fluidity. He articulates that privacy in its traditional form has lost its legitimacy as it is not assigned to individuals. The traditional meaning of privacy undergirding the right to be left alone has assumed imprecise connotations.

The individual-centric privacy has been debunked by the negation of individuals. The author supplements his argument with a case study which shows that a disabled patient dialling emergency services does not get his requirement addressed. He is greeted with questions, and the answers given by the patient are used to determine whether his request for an ambulance deserves to be heeded to or not. Eventually, the patient is found lying unconscious and dies at the hospital. The algorithmic logic at play in dredging up data about his health through a set of questions did him in.

Thus, privacy exposes the power dynamics of algorithms; they produce a statistical categorisation of the datafied “us”, necessarily the unidentifiable “us”. Besides, algorithms carry out an act of surveillance that detaches the subject from its corpus of data generated through space and time. Cheney-Lippold locates a dividual subject (my gender+my age+my colour+my caste+my income+brands I like+drug store from where I get medicines+my political affiliation = I) in an assemblage of myriad shards of one’s own data.

The author appeals to readers to understand privacy in “dividual terms”. Dividuality draws its sustenance from how the fragmented, incoherent data about us, or our “self”, are stitched together in different contexts and made useful. As Arjun Appadurai points out, our individuality is put in a blender and chopped into multiple fragments as dividuals. The algorithm is the blender machine splicing up our individuality through datafication.

The growing concerns over privacy have resulted in debates that have now tapered off to conceding the futility of privacy or the right to be left alone. From the edifice of the personal sphere, privacy’s import has mutated to what the author says, “I have nothing to hide”. It is an act of defence couched in a denial mode. The dividual privacy operates in obscure and inscrutable methods under the diktats of algorithms. This has a close parallel with Aadhaar in India and the debates it has triggered.

The author maps out privacy through the theoretical frameworks of Foucault’s Panopticon, which situates surveillance on the (presumed) visuality or on the knowledge of visuality of being monitored. It is in the knowing of the virtual preying eyes that one cares to bring self-discipline. But, dividual privacy is statistical, born out of data, and non-visual. The dataveillance—surveillance functioning in despatialised digital networks—affords targeted individuals no capability to track and self-discipline themselves. The author quotes Matthew Fuller (professor of cultural studies, University of London), who says that “surveillance is a socioalgorithmic process”.

Data profiling

According to Cheney-Lippold, data lack sensibilities to assess the deservingness of an “identity” that they assign to individuals. Citing the app Gaydar, the author explicates that placing us into categories of sexuality that the data deem fit, identities assigned arbitrarily do not reflect the lived experiences of individuals. The power dynamics involved in profiling us through data helps the state to administer us easily.

Much as the tech giants would like to write off privacy, the asymmetrical algorithmic functioning reveals that the privacy debate is alive and that it needs to be discussed in “dividual terms”. Cheney-Lippold argues that dividual privacy could be practised to obfuscate the algorithm’s effort to normalise us into measurable types. To counter the onslaught of dataveillance, the author proposes that users should become proactive to appropriate dividual privacy from algorithms. The awareness of how we are treated and exploited as dividuals could help us in ordering our lives. Digital literacy or algorithm capital (similar to the French sociologist Pierre Bourdieu’s concept of capital) could turn the situation around but the proposition appears to be far too utopian.

The book lays out its arguments with adequate case studies. One never feels lost engaging with it. The terse and easy writing style is a huge draw. The book is flush with theories that are pointers to the emergent algorithmic and data studies, wherein the key highlights are the dangers lurking in data traps and the subjectivities produced in a controlled digital environment.

The well-designed book is equally academic material and an activist project. Placing it in the trajectory of the new-modern paradigm of technology-induced development does plunge the book into some minor criticism of it being biased against the left out. One may rescue the book by saying that its scope is limited and that its aim is to prepare us to take cognisance of the potential of algorithms. What it conspicuously lacks is resistance and emancipatory accounts of users in countering the imperialistic control of tech companies and the state and their larger spying mission.

Besides, it does not focus much on the discriminatory practices of algorithms that impact society’s most vulnerable communities. However, with the growing penetration and footprint of technologies, the book lays bare technocracy and statism in precise terms

M. Shuaib Mohamed Haneef isAssistant Professor & Head in-charge, Department of Electronic Media & Mass Communication, School of Medi a and Communication, Pondicherry University.