Review on Stephen Baker’s The Numerati
Stephen Baker’s The Numerati, published in 2008, tells the story of our modern world’s “binarization;” how every individual is deduced to ones and zeroes through the trails of data we leave behind which are consequently gathered, analyzed and categorized by number crunchers, better known as data miners, in order to predict behavior. The group involved in this business, is however no “illuminati,” although in some cases their earnings may favor them as elitist, the numerati are in fact the only group of people able enough to deal with the immense bulk of information we produce daily. The Numerati are computer scientists, mathematicians and engineers who structure this data through algorithms to “find similarities and patterns.” Ideologically, the numerati’s interests goes beyond mere mapping and behavioral forecast, they also want to alter our behavior; make us work harder, buy more and become healthier. The numerati even experiment on us with different ads to see how we respond to them, presented as “helpful suggestions, prescriptions, or marching orders.” Baker tries to proof the existence of his Numerati by exploring their presence in the terrains of economics, politics, ideology, medicine and finally even love wherein the internet of course serves as the cross disciplinary, correlating source of information.
In the introduction Baker familiarizes us with Dave Morgan, who traces patterns of internet users. Morgan’s whole business revolves around data; his second company Tacoda, implements cookies to trace our surf behavior from one side to the other. Tacoda gathers 20 billion leads that indicate where certain people are in the market for… every day.
“He’s taking analysis that once ran through an advertiser’s gut, and replacing it with science. We’re his guinea pigs—or groundhogs—and we never stop working for him.”
With the internet, the “prolific data” we produce is available to the commercial world. Through the character of Dave Morgan, Baker introduces us to his primary understanding of the numerati, shows the historical shift from the 1980s onwards when the computer chip became increasingly cheaper and more powerful. From then on, it was able to record everything we do with our electronic devices, they are “fastidious note takers.” Combined, these bits of information unveil the routines and thus our behavior from day to day into global streams of data. That some of the data in the examples Baker gives is somewhat backward, due to the fact that the novel was written in 2008, of course does not come as a surprise. It does however showcase Baker’s optimistic technocratic philosophy as he describes how computers changed society in 50 years from “command-and-control economy [of mass production] to on driven by consumers.” He refers to an experiment which started in the 1950s to replace loan officers with computers. The loan officers had enough data to write entire “sociological monographs” whereas the “the computerized approach zeroed in on only a small set of numbers concerning bank balances, debts, and payment history.” On this data they based a “risk score” which was far more precise than the “gut” of the bankers. Surprisingly, a lot of people turned out to be more suitable than expected and so “the market for credit expanded.” However, we all know how this broadening of the credit extension ended up and how its detrimental consequences continue to reshape our financial world today.
In the first chapter “Worker,” Baker emphasizes how much the computer has infiltrated our workplace. Former distractions are now being passed on by the computer to our bosses, it rats on us, “exposing our online secrets.” Like in every chapter Baker meets up with a numerati coherent to the chapter’s subject. This time it is Samer Takriti, an Iraqi whose human workers modeling inspired him to write about the numerati. Takriti belongs to the top mathematicians of the world and focuses on the positive side of modeling through “stochastic calculus;” through information your boss can let you thrive on tasks you yourself did not even know you could do. Baker argues that today, when a company is doing well, workers can get jobs and their value as a worker rises, but when a company is not so fortunate the workers who were hired last, get kicked out first, privileging longevity above value. The Numerati have a different approach to this; Takriti wants to manage IBM personnel by like “financial investments” by reducing IBM’s work force into a coherent portfolio of skills, something only a computer would understand. Baker does however wonder whether these numbers reflect reality and does not get into the subject either. Baker ends the chapter by summarizing George Dantzig’s philosophy of how jobs and workers are now broken down into minutes, “they are broken into thoughts of the brain as labor is defined by knowledge and ideas.” This he calls job sharing, which allows workers to do jobs all over the world.
In second chapter, “shopper,” Baker elaborates more on our consumption behavior. The shopper’s data is also collected through membership passes, only not analyzed in a similar way as online, where even the movement of the mouse is monitored, memorized and reported:
“They can study our patterns of consumption, anticipate our appetites, and entice us to spend money.”
The chapter expert is Rayid Ghani; “a personal tutor for the idiot servants we know as computers,” to solve complications of the shopper’s profile. “Loyalty cards” can increase profits by looking at our previous shopping patterns and advise the fastest route. This way, the 11 percent of items we on average forget during shopping but which we were intended on buying, now are bought. Moreover, the supermarket manager can manipulate by lowering the prices, promoting certain products through coupons and thereby targeting a certain categorized group towards the targeted product. Worth mentioning is the imposed restriction of people on a budget, who probably should not get promotions concerning a discounted product, for every dollar spent on the discount product, can no longer be spent on a full-prices product, which hurts profits. Of course this greatly disadvantages the consumer. In the second part of the second chapter Baker finally manages to provide some in-depth analysis besides giving straight forward examples of the same point he already made in the introduction. Kumar distinguishes shoppers into buckets, barnacles and butterflies. The individual, the “me” is not so much retraceable in the data. We as consumers are categorized, put into categories with other consumers with the same shopping behavior; there is no “customized marketing.” It is not so much about who we are, but how we behave, our “consumer DNA,” comprised out of all the different variations of buckets. Our age, gender, ethnicity does not even matter in all this, it is simply our consumption behavior. Baker predicts how the numerati, through the growing importance of data on behavior, will be able to solve cross-disciplinary problems, as well as solve problems because the solution is applicable to various disciplines. David Heckerman worked on anti-spam system for email but was also able to apply in the battle against HIV / AIDS. Computers can be learned how to categorize by being educated by people who fill out forms of whether something is “sporty” or “business” or in what degree. This way, the computer learns an additional information tag. This is thus also another tasks handed to The Numerati, learning computers how to learn. Throughout the book Baker stresses the complexity of these analyses which are programmed into algorithms.
In the third chapter, “voters,” Josh Gortbaum enlightens us on how within politics those “consumed” into politics, are also the ones that make up the party program under the assumption that everyone holds an equal interest in the political domain similar to theirs. The US Republicans therefore have started with micro targeting the voters, refreshing the issues and focusing on “desires closer to the heart than to the head.” This way, in an US election where 1 or 2 points can be decisive, an indication of “swing voters” can ensure victory. This is all the more interesting now that big voter determining categories such as Race our neighborhoods are falling apart, therefore a more precise collection of data must be sought and found by mathematicians to ensure a correct prediction. The numerati are not so much about accurateness or truth, but about mapping categories which will become more specific as more data will be available in the future. They already “triumph if they come up with better, quicker, or cheaper answers than the status quo.”
“As the numerate develop tools to model voters and measure the effectiveness of campaign spending – its yield in economic terms — political parties will able to look at each election as a marketplace.” The votes that can make the difference are of course worth the most.”
In chapter four “blogger,” Baker elaborates on the “blogosphere” as a source of our most personal data. Everyone is usable, whether they run the world of blogs or read them, they all give off information.“Now that people […] are publishing their feelings about a host of products, it’s as if a universe of focus groups is forming online.” However, to manually scan all this information is for us as human beings impossible. So marketers leave this to computers. Chapter expert Howard Kaushansky, president of Umbria, a company which has computerized blog reading to millions a day, comments by stating that: “We turn the world of blogs into math.” (Thank you very much Stephen Baker, this is the 83rd time that you have made this point) He goes on to say how consumers are categorized in simple premises as “like” or “dislike,” which is very useful in finding out if a commercial campaign has actually worked in launching a new product, something that used to be immeasurable in the good old analog days. The striking thing about blogging is that as a result of its importance, The Numerati are now learning their computers “to digest our words automatically,” marking “a new stage of in market intelligence.” Baker displays his genius to us once more after meeting Nicolas Nicolov when he finds out the importance of this thing called “context,” which of course adds up to the difficulty of the algorithm. “If a laptop is big, it’s negative, but if a hard drive is big, it’s good.” Simply brilliant Stephen, thank you for you wisdom. Spam blogs, or “splogs” are however an interesting issue. These are trash blogs created by machines in order to “entice Google’s robots to drop advertisements onto the page.” Today these kinds of forgers are countered by key generators etc. This problem is solved by connecting blogs to each other. Each topic is an intereconnected “constellation,” if they’re not connected to other related blogs, they will be marked as spam. Consumers and bloggers alike it is not about the individual, it is about their opinion uttered in the post which than acts as a survey for data miners.
In chapter five, “the terrorist,” Baker meets up with Jeff Jonas, in order to find out how a society can monitor itself yet remain free and uninhibited, even sinful, by using the tools of the numerati. This Chapter’s numerati is James Schatz, NSA’s chief mathematician. Because the US lacked accurate data of ground intelligence, they fought them electronically, through the internet. One billion dollar gave “data miners” the opportunity to explore one giant database of CIA and FBI intelligence combined in order to filter out the terrorists among the US citizens. However, for terrorist mapping, the dangers are, unlike with marketing, much more severe in the light of wrong accusations, which especially in patriotic America, is a very serious business when it comes to terrorism. But because the national fear has been fueled greatly by the aftermath of 9/11 when Baker wrote the novel, “terrorist tracking” does take place through data mining. Bakers distinguishes three problems in this process:
1. There’s a lack of historical background, without which little can be said if current affairs can not be reflect against past behavior.
2. Suspected terrorist do their best to wipe out their internet trails, leaving as little digital trace as possible.
3. Thirdly “failure in this realm of data mining can destroy lives.”
For the numerati to figure out who the “outliers” are, they first have to establish what is “normal;” the NSA, especially after the fall of the Berlin wall, had to “figure out humans.”
This is actually the first chapter that Baker gets critical, reflecting on the failure of the Government to filter out the terrorists of 9/11. He justly argues that the terrorists involved in 9/11 were “hiding in plain sight” booked under own names, were already on a government watch list, and were published in address and phone books. Nevertheless does the battle to close the intelligence gap continue: “In essence, we compensate for our shortcomings in languages and on the ground intelligence with a heavy dose of unproven technology.” Like in every chapter Baker ends with an extremely vague future perspective on where “all this” is going, by stating Jonas: “We technologists had better spend a little more time thinking about what we’re creating.”
In the sixth chapter “patient,” Baker explains the practice of Eric Dishman who “is working feverishly to replace the fog, forgetfulness, and wishful thinking of human memory with minute by minute updates pouring in from electronic sensors.” Furthermore, Dishman develops machines that monitor people’s amount of medicine, pulse etc. in order to spy on ourselves to liver “healthier, happier and longer lives.” Due to the unfortunate busy schedule of most doctors, it is up to the Numerati to filter all the data, through math. Personal algorithms need to be realized in order for the system to work properly. Dishman’s mission is to use constant monitoring to change the very nature of health care; into a continuous health surveillance, going from crisis response to prediction. Baker comments on the future of medical data modeling as well, just like in the previous chapter. He hopes that in an age of exploding medical data and analysis, that an individual should consider himself lucky to be able to pay for “the privilege of remaining to one degree or another in the dark.”
Baker embarks in a social experiment with himself and his wife in the final and seventh chapter “lover,” to see whether they can be matched through a dating site. Within the data mining conditions, Baker than asks how scientists break down love “that can be fitted into a statistical hierarchy?” He explains how the love industry fits conditions of love into algorithms. The Numerati are even active in the field of love. Baker however does notice how a lot of profiling within the love sector is mere self-promotion wherein the Numerati are “only collaborators.”
The Numerati was in fact merely a story Baker wrote for Business Week, which developed in a cover story which inspired Baker to expand it into a book. Despite the various interesting insights given trough a number of interviews with various renowned experts, the book is not really more than an extensive two hundred and sixteen paged article. His material is constant, continuous; during the book you feel like you are constantly on the same page with the author. After an extensive explanation of the numerati phenomenon in the introduction, the rest of book fails to compliment the thesis and only slightly triggers the reader’s wit through a series of somewhat critical questions on morality, privacy and future perspectives. In short: very little new light is shed on the topic through the book; all interviews are with people that according to Baker already are in fact numerati, or are in the data mining business, so logically they all tell the same story.
What baker does throughout the novel is apply one and the same trick, he looks at the aspects in society where data mining is most effective, applicable and already in progress and consequently tests his theory of the numerati by people that suffice to the profile of a numerati person. In this way, Baker’s theory is never proven wrong, nor has it any critcal edge to it, as all questions are posed towards Numerati-like characters that defend their own position as data miners.
For someone who is able to write an entire book on the collection, processing and analysis of data, it is worrying that despite his interviews with intriguing and prominent figures of the numerati circle, the author himself lacks to put these valuable resources to good use. In every chapter Baker poses the same insignificant questions that eventually lead up to a very obvious conjecture of the numerati’s influence, repeated constantly and lacking any imaginative strength nor does he succeed in thinking beyond just the manipulation on data by blog spam. An example hereof could be is that when The Numerati actually do base everything on numbers, but these numbers are forged, than society or even an individual’s position can be manipulated, favored to their own profit, reality will shift. Furthermore, when taking into consideration the theory of the butterfly effect; the flapping of a butterfly’s wing in Brazil could cause a tornado in the US, it would seem highly doubtful that there are computer that can map this entire process and take all of this in account.
All in all, the magazine writing style makes it a gentle, light and easy read. For any academic who wants to start their research in the field of data mining, this book will serve as a fine introduction.