Data-Dumped: Understanding Data (and Being Understanding) in the Time of Ashley Madison
Ever since the hacker group, self-dubbed the ‘Impact Team,’ published 9.7 gigabytes of information from an estimated 37 million Ashley Madison users on the dark web, the extramarital dating site has been an unavoidable topic of discussion. It seems everyone, from Reddit users to security experts, has added to the conversation, weighing in on the privacy, liability and moral issues behind the leak on August 21st, 2015 (Wired.com).
In a statement placing moral blame on Avid Life Media (ALM), Ashley Madison’s owner, for the leak, the hackers cautioned us to keep in mind that the cheating site is “a scam with thousands of fake profiles” and that any data not signed with Impact Team’s PCG signature is fake. Beyond that, it was left to the web at large to decipher and make sense of the sensitive information exposed. As the names, emails and transaction information of users were pulled from the raw data and made available for download on torrent-sharing sites, the ensuing media storm raised the question: whose responsibility and right is it to analyze and aggregate data when it is so delicately linked to the personal lives of the people it exposes? And is it fair that the hackers’ moral vigilantism extends to releasing the data, but not to offering any real clues on how to understand it?
Making the illegible… more illegible?
In the hours and days following the data dump, a series of websites sprung up allowing users to search if their account – or more dauntingly, the account of a loved one – had been compromised in the breach. Rather than download and scour through the files on their own, users can enter email addresses into a search bar to quickly find out if their privacy has been compromised.
While some of these sites, including Trustify, asks users to verify that it is their own email address being searched, other sites immediately inform curious visitors if an email address is listed in their files.
As pointed out by The Awl’s John Herrman, Ashley Madison doesn’t verify all of its account information – meaning an email address could be linked to an account without the knowledge of its owner. Perusing the data dump in the quest for clarity may just bring forward more turbulence and confusion.
As addressed in Impact Team’s statement, just because someone made an account doesn’t necessarily mean they were sexually unfaithful – but on a website whose motto reads “Life is short. Have an Affair,” the mere illusion of an account may be enough to incriminate someone both personally and professionally.
This is, perhaps, the most problematic part of our response to Ashley Madison data dump: that in making users’ email addresses more accessible to the public, we demonstrate a reckless irreverence for the details of the stories and incentives behind each account. We are experiencing, in a sense, the paradox articulated by the scholar Erik Davis that “our society has come to place an enormous value on information, even though information itself can tell us nothing about value” (Davis 82).
The issue is, in part, one of context. By framing the security breach as the release of ‘raw data,’ we suggest that the information provided is unfiltered or unaltered by a mediating force. But as stated by Lisa Gitelman and Virginia Jackson in their Introduction to “‘Raw Data’ is an Oxymoron,” data is hardly neutral. Rather, “data need to be imagined as data to exist and function, and the imagination of data entails an interpretive base” (Gitelman 3). To understand the sensitivity and precariousness of the Ashley Madison data dump, we need an “enriched information literacy,” and the analytical toolkit to properly understand and contextualize aggregated data within such a sensationalized new story (Van Dijck 586).
Combining Accessibility with Ethics
Troy Hunt, the security expert behind https://haveibeenpwned.com/, a site that allows users to search if their privacy has been compromised in any major security breach, states that with the Ashley Madison scandal there’s “no escaping the human impact” of the data released. As such, he put special measures in place to protect victims from further exposure – current verified users of his site will get a direct email notification if their data was exposed, and new members will need to verify their account before checking. As for anonymous users of his site, Hunt introduces a new data classification, what he calls “the concept of the sensitive breach.” In an effort to diminish the public lynching possible in this type of news story, “sensitive data” will not be accessible publicly to anonymous users of his site the same way that less personal information has been in the past.
Following Hunt’s lead, hopefully we can remember the subjective, sensitive nature of the information we are dealing with, and better equip ourselves to discuss and understand the social, emotional and technological challenges of an internet where our secrets will never feel quite as safe again.
Sources Cited and Consulted
Davis, Erik. 2004. TechGnosis: Myth, Magic + Mysticism in the Age of Information. New York: Serpents Tail, pp. 81-92
Gitelman, L. (ed.) 2013. Raw Data Is an Oxymoron. Cambridge: MIT Press. Introduction chapter. pp.3
Herrman, John. “Early Notes on the Ashley Madison Hack.” Michael Macher, 18 Aug. 2015. Web. 14 Sept. 2015.
Hunt, Troy. “Here is How I’m Going to Handle the Ashley Madison Data.” Troy Hunt, 29 July. 2015. Web. 14 Sept. 2015.
Van Dijck, José. 2010. “Search Engines and the Production of Academic Knowledge.” International Journal of Cultural Studies 13 (6): 574–92. doi:10.1177/1367877910376582.
Zetter, Kim. “Hackers Finally Post Stolen Ashley Madison Data.” Wired.com. Conde Nast Digital, 8 Aug. 2015. Web. 10 Sept. 2015.