On: October 3, 2021
Print Friendly, PDF & Email
About Issaka Adams

Big Data is not the same as Good Data. Credit: https://www.youtube.com/watch?v=SrTncz6qCKU


Using data from digital platforms to produce knowledge is booming in academic circles (Boyd  & Crawford, 2012). It is an advantage afforded by the internet and various media on it. However, with the opportunity comes risks. One of such risk is fake accounts. While platforms owners have promised to remove such accounts, there is no denying that the danger still lurks, and knowing it would allow researchers who use data on these platforms to produce knowledge to be able to minimise the danger, while maximising the advantage for optimal end results.

This article thus draws attention to the risk fake accounts on digital platforms pose to researchers who rely on data from these platforms to produce knowledge. Using Lisa Gitelman’s article “ Raw Data ” Is an Oxymoron, and  Danah Boyd  and Kate Crawford work, Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon as base texts, the article first introduces a short conceptual framework on the theory of Affordance before running a summary of the two texts. It then uses fake accounts on Facebook to contextualise the problem, ending with a brief remark.

What Digital Platforms Afford Researchers?

The theory of affordance has been used in the humanities, the social sciences, science, and technology studies (Evans, Pearce, Vitak, & Treem, 2017; Nagy & Neff, 2015). This popularity has generated theoretical and philosophical debate about the meaning of the term (Davis & Chouinard,2017). But for this purpose, affordance is “what things furnish, for good or ill” (Gibson, 1966, p. 285). Gibson delineates further by using the environment and animal, stating “The affordances of the environment are what it offers the animal, what it provides or furnishes, either for good or ill” (Gibson, 1979, p. 127). Thus affordance works in a relationship between a subject and an object. The subject acts but not independent of the object. What the subject gets from the action is what the object permits. This means a limitation on the subject as it has no choice than to accept what is provided or refusal which would mean no goal achieved. Digital platforms have owners. The owners build platforms to first satisfy their interest. This means researchers using products from these platforms are getting what is available. They are not privy to the complex processes that happen before data is generated. The underlining point of conceptualising the problem this way is for researchers to be aware the limitation—just like the driver on the highway, the car is steered in the direction the road permits.

 Why Raw Data is an Oxymoron and Why We Should Ask Critical Questions on Big Data

As we have seen from the brief explication of Affordance, there are constraints  on researchers using online data. Researchers only get what is available. They have less control on the processes that generate the data. This section brings to the fore the nature of these constraints relying on the works of Lisa Gitelman (2013) , and  Danah Boyd  and Kate Crawford (2012). Gitelman critiques the dominant blind narrative, especially among quantitative digital researchers that data “are transparent, that information is self-evident, the fundamental stuff of truth itself” (Gitelman, 2013,p. 3). She re-echoes Lev Manovich (2002) assertion that “ Data [do] not just exist”, and they have to be “ generated. ” This is an ontological critique. If data is generated through the processes of complex human and machine interaction as it is the case in digital research, it is difficult to assume a simplistic position that is it without weakness.

Boyd  and Crawford point out the frenzy belief that Big Data is the “holy grail” of modern society by asking “Will large-scale search data help us create better tools, services, and public goods?” (Boyd  & Crawford, 2012,p.662). Indeed, this is a complex socio-technical problem that cannot be answered easily. Using Twitter as an example, Boyd  and Crawford reveal how Big Data could be flawed before it even get into the hands of researchers to be used. They write “Some users [on Twitter] have multiple accounts, while some accounts are used by multiple people.  Some accounts are ‘bots’ that produce automated content without directly involving a person” (Boyd  & Crawford, 2012, p. 669).  In fact, this revelation, no matter how trivia one thinks, it is a crucial factor in determining the reliability of a research produced from such data. It means we would have data produced by people who are not at all the intended respondents. This obviously gives us flawed results in the end. In the next section we would see in detail this problem on Facebook.

 Billions of Fake Accounts on Facebook

Int this year’s second quarter report of Facebook, the company said it has garnered daily users of up to “1.91 billion on average for June 2021, an increase of 7% year-over-year” (Facebook, 2021). The numbers look impressive but the crucial question to ask is are all these users the ideal of one user one account? Facebook estimates 5% of its monthly users are fake, but some sources say the figure could be as high as 20% (Nicas, 2020). This is exactly the contention of Boyd  and Crawford (2013) when they mentioned “‘bots’ that produce automated content without directly involving a person”.

In 2019, Facebook’s Transparency Report revealed it took down 5.4 billion fake accounts, and in 2018 it deleted 3.3 billion accounts (Davis, 2019). In this year’s second quarter report, Facebook said it deleted 1.7 billion fake accounts, admitting the problem is becoming complex due to the sophisticated nature of the adversaries behind them (Facebook, 2021). This the real problem Facebook faces—Detecting fake accounts. It is not a problem to be solved easily.

Writing in the Conversation, Jeanna Matthews, professor of Computer Science at Clarkson University explains that for even her as an expert, it is difficult to detect some of these fake accounts on Facebook, as they look as real as real users (Matthews, 2020). It is clear then that the issue is not just trivia with no significance. It is pervasive that creeps silently on digital platforms with mass of users like Facebook. Of course this is an issue Facebook would not be comfortable talking about. But for researchers that Facebook affords as a platform for collecting data to produce knowledge, this an important issue to keep a close eye.

Concluding Remarks

So far, we have seen what digital platforms offer researchers—offer them what is available. Lisa Gitelman’s article “ Raw Data ” Is an Oxymoron, and  Danah Boyd  and Kate Crawford work Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon showed data from digital platforms have to be scrutinized before engaging them in knowledge production. The number of fake accounts on Facebook and an expert view on their sophisticated nature revealed an elusive situation that poses great challenge to data from digital platforms. This paper thus encourages researchers to be critical and be aware of the shortcomings of digital data, before using them. Alternatively even, researchers can scale down digital data to the point where if it is possible to contact individuals who produced them. For example, exploring the affects of Facebook algorithms on users,  Taina Bucher (2015) tracked the conversation on Twitter before contacting those involved individually for interviews. This ensured data from real people were collected for the study.  

Reference List

Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenonInformation, communication & society15(5), 662-679. Doi: 10.1080/1369118X.2012.678878

Bucher, T. (2017). The algorithmic imaginary: exploring the ordinary affects of Facebook algorithms. Information, communication & society20(1), 30-44. DOI: 10.1080/1369118X.2016.1154086

Davis, J. L., & Chouinard, J. B. (2016). Theorizing affordances: From request to refuse. Bulletin of Science, Technology & Society36(4), 241-248. Doi: 10.1177/0270467617714944

Evans, S. K., Pearce, K. E., Vitak, J., & Treem, J. (2017). Explicating affordances: A conceptual framework for understanding affordances in communication research. Journal of Computer-Mediated Communication, 22, 35-52.

Gibson, J. J. (1966). The senses considered as perceptual systems. Boston, MA: Houghton Mifflin.

Gibson, J. J. (1979). The ecological approach to visual perception. Boston, MA: Houghton Mifflin.

Gitelman, L. (Ed.). (2013). Raw data is an oxymoron. MIT press.

Manovich, L. (2002). The language of new media. MIT press.

Nagy, P., & Neff, G. (2015). Imagined affordance: Reconstructing a keyword for communication theory. Social Media + Society, 1(2). Doi: 10.1177/2056305115603385

Website Sources

(2021, July 28). Facebook Reports Second Quarter 2021 Results. Investor.Facebook.Com https://investor.fb.com/investor-news/press-release-details/2021/Facebook-Reports-Second-Quarter-2021-Results/default.aspx

Nicas, J. (2020, December 8).  Why can’t the social networks stop fake accounts? The New York Times. https://www.nytimes.com/2020/12/08/technology/why-cant-the-social-networks-stop-fake-accounts.html

Davis, M. (2019, November 20). Billions of fake accounts: Who’s messaging you on Facebook?. Big Think. https://bigthink.com/the-present/facebook-banned-accounts/

Matthews, J. (2020, June 24)   How fake accounts constantly manipulate what you see on social media – and what you can do about it. The Conversation.  https://theconversation.com/how-fake-accounts-constantly-manipulate-what-you-see-on-social-media-and-what-you-can-do-about-it-139610

Comments are closed.