Facebook’s ‘Rosetta’ System: A Form of Safety Implementation or Data Exploitation?

By: Layal Boulos

On: September 23, 2018

Comments »

About Layal Boulos

By Privacy International

Facebook has announced its latest technological development, code-named Rosetta, which detects and recognizes texts in images, as well as videos. They claim that it’s for the benefit of the social network community because of the safety features it provides but what about the negative implications that come with it, which the majority seems to be ignoring?

What is Rosetta and How Does it Work?
Rosetta’s retrieval process occurs via an advanced system called Optical Character Recognition (OCR), which recognizes textual information using progressive software. They claim that the machine learning system, Rosetta, is capable of extracting text from “over a billion images and video frames in a wide variety of languages” (Moon).

The text extraction occurs through two processes. The first step is text detection, where they identify rectangular areas on the image that supposedly consist of text.

Rosetta detecting texts on images

The second step is text recognition, where a convolutional neural network (CNN), a machine learning unit algorithm to analyze data, is used to decipher words in the rectangular regions (Sivakumar et al.).

What Does Rosetta Offer to the Social Network?

According to several articles reporting on Rosetta’s functions, what seems to be the most attention-grabbing headline is to highlight that it can read memes. This makes it seem as though reading memes is Rosetta’s divine purpose, but is it?

Facebook declares that Rosetta has several other practical uses such as, enhancing users’ experiences when searching for relevant photos and making the platform user-friendly for the visually impaired (Sivakumar et al.). Essentially, Rosetta would allow Facebook to signify inappropriate content and keep the network safe by distinguishing things like hate speech and avoiding its dissemination (Matney). This is heavily necessary due to our generation’s great use of social media as a political tool of influence and opinion chanting.

What about the Negative Implications on our Society?

It appears as though most of us seem to be blinded by the positive aspects of Rosetta. But what about the other side of the coin? It’s best to put it as Neil Postman did: “Is it not possible that behind the noon-day brightness of technological ingenuity there lurks something dark and sinister, something that casts a terrible shadow over the better angels of our nature?” (Postman).With that said, since one of the most concerning aspects of machine learning is its interference with user privacy, couldn’t Rosetta be just another way of extracting our data to enhance targeted advertising? (Dickson)

In 2016 Facebook developed ‘DeepText’, a deep learning text understanding engine. It’s capable of understanding with “near-human accuracy the content of several thousand posts per second” in a range of 20 different languages. Ironically, Facebook claimed that it “helps improve people’s experiences” and removes unwanted content such as spam, hate speech, and fake accounts (Abdulkader et al.). Doesn’t that sound familiar?

Facebook repeatedly stated that the perks of using such advanced technology is to keep the community safe. So, you would expect there to be more surveillance on such issues since DeepText was discovered two years ago but this hasn’t been the case. One of the most recent examples is Facebook’s contribution to the violence outbreak in Myanmar. It was heavily aided by hate speech, leaking of private data and the spread of fake news against the Rohingya Muslim minority through the platform (Safi).

We surely cannot ignore the fact that Facebook has been releasing transparency data reports and acknowledging the removal of harmful content to a certain degree. However, other statistics highlight that their fake profile levels were at 7% in 2016, while in 2017 that number doubled to 14% (Hutchinson). Consequently, it doesn’t seem like Facebook’s main concern with the creation of such systems is to eradicate harmful content.

Instead, what we have seen since 2016 is a spike in Facebook’s advertising revenue. Between 2015 and 2016 the revenue increase is $9,806 million, which approximately equals to the combined revenue of the first four years illustrated (“Facebook’s”). While there certainly might be other factors contributing to the significant rise, with DeepText’s capabilities of understanding user preferences and the context of the words they’re typing it could’ve led to a better data analysis . As a result, this would’ve caused the creation of better personalized ads, contributing to an overall increase in advertising revenue. What does this tell us about the combination of DeepText and Rosetta’s potential to boost these numbers in the near future?

Year	Advertising revenue in USD (millions)	Annual increase in revenue
2009	764
2010	1,868	1,104
2011	3,154	1,286
2012	4,279	1,125
2013	6,986	2,707
2014	11,492	4,506
2015	17,079	5,587
2016	26,885	9,806
2017	39,942	13,057

Facebook’s advertising revenue worldwide from 2009 to 2017 (in million U.S. dollars)

What can we expect from Rosetta in the Future?

Ultimately, since Facebook has failed to comply with DeepText’s statements of safety implementation, I can’t help but question if the same will apply to Rosetta. Only four months ago the vice president of Facebook’s product manager stated that technologies like artificial intelligence are still “years away” from being able to effectively remove harmful content (Perez). Therefore, coming up with Rosetta and declaring its body-guard-like functions so soon after such a statement has been made, makes me question the extent of reliability that we should have on such a technology.

The emphasis on Facebook being “a capitalist corporation focused on accumulating capital,” highlights the use of social media to extract user’s data for economic benefits (Fuchs 148). Manufacturing Consent indicates that traditional mass media sold their readership to advertisers, turning them into the product (Herman and Chomsky 303). In contrast, with the development of new media, the algorithmic culture allows Facebook to sell users as products as well (Joler et al.).

Co-operations have been known to use machine learning techniques, just like Rosetta or DeepText, to gain awareness into the formation of data (Witten et al. 13). Therefore, this allows an establishment like Facebook to maximize economically, as they are gaining further insight into the user’s behavior to accomplish a goal of creating better targeted ads to increase ad rates (Yeung 3).

We’ve had software for years that violate ethical standards due to the constant data collection without the full acknowledgement of user’s, but with the rapid advancement of technology in the 21st century it amplifies the possibilities of using such systems for the betterment of an organization (Allen et al. 52). Thus, increasingly raising ethical and privacy concerns in terms of collecting personal data for marketing purposes (Mitchell 6). Will there ever come a time when we’ll be put first? With Rosetta in the picture, it seems as though the answer remains uncertain.

Works Cited

Abdulkader, Ahmad, et al. “Introducing DeepText: Facebook’s text understanding engine.” Facebook Code, 2016. Accessed 13 Sep. 2018. https://code.fb.com/core-data/introducing-deeptext-facebook-s-text-understanding-engine/

Allen, Colin, et al. “The Importance of Machine Ethics.” Machine Ethics. Eds. Michael Anderson and Susan Leigh Anderson. Cambridge: Cambridge Univ. Press, 2018. pp. 47–62. Google Books. Web. Accessed 17 Sep. 2018.

Dickson, Ben. “The darker side of machine learning” TechCrunch, 2016. Accessed 19 Sep. 2018. https://techcrunch.com/2016/10/26/the-darker-side-of-machine-learning/

“Facebook’s Advertising Revenue Worldwide from 2009 to 2017 (in Million U.S. Dollars).” Statista.com, The Statistics Portal, Jan. 2018, Accessed 17 Sep. 2018.
https://www.statista.com/statistics/271258/facebooks-advertising-revenue-worldwide/

Fuchs, Christian. “An Alternative View of Privacy on Facebook.” Information, vol. 2, no. 1, Sept. 2011, pp. 140–165. MDPI, doi:10.3390/info2010140.

Herman, Edward S., and Noam Chomsky. “Conclusions.” Manufacturing Consent: the Political Economy of the Mass Media. New York: Pantheon Books, 2002, pp. 297–307. Google Books. Web. Accessed 21 Sep. 2018

Hutchinson, Andrew. “Facebook Outlines the Number of Fake Accounts on Their Platform in New Report.” SocialMediaToday. 16 May 2018. Accessed 19 Sep. 2018. https://www.socialmediatoday.com/news/facebook-outlines-the-number-of-fake-accounts-on-their-platform-in-new-repo/523614/

Joler, Vladan, et al. “Quantified Lives on Discount.” SHARE LAB, Share Foundation, 17 Mar. 2017. Accessed 8 Sep. 2018. https://labs.rs/en/quantified-lives/

Matney, L. “Facebook’s Rosetta system helps the company understand memes.” TechCrunch. 11 Sep. 2018. Accessed 12 Sep. 2018. https://techcrunch.com/2018/09/11/facebooksrosetta-system-helps-the-company-understand-memes/

Mitchell, Tom M. “The Discipline of Machine Learning.”Google Scholar. 2006, Carnegie Mellon University. Accessed 18 Sep. 2018. http://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf

Moon, Mariella. “Facebook’s Rosetta AI Can Extract Text From A Billion Images Daily”. Engadget, 2018. Accessed 14 Sep. 2018. https://www.engadget.com/2018/09/11/facebook-rosetta-ai-translation/?guccounter=1

Perez, Sarah. “Facebook’s new transparency report now includes data on takedowns of ‘bad’ content, including hate speech.” TechCrunch. May 2018. Accessed 19 Sep. 2018. https://techcrunch.com/2018/05/15/facebooks-new-transparency-report-now-includes-data-on-takedowns-of-bad-content-including-hate-speech/

Postman, Neil. “The Humanism of Media Ecology.” Proceedings of the Media Ecology Association, Volume 1. Inaugural Media Ecology Association Convention, 16 Sept. 2018, New York, Fordham University, www.mediaecology.org/publications/MEA_proceedings/v1/humanism_of_media_ecology.html

Privacy International. “Data Exploitation.” YouTube. 22 Feb. 2017. Accessed 17 Sep. 2018. https://www.youtube.com/watch?v=8CKJtfLV6HU

“Rosetta Detecting Texts on Images.” Facebook Code, 11 Sept. 2018, https://code.fb.com/ai-research/rosetta-understanding-text-in-images-and-videos-with-machine-learning/

Safi, Michael. “Revealed: Facebook Hate Speech Exploded in Myanmar during Rohingya Crisis.” The Guardian, Guardian News and Media, 3 Apr. 2018, https://www.theguardian.com/world/2018/apr/03/revealed-facebook-hate-speech-exploded-in-myanmar-during-rohingya-crisis

Sivakumar, Viswanath, et al. “Rosetta: Understanding Text In Images And Videos With Machine Learning”. Facebook Code, 2018. Accessed 13 Sep. 2018. https://code.fb.com/ai-research/rosetta-understanding-text-in-images-and-videos-with-machine-learning/

Witten, Ian H., et al. “Introduction to Data Mining.” Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. Amsterdam: Morgan Kaufmann, 2017. 3-38. Google Books. Web. Accessed 18 Sep. 2018

Yeung, Karen. “Algorithmic Regulation: A Critical Interrogation.” Regulation & Governance, 2017, pp. 1–19. Wiley Online Library, Web. Accessed 17 Sep. 2018 doi:10.1111/rego.12158.

Tags: advertising, data privacy, facebook, Machine Learning

Comments are closed.