Truthy: Policing Misinformation, One Meme-ing Tweet At a Time

On: October 18, 2010
Print Friendly, PDF & Email
About Ekaterina Yudin
A New Yorker. An entrepreneur. A New Media Master’s student at the Universiteit van Amsterdam. A media and film junkie, intrigued but apprehensive of our digital future. A curious explorer, visualizer, and wanderer of the ever-evolving and innovating world and web. A skier, scuba diver and lover of all outdoor adventures. A happy cyclist and supporter of good public transportation. A live music enthusiast. A sticky rice and mango addict.


“Swiftboaters beware!”

The battle to control Congress is on and this election year the truth is about to get Truthier. Twitter – the social media network, twenty-four-hour news site, conversation and blogging platform, wedding and death announcement site, gossip patrol, and the general online information center that connects all of us across the world — is also the new frontier in political campaigning.

Just in time for yet another season of fabrications that usually beleaguers the polarized and partisan debates and elections in the United States, social media and science are attempting to come to the rescue – in real time. The internet has already become a massive organizing tool for people around the world on various issues – from politics to humanitarian fundraising — and this year it will also double as a real-time lie-detector with the help of the Twitter API and the enormous amounts of data flow it garners. Now that mid-term elections are less than three weeks away, the just launched Truthy infographic project, website and algorithm is here to weed out the enormous amount of astroturfing and flat out lies that spread like viruses on the web —particularly memes that are propagated through twitter (and eventually through Google search as well because of the way rankings operate and because Google’s real-time results are often generated from tweets).

A twitter based research tool, Truthy (the project gets its name from a term pumped up by Stephen Colbert, “Truthy” being lies calculated to deceive, because they take some nugget of truth and twist it into an unrecognizably muddy form [Cliff Kuang]) is a combination of data mining, social network analysis and crowdsourcing that was conceived by a group of researchers at Indiana University’s Center for Complex Networks and Systems Research to help uncover deceptive tactics and misinformation that will surely crowd the world wide web leading up to the Nov 2nd elections. According to Filippo Menczer, who specializes in the modeling of meme explosions and is also one of the researchers heading the project, the Truthy system will be evaluating thousands of tweets an hour to identify new and emerging bursts of activity around memes of various flavor — with today’s flavor being politics. By using the Truthy system to analyze and map the diffusion of information on Twitter, the data and statistics provided will aid in the study of social epidemics to see just how memes propagate through the Twittersphere and exactly what causes a burst of popularity.

Though still in its early stages of research, with this information in hand, Truthy will ideally be able to detect the difference between memes that arise organically versus memes that are engineered as public relations campaigns across industries. But the Truthy algorithm can not do this alone and to help identify suspicious memes the system is being trained and “supplemented” with the power of the people, by leveraging crowdsourcing and having users help flag suspicious hashtags (that often become memes) – with the simple click of the Truthy button.

How Truthy works:

Once a user report a Twitter hashtag as ‘truthy’, web crawlers at Truthy begin to start following the hashtag, tracking who is retweeting it, and also using language algorithms to gauge the sentiment (angry or depressed). Other stats that become available are who the most common retweeter is, the most influential retweeter, and how long the retweeters have been active on twiter. Conveniently enough, all of these stats (for each meme) are presented in a dashboard on the Truthy site (Cliff Kuang).

The visualization of all this data should makes it much clearer to understand how memes spread on Twitter and see a pattern in how a particular tweet evolves, how it starts in the first place and who is doing the distributing — making it that much easier to track artificially created memes that are often coming from fake accounts. (For example, see the visualization below for Top Conservatives on Twitter).

Truthy could have certainly been useful last year when a Twitter bomb campaign was led by the American Future Fund, a known conservative group awash in money from hidden sources and also an active player in this coming fall’s elections. The group set as its target for attack Martha Coakley, the Democratic candidate, who very possibly lost the senatorial seat following the nine fake Twitter accounts that were set up by the organization in the early hours of election day (AFF managed to send out 929 smear tweets in two hours and reach 60,000 people before Twitter realized it wasn’t a grassroots campaign but spam (SocialTimes). But at least the incident inspired Metczer with the idea for a website that would track Twitter feeds and sniff out such lies.

A lie-detector like Truthy, as it further develops and becomes more accurate in sniffing out lies, seems like an inevitable addition to the media revolution sweeping the world today. Much like television and radio changed the flow and pace of information to the people and influenced decision-making, especially around election-time, social media tools like Twitter are in their prime today to serve the same industry-changing purpose.

As Clay Shirky points out in his Ted talk, media has native support for groups and conversations at the same time, and as old media get digitized, internet becomes the mobile carriage for all other media. The information we find on Twitter (and everywhere else across the web today) is still accumulated from multiple media sources — tv news clips, newspaper headlines, radio links, live-streams, etc. — it’s just that Twitter has become the new medium and site of coordination for all this information circling the web, where people have their platform for exchange of ideas and conversation. It is great that we can now also be producers rather than only consumers of media, and that we have an environment for convening and supporting groups of different ideas (Enzensberger would surely be satisfied with this open and networked reality). However with us becoming contributors, citizen-powered news that ripples through the internet like wildfire can be a danger to real life by impacting results with fabrications that are often hard to keep track of. One can easily see that as more information is pumped into our social media networks (such as Twitter, and increasingly Facebook)  — the untrained and non-critical eye can easily mistake leading headlines and trending topics as the truth, just as millions of people believe what they read in tabloids, which are known to sensationalize crime, gossip and scandals.

As we are already keenly aware of this, and as is further pointed out by danah boyd, prejudice, intolerance, bigotry, and power are all baked into our [social] networks. In a world of networked media, it is easy to not get access to views from people who think from a different perspective. I concur – you can go on Twitter and only be ‘streamed’ the information you have filtered out to receive — in a political example, from Republican sites, party members and organizations, versus the whole scope of opinions and news feeds. Beyond simple feeds of information is the aspect of trending topics – a topic that has gained traction in a segment of the network to broader awareness, very often out of context. And once a trending topics triggers reaction, it is really hard to get meaningful dialogue going across (boyd).

It will be really fascinating to see how Truthy evolves as it moves past filtering politics and becomes an application to tell the legitimacy of ANY information you find on the web. With Twitter as one of the main aggregators of real-time data and talk across the globe, a truth-tool layered with a visualizer for the massive amount of data it combs will become indispensible to all facets of life.

It is about time the Internet steps up to web pollution, even if it is one tweet at a time — for now.



boyd, danah (UX Magazine, February 2010)

Gladwell, Malcolm (New Yorker, October 4, 2010)

NPR Science Friday (October, 8 2010)

Shirky, Clay (Ted@State, June 2009)

Kuang, Cliff (Fast Company Design, September 2010)

Comments are closed.