Repurposing the Wikiscanner: Comparing Dutch Universities’ edits on Wikipedia
<update>This project got quite a few positive responses, and many universities were interested in their own ‘profiles’. But the best was an endorsement by Virgil, who created the WikiScanner. He’s now made the Wikiscanner mashable, making this kind of research easier to do! – Michael 19.10.07</ update>
Under the banner of the Digital Methods Initiative, Erik and I have been working on a project called Repurposing the Wikiscanner. The following is an introduction to the project and the first of two case studies: this one deals with the presence of Dutch universities on Wikipedia, including how much they ‘anonymously’ contribute and the kinds of articles they edit. In the conclusion I suggest that the Wikiscanner, with some modifications, could prove a valuable tool for researching ‘local’ aspects of Wikipedia production.
The Wikiscanner ‘de-anonymizes’ edits on Wikipedia, linking IP addresses to the organizations and institutions where the edits were made. Released in August 2007, it was quickly taken up on the Web and in the media, and within days a number of high-profile cases of misconduct were revealed. These included unsavory edits by “the Al-Jazeera network, Fox News Channel, staffers of Democratic Senator Robert Byrd and the CIA” and, here in the Netherlands, a revelation that ‘the Royals’ were touching up their involvement in the Mabel affair.
As a tool, the scanner is skewed toward scandal research. Its question, ‘Who edits Wikipedia?’ comes with a suggestion: some of these edits will be “salacious”. The results are presented per edit rather than aggregated, meaning the focus is not on collaborative processes or article ‘evolution’, but on the single, incriminating edit. On the one hand, this benefits from a core assumption about Wikipedia, that it is subject to manipulation and should be approached with caution. On the other, it is perfectly in line with the larger Wikipedia narrative, the power of the many over the few.
The Wikiscanner reapplies the Wisdom of Crowds at a meta-level. Meta-editors now lead the charge in exposing conflicts of interest. But will this result in a better encyclopedia, or simply a relocation of ongoing ‘edit-wars’ to more news-worthy portions of the Web? At what point do we need to know who queried a certain set of Wikiscanner results?
Taking a step back, we wonder whether the Wikiscanner can be repurposed as a (new) digital method. Tools for Web research, including the Wikiscanner but also those created by the DMI team, use exploits in Web services (Google, Wikipedia, etc.) to test them and make claims about the knowledge they produce or make available. However, the tools themselves come with methods ‘built-in’. Can research questions be tweaked without tool-modification? Perhaps we are better off aiming for tool-amalgamation – combining existing tools so as to reposition their individual limits. Can we get past scandal research with the Wikiscanner?
Anonymous Wikipedia Production by Dutch Universities
(for the full case study, with a smörgåsbord of tables and tag clouds, go here.)
Every year, the Dutch weekly Elsevier conducts a large survey among students and professors, asking them to ‘grade’ the universities. The results are always highly anticipated, and a source of (somewhat) friendly competition. In addition to the ratings given by students and staff, the magazine looks at indicators of universities’ relevance in terms of scientific publications. Taking a cue from Elsevier, one could query Wikipedia for the relative presence of Dutch universities.
Using the Wikiscanner, anonymous edits from 13 Dutch universities were aggregated and compared. The greatest number of edits were made by the University of Groningen, followed by Twente and Utrecht. Given the University of Twente’s relatively small size, it is surprising to find it in the top three. However, friends have explained that this is probably because Twente has a lively campus, with students living close by, meaning much of their regular Web surfing will happen at the university.
Making Wikipedia Local
There were 639 articles edited by users from more than one university. Of these, 120 articles related to Dutch culture, history and politics. The most active universities in this area were Utrecht, Groningen and Leiden. Interestingly, but perhaps unsurprising, these same three universities have the highest-rated Language and Culture programs, according to one recent national survey. Wageningen University and Research Centre is also very active on these topics, but a closer look reveals that this has been ‘inflated’ by a great number of edits on just a few articles (especially ‘Ayaan Hirsi Ali’, ‘Wageningen’ and ‘Wageningen University’).
Profiling Technical Universities
The three technical universities (Twente, Eindhoven and Delft) were ‘profiled’ based on the articles each edited. As expected, each contributes often to articles relating to mathematics, science and technology. In addition to this, each was found to conform somewhat to ‘Geek’ stereotypes, with a high proportion of edits on topics dealing with science fiction and fantasy games. (I hope it is clear that, as a new media student, I would never use ‘Geek’ in a negative way.) This was most pronounced in the results for the Technical University of Eindhoven.
Cloud: Wikipedia articles edited anonymously from the Technical University of Eindhoven (Numbers indicate edits per article. Click image for a larger version)
Star-gazing at the University of Amsterdam
Last but not least is the University of Amsterdam, ‘home’ to the Masters of Media students. The most visible trend in the edits from our university could be termed a ‘Great Man’ view of Wikipedia. Half of the top thirty articles edited were biographies, and generally more than one were of a certain type (e.g. artist, charismatic leader, University of Amsterdam professor). Emphasis has been added in the cloud below to show this trend in the articles edited by the UvA (click the image for a larger version).
While the question has mostly been, ‘what can the Wikiscanner tell us about Dutch universities?’, the reverse is more interesting. What do the exercises carried out here say about the possible uses of the Wikiscanner for Wikipedia research?
The Wikiscanner, with some tweaking, makes it possible to ‘localize’ Wikipedia activity by linking edits to specific institutions or within geographical borders. Such a move adds a dimension to studies of Wikipedia. Where these have had to hang on to notions of the ‘virtual community’ in describing the ins and outs of collaboration online, the kind of research hinted at here will make it possible to rethink this production as both a local and global operation. General assumptions about Wikipedia’s ‘U.S.-centrism’ should be tested empirically, and alongside article content researchers should make use of location as a variable. In the case of universities, presumably hubs for the production of knowledge, this ‘trick’ is all the more interesting and relevant.
But the Wikiscanner also comes with limitations. Only anonymous edits are indexed, meaning the samples are relatively small and, until one can prove otherwise, not representative of all edits. Also, despite any attempts here or elsewhere, it will be tough to disassociate ‘anonymous’ from ‘discreditable’. With the profiles of technical universities, there is some indication that such samples are representative, but this needs more work. Taking the Wikiscanner further will require adequately theorizing the ‘anonymous edit’.