Wikimedia Conference 2011: Cultural Heritage, Commons and lots of Data. Pt. 2
This is part 2 of my report on the Wikimedia conference 2011. The first part can be found here
Teun Lucassen – Wikipedia is reliable [Citation needed]
Almost all the sessions in the cultural heritage track were presentations about a certain project. This was interesting but not something I had not heard yet. It therefore chose to go to a presentation by Teun Lucassen (@tlucassen) about how users experience the reliability of Wikipedia. Lucassen is a PhD student at the faculty of Behavioral Sciences at the University of Twente. The first question Lucassen asks in his research is if Wikipedia users need help deciding if an article is reliable. The problem with Wikipedia is the that it is hard to find out who the authors of an article are and if they can be considered a reliable source. Throughout the history of Wikipedia, several attempts have been made to help the user deciding. Lucassen first showed WikiViz, a datavisualization tool developed by IBM. The tools adds a bar to the article which shows a number of statistics about the article. For example that it has been edited 87 times, by 23 users. The problem with this kind of information is, what does it say about the reliability? Especially when you realize that most of the edits are made by automated bots. Lucassen told that he always uses this tool as a bad example. In this however, I do not totally agree with him. His research reminded me about my own research I did in the Digital Methods class about Wikipedia last year. Here I analyzed how different articles were build. This showed that most articles have been created by several different users, but the majority of the text was written by only one or two persons. All the others edits made by human editors were mainly grammatical and linguistic improvements. This is a problem for an encyclopedia who’s goal it is to show a neutral point of view. Showing how many people are actually responsible for the text can therefore be a useful way give an indication about the reliability of the article. My full report can be found on my blog.
Lucassen studied three methods that would help the user to decide if the article is reliable. The first is a user based rating system, which is implemented at the moment in the English language Wikipedia. The second one was an easy algorithm that shows a rating depending on the amount of edits and users the article has. The third one is what Lucassen calls an ‘Adaptive Neural Network Rating system’. This uses a difficult algorithm that is impossible for the user to understand. Lucassen did not tell his testing group that this system was complete nonsense. He gave the testing group the same articles to read with different ratings in order to see how this would influence their idea of trustworthiness. His test results showed that people considered the article less reliable when the user based rating system was used. People did not trust the opinion of other people or thought that there were not enough votes. The simple algorithm made people more positive about the article. All test users agreed that this mark that is created by this algorithm is not able to give much useful information about the article. The third, made up, algorithm showed both positive and negative results. This can be explained by a phenomena called ‘over-reliance’ . This is when people start making their own assumptions about what the algorithm means. It was funny to see how people had started to believe an algorithm which was completely made up.
Lucassen concludes his research that because of the ambiguous quality of Wikipedia, helping the users can be a good strategy in order to make Wikipedia more reliable, but that there are many pitfalls in how to achieve this. Lucassen proposes a user based rating system where he voter has to add a small piece of text that explains why he gave the grade. I found Lucassen’s presentation extremely interesting and I think this kind of research can definitely be used in combination with the research that is done at the Digital Methods course at the MA New Media at he University of Amsterdam. More information about Lucassen’s research can be found on his blog.
Ronald Beelaard – The lifecycle of Wikipedia
Ronald Beelaard did extensive research to the lifecycle of Wikipedia users. The reason for this is the article written by the Wall Street Journal which concluded that Wikipedia editors were leaving Wikipedia on a larger scale than ever. Beelaard started his own research in order to find out how many Wikipedia users ‘die’ each month and how many are ‘born’. He also took in concern the phenomena called a ‘Wikibreak’, where editors stop editing Wikipedia for a while, to come back later. Beelaard showed a big bulk of numbers which weren’t always easy to understand and concluded that the dropout rate is only a fraction as big as the numbers that are mentioned in the Wall Street Journal. It is however true that less people start editing Wikipedia and the young editors die earlier than the old ones. The total community is shrinking but the seniors are more vital than ever.
Erik Zachte – Wikipedia, still a world to win
The last presentation of the day was given by Erik Zachte (@infodisiac), a data analyst. He researched Wikipedia’s mission to embrace the whole world. He showed in a graph (Inspired by Hans Roslings Gapminder) that Wikipedia is growing in all languages, but that some of them are relatively small compared to the amount of people who speak the language. The English Wikipedia is off course the biggest, but the Arabic or Hindi Wikipedia is still relatively small, despite the millions of people who speak these languages. This is partly because of the internet penetration in these countries, which is not as high as in Western countries. This is also the reason why the Scandinavian Wikipedia is doing so extremely well. But this is not the only reason. Zachte showed for example that the English language Wikipedia is edited by an very high number of people from India. Zachte also showed that most edits come from Europe, which can be explained by the high amount of languages here. When a big disaster or worldwide event happens, a Wiki page appears of it in all the different languages.
There is a big correlation between the amount of habitants of a country and the size of the Wikipedia in that language. By putting a geographical map with the population density over a map with the sizes of each Wikipedia, Zachte showed interesting outliers. A nice piece of datavizualization. Zachte ended his presentation by addressing the rise of mobile internet. In African countries, not many people own a desktop computer with an internet connection. There is hoewever, a big rise in the use of smart phones. Zachte therefore beleives that the Wikimedia foundation should make their site more accessible for editing the pages with these devices in order to create a larger Wikipedia.
In the end I can look back on a very well organized event with lots of interesting presentations. The cultural heritage track gave a nice overview about what is happening at the moment in the field and how open content and standards can help spread the content. The Wiki-world track was for me however the most fascinating. It reminded me of all the researches that were done last year in the Digital Methods and datavisualization classes of my MA New Media studies and the fact Wikipedia and all its data is such an interesting object of study. I hereby want to thank the organization for a great day and I hope to able to be part of it next year.
Dit werk is gelicenseerd onder een Creative Commons Naamsvermelding-GelijkDelen 3.0 Nederland licentie