Web 3.0: Against the Semantic Web

By: Eelke Hermens

On: September 12, 2011

Comments »

About Eelke Hermens
Hacker and cultural critic, passionate about information theory and semantics. Fed up with cyber utopians.

Website
http://twitter.com/#!/eelkeh

There seems to an awful lot of speculation about where the web as a whole is heading. Whereas it’s possible to make predictions and regard the web as a homegenous entity, doesn’t seem to be relevant. We like our predictions too much, the bigger the scope, the better.

The current debate hovers around the definition of the elusive web 3.0, which naturally is the next phase in the growing up (it’s still a baby!) process of the internet. The first iteration ended abruptly when the bubble burst. The resulting vacuum offered a playground for innovative business models, heavily leaning on ideas from the open source movement. We entered web 2.0, where “information sharing,interoperability, user-centered design, and collaboration” (Wikipedia) are the building blocks of web applications that form the core of the user centered internet we grew accustomed to today. The World Wide Web became the ultimate playground. So what’s next? What happens after the emancipation of the user, which has been extensibly applauded by what Morosov has dubbed the cyber utopians. First off, the web 2.0 is as much about the user centered experience as it as about the corporations who disruptively changed the web as we know it, for better or worse. These corporations now have control of vastly centralized networks that constitute a dominant portion of our daily internet routines. In her paper “Accessing Truth: Marketplaces of ideas in the information age” Nima Darouian regards the big 2.0 websites as virtual marketplaces of ideas. “Virtual marketplaces are essentially open forums, but they are not public forums; rather, they are private ‘computerized information goods'” The user contributed information floating around in these vast networks is in no ways free and it’s merely exploitable when it’s computerized. When thinking about where the web is heading, we need to keep in mind these current levels of control in web 2.0 applications and content marketplaces.

Although there are many, widely varying, ideas of what web 3.0 constitutes of floating around, there seems to be a lot of momentum towards the narrative of web 3.0 as the semantic web. The recent Big Data hype (from Wikipedia: “Big data are datasets that grow so large that they become awkward to work with using on-hand database management tools”) is closely related to the semantic discourse. There is a lot of movement and investments in this field, the title of a McKinsey research report from last may triumphantly announces “Big Data: The next frontier for innovation, competition and productivity”. This of course is far from a new phenomenon, in fact, the internet itself is the biggest and messiest dataset we’re familiar with. Tim Berners Lee saw the shortcomings of the messy WWW with regards to computerized indexing and interpreting early on. In 1999 he wrote:

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A ‘Semantic Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The ‘intelligent agents’ people have touted for ages will finally materialize.

The proposition is that with a little extra effort from the users, web content could be properly classified using a universal and unified markup. This allows for the content to be interpretable by machines.

Skip 12 years later and he is still at the forefront of promoting this dream, spreading the word about the importance of semantics and open data. The dream didn’t fail, it just is taking a really long time to shape up according to its supporters. But the techniques and ideology underlying the semantic web is problematic and are the reason why the semantic web (as proposed by Lee) will never see the light of day.

Here’s why:

There is no one web, in order to have an effective Semantic Web one must presume a controllable web environment. At least an eco system, in which its agents have a shared interest in web standards, so that the proposed innovations are broadly supported and adapted. The web that the 2.0 left us constitutes of a lot of virtual marketplaces of content, which are in the business of exclusively exploiting their user generated data sets. The web is fragmented and points of interest vary widely.
There is no one ontology to rule them all. As Florian Cramer pointed out, the ontology that Lee is proposing is rather a cosmology. The semantic web is “based on a naive if not dangerous belief that the world can be described according to a single and universally valid viewpoint;”
Semantic techniques only work effectively in closed environments.For example: In the medical world there is a clear established ontology that is effective in its own. If you would link this data to, let’s say, a newspaper archive the terminologies wouldn’t overlap, even though the subjects in itself may be identical. The stronger the jargon is a particular field, the harder it becomes to meaningfully link the data to external environments.
There is no clear incentive for users to participate in semantic tagging (and it requires a lot of participation and willingness). We have become so accustomed to merely using invisible search algorithms to navigate through the web, that there is no apparent and acute problem with the overall categorization and relations of web content.
There is no judge. What if spammers took up the opportunity to go and add, fully automatized of course, semantic tags. When not contained, this would set the web back tremendously to the days before PageRank and other inventive search algorithms which take contextual information in account.
It’s already here! But it’s not open and it’s not linked in any way. It’s called the open graph protocol (http://developers.facebook.com/docs/opengraph/) and it is widely (and willingly) addopted. This ontology is designed with advertising revenue as its main point of interest. Facebooks open graph is the direct result of its appropriation and centralization of user generated value. There is no further incentive for a company like Facebook (or Google for that matter) to participate in an ontology that is not designed with their business model in mind.

The cultural problems underlying universal ontologies (more on this perspective here) and the layers of control operating in widely used web 2.0 networks undermine the idea of a Semantic Web. It’s time to rethink the possibilities of semantics and let go of, seemingly pointless, holistic ideas of the web. Instead we should be looking for a new realistic common points of interest in which the current web eco system is willing to participate.

Comments are closed.