Google and the Principles of the Semantic Web

By: Lauressa Ford

On: September 10, 2012

Comments »

About Lauressa Ford
Hello! I am Lauressa Ford. I was born and raised in Scheveningen, the most difficult city name to pronounce when Dutch is not your first language. After I graduated High School I went to Groningen (also difficult to pronounce) to study History. I also followed classed concerning Modern Art, Rhetoric, Journalism and my Bachelor thesis was about Marc Chagall and the development of Judaism in East and West Europe. After I finished my Bachelor I went to Amsterdam (finally a city that doesn't cause any problems to pronounce) and I followed the short Bachelor program Media and Culture direction New Media. My second bachelor thesis was about Google and the upcoming principles of the semantic web. At the moment I follow the master New Media and Digital Cultures at the University of Amsterdam. In Groningen I learned everything about the past, but now it is time to focus on the future....

One of the first goals of the Internet was to expand the knowledge of the users by connecting different data. However, the contemporary web is by no means comparable to this initially academic network. The web expanded unprecedentedly and became more complex and the availability of data grew enormously. Because of this expansion it became harder to link data several search engines tried to solve this problem. The result was that the techniques of the search engines also became more complex since the emergence of the Internet. The history of Google made these changes visible.

Initially Google had no commercial image, however, the company is now almost fully financed by the revenue from the ads on the displayed search result pages. The profitability of these ads resulted in a major growth of the Google network and its various services. The main purpose of these services was to provide better search results to users. However the more people started using these possibilities Google was offering, the more the users of the search engine became an important research object.

While the Internet expanded, it became more complex to deliver the right information to the customers. Different search techniques have been developed, but became insufficient. An example of this is the technique called PageRank wich Google uses to sort the search results. Even though the effective operation of the technique is a secret, webmasters found out that a webpage with more links on other websites was ranked higher on the Google search result page. Nowadays there are many handbooks available and these explain to the webmasters the exact steps that must be followed to use the PageRank technique in their favor. As a consequence, new ways had to be found new ways to reach all the information on the web. The development of the semantic was is a good example of this circumstance.

The principles of the semantic web became more relevant over the past years and it is based on ontology’s and metadata. In the future computers must be able to read the context of the data on the Internet to prevent that irrelevant web pages appear among the search results. By means of meta-tagging users can give a meaning to their content on the web. This should eventually lead to an objective classification of the information on the web. However, there is a lot of criticism on the idea of an ambiguity-free classification of the web. Universal categories don’t exist, because one has no knowledge of the universe itself. Google also is aware of the growing complexity of dividing the information on the Web into categories so they started to focus more and more on the user, then on the ordering of the Internet.

These developments are clearly visible in the changes Google made concerning their privacy policy since the origin of the company. At first Google didn’t collect any personal information of their users. However, the current privacy statement announces that the gathering of personal information is one of their main goals to provide a better service to their customers and the enormous growth of Google confirms that the users are willing to share their personal information.

Due to the revenues of the ads Google created a huge network and the have the possibility to collect all kinds of information about their users. Not only the search queries are analyzed, but also every step a user takes on the search website is registered. Therefore Google has an immense database of personal details and this leads to a new search technique. Once a user visits the website of the search engine, Google directly recognizes the user. With the search history of the user and all the other data Google collected about this individual, Google can make a good estimation of the needs of this user. Google divides these needs into categories and the search results are adjusted to the users requirements, just like the ads that appear on the webpage with the search results. This categorizing is somewhat similar to the principles of the semantic web.

Because dividing the web into categories seems almost impossible concerning the increase of available information on the web, Google tries to apply the ideology of the semantic web on the behavior of their users. The users voluntarily provide their metadata about their personal information to Google and this information is linked to their query or to the ads that match their profile. The current definition of the semantic web is unfortunately rather vague so it is difficult to answer the question in which ways Google acts according the principles of the semantic web. The sematic web can be seen as an upgrade of the contemporary web, as a metadata technology for software, as a social movement who prefer open-source data or as a new generation of artificial intelligence. All these visions contain some truth, because the main goal of the semantic web is to improve the technology of the Internet, because there are more and more websites with new possibilities and functions. These websites promote both the solidarity that exists on the web, but there are also more opportunities that did not exist before. The most basic way to describe the semantic web is that all the information available on the web is customized to the personal interests of the users. The semantic technology mixes and recycles data in a new way. Data is linked and intertwined and the physical location of the data is irrelevant. This development is characterized by transparency. The semantic web can be seen as an extension of the current web in which computers and people work closely together.

The biggest difference between the semantic web and the approach of Google is the focus on the behavior of the users instead of the attention to categorize the data on the Internet. The context of the documents on the web plays an important role concerning the semantic web, however, the method of Google focuses on the context of the behavior of the users. The criticisms regarding the idea of classifying the data on the web into universal categories plays a less important role by the idea of analyzing and collecting the behavior of the visitors of a search engine website. The users create their own reality that is less connected to the reality of the universe. The queries take place in a defined area, the Internet. Avoiding subjectivity is less important because human behavior is always subjective. In short, analyzing and the context of human behavior when they visit a search engine website is a less labor-intensive task. Sometimes without realizing it, the users provide their personal information and Google only has to process this data. Eventually categorizing the conduct of the users will lead to new problems, but probably Google will find a solution when needed.

Tags: advertising, database, google, open source, privacy, semantic web, web 3.0

Comments are closed.