App Review: Google Goggles
I have to admit that I do not own the app I’m about to review. In fact, I have never used it. Truth be told, I rarely install the latest apps on my mobile phone. It’s not that I don’t want to, but I own a by today’s standards incredibly old-fashioned mobile phone, namely the Nokia N95. The few apps that actually work on my phone include Google Maps, Gmail and the Facebook app (I use all of them very regularly). To demonstrate the hit and miss that installing apps on my Nokia N95 is: I download Nokia’s very own sports tracker. It installed perfectly, but totally fails to pick-up any satellites making it as useful for tracking sports (running in this case) as an analog watch. By contrast, the Google Maps app is able to pinpoint my location almost spot on. It’s an app thing I guess.
For this “review”, I will focus on the appmakers’ own description of their app and reviews found on the web. I want to discuss this app here because I feel that in the long run it may possibly have a huge impact on the ways we think about search, data and data sharing. The app I want to talk about here is Google Goggles.
The tagline of Google Goggles is simply “Use pictures to search the web”. In essence that is indeed what the application entails. By taking a picture with your Android power mobile phone or iPhone, you can initiate a search query on Google. The app tries to recognize what’s in the image and returns search results based on its findings. As Google shows in the demonstration video, the best results are achieved by taking pictures of books, painting or landmarks. Pictures of cars, people, animals and food often result in misses. Interesting detail: Google actually disabled face recognition technologies in Google Goggles, so the fact that you can’t use their app to search for people is based on a decision rather than a lack of technical possibilities. Already we touch upon some fertile grounds for a thought experiment here. But let’s continue for now. The last piece of Google’s demo video shows the option to take a picture of text in a foreign language and receive a translation with your search results. Logically this is a small step for Google, combining OCR (optical character recognition) software with their Google Translate service. But the ease and ‘real timeness’ of the translation is an important aspect of this technique.
Data on demand, sharing queries and the possibility space of life
Now bear with me as I commence a little thought experiment. Using Google Goggles, you (the user) can walk around town until something triggers you (a poster, a book, architecture). No longer do you need a companion or sign to tell you what this object is that is triggering your attention. You simply take a picture of it and use Google Goggles’ “computer vision” to analyze the image. Within a few moments, search results appear on your screen demystifying the object in the picture and returning its name and proporties. You got the data you wanted at that moment. But more precise: you apparently didn’t need it before. Google Goggles enables a world of data on demand for everyone with a supported mobile phone. Now you can go to a museum and when you see an interesting painting, you take a picture and Google tells you all about it. In fact, Google was so eager on supporting art that they bought a then only four months old start-up called PlinkArt to accumulate its knowledge on the matter (PlinkArt’s old website now automatically forwards you to Google Goggles). The museum is a perfect example of possible consequences of the techniques used by Goggles. When I go to a museum, I have to rely on signs next to the painting telling me about the artist, the work and hopefully a short story about it. If I want more information I pay a small fee for the exhibition book to get some more in depth information. If I had Google Goggles however, all this information would be in my hands and I wouldn’t need informational signs on the walls anymore. In fact, nobody would need them if everybody had Google Goggles. Let’s push things even further: how many signs would become totally superfluous if we could just take a picture of a object to know all about it? The signs on the walls of university buildings would no longer be needed (truth be told here at the UvA they are already so small that it’s hard to spot them anyway). Google Goggles even has a built-in map telling you what objects are around you (restaurants etc.), so you wouldn’t even need signs for navigating either. Just think of botanical garden in which you would really feel like you’re walking through nature, because all the signs with the tree names on them have been removed. You can simply find out which tree is which by snapping a quick pic.
This use of pictures also shows the transformative power of Goggles: pictures become queries. A picture is now one-on-one translationable into a search query. Sharing pictures with your friends now becomes sharing queries. Friend A: “Hey, look I was in Australia and I took a picture of the..” Friend B: “Yes the Twelve Apostles, I know, it says so right here.” How often have you shared a search result page with your friends? It’s not something I do regularly, but I do share pictures with my friends. Because of Google Goggles there’s not that much difference between the two anymore.
But here’s the thing. Google Goggles is an application, from the hands of a development team. It uses different techniques from other developers. Furthermore, it uses a search system that returns specific results based on a grading system developed by another team of developers. There is a huge system of rules at play within the use of this application. To borrow a concept from Ian Bogost: using Google Goggles is engaging within a possibility space created by the rules of the application. The analyzing algorithms, the returned information – everything goes through a set of rules created by developers. But you are using Google Goggles in the real world, on real world objects and you probably have a real world decision to make based on the information Goggles gives you. This is a very interesting problem and one that should not be overlooked. What would the world look if you used the hypothetical apps Bing Binoculars, Facebook Vision or Soso Spectacles? Following Bogost, it’s important “to explore the possibility spaces in which we engage and then accept, challenge or reject them”. Bogost is talking about possibility spaces in video games and how we should negotiate them in our daily lives. Well, this is real life. This is your daily life and if you are going to use an app like Google Goggle you will have to engage in an active negotiation with the information it gives you.
Back to a reality, but it’s changing
Google Goggles is a relatively new app and hasn’t yet reached its full potential (however mindblowing and game changing that potential may be). As of yet, judging by the reviews I read, more often than not Goggles fails to interpret the image correctly and returns search results that are in no way related to the subject. Even the scanning and analyzing of a simple business card (which is shown in the demo video as one of the things Goggles really can do) fails at times. However, we should embrace these now stille existing flaws, as it gives us time to extensively think about the impact of applications enabling “visual search” and prepare a theoretical framework to talk about them, before they get the chance to integrate with that other big game changer that’s on its way: social search. Search is getting social and if the predictions are anything to go by, early 2012 will see the real becoming of social search. Now consider the combination of visual search and social search and you’ll end up with huge implications for our daily lives. You took a picture of Wall Street? Well according to the social sphere, the most important thing to tell you about it right now is that there are huge protests going on. Or not, depending on your sources. And exactly which sources are used for the information is part of the rules of the application. Will you realise the biased information you’ll receive when the search results show you a picture that looks exactly like the object you photographed in the first place, but this time accompanied by a description that sounds logical and probable? I don’t know. When Google maps tells me a certain street is somewhere I believe it. When I reach the street and it actually appears to be somewhere else I still doubt my own findings. Now when I take a picture of an object I don’t know and there is a list of results describing the object in a certain way, it will sure take a lot of my attention to actively negotiate with the information. So in the end, maybe I should be happy with my Nokia N95 mobile phone from back in the days, since it allows me to lag behind until my telephone contract ends in the summer of 2012 and I will be able to get the latest model when social and visual search will be firmly in place.
About Google buying PlinkArt
On Search getting Social