I am an avid googler and swear by the Wikipedia, but a few days ago I was let down big time by the Google. I was to go to Ooty and was searching for suitable accommodation over there. Can you believe google gave me some 8470 results, out of which I read the first 20 but still couldn't get what I wanted.
Anyways, I decided that these kinda queries should be answered in person and no google should be given the authority to dictate the beds in which I sleep in. However, all the while I was driving to Ooty, one thought kept troubling me. Why can't someone dig all the wealth of information in the blogs and user reviews out there and provide me with a simple choice of no more than 10 hotels/homestays each in a different price range and with the maximum number of favorable reviews in their range?
This question is the ultimate frontier for Internet searching. Providing relevant search results have been the nightmare for all the search algorithm writers for over two decades now. Since the birth of the Internet, people have been collecting data on the cloud. How nice it would be to go through this huge amount of data and come back with the most relevant piece of information.
All this and some more spiked me to do some more googling on such searching and voila! I came up with a new jargon "WEB 3.0". Well, not actually new, I had heard a lot about semantic and vertical searching and have dirtied my hands trying web page - scraping but had never given a serious thought to this so called "natural language searching".
Now, lets take the most basic question : What is, or rather will be, WEB 3.0? Everybody has got his own opinion about what it will be, I also have mine. For me, WEB 3.0 will be a paradigm shift from what we know of the Internet as of today. I mean, 10 years down the line there won't be any website as we see today. What will be is a huge repository of data, which will be essentially user and community generated and we will be able to access the data in whatever format we like and we won't need a conventional computer to do that (anyways PCs will have shrunk to the size of a laser device mounted on our ears which can project images on any surface or play sounds from their ear buds). This data will be rendered in whatever format we like using our previous preferences and could be changed whenever we want to. Hmmm! Quite futuristic huh. Wait for another 10 years sweetheart.
But before we leap 10 years in the time warp, lets think if we can do anything about what we have here today. Maybe, maybe not. Lets break this huge insurmountable problem into somewhat smaller and manageable issues. As we know most of the searches in future will be like :
a. I want to go to a happy place.
b. I would like to read a sad, romantic story.
c. I would like to have delicious, Chinese, home cooked food.
Now all these questions today are answered by searching for keywords from the existing pages. However, we are talking about a search in which the search engine crawls all through the blogs and forums and other community sites and get the reactions of people for the various options possible and then show the results. Whew! That's a huge requirement in itself. Now let us break this into a further smaller problems.
The biggest issue here is how does the search engine understand the emotions portrayed by the web page? Most of the searches can be made easy if the pages can be ranked according to their EQ. To answer this let me ask another question: How do we gauge the emotional state of something we read? Simple! By looking at the keywords which our parents and teachers taught us to identify with certain emotions. Similarly if we keep a data base where in the search engine can refer to and find out the number of sad or happy keywords, then it can gauge the overall emotional state of the page.
So one problem solved. Similarly I will try to tackle different issues as and when I get time and in the end we will have the model of a basic WEB 3.0 search engine.