
It might as well be
Google itself, but there sure is a lot of room for search. The amount of data available online is just going to keep growing like crazy. Five of the six billion people are not yet online. And as they come online, companies are going to work to make it easier and easier for people to create data online, text, audio, video and more. So you are looking at more and more stuff, in more and more languages, in more and more formats, text, audio,video. And there is the perennial demand: the quality of search. Can you find what you are looking for? How precise is the result? How user friendly is the search experience?
Google's handicap for now is it only knows text. And even text it does not do as well as we might want, even though it does it better than anyone else.
When you search for something you want the most relevant webpages and sites to show up first. And you want to be able to search within sites. How to rank pages? Google started out by saying the more sites that link to your site, the more valuable it is. Well, maybe, but not if many of them are link farms. But that was a great way to start. That still has to be the basic formula. But then each site has to be given a weight of its own based on many different criteria.
The language challenge is a big one, as is the format challenge. Can a
search engine "read" audio and video like it reads words? Can a search engine search content regardless of what language or format it might be in, and then present the same in the language of the end user's choosing? You are talking real time translation. That right there is a huge challenge. Major work will have to be done in speech software to make this possible.
The formula that a
website's worth is how many other websites link to it is basically good. But the formula has to get more sophisticated than that. Not all links are equal. And each site should have a weight based on a few different things. So one link from a really good site should count for more than many links from so so sites. And the search engine should be able to count the page hits for each site in real time. The activity level of a site should be a major factor of how important a site is. As important as links.
And there is this real time thing. Say if I put out a new website or just a new page to an existing site, how long before the search engine finds it? Can it be an hour? A minute? Less? It should be less than a second. The reverse should also be true. If a site or a page is taken down, the search engine should know.
Language, content format, speech, website weightage based on links and page hits. These are some of the things that come to mind. This is enough homework for now.
Google might or might not deliver. There is plenty of room for others, especially for bold upstarts.
Eric Schmidt interview (39mins MP3)
No comments:
Post a Comment