Thursday, February 8, 2007

A good source of textbookish readings for IR material + Links for/from today's class (including why "miserbale failure" didn't work)..


First, off, here is an NYT article explaining that as of Jan end, 2007, Google decided to
quit acting all innocent and hand-removed the miserable failure google bomb link to
whitehouse

http://www.nytimes.com/2007/01/29/technology/29google.html


On a more serious front, I added a new uber-link in the readings list to a draft text book on
on information retrieval that is freely available on the web.

http://www-csli.stanford.edu/%7Eschuetze/information-retrieval-book.html

You might consider consulting it as  a good additional source of readings on topics discussed in the class
(note: some chapters are more drafty than others)

You can find discussion on how to statistically estimate relative index sizes of search engines in chapter
19 (section 19.5). The specific link is:

http://nlp.stanford.edu/IR-book/pdf/chapter-webchar.pdf

That whole chapter is a quick overview of web search issues/challenges.

 distributed index and crawling discussion can be found in the next chapter: ch 20.

cheers
Rao







2 comments:

k.r.a.k.t.i.k said...

As a matter of fact, I think I distinctly remember that when you performed the search for "miserable failure" in class that day, the result from the NYT describing why/how Google took it off their search was on the top-10 search results (first page).

Sanjay said...

Just another info. "Failure" has been a good search keyword to learn a lot of things. I initially did not know much about googlebombing. "Failure" taught me that . Later i started checking the word often for more updates and still was giving more and more information. So its not a bad search !!!