Tuesday, July 26, 2005

What SEARCH Won't Find

The Web Hits the Stacks
by Stephen Wildstrom

Edited excerpts appear below:
Popular wisdom holds that you can find anything on the Web. Not true. There is a vast body of knowledge that's hidden.

The bulk of human knowledge is represented by printed material -- especially the portion that is more than 25 years old -- does not exist in digital form. In addition, most books and other printed matter published in the last century are still under copyright, and rights owners want to know they'll be compensated for the use of their material.

Google Print (print.google.com) is an attempt to scan the contents of the world's books. One part, developed with publishers, lets people search the contents of current books -- an effort similar to Amazon.com's Search Inside. The more ambitious piece, an outgrowth of the National Science Foundation's digital-libraries initiative, aims to put leading research collections online.

This project has a long way to go, not least because publishers are already up in arms over copyright (see BW Online, 6/22/05, "A New Page in Google's Books Fight"). So far, relatively few books have been digitized. Among those are many copyrighted works that are in libraries but out of print. Google lets you search the contents of these works but only serves up snippets of text surrounding the search terms.

As useful as the Web is, Google Print shows how much is missing. It's good to see it gradually coming within clicking distance.


Post a Comment

<< Home