Wednesday, May 18, 2005

Categorization versus Search


               

Back in 1994, a couple of guys at Stanford, David Filo and Jerry Yang, both Ph.D. candidates in Electrical Engineering, launched a service called Yahoo!. Originally begun as a hobby to keep track of their personal interests on the Internet, eventually their home-brewed lists of favorite links became too long and unwieldy, and they broke them out into categories. When the categories became too full, they developed subcategories ... and the core concept behind Yahoo! was born. The name Yahoo! is an acronym for "Yet Another Hierarchical Officious Oracle."

Yahoo! became the first really significant attempt to bring order to the Web. As the Web expanded, Yahoo! hired experts to systematize the directory. But what a daunting task given that the Web is so huge and unstructured. How would you like to have the responsibility of organizing the world in advance, trying to fit everything into a classification hierarchy?

So along came Google, an alternative to the Yahoo! directory. One reason why Google was adopted so quickly was that it didn't try to predict in advance how information is structured.

Google's founders, Larry Page and Sergey Brin, were both grad students at Stanford University (just like Yahoo!'s founders Filo and Yang). In 1996 Page and Brin collaborated on a search engine called BackRub that analyzed the "back links" pointing to a given website. Their unique approach to link analysis eventually led them, in 1998, to put their Ph.D. plans on hold and start-up Google. Google is a play on the word googol, which refers to the number represented by the numeral 1 followed by 100 zeros.

With the Google search paradigm, no one decides in advance how to categorize. In fact, with Google there is significant value in not categorizing.

But, there are times when categorizing in advance really is important. That's when the cataloger needs to provide context. Such is the case with establishing and communicating IT standards. The whole point of standardizing is to make sure everyone is working on the same page. It's incredibly wasteful when multiple project teams each go off and purchase their own relational DBMS. Besides redundant acquisition costs, each separate product carries its own ongoing annual maintenance fee. Worst of all, skill sets associated with each tool are not readily transferable. In other words, don't expect a DBA trained in Oracle to be able to shift over and work as a DBA for DB2, or SQL Server, or MySQL, or Sybase, or Ingres, or PostgreSQL, or Firebird ...

Someone inside IT with authority has to provide context. That means setting up the categories. It also means identifying which products to use as well as which ones not to.

The biggest challenge in creating the environment for communicating IT standards is providing proper context. The problem is that IT itself is enormously complex. There are numerous product categories.

0 Comments:

Post a Comment

<< Home