Ride the Lightning

Cybersecurity and Future of Law Practice Blog
by Sharon D. Nelson Esq., President of Sensei Enterprises, Inc.


July 31, 2009

I am including below a reponse I received to the original post referenced above:


Google has solved search in the public Internet however applying the same technology to the enterprise is a different story.  In order to index the Internet Google has built massive data centers to process the data.  In processing this data they make a cache copy of all web pages (perform a Google search and you will see a link "Cached" next to each search result. This process works for the web where you can spread the cost among billions of users.  This process does not work for the enterprise.  Having exabytes of enterprise data that would need to be cached (replicated) is not practical.  I would require the doubling of the entire enterprise storage environment – something no company would undertake.

In order to make exabytes of ESI discoverable you need to build an affordable solution that is designed to be efficient (small index footprint), scalable (billions of objects per server), and fast (1TB/Hour/node processing speed).  This is not what Google has delivered.  It is a fine point solution for smaller projects, but when you talk terabytes or exabytes you need to look elsewhere.

Jim McGann

Index Engines



By the way, John agrees that Google, for any number of reasons, is not an appropriate EDD tool. It simply is not EDD-specific. It is limited in the number of file formats it can handle even though it supports more than 200. It is designed as an internal file indexer and doesn't appear to even deal with user's e-mail stores. At least Google desktop will index your Outlook mailbox even though it doesn't properly deal with attachments. The Google Search Appliance doesn't even seem to deal with e-mail period. That's a killer right there.

E-mail:        Phone: 703-359-0700

