Tuesday, January 26, 2010

Economics of Extraction

As I stated in my last post, managing unstructured data is increasingly crucial. A common estimate bandied about is that upwards of 80% of enterprise data is unstructured, including office documents, email, etc. The need to manage the information represented by these bits isn't just theoretical. It's painfully real. Real enough that people pay lots of money for third party solutions to help them do this.

To give an idea of exactly how much money, here are prices for some of the top players in this roughly $2.5 billion market:

  • Autonomy IDOL Server: $220K bundled
  • SAS Enterprise Miner: $100-$400K in 2001
  • Open Text Enterprise 2.0: $600K for 1,000 users
(Thanks to Naveen Garg, a fellow PM, for these data).

Consider these numbers in the context of my last post. Not only is there a good conceptual argument for extraction in databases, there's also a clear customer need. If there wasn't pain, vendors couldn't charge hundreds of thousands of dollars for a solution. And that's what customers are willing to pay for a solution that's not fully integrated into the database: a separate system to buy and maintain and support.

Imagine what they'd think of true unstructured data management as a first-class database feature.

