On Mon, Jan 31, 2011 at 9:38 PM, Upayavira <u...@odoko.co.uk> wrote:

>
>
> On Mon, 31 Jan 2011 08:40 -0500, "Estrada Groups"
> <estrada.adam.gro...@gmail.com> wrote:
> > What are the advantages of using something like HBase over your standard
> > Lucene index with Solr? It would seem to me like you'd be losing a lot of
> > what Lucene has to offer!?!
>
> I think Steven is saying that he has an indexer app that reads from
> HBase and writes to a standard Solr by hitting its Rest API.
>
> So, nothing funky, just a little app that reads from HBase and posts to
> Solr.
>


We're doing something like offering a relational-database-like experience
(i.e. a schema language, storing typed data instead of byte[]s, secondary
indexing facilities), with some content management features (versioning,
blob storage), combined with SOLR as a search index (with mapping between
our schema and that of SOLR), the index being maintained incrementally and
through map/reduce (for reindexing). We keep multiple versions of the index
if you want, with state management and we do text extraction with Tika. All
this happens fully distributed, so you can play with different boxes serving
as HBase datanode, or index feeder, SOLR search node, etc etc.

All that sits behind a Java API that uses Avro underneath, and a REST
interface as well (searches go directly to SOLR). For future versions, we
will integrate a recommendation engine and some analytics tools as well.

So yes, we do more (or rather: different things) than what Lucene/SOLR does,
as we offer a full-featured data storage environment, stuffing your data in
HBase (which scales better than MySQL), and make it searchable through SOLR.

The 'funky app' you're referring at now sits at about 3 manyears of fulltime
development, BTW. ;-)

Steven.
-- 
Steven Noels
http://outerthought.org/
Scalable Smart Data
Makers of Kauri, Daisy CMS and Lily

Reply via email to