Re: Collating results from multiple indexes

Will Johnson Tue, 16 Feb 2010 18:28:15 -0800

Jan Hoydal / Otis,



First off, Thanks for mentioning us.  We do use some utility functions from
SOLR but our index engine is built on top of Lucene only, there are no Solr
cores involved.  We do have a JOIN operator that allows us to perform
relational searches while still acting like a search engine in terms of
performance, ranking, faceting, etc.  Our CTO wrote a blog article about it
a month ago that does a pretty good of explaining how it’s used:
http://www.attivio.com/blog/55-industry-insights/507-can-a-search-engine-replace-a-relational-database.html



The join functionality and most of our other higher level features use
separate data structures and don’t really have much to do with Lucene/SOLR
except where they integrate with the query execution.  If you want to learn
more feel free to check out www.attivio.com.



-              w...@attivio.com


On Fri, Feb 12, 2010 at 10:35 AM, Jan Høydahl / Cominvent <
jan....@cominvent.com> wrote:

> Really? The last time I looked at AIE, I am pretty sure there was Solr core
> msgs in the logs, so I assumed it used EmbeddedSolr or something. But I may
> be mistaken. Anyone from Attivio here who can elaborate? Is the join stuff
> at Lucene level or on top of multiple Solr cores or what?
>
> --
> Jan Høydahl  - search architect
> Cominvent AS - www.cominvent.com
>
>  On 11. feb. 2010, at 23.02, Otis Gospodnetic wrote:
>
> > Minor correction re Attivio - their stuff runs on top of Lucene, not
> Solr.  I *think* they are trying to patent this.
> >
> > Otis
> > ----
> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> > Hadoop ecosystem search :: http://search-hadoop.com/
> >
> >
> >
> > ----- Original Message ----
> >> From: Jan Høydahl / Cominvent <jan....@cominvent.com>
> >> To: solr-user@lucene.apache.org
> >> Sent: Mon, February 8, 2010 3:33:41 PM
> >> Subject: Re: Collating results from multiple indexes
> >>
> >> Hi,
> >>
> >> There is no JOIN functionality in Solr. The common solution is either to
> accept
> >> the high volume update churn, or to add client side code to build a
> "join" layer
> >> on top of the two indices. I know that Attivio (www.attivio.com) have
> built some
> >> kind of JOIN functionality on top of Solr in their AIE product, but do
> not know
> >> the details or the actual performance.
> >>
> >> Why not open a JIRA issue, if there is no such already, to request this
> as a
> >> feature?
> >>
> >> --
> >> Jan Høydahl  - search architect
> >> Cominvent AS - www.cominvent.com
> >>
> >> On 25. jan. 2010, at 22.01, Aaron McKee wrote:
> >>
> >>>
> >>> Is there any somewhat convenient way to collate/integrate fields from
> separate
> >> indices during result writing, if the indices use the same unique keys?
> >> Basically, some sort of cross-index JOIN?
> >>>
> >>> As a bit of background, I have a rather heavyweight dataset of every US
> >> business (~25m records, an on-disk index footprint of ~30g, and 5-10
> hours to
> >> fully index on a decent box). Given the size and relatively stability of
> the
> >> dataset, I generally only update this monthly. However, I have separate
> >> advertising-related datasets that need to be updated either hourly or
> daily
> >> (e.g. today's coupon, click revenue remaining, etc.) . These advertiser
> feeds
> >> reference the same keyspace that I use in the main index, but are
> otherwise
> >> significantly lighter weight. Importing and indexing them discretely
> only takes
> >> a couple minutes. Given that Solr/Lucene doesn't support field updating,
> without
> >> having to drop and re-add an entire document, it doesn't seem practical
> to
> >> integrate this data into the main index (the system would be under a
> constant
> >> state of churn, if we did document re-inserts, and the performance
> impact would
> >> probably be debilitating). It may be nice if this data could participate
> in
> >> filtering (e.g. only show advertisers), but it doesn't need to
> participate in
> >> scoring/ranking.
> >>>
> >>> I'm guessing that someone else has had a similar need, at some point?
>  I can
> >> have our front-end query the smaller indices separately, using the keys
> returned
> >> by the primary index, but would prefer to avoid the extra sequential
> roundtrips.
> >> I'm hoping to also avoid a coding solution, if only to avoid the
> maintenance
> >> overhead as we drop in new builds of Solr, but that's also feasible.
> >>>
> >>> Thank you for your insight,
> >>> Aaron
> >>>
> >
>
>

Re: Collating results from multiple indexes

Reply via email to