RE: One item, multiple fields, and range queries

David Smiley (@MITRE.org) Mon, 29 Mar 2010 13:55:16 -0700

Steven,

The composite doc idea is an interesting avenue to a solution here that I 
didn't think of.  What's missing is code to do the group by and then do an 
intersection in order to get boolean AND behavior between the addresses and 
primary documents, and  then filter out the non-primary documents.  Perhaps 
Solr's popular field-collapsing patch would be a starting point.

I realize of course that Lucene/Solr isn't a database but there is plenty of 
gray area in-between.

Did you read my original message where I suggested perhaps a solution might lie 
in intersecting different queries based on common multi-value field offsets 
derived from matching term positions?  I have no idea how far off the current 
codebase is to exposing enough information to make such an approach possible.

~ David Smiley

From: Steven A Rowe [via Lucene] 
[mailto:ml-node+684371-1863547009-13...@n3.nabble.com]
Sent: Monday, March 29, 2010 4:29 PM
To: Smiley, David W.
Subject: RE: One item, multiple fields, and range queries

Hi David,

On 03/29/2010 at 3:36 PM, David Smiley (@MITRE.org) wrote:
> I'm not sure what to make of "or index using a heterogeneous field
> schema, grouping the different doc type instances with a unique key
> (the one) to form a composite doc"

Lucene is schema-free - you can mix and match different document types in a 
single index.  You could emulate this in Solr by merging the two document types 
and leaving blank the parts that are inapplicable to a given instance.  E.g.:

Address-doc-type:
        Field: Unique-key
        Field: Street
        Field: City
        ...

Everything-else-doc-type:
        Field: Unique-key
        Field: Blob-o'-text
        ...

Doc1: Unique-key: 1; Blob-o'-text: blobbedy-blah-blob; ...
Doc2: Unique-key: 1; Street: 12 Main St; City: Somewheresville; ...
Doc3: Unique-key: 1; Street: 243 13th St; City: Bogdownton; ...
....

> I could use the scheme you mention provided with the spanNear query but
> it conflates different fields into one indexed field which will mess
> with the scoring and make queries like range queries if there are dates
> involved next to impossible.

I agree, dimensional reduction can be an issue, though I'm sure there are use 
cases where the attendant scoring distortion would be acceptable, e.g. 
non-scoring filters.  (Stuffing a variable number of addresses into a single 
document will also "mess with the scoring" unless you turn off norms, which is 
of course another form of scoring-messing.)

I've seen a couple of different mentions of private SpanRangeQuery 
implementations on the mailing lists, so range queries likely wouldn't be a 
problem for long, should it become a general issue.

> This "solution" is really a hack workaround to a limitation in
> Lucene/Solr.  I was hoping to start a conversation to a more
> truer resolution to this problem rather than these workarounds
> which aren't always satisfactory.

Limitation: Solr/Lucene is not a database.

"Solutions":
        1. Hack workaround
        2. Rewrite Solr/Lucene to be a database
        3. ? (fill in "more truer resolution" here)

Good luck,
Steve

________________________________
View message @ 
http://n3.nabble.com/One-item-multiple-fields-and-range-queries-tp475030p684371.html
To unsubscribe from RE: One item, multiple fields, and range queries, click 
here< (link removed) ==>.

-----
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
-- 
View this message in context: 
http://n3.nabble.com/One-item-multiple-fields-and-range-queries-tp475030p684415.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: One item, multiple fields, and range queries

Reply via email to