Steven, The composite doc idea is an interesting avenue to a solution here that I didn't think of. What's missing is code to do the group by and then do an intersection in order to get boolean AND behavior between the addresses and primary documents, and then filter out the non-primary documents. Perhaps Solr's popular field-collapsing patch would be a starting point.
I realize of course that Lucene/Solr isn't a database but there is plenty of gray area in-between. Did you read my original message where I suggested perhaps a solution might lie in intersecting different queries based on common multi-value field offsets derived from matching term positions? I have no idea how far off the current codebase is to exposing enough information to make such an approach possible. ~ David Smiley From: Steven A Rowe [via Lucene] [mailto:ml-node+684371-1863547009-13...@n3.nabble.com] Sent: Monday, March 29, 2010 4:29 PM To: Smiley, David W. Subject: RE: One item, multiple fields, and range queries Hi David, On 03/29/2010 at 3:36 PM, David Smiley (@MITRE.org) wrote: > I'm not sure what to make of "or index using a heterogeneous field > schema, grouping the different doc type instances with a unique key > (the one) to form a composite doc" Lucene is schema-free - you can mix and match different document types in a single index. You could emulate this in Solr by merging the two document types and leaving blank the parts that are inapplicable to a given instance. E.g.: Address-doc-type: Field: Unique-key Field: Street Field: City ... Everything-else-doc-type: Field: Unique-key Field: Blob-o'-text ... Doc1: Unique-key: 1; Blob-o'-text: blobbedy-blah-blob; ... Doc2: Unique-key: 1; Street: 12 Main St; City: Somewheresville; ... Doc3: Unique-key: 1; Street: 243 13th St; City: Bogdownton; ... .... > I could use the scheme you mention provided with the spanNear query but > it conflates different fields into one indexed field which will mess > with the scoring and make queries like range queries if there are dates > involved next to impossible. I agree, dimensional reduction can be an issue, though I'm sure there are use cases where the attendant scoring distortion would be acceptable, e.g. non-scoring filters. (Stuffing a variable number of addresses into a single document will also "mess with the scoring" unless you turn off norms, which is of course another form of scoring-messing.) I've seen a couple of different mentions of private SpanRangeQuery implementations on the mailing lists, so range queries likely wouldn't be a problem for long, should it become a general issue. > This "solution" is really a hack workaround to a limitation in > Lucene/Solr. I was hoping to start a conversation to a more > truer resolution to this problem rather than these workarounds > which aren't always satisfactory. Limitation: Solr/Lucene is not a database. "Solutions": 1. Hack workaround 2. Rewrite Solr/Lucene to be a database 3. ? (fill in "more truer resolution" here) Good luck, Steve ________________________________ View message @ http://n3.nabble.com/One-item-multiple-fields-and-range-queries-tp475030p684371.html To unsubscribe from RE: One item, multiple fields, and range queries, click here< (link removed) ==>. ----- Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book -- View this message in context: http://n3.nabble.com/One-item-multiple-fields-and-range-queries-tp475030p684415.html Sent from the Solr - User mailing list archive at Nabble.com.