This approach works (I do a similar thing using solr), but you have
to be careful as BooleanQuery.TooManyClauses exception can be thrown
depending where you use the wild card. It should be fine in the case
you described however. Anyway, there is a pretty interesting
discussion about this here:
http://www.usit.uio.no/it/vortex/arbeidsomrader/metadata/lucene/
limitations.html
Brendan
On Dec 17, 2007, at 10:39 PM, George Everitt wrote:
On Dec 13, 2007, at 1:56 AM, Chris Hostetter wrote:
ie, if this is your hierarchy...
Products/
Products/Computers/
Products/Computers/Laptops
Products/Computers/Desktops
Products/Cases
Products/Cases/Laptops
Products/Cases/CellPhones
Then this trick won't work (because Laptops appears twice) but if
you have
numeric IDs that corrispond with each of those categories (so that
the two
instances of Laptops are unique...
1/
1/2/
1/2/3
1/2/4
1/5/
1/5/6
1/5/7
Why not just use the whole path as the unique identifying token for
a given node on the hierarchy? That way, you don't need to map
nodes to unique numbers, just use a prefix query.
taxonomy:Products/Computers/Laptops* or taxonomy:Products/Cases/
Laptops*
Sorry - that may be bogus query syntax, but you get the idea.
Products/Computers/Laptops* and Products/Cases/Laptops* are two
unique identifiers. You just need to make sure they are tokenized
properly - which is beyond my current off-the-cuff expertise.
At least that is the way I've been doing it with IDOL lately. I
dearly hope I can do the same in Solr when the time comes.
I have a whole mess of Java code which parses out arbitrary path
separated values into real tree structures. I think it would be a
useful addition to Solr, or maybe Solrj. It's been knocking around
my hard drives for the better part of a decade. If I get enough
interest, I'll clean it up and figure out how to offer it up as a
part of the code base. I'm pretty naive when it comes to FLOSS, so
any authoritative non-condescending hints on how to go about this
would be greatly appreciated.
Regards,
George