I designed and built the taxonomy and classification support in the Ultraseek search engine.
There are many kinds of taxonomies, even different "shapes": tree, DAG, facets, tree + links (e.g. ANSI/NISO Z39.19, LCSH, Yahoo directory), and even mixtures of those. It would be a serious limitation to support only one. It would be a mess to try and support all. Luckily, it works very well to keep the classification separate and tag the documents with category membership. Building a long string is not hard. One feature that is very useful is to update the category tag after the document has been indexed. We ran into that again and again when implementing taxonomies at Verity. For example, you can build a nearest neighbor classifier using More Like This, but you need to index and commit the doc before you run an MLT search and discover the category. Then you need to delete it and reindex with the category tag. wunder On 12/12/08 5:16 AM, "Jana, Kumar Raja" <kj...@ptc.com> wrote: > Thanks all. This workaround was very helpful for my case. > > However, it would be wonderful if there was a way to make Solr have a > copy of my classification so that I need not create a big string at the > client side everytime I need to index a document. I am sure there are > many others out there who do the following on the client side burdening > their already overburdened client servers: > > Here's something I would love to see on Solr if possible: > 1. A way to send my classification to Solr. > a. The classification has nodes which have their own fields (and > default values) > b. Images or static files of such sort associated with the nodes > 2. Update the classification everytime I want to change it. > > During indexing of documents: > 1. Give the name of the node my document belongs to, say a special field > named, <classificationNode>. Solr would index the document just as any > other document or do some optimized indexing to return results faster. > > During Search: > 1. Solr matches my query with and checks if any of the documents being > returned are classified. > 2. If a classified document is found, return all the documents which are > classified with the child nodes (as well as with the document's node - > this can be optional) > > > I feel taxonomy support would be a welcome feature in Solr. What do the > developers say? > > -Kumar > > -----Original Message----- > From: Alexander Ramos Jardim [mailto:alexander.ramos.jar...@gmail.com] > Sent: Friday, December 12, 2008 12:04 AM > To: solr-user@lucene.apache.org > Subject: Re: Taxonomy Support on Solr > > I use this workaround all the time. > > When I need to put the hierarchy which a product belongs, I simply > arranje all the nodes as: "a ^ b ^ c ^ d" > > 2008/12/11 Otis Gospodnetic <otis_gospodne...@yahoo.com> > >> This is what Hoss was hinting at yesterday (or was that on the Lucene >> list?). You can do that if you encode the hierarchy in a field >> properly., e.g. "/A /B /1" may be one doc's field. "/A /B /2" may be >> another doc's field. THen you just have to figure out how to query >> that to get a sub-tree. >> >> >> Otis >> -- >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> >> >> ----- Original Message ---- >>> From: "Jana, Kumar Raja" <kj...@ptc.com> >>> To: solr-user@lucene.apache.org >>> Sent: Thursday, December 11, 2008 5:03:02 AM >>> Subject: Taxonomy Support on Solr >>> >>> Hi, >>> >>> Any plans of supporting user-defined classifications on Solr? Is >>> there any component which returns all the children of a node (till >>> the leaf >>> node) when I search for any node? >>> >>> May be this would help: >>> >>> Say I have a few SolrDocuments classified as: >>> >>> A >>> B--------------------------C >>> 1----2----3 8------9 >>> >>> (I.e A has 2 child nodes B and C. B has 3 child nodes 1,2,3 and C >>> has 2 child nodes 8,9) When my search criteria matches B, my results > >>> should contain B as well as 1,2 and 3 too. >>> Search for A would return all the nodes mentioned above. >>> >>> -Kumar >> >> > > > -- > Alexander Ramos Jardim