That's sound exactly what I'm looking for! However I cannot find an example of how to use it..could you help me please? Moreover, about id field, isn't true that id field shouldn't be analyzed as suggested in http://wiki.apache.org/solr/UniqueKey#Text_field_in_the_document?
On Tue, Jun 25, 2013 at 2:47 PM, Jan Høydahl <jan....@cominvent.com> wrote: > Sure you can query the url directly. Or if you choose you can split it up > in multiple components, e.g. using > http://lucene.apache.org/solr/4_3_0/solr-core/org/apache/solr/update/processor/URLClassifyProcessor.html > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > 25. juni 2013 kl. 14:10 skrev Flavio Pompermaier <pomperma...@okkam.it>: > > > Sorry but maybe I miss something here..could I declare url as key field > and > > query it too..? > > At the moment, my schema.xml looks like: > > > > <fields> > > <field name="url" type="string" indexed="true" stored="true" > > required="true" multiValued="false" /> > > > > <field name="category" type="string" indexed="true" stored="true"/> > > <field name="language" type="string" indexed="true" stored="true"/> > > ... > > <field name="_version_" type="long" indexed="true" stored="true"/> > > > > </fields> > > <uniqueKey>url</uniqueKey> > > > > Is it ok? or should I add a "baseurl" field of some kind to be able to > > query all url coming from a certain domain (1st or 2nd level as well)? > > > > Best, > > Flavio > > > > > > On Tue, Jun 25, 2013 at 12:28 PM, Jan Høydahl <jan....@cominvent.com> > wrote: > > > >> Probably a good match for the RegExp feature of Solr (given that your > url > >> is not tokenized) > >> e.g. q=url:/.*\.it$/ > >> > >> -- > >> Jan Høydahl, search solution architect > >> Cominvent AS - www.cominvent.com > >> > >> 25. juni 2013 kl. 12:17 skrev Flavio Pompermaier <pomperma...@okkam.it > >: > >> > >>> Hi to everybody, > >>> I'm quite new to Solr so maybe my question could be trivial for you.. > >>> In my use case I have to index stuff contained in some URL so i use url > >> as > >>> key of my document and I treat it like a string. > >>> > >>> However I'd like to be able to query by domain name, like *.it or *. > >>> somesite.com, what's the best strategy? I tought to made a URL to path > >>> transfromation and indexed using solr.PathHierarchyTokenizerFactory but > >>> maybe there's a simpler solution..isn't it? > >>> > >>> Best, > >>> Flavio > >>> > >>> -- > >>> > >>> Flavio Pompermaier > >>> *Development Department > >>> *_______________________________________________ > >>> *OKKAM**Srl **- www.okkam.it* > >>> > >>> *Phone:* +(39) 0461 283 702 > >>> *Fax:* + (39) 0461 186 6433 > >>> *Email:* f.pomperma...@okkam.it > >>> *Headquarters:* Trento (Italy), fraz. Villazzano, Salita dei Molini 2 > >>> *Registered office:* Trento (Italy), via Segantini 23 > >>> > >>> Confidentially notice. This e-mail transmission may contain legally > >>> privileged and/or confidential information. Please do not read it if > you > >>> are not the intended recipient(S). Any use, distribution, reproduction > or > >>> disclosure by any other person is strictly prohibited. If you have > >> received > >>> this e-mail in error, please notify the sender and destroy the original > >>> transmission and its attachments without reading or saving it in any > >> manner. > >> > >> > > > > > > -- > > > > Flavio Pompermaier > > *Development Department > > *_______________________________________________ > > *OKKAM**Srl **- www.okkam.it* > > > > *Phone:* +(39) 0461 283 702 > > *Fax:* + (39) 0461 186 6433 > > *Email:* f.pomperma...@okkam.it > > *Headquarters:* Trento (Italy), fraz. Villazzano, Salita dei Molini 2 > > *Registered office:* Trento (Italy), via Segantini 23 > > > > Confidentially notice. This e-mail transmission may contain legally > > privileged and/or confidential information. Please do not read it if you > > are not the intended recipient(S). Any use, distribution, reproduction or > > disclosure by any other person is strictly prohibited. If you have > received > > this e-mail in error, please notify the sender and destroy the original > > transmission and its attachments without reading or saving it in any > manner. > > -- Flavio Pompermaier *Development Department *_______________________________________________ *OKKAM**Srl **- www.okkam.it* *Phone:* +(39) 0461 283 702 *Fax:* + (39) 0461 186 6433 *Email:* f.pomperma...@okkam.it *Headquarters:* Trento (Italy), fraz. Villazzano, Salita dei Molini 2 *Registered office:* Trento (Italy), via Segantini 23 Confidentially notice. This e-mail transmission may contain legally privileged and/or confidential information. Please do not read it if you are not the intended recipient(S). Any use, distribution, reproduction or disclosure by any other person is strictly prohibited. If you have received this e-mail in error, please notify the sender and destroy the original transmission and its attachments without reading or saving it in any manner.