reusing docset to limit new query
I'm creating a custom handler where I have a base query and a resulting doclistandset. I need to do some extra queries to get top-results per facet. There are 2 cases: 1. the sorting used for the top-results for a particular facet is the same as the sorting used for the already returned doclistandset. This means that I can return a docslice of the doclist (contained in the doclistandset) after doing some intersections. This is quick and works well. 2. The sorting is different. In this case I need to do the query again (I think, please let me know if there's a better option), by using SolrIndexSearcher.getDocList(...). I'm looking for a way to tell the SolrIndexSearcher that it can limit it's query (including sorting) to the docset that I got by 1. (orginal docset + some intersections), because I figured it must be quicker (is it? ) I've found a method SolrIndexSearcher.cacheDocSet(..) but am not entirely sure what it does (sideeffects? ) Can someone please elaborate on this? Britske -- View this message in context: http://www.nabble.com/reusing-docset-to-limit-new-query-tp16721670p16721670.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: too many queries?
So I counted the number if distinct values that I have for each field that I want a facet on. In total it's around 100,000. I tried with a filterCache of 120,000 but it seems like too much because the server went down. I will try with less, around 75,000 and let you know. How do you to partition the data to a static set and a dynamic set, and then combining them at query time? Do you have a link to read about that? On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: > > > My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is 32 > > bits). > > It is optimized twice a day, it takes around 15 minutes to optimize. > > The index is updated (commits) every two minutes. There are between 10 > > and > > 100 inserts/updates every 2 minutes. > > > > Caching could help--you should definitely start there. > > The commit every 2 minutes could end up being an unsurmountable problem. > You may have to partition your data into a large, mostly static set and a > small dynamic set, combining the results at query time. > > -Mike >
Re: too many queries?
Jonathan Ariel wrote: How do you to partition the data to a static set and a dynamic set, and then combining them at query time? Do you have a link to read about that? One way would be distributed search (SOLR-303), but distributed idf is not part of the current patch anymore, so you may have some issue combining documents from the two sets as the collection statistics for the two are likely to be different. It sounds like distributed idf may be added back in in the near future as there was some chatter about it again on the dev list. -Sean
Re: too many queries?
A commit every two minutes means that the Solr caches are flushed before they even start to stabilize. Two things to try: * commit less often, 5 minutes or 10 minutes * have enough RAM that your entire index can fit in OS file buffers wunder On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > So I counted the number if distinct values that I have for each field that I > want a facet on. In total it's around 100,000. I tried with a filterCache > of 120,000 but it seems like too much because the server went down. I will > try with less, around 75,000 and let you know. > > How do you to partition the data to a static set and a dynamic set, and then > combining them at query time? Do you have a link to read about that? > > > > On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > >> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: >> >>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is 32 >>> bits). >>> It is optimized twice a day, it takes around 15 minutes to optimize. >>> The index is updated (commits) every two minutes. There are between 10 >>> and >>> 100 inserts/updates every 2 minutes. >>> >> >> Caching could help--you should definitely start there. >> >> The commit every 2 minutes could end up being an unsurmountable problem. >> You may have to partition your data into a large, mostly static set and a >> small dynamic set, combining the results at query time. >> >> -Mike >>
Re: too many queries?
In order to do that I have to change to a 64 bits OS so I can have more than 4 GB of RAM.Is there any way to see how long does it takes to Solr to warmup the searcher? On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood <[EMAIL PROTECTED]> wrote: > A commit every two minutes means that the Solr caches are flushed > before they even start to stabilize. Two things to try: > > * commit less often, 5 minutes or 10 minutes > * have enough RAM that your entire index can fit in OS file buffers > > wunder > > On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > > > So I counted the number if distinct values that I have for each field > that I > > want a facet on. In total it's around 100,000. I tried with a > filterCache > > of 120,000 but it seems like too much because the server went down. I > will > > try with less, around 75,000 and let you know. > > > > How do you to partition the data to a static set and a dynamic set, and > then > > combining them at query time? Do you have a link to read about that? > > > > > > > > On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> > wrote: > > > >> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: > >> > >>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is 32 > >>> bits). > >>> It is optimized twice a day, it takes around 15 minutes to optimize. > >>> The index is updated (commits) every two minutes. There are between 10 > >>> and > >>> 100 inserts/updates every 2 minutes. > >>> > >> > >> Caching could help--you should definitely start there. > >> > >> The commit every 2 minutes could end up being an unsurmountable > problem. > >> You may have to partition your data into a large, mostly static set > and a > >> small dynamic set, combining the results at query time. > >> > >> -Mike > >> > >
Re: too many queries?
Is there anyway to know how much memory is being used in caches? On Wed, Apr 16, 2008 at 11:50 AM, Jonathan Ariel <[EMAIL PROTECTED]> wrote: > In order to do that I have to change to a 64 bits OS so I can have more > than 4 GB of RAM.Is there any way to see how long does it takes to Solr to > warmup the searcher? > > > On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood <[EMAIL PROTECTED]> > wrote: > > > A commit every two minutes means that the Solr caches are flushed > > before they even start to stabilize. Two things to try: > > > > * commit less often, 5 minutes or 10 minutes > > * have enough RAM that your entire index can fit in OS file buffers > > > > wunder > > > > On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > > > > > So I counted the number if distinct values that I have for each field > > that I > > > want a facet on. In total it's around 100,000. I tried with a > > filterCache > > > of 120,000 but it seems like too much because the server went down. I > > will > > > try with less, around 75,000 and let you know. > > > > > > How do you to partition the data to a static set and a dynamic set, > > and then > > > combining them at query time? Do you have a link to read about that? > > > > > > > > > > > > On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> > > wrote: > > > > > >> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: > > >> > > >>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is > > 32 > > >>> bits). > > >>> It is optimized twice a day, it takes around 15 minutes to optimize. > > >>> The index is updated (commits) every two minutes. There are between > > 10 > > >>> and > > >>> 100 inserts/updates every 2 minutes. > > >>> > > >> > > >> Caching could help--you should definitely start there. > > >> > > >> The commit every 2 minutes could end up being an unsurmountable > > problem. > > >> You may have to partition your data into a large, mostly static set > > and a > > >> small dynamic set, combining the results at query time. > > >> > > >> -Mike > > >> > > > > >
Re: too many queries?
Do it. 32-bit OS's went out of style five years ago in server-land. I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for the OS and 1 for other processes. That might be tight. 12GB would be a lot better. wunder On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > In order to do that I have to change to a 64 bits OS so I can have more than > 4 GB of RAM.Is there any way to see how long does it takes to Solr to warmup > the searcher? > > On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood <[EMAIL PROTECTED]> > wrote: > >> A commit every two minutes means that the Solr caches are flushed >> before they even start to stabilize. Two things to try: >> >> * commit less often, 5 minutes or 10 minutes >> * have enough RAM that your entire index can fit in OS file buffers >> >> wunder >> >> On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: >> >>> So I counted the number if distinct values that I have for each field >> that I >>> want a facet on. In total it's around 100,000. I tried with a >> filterCache >>> of 120,000 but it seems like too much because the server went down. I >> will >>> try with less, around 75,000 and let you know. >>> >>> How do you to partition the data to a static set and a dynamic set, and >> then >>> combining them at query time? Do you have a link to read about that? >>> >>> >>> >>> On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> >> wrote: >>> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: > My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is 32 > bits). > It is optimized twice a day, it takes around 15 minutes to optimize. > The index is updated (commits) every two minutes. There are between 10 > and > 100 inserts/updates every 2 minutes. > Caching could help--you should definitely start there. The commit every 2 minutes could end up being an unsurmountable >> problem. You may have to partition your data into a large, mostly static set >> and a small dynamic set, combining the results at query time. -Mike >> >>
Re: Fuzzy queries in dismax specs?
It is working, but I disabled recursive field aliasing. Two questions: * Is it possible to do recursive field aliasing from solrconfig.xml? * If not, do we want to preserve this speculative feature? I think the answers are "no" and "no", but I'd like a second opinion. wunder On 4/15/08 10:23 AM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : I've started implementing something to use fuzzy queries for selected fields > : in dismax. The request handler spec looks like this: > : > :exact~0.7^4.0 stemmed^2.0 > > that's a pretty cool idea ... usually when people talk about adding > support for other querytypes in dismax they mean to the query sytnax, but > you are adding more info to the qf to specify how hte field should be > handled in general -- i like it. > > i think if i had it to do over again (now that dismax supports multiple > param values, and per field overrides) i would have made qf and pf > multivalued params containing just the field names, and gotten the boost > value from a per field overridable fieldBoost param, so adding a > fuzzyDistance param would also be trivial 9without needing to parse crazy > syntax) > > (hmmm... ps could be a per field overridable field too ... dismax v2.0 > maybe) > > > -Hoss
Re: too many queries?
Hello. I am having a similar problem as the OP. I see that you recommended setting 4GB for the index, and 2 for Solr. How do I allocate memory for the index? I was under the impression that Solr did not support a RAMIndex. Walter Underwood wrote: > > Do it. 32-bit OS's went out of style five years ago in server-land. > > I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for > the OS and 1 for other processes. That might be tight. 12GB would > be a lot better. > > wunder > > On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > >> In order to do that I have to change to a 64 bits OS so I can have more >> than >> 4 GB of RAM.Is there any way to see how long does it takes to Solr to >> warmup >> the searcher? >> >> On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood >> <[EMAIL PROTECTED]> >> wrote: >> >>> A commit every two minutes means that the Solr caches are flushed >>> before they even start to stabilize. Two things to try: >>> >>> * commit less often, 5 minutes or 10 minutes >>> * have enough RAM that your entire index can fit in OS file buffers >>> >>> wunder >>> >>> On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: >>> So I counted the number if distinct values that I have for each field >>> that I want a facet on. In total it's around 100,000. I tried with a >>> filterCache of 120,000 but it seems like too much because the server went down. I >>> will try with less, around 75,000 and let you know. How do you to partition the data to a static set and a dynamic set, and >>> then combining them at query time? Do you have a link to read about that? On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> >>> wrote: > On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: > >> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is >> 32 >> bits). >> It is optimized twice a day, it takes around 15 minutes to optimize. >> The index is updated (commits) every two minutes. There are between >> 10 >> and >> 100 inserts/updates every 2 minutes. >> > > Caching could help--you should definitely start there. > > The commit every 2 minutes could end up being an unsurmountable >>> problem. > You may have to partition your data into a large, mostly static set >>> and a > small dynamic set, combining the results at query time. > > -Mike > >>> >>> > > > -- View this message in context: http://www.nabble.com/too-many-queries--tp16690870p16727264.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: too many queries?
4GB for the operating system to use to buffer disk files. That is not a Solr setting. wunder On 4/16/08 11:05 AM, "oleg_gnatovskiy" <[EMAIL PROTECTED]> wrote: > > Hello. I am having a similar problem as the OP. I see that you recommended > setting 4GB for the index, and 2 for Solr. How do I allocate memory for the > index? I was under the impression that Solr did not support a RAMIndex. > > > Walter Underwood wrote: >> >> Do it. 32-bit OS's went out of style five years ago in server-land. >> >> I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for >> the OS and 1 for other processes. That might be tight. 12GB would >> be a lot better. >> >> wunder >> >> On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: >> >>> In order to do that I have to change to a 64 bits OS so I can have more >>> than >>> 4 GB of RAM.Is there any way to see how long does it takes to Solr to >>> warmup >>> the searcher? >>> >>> On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood >>> <[EMAIL PROTECTED]> >>> wrote: >>> A commit every two minutes means that the Solr caches are flushed before they even start to stabilize. Two things to try: * commit less often, 5 minutes or 10 minutes * have enough RAM that your entire index can fit in OS file buffers wunder On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > So I counted the number if distinct values that I have for each field that I > want a facet on. In total it's around 100,000. I tried with a filterCache > of 120,000 but it seems like too much because the server went down. I will > try with less, around 75,000 and let you know. > > How do you to partition the data to a static set and a dynamic set, and then > combining them at query time? Do you have a link to read about that? > > > > On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > >> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: >> >>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is >>> 32 >>> bits). >>> It is optimized twice a day, it takes around 15 minutes to optimize. >>> The index is updated (commits) every two minutes. There are between >>> 10 >>> and >>> 100 inserts/updates every 2 minutes. >>> >> >> Caching could help--you should definitely start there. >> >> The commit every 2 minutes could end up being an unsurmountable problem. >> You may have to partition your data into a large, mostly static set and a >> small dynamic set, combining the results at query time. >> >> -Mike >> >> >> >>
Re: too many queries?
Oleg, you can't explicitly say "N GB for index". Wunder was just saying how much you can imagine how much RAM each piece might need and be happy with. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: oleg_gnatovskiy <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, April 16, 2008 2:05:23 PM Subject: Re: too many queries? Hello. I am having a similar problem as the OP. I see that you recommended setting 4GB for the index, and 2 for Solr. How do I allocate memory for the index? I was under the impression that Solr did not support a RAMIndex. Walter Underwood wrote: > > Do it. 32-bit OS's went out of style five years ago in server-land. > > I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for > the OS and 1 for other processes. That might be tight. 12GB would > be a lot better. > > wunder > > On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > >> In order to do that I have to change to a 64 bits OS so I can have more >> than >> 4 GB of RAM.Is there any way to see how long does it takes to Solr to >> warmup >> the searcher? >> >> On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood >> <[EMAIL PROTECTED]> >> wrote: >> >>> A commit every two minutes means that the Solr caches are flushed >>> before they even start to stabilize. Two things to try: >>> >>> * commit less often, 5 minutes or 10 minutes >>> * have enough RAM that your entire index can fit in OS file buffers >>> >>> wunder >>> >>> On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: >>> So I counted the number if distinct values that I have for each field >>> that I want a facet on. In total it's around 100,000. I tried with a >>> filterCache of 120,000 but it seems like too much because the server went down. I >>> will try with less, around 75,000 and let you know. How do you to partition the data to a static set and a dynamic set, and >>> then combining them at query time? Do you have a link to read about that? On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> >>> wrote: > On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: > >> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is >> 32 >> bits). >> It is optimized twice a day, it takes around 15 minutes to optimize. >> The index is updated (commits) every two minutes. There are between >> 10 >> and >> 100 inserts/updates every 2 minutes. >> > > Caching could help--you should definitely start there. > > The commit every 2 minutes could end up being an unsurmountable >>> problem. > You may have to partition your data into a large, mostly static set >>> and a > small dynamic set, combining the results at query time. > > -Mike > >>> >>> > > > -- View this message in context: http://www.nabble.com/too-many-queries--tp16690870p16727264.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searching for popular phrases or words
Thanks Chris. I had in mind "Occurs in alot of documents". Please do point me where i can pick up an example of using the LukeRequestHandler and the "shingles based tokenizer". Eric --- Chris Hostetter <[EMAIL PROTECTED]> wrote: > > it depends on your definition of "polular" if you > mean "occurs in a lot of > documents" then take a look at the > LukeRequestHandler ... if can give you > info on terms with high frequencies (and you can use > a Shingles based > tokenizer to index "phrase" as terms > > if by popular you mean "occurs in a lot of queries" > there isn't anything > in Solr that keeps track of what people search for > ... your application > would need to do that. > > : How can i search for popular phrases or words with > an > : option to include only, for example, technical > terms > : e.g "Oracle database" rather than common english > > You'll need a better definition of your goal to get > any meaningful answer > to the "an option to include only, for example, > technical terms" part of > that question ... the "for example" implies that > there are other examples > ... how would you (as a human person) decide when to > classify a phrase as > a "technical" phrase, vs an ... "other" phrase? if > you can't answer that > question, then neither can code. > > > -Hoss > > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
Re: Searching for popular phrases or words
Eric, Look at LUCENE-400 or Lucene trunk/contrib/analyzers for the shingles stuff. Have you checked the Wiki for info about LukeRequestHandler? I bet it's there. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Edwin Koome <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Wednesday, April 16, 2008 4:07:24 PM Subject: Re: Searching for popular phrases or words Thanks Chris. I had in mind "Occurs in alot of documents". Please do point me where i can pick up an example of using the LukeRequestHandler and the "shingles based tokenizer". Eric --- Chris Hostetter <[EMAIL PROTECTED]> wrote: > > it depends on your definition of "polular" if you > mean "occurs in a lot of > documents" then take a look at the > LukeRequestHandler ... if can give you > info on terms with high frequencies (and you can use > a Shingles based > tokenizer to index "phrase" as terms > > if by popular you mean "occurs in a lot of queries" > there isn't anything > in Solr that keeps track of what people search for > ... your application > would need to do that. > > : How can i search for popular phrases or words with > an > : option to include only, for example, technical > terms > : e.g "Oracle database" rather than common english > > You'll need a better definition of your goal to get > any meaningful answer > to the "an option to include only, for example, > technical terms" part of > that question ... the "for example" implies that > there are other examples > ... how would you (as a human person) decide when to > classify a phrase as > a "technical" phrase, vs an ... "other" phrase? if > you can't answer that > question, then neither can code. > > > -Hoss > > Be a better friend, newshound, and know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
Re: too many queries?
Oh ok. That makes sense. Thanks. Otis Gospodnetic wrote: > > Oleg, you can't explicitly say "N GB for index". Wunder was just saying > how much you can imagine how much RAM each piece might need and be happy > with. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > - Original Message > From: oleg_gnatovskiy <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday, April 16, 2008 2:05:23 PM > Subject: Re: too many queries? > > > Hello. I am having a similar problem as the OP. I see that you recommended > setting 4GB for the index, and 2 for Solr. How do I allocate memory for > the > index? I was under the impression that Solr did not support a RAMIndex. > > > Walter Underwood wrote: >> >> Do it. 32-bit OS's went out of style five years ago in server-land. >> >> I would start with 8GB of RAM. 4GB for your index, 2 for Solr, 1 for >> the OS and 1 for other processes. That might be tight. 12GB would >> be a lot better. >> >> wunder >> >> On 4/16/08 7:50 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: >> >>> In order to do that I have to change to a 64 bits OS so I can have more >>> than >>> 4 GB of RAM.Is there any way to see how long does it takes to Solr to >>> warmup >>> the searcher? >>> >>> On Wed, Apr 16, 2008 at 11:40 AM, Walter Underwood >>> <[EMAIL PROTECTED]> >>> wrote: >>> A commit every two minutes means that the Solr caches are flushed before they even start to stabilize. Two things to try: * commit less often, 5 minutes or 10 minutes * have enough RAM that your entire index can fit in OS file buffers wunder On 4/16/08 6:27 AM, "Jonathan Ariel" <[EMAIL PROTECTED]> wrote: > So I counted the number if distinct values that I have for each field that I > want a facet on. In total it's around 100,000. I tried with a filterCache > of 120,000 but it seems like too much because the server went down. I will > try with less, around 75,000 and let you know. > > How do you to partition the data to a static set and a dynamic set, > and then > combining them at query time? Do you have a link to read about that? > > > > On Tue, Apr 15, 2008 at 7:21 PM, Mike Klaas <[EMAIL PROTECTED]> wrote: > >> On 15-Apr-08, at 5:38 AM, Jonathan Ariel wrote: >> >>> My index is 4GB on disk. My servers has 8 GB of RAM each (the OS is >>> 32 >>> bits). >>> It is optimized twice a day, it takes around 15 minutes to optimize. >>> The index is updated (commits) every two minutes. There are between >>> 10 >>> and >>> 100 inserts/updates every 2 minutes. >>> >> >> Caching could help--you should definitely start there. >> >> The commit every 2 minutes could end up being an unsurmountable problem. >> You may have to partition your data into a large, mostly static set and a >> small dynamic set, combining the results at query time. >> >> -Mike >> >> >> >> > > -- > View this message in context: > http://www.nabble.com/too-many-queries--tp16690870p16727264.html > Sent from the Solr - User mailing list archive at Nabble.com. > > > > > > -- View this message in context: http://www.nabble.com/too-many-queries--tp16690870p16732932.html Sent from the Solr - User mailing list archive at Nabble.com.
Installation help
Hi all, I am trying to install Solr with Jetty (as part of another application) on a Linux server running Gentoo linux and JDK 1.6.0_05. When I try to start Jetty (and Solr), it doesn't open a port. I know you will need more info, but I'm not sure what you would need as I'm not clear on how this part works. Thanks, Shawn
POST interface to sending queries to SOLR?
Folks, I know there is a 'GET' to send queries to Solr. But is there a POST interface to sending queries? If so, can someone point me in that direction? Thanks, Jim
Re: Installation help
What does the Jetty log output say in the console after you start it? It should mention the port # on one of the last lines. If it does, try using curl or wget to do a local request: curl http://localhost:8983/solr/ wget http://localhost:8983/solr/ Matt On Wed, Apr 16, 2008 at 5:08 PM, Shawn Carraway <[EMAIL PROTECTED]> wrote: > Hi all, > I am trying to install Solr with Jetty (as part of another application) > on a Linux server running Gentoo linux and JDK 1.6.0_05. > > When I try to start Jetty (and Solr), it doesn't open a port. > > I know you will need more info, but I'm not sure what you would need as > I'm not clear on how this part works. > > Thanks, > Shawn > >
XSLT transform before update?
Hey everyone, I'm experimenting with updating solr from a remote XML source, using an XSLT transform to get it into the solr XML syntax (and yes, I've looked into SOLR-469, but disregarded it as I need to do quite a bit using XSLT to get it to what I can index) to let me maintain an index. I'm looking at using stream.url, but I need to do the XSLT at some point in there. I would prefer to do the XSLT on the client (solr) side of the transfer, for various reasons. Is there a way to implement a custom request handler or similar to get solr to apply an XSLT transform to the content stream before it attempts to parse it? If not possible OOTB, where would be the right place to add said functionality? Thanks much for your help, Daniel
Re: XSLT transform before update?
: Is there a way to implement a custom request handler or similar to get : solr to apply an XSLT transform to the content stream before it attempts : to parse it? If not possible OOTB, where would be the right place to : add said functionality? take a look at SOLR-285 and SOLR-370 ... a RequestHandler is the right way to go, the biggest problems with the patch in SOLR-285 at the moment are: a) i wrote it and i don't know much about doing XSLT transformations in java efficiently. b) the existing XSLT Transformer "caching" code in Solr is really trivial and not suitable for any real volume ... if it were overhauled to take advantage of the standard SolrCache APIs it would be a lot more reusable by both the XsltResponseWriter and a new XsltUpdateHandler. -Hoss
Re: POST interface to sending queries to SOLR?
: I know there is a 'GET' to send queries to Solr. But is there a POST : interface to sending queries? If so, can someone point me in that : direction? POST using standard the standard application/x-www-form-urlencoded content-type (ie: the same way you would POST using any HTML form) -Hoss
Re: XSLT transform before update?
Hi Daniel, Maybe if you can give us a sample of how your XML looks like, we can suggest how to use SOLR-469 (Data Import Handler) to index it. Most of the use-cases we have yet encountered are solvable using the XPathEntityProcessor in DataImportHandler without using XSLT, for details look at http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476 If you're willing to write code, you can do almost anything with DataImportHandler. If this is a general need, I can look into adding XSLT support in Data Import Handler. On Thu, Apr 17, 2008 at 9:13 AM, Daniel Papasian < [EMAIL PROTECTED]> wrote: > Hey everyone, > > I'm experimenting with updating solr from a remote XML source, using an > XSLT transform to get it into the solr XML syntax (and yes, I've looked > into SOLR-469, but disregarded it as I need to do quite a bit using XSLT > to get it to what I can index) to let me maintain an index. > > I'm looking at using stream.url, but I need to do the XSLT at some point > in there. I would prefer to do the XSLT on the client (solr) side of > the transfer, for various reasons. > > Is there a way to implement a custom request handler or similar to get > solr to apply an XSLT transform to the content stream before it attempts > to parse it? If not possible OOTB, where would be the right place to > add said functionality? > > Thanks much for your help, > > Daniel > -- Regards, Shalin Shekhar Mangar.