XS DateTime format
Hi, I just have a small question regarding the output format of fields of type TrieDateField. If a document containing the date 0001-01-01T01.01.01Z is passed to Solr and I then try to search for that document the output of the date field is of format Y-MM-DDThh:mm:ssZ. The first three zeros are missing. According to XML specification found on w3.org in XS DateTime is a four-or-more digit optionally negative-signed numeral that represents the year. Is it intentional that Solr strips leading zeros for the first four digits? Thanks Jens Jørgen Flaaris
fq parameter with partial value
Hello, I would like to know if there is a way to use the fq parameter with a partial value. For instance, if I have a request with fq=NAME:Joe, and I would like to retrieve all answers where NAME contains Joe, including those with NAME = Joe Smith. Thanks, Elisabeth
Re: fq parameter with partial value
Hi Elisabeth, that's not what FilterQueries are made for :) What against using that Criteria in the Query? Perhaps you want to describe your UseCase and we'll see if there's another way to solve it? Regards Stefan On Thu, Apr 28, 2011 at 9:09 AM, elisabeth benoit wrote: > Hello, > > I would like to know if there is a way to use the fq parameter with a > partial value. > > For instance, if I have a request with fq=NAME:Joe, and I would like to > retrieve all answers where NAME contains Joe, including those with NAME = > Joe Smith. > > Thanks, > Elisabeth >
Spatial Search
Dear list :) I am new to solr and try to use the spatial search feature which was added in 3.1. In my schema.xml I have 2 double fields for latitude and longitude. How can I get them into the location field type? I use solrj to fill the index with data. If I would use a location field instead of two double fields, how could I fill this with solrj? I use annotations to link the data from my dto´s to the index fields... Hope you got my problem... best regards, Jonas
Re: fq parameter with partial value
Hi Stefan, Thanks for answering. In more details, my problem is the following. I'm working on searching points of interest (POIs), which can be hotels, restaurants, plumbers, psychologists, etc. Those POIs can be identified among other things by categories or by brand. And a single POIs might have different categories (no maximum number). User might enter a query like McDonald’s Paris or Restaurant Paris or many other possible queries First I want to do a facet search on brand and categories, to find out which case is the current case. http://localhost:8080/solr /select?q=restaurant paris &facet=true&facet.field=BRAND& facet.field=CATEGORY and get an answer like 598 451 Then I want to send a request with fq= CATEGORY: Restaurant and still get answers with CATEGORY= Restaurant Hotel. One solution would be to modify the data to add a new document every time we have a new category, so a POI with three different categories would be index three times, each time with a different category. But I was wondering if there was another way around. Thanks again, Elisabeth 2011/4/28 Stefan Matheis > Hi Elisabeth, > > that's not what FilterQueries are made for :) What against using that > Criteria in the Query? > Perhaps you want to describe your UseCase and we'll see if there's > another way to solve it? > > Regards > Stefan > > On Thu, Apr 28, 2011 at 9:09 AM, elisabeth benoit > wrote: > > Hello, > > > > I would like to know if there is a way to use the fq parameter with a > > partial value. > > > > For instance, if I have a request with fq=NAME:Joe, and I would like to > > retrieve all answers where NAME contains Joe, including those with NAME = > > Joe Smith. > > > > Thanks, > > Elisabeth > > >
how to update database record after indexing
Hello, i am using dataimporthandler to import data from sql server database. my requirement is when solr completed indexing on particular database record i want to update that record in database or after indexing all records if i can get all ids and update all records how to achieve same ? Thanks Vishal Parekh -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-update-database-record-after-indexing-tp2874171p2874171.html Sent from the Solr - User mailing list archive at Nabble.com.
manual background re-indexing
Hello list, I am planning to implement a setup, to be run on unix scripts, that should perform a full pull-and-reindex in a background server and index then deploy that index. All should happen on the same machine. I thought the replication methods would help me but they seem to rather solve the issues of distribution while, what I need, is only the ability to: - suspend the queries - swap the directories with the new index - close all searchers - reload and warm-up the searcher on the new index Is there a part of the replication utilities (http or unix) that I could use to perform the above tasks? I intend to do this on occasion... maybe once a month or even less. Is "reload" the right term to be used? paul
Re: Formatted date/time in long field and javabinRW exception
Any thoughts on this one? Why does Solr output a string in a long field with XMLResponseWriter but fails doing so (as it should) with the javabin format? On Tuesday 19 April 2011 10:52:33 Markus Jelsma wrote: > Hi, > > Nutch 1.3-dev seems to have changed its tstamp field from a long to a > properly formatted Solr readable date/time but the example Solr schema for > Nutch still configures the tstamp field as a long. This results in a > formatted date/time in a long field, which i think should not be allowed > in the first place by Solr. > > 2011-04-19T08:16:31.675Z > > While the above is strange enough, i only found out it's all wrong when > using the javabin format. The following query will throw an exception > while using XML response writer works find and returns the tstamp as long > but formatted as a proper date/time. > > javabin: > > curl > "http://localhost:8983/solr/select?fl=id,boost,tstamp,digest&start=0&q=id: > \[*+TO+*\]&wt=javabin&rows=2&version=1" > > Apr 19, 2011 10:34:50 AM > org.apache.solr.request.BinaryResponseWriter$Resolver getDoc > WARNING: Error reading a field from document : > SolrDocument[{digest=7ff92a31c58e43a34fd45bc6d87cda03}] > java.lang.NumberFormatException: For input string: > "2011-04-19T08:16:31.675Z" at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:4 > 8) at java.lang.Long.parseLong(Long.java:419) > at java.lang.Long.valueOf(Long.java:525) > at org.apache.solr.schema.LongField.toObject(LongField.java:82) > at org.apache.solr.schema.LongField.toObject(LongField.java:33) > at > org.apache.solr.request.BinaryResponseWriter$Resolver.getDoc(BinaryResponse > Writer.java:148) at > org.apache.solr.request.BinaryResponseWriter$Resolver.writeDocList(BinaryRe > sponseWriter.java:124) at > org.apache.solr.request.BinaryResponseWriter$Resolver.resolve(BinaryRespons > eWriter.java:88) at > org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:143) > at > org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:1 > 33) at > org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:2 > 21) at > org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:138) > at > org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:87) > at > org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.jav > a:48) at > org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter > .java:322) at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java > :254) more trace from Jetty > > Here's the wt=xml working fine and showing output for the tstamp field: > > markus@midas:~$ curl > "http://localhost:8983/solr/select?fl=id,boost,tstamp,digest&start=0&q=id: > \[*+TO+*\]&wt=xml&rows=2&version=1" > > > > 017 > > id,boost,tstamp,digest > 0 > id:[* TO *] > xml > 2 > 1< > /lst> > > > 478e77f99f7005ae71aa92a879be2fd4 > idfield > 2011-04-19T08:16:31.689Z > > > 7ff92a31c58e43a34fd45bc6d87cda03 > idfield > 2011-04-19T08:16:31.675Z > > > > > Cheers, -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
Re: manual background re-indexing
Hi Paul Would a multi-core set up and the swap command do what you want it to do? http://wiki.apache.org/solr/CoreAdmin Shaun On 28 April 2011 12:49, Paul Libbrecht wrote: > > Hello list, > > I am planning to implement a setup, to be run on unix scripts, that should > perform a full pull-and-reindex in a background server and index then deploy > that index. All should happen on the same machine. > > I thought the replication methods would help me but they seem to rather > solve the issues of distribution while, what I need, is only the ability to: > > - suspend the queries > - swap the directories with the new index > - close all searchers > - reload and warm-up the searcher on the new index > > Is there a part of the replication utilities (http or unix) that I could > use to perform the above tasks? > I intend to do this on occasion... maybe once a month or even less. > Is "reload" the right term to be used? > > paul
Re: fq parameter with partial value
So, I assume your CATEGORY field is multiValued but each value is not broken up into tokens, right? If that's the case, would it work to have a second field CATEGORY_TOKENIZED and run your fq against that field instead? You could have this be a multiValued field with an increment gap if you wanted to prevent matches across separate entries and have your fq do a proximity search where the proximity was less than the increment gap Best Erick On Thu, Apr 28, 2011 at 6:03 AM, elisabeth benoit wrote: > Hi Stefan, > > Thanks for answering. > > In more details, my problem is the following. I'm working on searching > points of interest (POIs), which can be hotels, restaurants, plumbers, > psychologists, etc. > > Those POIs can be identified among other things by categories or by brand. > And a single POIs might have different categories (no maximum number). User > might enter a query like > > > McDonald’s Paris > > > or > > > Restaurant Paris > > > or > > > many other possible queries > > > First I want to do a facet search on brand and categories, to find out which > case is the current case. > > > http://localhost:8080/solr /select?q=restaurant paris > &facet=true&facet.field=BRAND& facet.field=CATEGORY > > and get an answer like > > > > > > 598 > > 451 > > > > Then I want to send a request with fq= CATEGORY: Restaurant and still get > answers with CATEGORY= Restaurant Hotel. > > > > One solution would be to modify the data to add a new document every time we > have a new category, so a POI with three different categories would be index > three times, each time with a different category. > > > But I was wondering if there was another way around. > > > > Thanks again, > > Elisabeth > > > 2011/4/28 Stefan Matheis > >> Hi Elisabeth, >> >> that's not what FilterQueries are made for :) What against using that >> Criteria in the Query? >> Perhaps you want to describe your UseCase and we'll see if there's >> another way to solve it? >> >> Regards >> Stefan >> >> On Thu, Apr 28, 2011 at 9:09 AM, elisabeth benoit >> wrote: >> > Hello, >> > >> > I would like to know if there is a way to use the fq parameter with a >> > partial value. >> > >> > For instance, if I have a request with fq=NAME:Joe, and I would like to >> > retrieve all answers where NAME contains Joe, including those with NAME = >> > Joe Smith. >> > >> > Thanks, >> > Elisabeth >> > >> >
Re: how to update database record after indexing
I don't think you can do this through DIH, you'll probably have to write a separate process that queries the Solr index and updates your table. You'll have to be a bit cautious that you coordinate the commits, that is wait for the DIH to complete and commit before running your separate db update process. Best Erick On Thu, Apr 28, 2011 at 6:59 AM, vrpar...@gmail.com wrote: > Hello, > > i am using dataimporthandler to import data from sql server database. > > my requirement is when solr completed indexing on particular database record > i want to update that record in database > > or after indexing all records if i can get all ids and update all records > > how to achieve same ? > > Thanks > > Vishal Parekh > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/how-to-update-database-record-after-indexing-tp2874171p2874171.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Spatial Search
On Thu, Apr 28, 2011 at 5:15 AM, Jonas Lanzendörfer wrote: > I am new to solr and try to use the spatial search feature which was added in > 3.1. In my schema.xml I have 2 double fields for latitude and longitude. How > can I get them into the location field type? I use solrj to fill the index > with data. If I would use a location field instead of two double fields, how > could I fill this with solrj? I use annotations to link the data from my > dto´s to the index fields... I've not used the annotation stuff in SolrJ, but since the value sent in must be of the for 10.3,20.4 then I guess one would have to have a String field with this value on your object. -Yonik http://www.lucenerevolution.org -- Lucene/Solr User Conference, May 25-26, San Francisco
Re: manual background re-indexing
Just where to do I put the new index data with such a command? Simply replacing the segment files appears dangerous to me. Also, what is the best practice to move from single-core to multi-core? My current set-up is single-core, do I simply need to add a solr.xml in my solr-home and one core1 directory with the data that was there previously? paul Le 28 avr. 2011 à 14:04, Shaun Campbell a écrit : > Hi Paul > > Would a multi-core set up and the swap command do what you want it to do? > > http://wiki.apache.org/solr/CoreAdmin > > Shaun > > On 28 April 2011 12:49, Paul Libbrecht wrote: > >> >> Hello list, >> >> I am planning to implement a setup, to be run on unix scripts, that should >> perform a full pull-and-reindex in a background server and index then deploy >> that index. All should happen on the same machine. >> >> I thought the replication methods would help me but they seem to rather >> solve the issues of distribution while, what I need, is only the ability to: >> >> - suspend the queries >> - swap the directories with the new index >> - close all searchers >> - reload and warm-up the searcher on the new index >> >> Is there a part of the replication utilities (http or unix) that I could >> use to perform the above tasks? >> I intend to do this on occasion... maybe once a month or even less. >> Is "reload" the right term to be used? >> >> paul
Re: manual background re-indexing
It would probable be safest just to set up a separate system as multi-core from the start, get the process working and then either use the new machine or copy the whole setup to the production machine. Best Erick On Thu, Apr 28, 2011 at 8:49 AM, Paul Libbrecht wrote: > Just where to do I put the new index data with such a command? Simply > replacing the segment files appears dangerous to me. > > Also, what is the best practice to move from single-core to multi-core? > My current set-up is single-core, do I simply need to add a solr.xml in my > solr-home and one core1 directory with the data that was there previously? > > paul > > > Le 28 avr. 2011 à 14:04, Shaun Campbell a écrit : > >> Hi Paul >> >> Would a multi-core set up and the swap command do what you want it to do? >> >> http://wiki.apache.org/solr/CoreAdmin >> >> Shaun >> >> On 28 April 2011 12:49, Paul Libbrecht wrote: >> >>> >>> Hello list, >>> >>> I am planning to implement a setup, to be run on unix scripts, that should >>> perform a full pull-and-reindex in a background server and index then deploy >>> that index. All should happen on the same machine. >>> >>> I thought the replication methods would help me but they seem to rather >>> solve the issues of distribution while, what I need, is only the ability to: >>> >>> - suspend the queries >>> - swap the directories with the new index >>> - close all searchers >>> - reload and warm-up the searcher on the new index >>> >>> Is there a part of the replication utilities (http or unix) that I could >>> use to perform the above tasks? >>> I intend to do this on occasion... maybe once a month or even less. >>> Is "reload" the right term to be used? >>> >>> paul > >
Re: fq parameter with partial value
yes, the multivalued field is not broken up into tokens. so, if I understand well what you mean, I could have a field CATEGORY with multiValued="true" a field CATEGORY_TOKENIZED with multiValued=" true" and then some POI POI_Name ... Restaurant Hotel Restaurant Hotel do faceting on CATEGORY and fq on CATEGORY_TOKENIZED. But then, wouldn't it be possible to do faceting on CATEGORY_TOKENIZED? Best regards Elisabeth 2011/4/28 Erick Erickson > So, I assume your CATEGORY field is multiValued but each value is not > broken up into tokens, right? If that's the case, would it work to have a > second field CATEGORY_TOKENIZED and run your fq against that > field instead? > > You could have this be a multiValued field with an increment gap if you > wanted > to prevent matches across separate entries and have your fq do a proximity > search where the proximity was less than the increment gap > > Best > Erick > > On Thu, Apr 28, 2011 at 6:03 AM, elisabeth benoit > wrote: > > Hi Stefan, > > > > Thanks for answering. > > > > In more details, my problem is the following. I'm working on searching > > points of interest (POIs), which can be hotels, restaurants, plumbers, > > psychologists, etc. > > > > Those POIs can be identified among other things by categories or by > brand. > > And a single POIs might have different categories (no maximum number). > User > > might enter a query like > > > > > > McDonald’s Paris > > > > > > or > > > > > > Restaurant Paris > > > > > > or > > > > > > many other possible queries > > > > > > First I want to do a facet search on brand and categories, to find out > which > > case is the current case. > > > > > > http://localhost:8080/solr /select?q=restaurant paris > > &facet=true&facet.field=BRAND& facet.field=CATEGORY > > > > and get an answer like > > > > > > > > > > > > 598 > > > > 451 > > > > > > > > Then I want to send a request with fq= CATEGORY: Restaurant and still get > > answers with CATEGORY= Restaurant Hotel. > > > > > > > > One solution would be to modify the data to add a new document every time > we > > have a new category, so a POI with three different categories would be > index > > three times, each time with a different category. > > > > > > But I was wondering if there was another way around. > > > > > > > > Thanks again, > > > > Elisabeth > > > > > > 2011/4/28 Stefan Matheis > > > >> Hi Elisabeth, > >> > >> that's not what FilterQueries are made for :) What against using that > >> Criteria in the Query? > >> Perhaps you want to describe your UseCase and we'll see if there's > >> another way to solve it? > >> > >> Regards > >> Stefan > >> > >> On Thu, Apr 28, 2011 at 9:09 AM, elisabeth benoit > >> wrote: > >> > Hello, > >> > > >> > I would like to know if there is a way to use the fq parameter with a > >> > partial value. > >> > > >> > For instance, if I have a request with fq=NAME:Joe, and I would like > to > >> > retrieve all answers where NAME contains Joe, including those with > NAME = > >> > Joe Smith. > >> > > >> > Thanks, > >> > Elisabeth > >> > > >> > > >
Re: manual background re-indexing
I sure would need a downtime to migrate from single-core to multi-core! The question is however whether there are typical steps for a migration. paul Le 28 avr. 2011 à 15:01, Erick Erickson a écrit : > It would probable be safest just to set up a separate system as > multi-core from the start, get the process working and then either use > the new machine or copy the whole setup to the production machine. > > Best > Erick > > On Thu, Apr 28, 2011 at 8:49 AM, Paul Libbrecht wrote: >> Just where to do I put the new index data with such a command? Simply >> replacing the segment files appears dangerous to me. >> >> Also, what is the best practice to move from single-core to multi-core? >> My current set-up is single-core, do I simply need to add a solr.xml in my >> solr-home and one core1 directory with the data that was there previously? >> >> paul >> >> >> Le 28 avr. 2011 à 14:04, Shaun Campbell a écrit : >> >>> Hi Paul >>> >>> Would a multi-core set up and the swap command do what you want it to do? >>> >>> http://wiki.apache.org/solr/CoreAdmin >>> >>> Shaun >>> >>> On 28 April 2011 12:49, Paul Libbrecht wrote: >>> Hello list, I am planning to implement a setup, to be run on unix scripts, that should perform a full pull-and-reindex in a background server and index then deploy that index. All should happen on the same machine. I thought the replication methods would help me but they seem to rather solve the issues of distribution while, what I need, is only the ability to: - suspend the queries - swap the directories with the new index - close all searchers - reload and warm-up the searcher on the new index Is there a part of the replication utilities (http or unix) that I could use to perform the above tasks? I intend to do this on occasion... maybe once a month or even less. Is "reload" the right term to be used? paul >> >>
RE: fq parameter with partial value
Yep, what you describe is what I do in similar situations, it works fine. It is certainly possible to facet on a tokenized field... but your individual facet values will be the _tokens_, not the complete values. And they'll be the post-analyzed tokens at that. Which is rarely what you want. Thus the use of two fields, one tokenized and analyzed, one not tokenized and minimimally analzyed (for instance, not stemmed). From: elisabeth benoit [elisaelisael...@gmail.com] Sent: Thursday, April 28, 2011 9:03 AM To: solr-user@lucene.apache.org Subject: Re: fq parameter with partial value yes, the multivalued field is not broken up into tokens. so, if I understand well what you mean, I could have a field CATEGORY with multiValued="true" a field CATEGORY_TOKENIZED with multiValued=" true" and then some POI POI_Name ... Restaurant Hotel Restaurant Hotel do faceting on CATEGORY and fq on CATEGORY_TOKENIZED. But then, wouldn't it be possible to do faceting on CATEGORY_TOKENIZED? Best regards Elisabeth 2011/4/28 Erick Erickson > So, I assume your CATEGORY field is multiValued but each value is not > broken up into tokens, right? If that's the case, would it work to have a > second field CATEGORY_TOKENIZED and run your fq against that > field instead? > > You could have this be a multiValued field with an increment gap if you > wanted > to prevent matches across separate entries and have your fq do a proximity > search where the proximity was less than the increment gap > > Best > Erick > > On Thu, Apr 28, 2011 at 6:03 AM, elisabeth benoit > wrote: > > Hi Stefan, > > > > Thanks for answering. > > > > In more details, my problem is the following. I'm working on searching > > points of interest (POIs), which can be hotels, restaurants, plumbers, > > psychologists, etc. > > > > Those POIs can be identified among other things by categories or by > brand. > > And a single POIs might have different categories (no maximum number). > User > > might enter a query like > > > > > > McDonald’s Paris > > > > > > or > > > > > > Restaurant Paris > > > > > > or > > > > > > many other possible queries > > > > > > First I want to do a facet search on brand and categories, to find out > which > > case is the current case. > > > > > > http://localhost:8080/solr /select?q=restaurant paris > > &facet=true&facet.field=BRAND& facet.field=CATEGORY > > > > and get an answer like > > > > > > > > > > > > 598 > > > > 451 > > > > > > > > Then I want to send a request with fq= CATEGORY: Restaurant and still get > > answers with CATEGORY= Restaurant Hotel. > > > > > > > > One solution would be to modify the data to add a new document every time > we > > have a new category, so a POI with three different categories would be > index > > three times, each time with a different category. > > > > > > But I was wondering if there was another way around. > > > > > > > > Thanks again, > > > > Elisabeth > > > > > > 2011/4/28 Stefan Matheis > > > >> Hi Elisabeth, > >> > >> that's not what FilterQueries are made for :) What against using that > >> Criteria in the Query? > >> Perhaps you want to describe your UseCase and we'll see if there's > >> another way to solve it? > >> > >> Regards > >> Stefan > >> > >> On Thu, Apr 28, 2011 at 9:09 AM, elisabeth benoit > >> wrote: > >> > Hello, > >> > > >> > I would like to know if there is a way to use the fq parameter with a > >> > partial value. > >> > > >> > For instance, if I have a request with fq=NAME:Joe, and I would like > to > >> > retrieve all answers where NAME contains Joe, including those with > NAME = > >> > Joe Smith. > >> > > >> > Thanks, > >> > Elisabeth > >> > > >> > > >
boost fields which have value
Hi, How can I achieve that documents which don't have field1 and field2 filled in, are returned in the end of the search result. I have tried with *bf* parameter, which seems to work but just with one field. Is there any function query which I can use in bf value to boost two fields? Thank you. Regards, Zoltan
Boost newer documents only if date is different from timestamp
I am trying to boost newer documents in Solr queries. The ms function http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents seems to be the right way to go, but I need to add an additional condition: I am using the last-Modified-Date from crawled web pages as the date to consider, and that does not always provide a meaningful date. Therefore I would like the function to only boost documents where the date (not time) found in the last-Modified-Date is different from the timestamp, eliminating results that just return the current date as the last-Modified-Date. Suggestions are appreciated!
Searching for escaped characters
I'm trying to create a test to make sure that character sequences like "è" are successfully converted to their equivalent utf character (that is, in this case, "è"). So, I'd like to search my solr index using the equivalent of the following regular expression: &\w{1,6}; To find any escaped sequences that might have slipped through. Is this possible? I have indexed these fields with text_lu, which looks like this: Thanks, Paul
Re: Concatenate multivalued DIH fields
I solved this problem using the flatten="true" attribute. Given this schema Joe Smith attr_names is a multiValued field in my schema.xml. The flatten attribute tells solr to take all the text from the specified node and below. -- View this message in context: http://lucene.472066.n3.nabble.com/Concatenate-multivalued-DIH-fields-tp2749988p2875435.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: boost fields which have value
I believe the sortMissingLast fieldtype attribute is what you want: http://wiki.apache.org/solr/SchemaXml -Original Message- From: Zoltán Altfatter [mailto:altfatt...@gmail.com] Sent: Thursday, April 28, 2011 6:11 AM To: solr-user@lucene.apache.org Subject: boost fields which have value Hi, How can I achieve that documents which don't have field1 and field2 filled in, are returned in the end of the search result. I have tried with *bf* parameter, which seems to work but just with one field. Is there any function query which I can use in bf value to boost two fields? Thank you. Regards, Zoltan
Re: Searching for escaped characters
StandardTokenizer will have stripped punctuation I think. You might try searching for all the entity names though: (agrave | egrave | omacron | etc... ) The names are pretty distinctive. Although you might have problems with greek letters. -Mike On 04/28/2011 12:10 PM, Paul wrote: I'm trying to create a test to make sure that character sequences like "è" are successfully converted to their equivalent utf character (that is, in this case, "è"). So, I'd like to search my solr index using the equivalent of the following regular expression: &\w{1,6}; To find any escaped sequences that might have slipped through. Is this possible? I have indexed these fields with text_lu, which looks like this: Thanks, Paul
Re: SolrQuery#setStart(Integer) ???
Hi Erick, Correct, i cut some zeros while reading the javadocs, thanks for the heads up! [ ]'s Leonardo da S. Souza °v° Linux user #375225 /(_)\ http://counter.li.org/ ^ ^ On Wed, Apr 27, 2011 at 8:13 PM, Erick Erickson wrote: > Well, the java native int fomat is 32 bits, so unless you're returning > over 2 billion documents, you should be OK. But you'll run into other > issues > long before you get to that range. > > Best > Erick > > On Wed, Apr 27, 2011 at 5:25 PM, Leonardo Souza > wrote: > > Hi Guys, > > > > We have an index with more than 3 millions documents, we use the > pagination > > feature through SolrQuery#setStart and SolrQuery#setRows > > methods. Some queries can return a huge amount of documents and i'm worry > > about the integer parameter of the setStart method, this parameter > > should be a long don't you think? For now i'm considering to use the > > ModifiableSolrParams class. Any suggestion is welcome! > > > > thanks! > > > > > > [ ]'s > > Leonardo Souza > > °v° Linux user #375225 > > /(_)\ http://counter.li.org/ > > ^ ^ > > >
Re: Replicaiton Fails with Unreachable error when master host is responding.
Anybody? On 04/27/2011 01:51 PM, Jed Glazner wrote: Hello All, I'm having a very strange problem that I just can't figure out. The slave is not able to replicate from the master, even though the master is reachable from the slave machine. I can telnet to the port it's running on, I can use text based browsers to navigate the master from the slave. I just don't understand why it won't replicate. The admin screen gives me an Unreachable in the status, and in the log there is an exception thrown. Details below: BACKGROUND: OS: Arch Linux Solr Version: svn revision 1096983 from https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ No custom plugins, just whatever came with the version above. Java Setup: java version "1.6.0_22" OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) We have 3 cores running, all 3 cores are not able to replicate. The admin on the slave shows the Master as http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable* Replicaiton def on the slave 529 530 531 http://solr-master-01_dev.la.bo:8983/solr/music/replication 532 00:15:00 533 534 Replication def on the master: 529 530 531 commit 532 startup 533 schema.xml,stopwords.txt 534 535 Below is the log start to finish for replication attempts, note that it says connection refused, however, I can telnet to 8983 from the slave to the master, so I know it's up and reachable from the slave: telnet solr-master-01_dev.la.bo 8983 Trying 172.12.65.58... Connected to solr-master-01_dev.la.bo. Escape character is '^]'. I double checked the master to make sure that it didn't have replication turned off, and it's not. So I should be able to replicate but it can't. I just dont' know what else to check. The log from the slave is below. Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please use the corresponding class in org.apache.solr.response Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler getReplicationDetails WARNING: Exception while invoking 'details' method for replication on master java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125) at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java:193) at org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java:188) at org.apache.solr.handler.ReplicationHandler.getReplicationDetails(ReplicationHandler.java:588)
Re: Replicaiton Fails with Unreachable error when master host is responding.
No clue. Try wireshark to gather more data? On 04/28/2011 02:53 PM, Jed Glazner wrote: Anybody? On 04/27/2011 01:51 PM, Jed Glazner wrote: Hello All, I'm having a very strange problem that I just can't figure out. The slave is not able to replicate from the master, even though the master is reachable from the slave machine. I can telnet to the port it's running on, I can use text based browsers to navigate the master from the slave. I just don't understand why it won't replicate. The admin screen gives me an Unreachable in the status, and in the log there is an exception thrown. Details below: BACKGROUND: OS: Arch Linux Solr Version: svn revision 1096983 from https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/ No custom plugins, just whatever came with the version above. Java Setup: java version "1.6.0_22" OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64) OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode) We have 3 cores running, all 3 cores are not able to replicate. The admin on the slave shows the Master as http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable* Replicaiton def on the slave 529 530 531http://solr-master-01_dev.la.bo:8983/solr/music/replication 53200:15:00 533 534 Replication def on the master: 529 530 531commit 532startup 533schema.xml,stopwords.txt 534 535 Below is the log start to finish for replication attempts, note that it says connection refused, however, I can telnet to 8983 from the slave to the master, so I know it's up and reachable from the slave: telnet solr-master-01_dev.la.bo 8983 Trying 172.12.65.58... Connected to solr-master-01_dev.la.bo. Escape character is '^]'. I double checked the master to make sure that it didn't have replication turned off, and it's not. So I should be able to replicate but it can't. I just dont' know what else to check. The log from the slave is below. Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please use the corresponding class in org.apache.solr.response Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: I/O exception (java.net.ConnectException) caught when processing request: Connection refused Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector executeWithRetry INFO: Retrying request Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler getReplicationDetails WARNING: Exception while invoking 'details' method for replication on master java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) at java.net.Socket.connect(Socket.java:546) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125) at org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397) at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:323) at org.apache.solr.handler.SnapPuller.getNamedListResponse(SnapPuller.java:193) at org.apache.solr.handler.SnapPuller.getCommandResponse(SnapPuller.java:188) at org.apache.solr.
Re: fq parameter with partial value
See below: On Thu, Apr 28, 2011 at 9:03 AM, elisabeth benoit wrote: > yes, the multivalued field is not broken up into tokens. > > so, if I understand well what you mean, I could have > > a field CATEGORY with multiValued="true" > a field CATEGORY_TOKENIZED with multiValued=" true" > > and then some POI > > POI_Name > ... > Restaurant Hotel > Restaurant > Hotel [EOE] If the above is the document you're sending, then no. The document would be indexed with Restaurant Hotel Restaurant Hotel Or even just: Restaurant Hotel and set up a to copy the value from CATEGORY to CATEGORY_TOKENIZED. The multiValued part comes from: "And a single POIs might have different categories so your document could have" which would look like: Restaruant Hotel Health Spa Dance Hall and your document would be counted for each of those entries while searches against CATEGORY_TOKENIZED would match things like "dance" "spa" etc. But do notice that if you did NOT want searching for "restaurant hall" (no quotes), to match then you could do proximity searches for less than your increment gap. e.g. (this time with the quotes) would be "restaurant hall"~50, which would then NOT match if your increment gap were 100. Best Erick > > do faceting on CATEGORY and fq on CATEGORY_TOKENIZED. > > But then, wouldn't it be possible to do faceting on CATEGORY_TOKENIZED? > > Best regards > Elisabeth > > > 2011/4/28 Erick Erickson > >> So, I assume your CATEGORY field is multiValued but each value is not >> broken up into tokens, right? If that's the case, would it work to have a >> second field CATEGORY_TOKENIZED and run your fq against that >> field instead? >> >> You could have this be a multiValued field with an increment gap if you >> wanted >> to prevent matches across separate entries and have your fq do a proximity >> search where the proximity was less than the increment gap >> >> Best >> Erick >> >> On Thu, Apr 28, 2011 at 6:03 AM, elisabeth benoit >> wrote: >> > Hi Stefan, >> > >> > Thanks for answering. >> > >> > In more details, my problem is the following. I'm working on searching >> > points of interest (POIs), which can be hotels, restaurants, plumbers, >> > psychologists, etc. >> > >> > Those POIs can be identified among other things by categories or by >> brand. >> > And a single POIs might have different categories (no maximum number). >> User >> > might enter a query like >> > >> > >> > McDonald’s Paris >> > >> > >> > or >> > >> > >> > Restaurant Paris >> > >> > >> > or >> > >> > >> > many other possible queries >> > >> > >> > First I want to do a facet search on brand and categories, to find out >> which >> > case is the current case. >> > >> > >> > http://localhost:8080/solr /select?q=restaurant paris >> > &facet=true&facet.field=BRAND& facet.field=CATEGORY >> > >> > and get an answer like >> > >> > >> > >> > >> > >> > 598 >> > >> > 451 >> > >> > >> > >> > Then I want to send a request with fq= CATEGORY: Restaurant and still get >> > answers with CATEGORY= Restaurant Hotel. >> > >> > >> > >> > One solution would be to modify the data to add a new document every time >> we >> > have a new category, so a POI with three different categories would be >> index >> > three times, each time with a different category. >> > >> > >> > But I was wondering if there was another way around. >> > >> > >> > >> > Thanks again, >> > >> > Elisabeth >> > >> > >> > 2011/4/28 Stefan Matheis >> > >> >> Hi Elisabeth, >> >> >> >> that's not what FilterQueries are made for :) What against using that >> >> Criteria in the Query? >> >> Perhaps you want to describe your UseCase and we'll see if there's >> >> another way to solve it? >> >> >> >> Regards >> >> Stefan >> >> >> >> On Thu, Apr 28, 2011 at 9:09 AM, elisabeth benoit >> >> wrote: >> >> > Hello, >> >> > >> >> > I would like to know if there is a way to use the fq parameter with a >> >> > partial value. >> >> > >> >> > For instance, if I have a request with fq=NAME:Joe, and I would like >> to >> >> > retrieve all answers where NAME contains Joe, including those with >> NAME = >> >> > Joe Smith. >> >> > >> >> > Thanks, >> >> > Elisabeth >> >> > >> >> >> > >> >
Re: Extra facet query from within a custom search component
Have you looked at: http://wiki.apache.org/solr/TermsComponent? Best Erick On Thu, Apr 28, 2011 at 2:44 PM, Frederik Kraus wrote: > Hi Guys, > > I'm currently working on a custom search component and need to fetch a list > of all possible values within a certain field. > An internal facet (wildcard) query first came to mind, but I'm not quite sure > how to best create and then execute such a query ... > > What would be the best way to do this? > > Can anyone please point me in the right direction? > > Thanks, > > Fred.
Problem with autogeneratePhraseQueries=false
Hi, I'm new to solr. My solr instance version is: Solr Specification Version: 3.1.0 Solr Implementation Version: 3.1.0 1085815 - grantingersoll - 2011-03-26 18:00:07 Lucene Specification Version: 3.1.0 Lucene Implementation Version: 3.1.0 1085809 - 2011-03-26 18:06:58 Current Time: Tue Apr 26 08:01:09 CEST 2011 Server Start Time:Tue Apr 26 07:59:05 CEST 2011 I have following definition for textgen type: I'm using this type for name field in my index. As you can see I'm using autoGeneratePhraseQueries="false" but for query sony vaio 4gb I'm getting following query in debug: sony vaio 4gb sony vaio 4gb +name:sony +name:vaio +MultiPhraseQuery(name:"(4gb 4) gb") +name:sony +name:vaio +name:"(4gb 4) gb" Do you have any idea how can I avoid this MultiPhraseQuery? Best Regards, solr_beginner
Re: Problem with autogeneratePhraseQueries
Thank you very much for answer. You were right. There was no luceneMatchVersion in solrconfig.xml of our dev core. We thought that values not present in core configuration are copied from main solrconfig.xml. I will investigate if our administrators did something wrong during upgrade to 3.1. On Tue, Apr 26, 2011 at 1:35 PM, Robert Muir wrote: > What do you have in solrconfig.xml for luceneMatchVersion? > > If you don't set this, then its going to default to "Lucene 2.9" > emulation so that old solr 1.4 configs work the same way. I tried your > example and it worked fine here, and I'm guessing this is probably > whats happening. > > the default in the example/solrconfig.xml looks like this: > > > LUCENE_31 > > On Tue, Apr 26, 2011 at 6:51 AM, Solr Beginner > wrote: > > Hi, > > > > I'm new to solr. My solr instance version is: > > > > Solr Specification Version: 3.1.0 > > Solr Implementation Version: 3.1.0 1085815 - grantingersoll - 2011-03-26 > > 18:00:07 > > Lucene Specification Version: 3.1.0 > > Lucene Implementation Version: 3.1.0 1085809 - 2011-03-26 18:06:58 > > Current Time: Tue Apr 26 08:01:09 CEST 2011 > > Server Start Time:Tue Apr 26 07:59:05 CEST 2011 > > > > I have following definition for textgen type: > > > > positionIncrementGap="100" > > autoGeneratePhraseQueries="false"> > > > > > > > words="stopwords.txt" enablePositionIncrements="true" /> > > > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > > preserveOriginal="1"/> > > > > maxGramSize="15" > > side="front" preserveOriginal="1"/> > > > > > > > > > ignoreCase="true" expand="true"/> > > > ignoreCase="true" > > words="stopwords.txt" > > enablePositionIncrements="true"/> > > > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > > catenateAll="0" preserveOriginal="1"/> > > > > > > > > > > > > I'm using this type for name field in my index. As you can see I'm > > using autoGeneratePhraseQueries="false" but for query sony vaio 4gb I'm > > getting following query in debug: > > > > > > sony vaio 4gb > > sony vaio 4gb > > +name:sony +name:vaio > +MultiPhraseQuery(name:"(4gb > > 4) gb") > > +name:sony +name:vaio +name:"(4gb 4) > > gb" > > > > Do you have any idea how can I avoid this MultiPhraseQuery? > > > > Best Regards, > > solr_beginner > > >
Dynamically loading xml files from webapplication to index
In our webapp, we need to upload a xml data file from the UI(dialogue box) for indexing we are not able to find the solution in documentation. plz suggest what is the way to implement it -- View this message in context: http://lucene.472066.n3.nabble.com/Dynamically-loading-xml-files-from-webapplication-to-index-tp2865890p2865890.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: fieldCache only on stats page
Solr version: Solr Specification Version: 3.1.0 Solr Implementation Version: 3.1.0 1085815 - grantingersoll - 2011-03-26 18:00:07 Lucene Specification Version: 3.1.0 Lucene Implementation Version: 3.1.0 1085809 - 2011-03-26 18:06:58 Current Time: Wed Apr 27 14:28:34 CEST 2011 Server Start Time:Wed Apr 27 11:07:00 CEST 2011 According to cache I can see only following informations: CACHE name:fieldCache class: org.apache.solr.search.SolrFieldCacheMBean version: 1.0 description: Provides introspection of the Lucene FieldCache, this is **NOT** a cache that is managed by Solr. sourceid:$Id: SolrFieldCacheMBean.java 984594 2010-08-11 21:42:04Z yonik $ source: $URL: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/solr/src/java/org/apache/solr/search/SolrFieldCacheMBean.java $ name:fieldValueCache class: org.apache.solr.search.FastLRUCache version: 1.0 description: Concurrent LRU Cache(maxSize=1, initialSize=10, minSize=9000, acceptableSize=9500, cleanupThread=false) sourceid:$Id: FastLRUCache.java 1065312 2011-01-30 16:08:25Z rmuir $ source: $URL: https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_1/solr/src/java/org/apache/solr/search/FastLRUCache.java $ Nothing about filterCache or documentCache ;/ Best Regards, Solr Beginner On Wed, Apr 27, 2011 at 2:00 PM, Erick Erickson wrote: > There's nothing special you need to do to be able to view the various > stats from admin/stats.jsp. If another look doesn't show them, could you > post a screenshot? > > And please include the version of Solr you're using, I checked with 1.4.1. > > Best > Erick > > On Wed, Apr 27, 2011 at 1:44 AM, Solr Beginner wrote: >> Hi, >> >> I can see only fieldCache (nothing about filter, query or document >> cache) on stats page. What I'm doing wrong? We have two servers with >> replication. There are two cores(prod, dev) on each server. Maybe I >> have to add something to solrconfig.xml of cores? >> >> Best Regards, >> Solr Beginner >> >
Re: Extra facet query from within a custom search component
Haaa fantastic! Thanks a lot! Fred. On Donnerstag, 28. April 2011 at 22:21, Erick Erickson wrote: > Have you looked at: http://wiki.apache.org/solr/TermsComponent? > > Best > Erick > > On Thu, Apr 28, 2011 at 2:44 PM, Frederik Kraus > wrote: > > Hi Guys, > > > > I'm currently working on a custom search component and need to fetch a list > > of all possible values within a certain field. > > An internal facet (wildcard) query first came to mind, but I'm not quite > > sure how to best create and then execute such a query ... > > > > What would be the best way to do this? > > > > Can anyone please point me in the right direction? > > > > Thanks, > > > > Fred. >
Re: AlternateDistributedMLT.patch not working (SOLR-788)
On 2/23/2011 11:53 AM, Otis Gospodnetic wrote: Hi Isha, The patch is out of date. You need to look at the patch and rejection and update your local copy of the code to match the logic from the patch, if it's still applicable to the version of Solr source code you have. We have a need for distributed More Like This. We're gearing up for a deployment of 3.1, so a patch against 1.4.1 is not very useful for us. I've spent the last couple of days trying to rework both the original and the alternate patches on SOLR-788 to work against 3.1. I don't understand enough about the code to know how to fix it. I knew I had to change the value of PURPOSE_GET_MLT_RESULTS to 0x800 because of the conflict with PURPOSE_GET_TERMS, but the changes in MoreLikeThisComponent.java are beyond me. Thanks, Shawn
Re: Spatial Search
1) Create an extra String field on your bean as Yonik suggests or 2) Write an UpdateRequestHandler which reads the doubles and creates the LatLon from that -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 28. apr. 2011, at 14.44, Yonik Seeley wrote: > On Thu, Apr 28, 2011 at 5:15 AM, Jonas Lanzendörfer > wrote: >> I am new to solr and try to use the spatial search feature which was added >> in 3.1. In my schema.xml I have 2 double fields for latitude and longitude. >> How can I get them into the location field type? I use solrj to fill the >> index with data. If I would use a location field instead of two double >> fields, how could I fill this with solrj? I use annotations to link the data >> from my dto´s to the index fields... > > > I've not used the annotation stuff in SolrJ, but since the value sent > in must be of the for 10.3,20.4 then > I guess one would have to have a String field with this value on your object. > > > -Yonik > http://www.lucenerevolution.org -- Lucene/Solr User Conference, May > 25-26, San Francisco
Rép : Re: manual background re-indexing
> It would probable be safest just to set up a separate system as > multi-core from the start, get the process working and then either use > the new machine or copy the whole setup to the production machine.> > On Thu, Apr 28, 2011 at 8:49 AM, Paul Libbrechtwrote: >> Just where to do I put the new index data with such a command? Simply replacing the segment files appears dangerous to me. Any idea where I should put the data directory before calling the reload command?paul
Re: Rép : Re: manual background re-indexing
You simply create two cores. One in solr/cores/core1 and another in solr/cores/core2 They each have a separate conf and data directory,and the index in in core#/data/index. Really, its' just introducing one more level. You can experiment just by configuring a core and copying your index to solr/cores/yourcore/data/index. After, of course, configuring Solr.xml to understand cores. Best Erick On Thu, Apr 28, 2011 at 7:27 PM, Paul Libbrecht wrote: >> It would probable be safest just to set up a separate system as >> multi-core from the start, get the process working and then either use >> the new machine or copy the whole setup to the production machine. >> >> On Thu, Apr 28, 2011 at 8:49 AM, Paul Libbrecht wrote: >>> Just where to do I put the new index data with such a command? Simply >>> replacing the segment files appears dangerous to me. > > > Any idea where I should put the data directory before calling the reload > command? > paul
Location of Solr Logs
Hi, I am newbee to SOLR. Can you please help me to know where can see the logs written by SOLR? Is there any configuration required to see the logs of SOLR? Thanks for your time and help, Geeta **Legal Disclaimer*** "This communication may contain confidential and privileged material for the sole use of the intended recipient. Any unauthorized review, use or distribution by others is strictly prohibited. If you have received the message in error, please advise the sender by reply email and delete the message. Thank you." *
Can the Suggester be updated incrementally?
I'm interested in using Suggester (http://wiki.apache.org/solr/Suggester) for auto-complete on the field "Document Title". Does Suggester (either FST, TST or Jaspell) support incremental updates? Say I want to add a new document title to the Suggester, or to change the weight of an existing document title, would I need to rebuild the entire tree for every update? Also, can the Suggester be sharded? If the size of the tree gets bigger than the RAM size, is it possible to shard the Suggester across multiple machines? Thanks Andy
Re: Can the Suggester be updated incrementally?
It's answered on the wiki site: "TSTLookup - ternary tree based representation, capable of immediate data structure updates" Although the EdgeNGram technique is probably more widely adopted, eg, it's closer to what Google has implemented. http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ On Thu, Apr 28, 2011 at 9:37 PM, Andy wrote: > I'm interested in using Suggester (http://wiki.apache.org/solr/Suggester) for > auto-complete on the field "Document Title". > > Does Suggester (either FST, TST or Jaspell) support incremental updates? Say > I want to add a new document title to the Suggester, or to change the weight > of an existing document title, would I need to rebuild the entire tree for > every update? > > Also, can the Suggester be sharded? If the size of the tree gets bigger than > the RAM size, is it possible to shard the Suggester across multiple machines? > > Thanks > Andy >
Re: Question on Batch process
Charles, Maybe the question to ask is why you are committing at all? Do you need somebody to see index changes while you are indexing? If not, commit just at the end. And optimize if you won't touch the index for a while. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Charles Wardell > To: solr-user@lucene.apache.org > Sent: Wed, April 27, 2011 7:51:20 PM > Subject: Re: Question on Batch process > > Thank you for your response. I did not make the StreamingUpdate application >yet, but I did change the other settings that you mentioned. It gave me a >huge >boost in indexing speed. (I am still using post.sh but hope to change that >soon). > > One thing I noticed is the indexing speed was incredibly fast last night, > but >today the commits are taking so long. Is this to be expected? > > > > -- > Best Regards, > > Charles Wardell > Blue Chips Technology, Inc. > www.bcsolution.com > > On Wednesday, April 27, 2011 at 6:15 PM, Otis Gospodnetic wrote: > > Hi Charles, > > > > Yes, the threads I was referring to are in the context of the >client/indexer, so > > > one of the params for StreamingUpdateSolrServer. > > post.sh/jar are just there because they are handy. Don't use them for > > production. > > > > It's impossible to tell how long indexing of 100M documents may take. They > > could be very big or very small. You could perform very light or no > > analysis >or > > > heavy analysis. They could contain 1 or 100 fields. :) > > > > Otis > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > - Original Message > > > From: Charles Wardell > > > To: solr-user@lucene.apache.org > > > Sent: Tue, April 26, 2011 8:01:28 PM > > > Subject: Re: Question on Batch process > > > > > > Thank you Otis. > > > Without trying to appear to stupid, when you refer to having the params > > > matching your # of CPU cores, you are talking about the # of threads I > > > can > > > > spawn with the StreamingUpdateSolrServer object? > > > Up until now, I have been just utilizing post.sh or post.jar. Are these > > > capable of that or do I need to write some code to collect a bunch of >files > > > > into the buffer and send it off? > > > > > > Also, Do you have a sense for how long it should take to index 100,000 >files > > > > or in my case 100,000,000 documents? > > > StreamingUpdateSolrServer > > > public StreamingUpdateSolrServer(String solrServerUrl, int queueSize, > > > int > > > threadCount) throws MalformedURLException > > > > > > Thanks again, > > > Charlie > > > > > > -- > > > Best Regards, > > > > > > Charles Wardell > > > Blue Chips Technology, Inc. > > > www.bcsolution.com > > > > > > On Tuesday, April 26, 2011 at 5:12 PM, Otis Gospodnetic wrote: > > > > Charlie, > > > > > > > > How's this: > > > > * -Xmx2g > > > > * ramBufferSizeMB 512 > > > > * mergeFactor 10 (default, but you could up it to 20, 30, if ulimit -n > > > allows) > > > > * ignore/delete maxBufferedDocs - not used if you ran ramBufferSizeMB > > > > * use SolrStreamingUpdateServer (with params matching your number of > > > CPU > > > > cores) > > > > > > > or send batches of say 1000 docs with the other SolrServer impl using > > > > N > > > threads > > > > > > > (N=# of your CPU cores) > > > > > > > > Otis > > > > > > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > > > > > > > > > - Original Message > > > > > From: Charles Wardell > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Tue, April 26, 2011 2:32:29 PM > > > > > Subject: Question on Batch process > > > > > > > > > > I am sure that this question has been asked a few times, but I can't >seem > > > > to > > > > > > > > find the sweetspot for indexing. > > > > > > > > > > I have about 100,000 files each containing 1,000 xml documents ready >to be > > > > > > > > > posted to Solr. My desire is to have it index as quickly as > > possible >and > > > > then > > > > > > > > once completed the daily stream of ADDs will be small in comparison. > > > > > > > > > > The individual documents are small. Essentially web postings from > > > > > the >net. > > > > > > > > > Title, postPostContent, date. > > > > > > > > > > > > > > > What would be the ideal configuration? For RamBufferSize, >mergeFactor, > > > > > > MaxbufferedDocs, etc.. > > > > > > > > > > My machine is a quad core hyper-threaded. So it shows up as 8 cpu's > > > > > in > > > TOP > > > > > I have 16GB of available ram. > > > > > > > > > > > > > > > Thanks in advance. > > > > > Charlie > > >
Re: Can the Suggester be updated incrementally?
--- On Fri, 4/29/11, Jason Rutherglen wrote: > It's answered on the wiki site: > > "TSTLookup - ternary tree based representation, capable of > immediate > data structure updates" > But how to update it? The wiki talks about getting data sources from a file or from the main index. In either case it sounds like the entire data structure will be rebuilt, no?
Re: Location of Solr Logs
You can see solr logs at your servlet container's log file i.e. if you are using Tomcat it can be found at [CATALINA_HOME]/logs/catalina.XXX.log -Thanx: Grijesh www.gettinhahead.co.in -- View this message in context: http://lucene.472066.n3.nabble.com/Location-of-Solr-Logs-tp2877510p2878294.html Sent from the Solr - User mailing list archive at Nabble.com.