Re: Anyproblem in running two solr instances on the same machine using the same directory ?
I am indexing data provided by the users our web site. If load on the site increases, the rate of the commits also increases. The nature of the data is such that it should get reflected in the index instantaneously. On Sat, Sep 27, 2008 at 4:00 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Fri, Sep 26, 2008 at 2:18 AM, Jagadish Rath <[EMAIL PROTECTED]> wrote: > > - *What are the other solutions to the problem of "maxWarmingSearchers > > limit exceeded error " ?** * > > Don't commit so rapidly? > What is the reason for your high commit rate? > > -Yonik >
Re: Searching Question
On Sep 26, 2008, at 2:10 PM, Otis Gospodnetic wrote: It might be easiest to store the thread ID and the number of replies in the thread in each post Document in Solr. Yeah, but that would mean updating every document in a thread every time a new reply is added. I still keep going back to the solution as putting all the replies in a single document, and then using a custom Similarity factor that overrides the TF function and/or the length normalization. Still, this suffers from having to update the document for every new reply. Let's take a step back... Can I ask why you want the scoring this way? What have you seen in your results that leads you to believe it is the correct way? Note, I'm not trying to convince you it's wrong, I just want to better understand what's going on. Otherwise it sounds like you'll have to combine some search results or data post-search. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jake Conk <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Friday, September 26, 2008 1:50:37 PM Subject: Re: Searching Question Grant, Each post is its own document but I can merge them all into a single document under one thread if that will allow me to do what I want. The number of replies is stored both in Solr and the DB. Thanks, - JC On Fri, Sep 26, 2008 at 5:24 AM, Grant Ingersoll wrote: Is a thread and all of it's posts a single document? In other words, how are you modeling your posts as Solr documents? Also, where are you keeping track of the number of replies? Is that in Solr or in a DB? -Grant On Sep 25, 2008, at 8:51 PM, Jake Conk wrote: Hello, We are using Solr for our new forums search feature. If possible when searching for the word "Halo" we would like threads that contain the word "Halo" the most with the least amount of posts in that thread to have a higher score. For instance, if we have a thread with 10 posts and the word "Halo" shows up 5 times then that should have a lower score than a thread that has the word "Halo" 3 times within its posts and has 5 replies. Basically the thread that shows the search string most frequently amongst the number of posts in the thread should be the one with the highest score. Is something like this possible? Thanks,
Re: Anyproblem in running two solr instances on the same machine using the same directory ?
Solr today is not suited for real-time search (seeing newly added docs in search results as soon as they've been added - the way databases work, for example). Work on that is in progress, though. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Jagadish Rath <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Saturday, September 27, 2008 6:24:02 AM > Subject: Re: Anyproblem in running two solr instances on the same machine > using the same directory ? > > I am indexing data provided by the users our web site. If load on the site > increases, the rate of the commits also increases. The nature of the data is > such that it should get reflected in the index instantaneously. > > On Sat, Sep 27, 2008 at 4:00 AM, Yonik Seeley wrote: > > > On Fri, Sep 26, 2008 at 2:18 AM, Jagadish Rath wrote: > > > - *What are the other solutions to the problem of "maxWarmingSearchers > > > limit exceeded error " ?** * > > > > Don't commit so rapidly? > > What is the reason for your high commit rate? > > > > -Yonik > >
Re: Anyproblem in running two solr instances on the same machine using the same directory ?
The question I have is what is the optimal approach for integrating realtime into SOLR? What classes should be extended or created? On Sat, Sep 27, 2008 at 9:40 AM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Solr today is not suited for real-time search (seeing newly added docs in > search results as soon as they've been added - the way databases work, for > example). Work on that is in progress, though. > > > Otis -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message >> From: Jagadish Rath <[EMAIL PROTECTED]> >> To: solr-user@lucene.apache.org >> Sent: Saturday, September 27, 2008 6:24:02 AM >> Subject: Re: Anyproblem in running two solr instances on the same machine >> using the same directory ? >> >> I am indexing data provided by the users our web site. If load on the site >> increases, the rate of the commits also increases. The nature of the data is >> such that it should get reflected in the index instantaneously. >> >> On Sat, Sep 27, 2008 at 4:00 AM, Yonik Seeley wrote: >> >> > On Fri, Sep 26, 2008 at 2:18 AM, Jagadish Rath wrote: >> > > - *What are the other solutions to the problem of "maxWarmingSearchers >> > > limit exceeded error " ?** * >> > >> > Don't commit so rapidly? >> > What is the reason for your high commit rate? >> > >> > -Yonik >> > > >
Updating the index with a csv file
Hello, I would like to update my index with a csv file, but for some reason I get the following error: "The request sent by the client was syntactically incorrect (missing content stream)" I get it after using the following statement: curl http://localhost:8983/solr/update/csv --data-binary @blog.csv -H 'Content-type:text/plain; charset=utf-8' I use the windows version of curl, running this statement from the curl folder where the blog.csv file resides as well. Thank you. -- View this message in context: http://www.nabble.com/Updating-the-index-with-a-csv-file-tp19706582p19706582.html Sent from the Solr - User mailing list archive at Nabble.com.
DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?
Looking at the wiki, code of DataImportHandler and it looks impressive. There's talk about ways to use Transformers to be able to create several rows (solr docs) based on a single db row. I'd like to know if it's possible to do the exact opposite: to build customer transformers that take multiple db-rows and merge it to a single solr-row/document. If so, how? Thanks, Britske -- View this message in context: http://www.nabble.com/DataImportHandler%3A-way-to-merge-multiple-db-rows-to-1-doc-using-transformer--tp19706722p19706722.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?
If I understand your question right ... you would not need a transformer, basically you nest entities under each other ... ie: driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/nhldb? connectTimeout=0&autoReconnect=true" user="root" password="" batchSize="-1"/> processor="org.apache.solr.handler.dataimport.CachedSqlEntityProcessor"> I believe that is the basic steps. Look up CachedSqlEntityProcessor to see if you need it. - Jon On Sep 27, 2008, at 5:47 PM, Britske wrote: Looking at the wiki, code of DataImportHandler and it looks impressive. There's talk about ways to use Transformers to be able to create several rows (solr docs) based on a single db row. I'd like to know if it's possible to do the exact opposite: to build customer transformers that take multiple db-rows and merge it to a single solr-row/document. If so, how? Thanks, Britske -- View this message in context: http://www.nabble.com/DataImportHandler%3A-way-to-merge-multiple-db-rows-to-1-doc-using-transformer--tp19706722p19706722.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DataImportHandler: way to merge multiple db-rows to 1 doc using transformer?
Make a view in your database and index that. No point in duplicating database views in Solr. --wunder On 9/27/08 2:47 PM, "Britske" <[EMAIL PROTECTED]> wrote: > > Looking at the wiki, code of DataImportHandler and it looks impressive. > There's talk about ways to use Transformers to be able to create several > rows (solr docs) based on a single db row. > > I'd like to know if it's possible to do the exact opposite: to build > customer transformers that take multiple db-rows and merge it to a single > solr-row/document. If so, how? > > Thanks, > Britske
Re: Updating the index with a csv file
: "The request sent by the client was syntactically incorrect (missing content : stream)" that usually means either the content type wasn't set, or there was no post data : curl http://localhost:8983/solr/update/csv --data-binary @blog.csv -H : 'Content-type:text/plain; charset=utf-8' : : I use the windows version of curl, running this statement from the curl : folder where the blog.csv file resides as well. my gut assumption was that you needed some whitespace in the content type (ie: 'Content-type: text/plain; charset=utf-8') but i was able to get this to work just fine on linux using the example setup... curl 'http://localhost:8983/solr/update/csv?commit=true' --data-binary @books.csv -H 'Content-type:text/plain; charset=utf-8' ...perhaps there is some eccentricity about windows curl? -Hoss