Solr for noSQL
Hi, Do we have data import handler to fast read in data from noSQL database, specifically, MongoDB I am thinking to use? Or a more general question, how does Solr work with noSQL database? Thanks. Jianbin
embeded solrj doesn't refresh index
Hi, I am using embedded solrj. After I add new doc to the index, I can see the changes through solr web, but not from embedded solrj. But after I restart the embedded solrj, I do see the changes. It works as if there was a cache. Anyone knows the problem? Thanks. Jianbin
RE: embeded solrj doesn't refresh index
Hi Thanks for response. Here is the whole picture: I use DIH to import and index data. And use embedded solrj connecting to the index file for search and other operations. Here is what I found: Once data are indexed (and committed), I can see the changes through solr web server, but not from embedded solrj. If I restart the embedded solr server, I do see the changes. Hope it helps. Thanks. -Original Message- From: Marco Martinez [mailto:mmarti...@paradigmatecnologico.com] Sent: Wednesday, July 20, 2011 5:09 AM To: solr-user@lucene.apache.org Subject: Re: embeded solrj doesn't refresh index You should send a commit to you embedded solr Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/7/20 Jianbin Dai > Hi, > > > > I am using embedded solrj. After I add new doc to the index, I can see the > changes through solr web, but not from embedded solrj. But after I restart > the embedded solrj, I do see the changes. It works as if there was a cache. > Anyone knows the problem? Thanks. > > > > Jianbin > >
RE: embeded solrj doesn't refresh index
Hi Thanks for response. Here is the whole picture: I use DIH to import and index data. And use embedded solrj connecting to the index file for search and other operations. Here is what I found: Once data are indexed (and committed), I can see the changes through solr web server, but not from embedded solrj. If I restart the embedded solr server, I do see the changes. Hope it helps. Thanks. -Original Message- From: Marco Martinez [mailto:mmarti...@paradigmatecnologico.com] Sent: Wednesday, July 20, 2011 5:09 AM To: solr-user@lucene.apache.org Subject: Re: embeded solrj doesn't refresh index You should send a commit to you embedded solr Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42 2011/7/20 Jianbin Dai > Hi, > > > > I am using embedded solrj. After I add new doc to the index, I can see the > changes through solr web, but not from embedded solrj. But after I restart > the embedded solrj, I do see the changes. It works as if there was a cache. > Anyone knows the problem? Thanks. > > > > Jianbin > >
RE: embeded solrj doesn't refresh index
Thanks Marc. Guess I was not clear about my previous statement. So let me rephrase. I use DIH to import data into solr and do indexing. Everything works fine. I have another embedded solr server setting to the same index files. I use embedded solrj to search the index file. So the first solr is for indexing purpose, it can be turned off once the indexing is done. However the changes in the index files cannot show up from embedded solrj, that is, once the new index is built, from embedded solrj, I still get the old results. Only after I restart the embedded solr server, the new changes are reflected from solrj. The embedded solrj works like there was a caching that it always goes to first. Thanks. JB -Original Message- From: Marc Sturlese [mailto:marc.sturl...@gmail.com] Sent: Friday, July 22, 2011 1:57 AM To: solr-user@lucene.apache.org Subject: RE: embeded solrj doesn't refresh index Are u indexing with full import? In case yes and the resultant index has similar num of docs (that the one you had before) try setting reopenReaders to false in solrconfig.xml * You have to send the comit, of course. -- View this message in context: http://lucene.472066.n3.nabble.com/embeded-solrj-doesn-t-refresh-index-tp318 4321p3190892.html Sent from the Solr - User mailing list archive at Nabble.com.
Help needed on DataImportHandler to index xml files
Hi All, I am new here. Thanks for reading my question. I want to use DataImportHandler to index my tons of xml files (7GB total) stored in my local disk. My data-config.xml is attached below. It works fine with one file (abc.xml), but how can I index all xml files at one time? Thanks!
How to index large set data
Hi, I have about 45GB xml files to be indexed. I am using DataImportHandler. I started the full import 4 hours ago, and it's still running My computer has 4GB memory. Any suggestion on the solutions? Thanks! JB
Re: How to index large set data
Hi Paul, Thank you so much for answering my questions. It really helped. After some adjustment, basically setting mergeFactor to 1000 from the default value of 10, I can finished the whole job in 2.5 hours. I checked that during running time, only around 18% of memory is being used, and VIRT is always 1418m. I am thinking it may be restricted by JVM memory setting. But I run the data import command through web, i.e., http://:/solr/dataimport?command=full-import, how can I set the memory allocation for JVM? Thanks again! JB --- On Thu, 5/21/09, Noble Paul നോബിള് नोब्ळ् wrote: > From: Noble Paul നോബിള് नोब्ळ् > Subject: Re: How to index large set data > To: solr-user@lucene.apache.org > Date: Thursday, May 21, 2009, 9:57 PM > check the status page of DIH and see > if it is working properly. and > if, yes what is the rate of indexing > > On Thu, May 21, 2009 at 11:48 AM, Jianbin Dai > wrote: > > > > Hi, > > > > I have about 45GB xml files to be indexed. I am using > DataImportHandler. I started the full import 4 hours ago, > and it's still running > > My computer has 4GB memory. Any suggestion on the > solutions? > > Thanks! > > > > JB > > > > > > > > > > > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com >
Re: How to index large set data
about 2.8 m total docs were created. only the first run finishes. In my 2nd try, it hangs there forever at the end of indexing, (I guess right before commit), with cpu usage of 100%. Total 5G (2050) index files are created. Now I have two problems: 1. why it hangs there and failed? 2. how can i speed up the indexing? Here is my solrconfig.xml false 3000 1000 2147483647 1 false --- On Thu, 5/21/09, Noble Paul നോബിള് नोब्ळ् wrote: > From: Noble Paul നോബിള് नोब्ळ् > Subject: Re: How to index large set data > To: solr-user@lucene.apache.org > Date: Thursday, May 21, 2009, 10:39 PM > what is the total no:of docs created > ? I guess it may not be memory > bound. indexing is mostly amn IO bound operation. You may > be able to > get a better perf if a SSD is used (solid state disk) > > On Fri, May 22, 2009 at 10:46 AM, Jianbin Dai > wrote: > > > > Hi Paul, > > > > Thank you so much for answering my questions. It > really helped. > > After some adjustment, basically setting mergeFactor > to 1000 from the default value of 10, I can finished the > whole job in 2.5 hours. I checked that during running time, > only around 18% of memory is being used, and VIRT is always > 1418m. I am thinking it may be restricted by JVM memory > setting. But I run the data import command through web, > i.e., > > > http://:/solr/dataimport?command=full-import, > how can I set the memory allocation for JVM? > > Thanks again! > > > > JB > > > > --- On Thu, 5/21/09, Noble Paul നോബിള് > नोब्ळ् > wrote: > > > >> From: Noble Paul നോബിള് > नोब्ळ् > >> Subject: Re: How to index large set data > >> To: solr-user@lucene.apache.org > >> Date: Thursday, May 21, 2009, 9:57 PM > >> check the status page of DIH and see > >> if it is working properly. and > >> if, yes what is the rate of indexing > >> > >> On Thu, May 21, 2009 at 11:48 AM, Jianbin Dai > > >> wrote: > >> > > >> > Hi, > >> > > >> > I have about 45GB xml files to be indexed. I > am using > >> DataImportHandler. I started the full import 4 > hours ago, > >> and it's still running > >> > My computer has 4GB memory. Any suggestion on > the > >> solutions? > >> > Thanks! > >> > > >> > JB > >> > > >> > > >> > > >> > > >> > > >> > >> > >> > >> -- > >> > - > >> Noble Paul | Principal Engineer| AOL | http://aol.com > >> > > > > > > > > > > > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com >
Re: How to index large set data
I dont know exactly what is this 3G Ram buffer used. But what I noticed was both index size and file number were keeping increasing, but stuck in the commit. --- On Fri, 5/22/09, Otis Gospodnetic wrote: > From: Otis Gospodnetic > Subject: Re: How to index large set data > To: solr-user@lucene.apache.org > Date: Friday, May 22, 2009, 7:26 AM > > Hi, > > Those settings are a little "crazy". Are you sure you > want to give Solr/Lucene 3G to buffer documents before > flushing them to disk? Are you sure you want to use > the mergeFactor of 1000? Checking the logs to see if > there are any errors. Look at the index directory to > see if Solr is actually still writing to it? (file sizes are > changing, number of files is changing). kill -QUIT the > JVM pid to see where things are "stuck" if they are > stuck... > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Jianbin Dai > > To: solr-user@lucene.apache.org; > noble.p...@gmail.com > > Sent: Friday, May 22, 2009 3:42:04 AM > > Subject: Re: How to index large set data > > > > > > about 2.8 m total docs were created. only the first > run finishes. In my 2nd try, > > it hangs there forever at the end of indexing, (I > guess right before commit), > > with cpu usage of 100%. Total 5G (2050) index files > are created. Now I have two > > problems: > > 1. why it hangs there and failed? > > 2. how can i speed up the indexing? > > > > > > Here is my solrconfig.xml > > > > false > > 3000 > > 1000 > > 2147483647 > > 1 > > false > > > > > > > > > > --- On Thu, 5/21/09, Noble Paul > നോബിള് नोब्ळ् wrote: > > > > > From: Noble Paul നോബിള് > नोब्ळ् > > > Subject: Re: How to index large set data > > > To: solr-user@lucene.apache.org > > > Date: Thursday, May 21, 2009, 10:39 PM > > > what is the total no:of docs created > > > ? I guess it may not be memory > > > bound. indexing is mostly amn IO bound operation. > You may > > > be able to > > > get a better perf if a SSD is used (solid state > disk) > > > > > > On Fri, May 22, 2009 at 10:46 AM, Jianbin Dai > > > wrote: > > > > > > > > Hi Paul, > > > > > > > > Thank you so much for answering my > questions. It > > > really helped. > > > > After some adjustment, basically setting > mergeFactor > > > to 1000 from the default value of 10, I can > finished the > > > whole job in 2.5 hours. I checked that during > running time, > > > only around 18% of memory is being used, and VIRT > is always > > > 1418m. I am thinking it may be restricted by JVM > memory > > > setting. But I run the data import command > through web, > > > i.e., > > > > > > > http://:/solr/dataimport?command=full-import, > > > how can I set the memory allocation for JVM? > > > > Thanks again! > > > > > > > > JB > > > > > > > > --- On Thu, 5/21/09, Noble Paul > നോബിള് > > > नोब्ळ् > > > wrote: > > > > > > > >> From: Noble Paul നോബിള് > > > नोब्ळ् > > > >> Subject: Re: How to index large set > data > > > >> To: solr-user@lucene.apache.org > > > >> Date: Thursday, May 21, 2009, 9:57 PM > > > >> check the status page of DIH and see > > > >> if it is working properly. and > > > >> if, yes what is the rate of indexing > > > >> > > > >> On Thu, May 21, 2009 at 11:48 AM, > Jianbin Dai > > > > > > >> wrote: > > > >> > > > > >> > Hi, > > > >> > > > > >> > I have about 45GB xml files to be > indexed. I > > > am using > > > >> DataImportHandler. I started the full > import 4 > > > hours ago, > > > >> and it's still running. > > > >> > My computer has 4GB memory. Any > suggestion on > > > the > > > >> solutions? > > > >> > Thanks! > > > >> > > > > >> > JB > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > >> > > > >> -- > > > >> > > > > - > > > >> Noble Paul | Principal Engineer| AOL | > http://aol.com > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > - > > > Noble Paul | Principal Engineer| AOL | http://aol.com > > > > >
Re: How to index large set data
If I do the xml parsing by myself and use embedded client to do the push, would it be more efficient than DIH? --- On Fri, 5/22/09, Grant Ingersoll wrote: > From: Grant Ingersoll > Subject: Re: How to index large set data > To: solr-user@lucene.apache.org > Date: Friday, May 22, 2009, 5:38 AM > Can you parallelize this? I > don't know that the DIH can handle it, > but having multiple threads sending docs to Solr is the > best > performance wise, so maybe you need to look at alternatives > to pulling > with DIH and instead use a client to push into Solr. > > > On May 22, 2009, at 3:42 AM, Jianbin Dai wrote: > > > > > about 2.8 m total docs were created. only the first > run finishes. In > > my 2nd try, it hangs there forever at the end of > indexing, (I guess > > right before commit), with cpu usage of 100%. Total 5G > (2050) index > > files are created. Now I have two problems: > > 1. why it hangs there and failed? > > 2. how can i speed up the indexing? > > > > > > Here is my solrconfig.xml > > > > > false > > > 3000 > > > 1000 > > > 2147483647 > > > 1 > > > false > > > > > > > > > > --- On Thu, 5/21/09, Noble Paul > നോബിള് नो > > ब्ळ् > wrote: > > > >> From: Noble Paul നോബിള് > नोब्ळ् > >> > >> Subject: Re: How to index large set data > >> To: solr-user@lucene.apache.org > >> Date: Thursday, May 21, 2009, 10:39 PM > >> what is the total no:of docs created > >> ? I guess it may not be memory > >> bound. indexing is mostly amn IO bound operation. > You may > >> be able to > >> get a better perf if a SSD is used (solid state > disk) > >> > >> On Fri, May 22, 2009 at 10:46 AM, Jianbin Dai > > >> wrote: > >>> > >>> Hi Paul, > >>> > >>> Thank you so much for answering my questions. > It > >> really helped. > >>> After some adjustment, basically setting > mergeFactor > >> to 1000 from the default value of 10, I can > finished the > >> whole job in 2.5 hours. I checked that during > running time, > >> only around 18% of memory is being used, and VIRT > is always > >> 1418m. I am thinking it may be restricted by JVM > memory > >> setting. But I run the data import command through > web, > >> i.e., > >>> > >> > http://:/solr/dataimport?command=full-import, > >> how can I set the memory allocation for JVM? > >>> Thanks again! > >>> > >>> JB > >>> > >>> --- On Thu, 5/21/09, Noble Paul > നോബിള് > >> नोब्ळ् > >> wrote: > >>> > >>>> From: Noble Paul നോബിള് > >> नोब्ळ् > >>>> Subject: Re: How to index large set data > >>>> To: solr-user@lucene.apache.org > >>>> Date: Thursday, May 21, 2009, 9:57 PM > >>>> check the status page of DIH and see > >>>> if it is working properly. and > >>>> if, yes what is the rate of indexing > >>>> > >>>> On Thu, May 21, 2009 at 11:48 AM, Jianbin > Dai > >> > >>>> wrote: > >>>>> > >>>>> Hi, > >>>>> > >>>>> I have about 45GB xml files to be > indexed. I > >> am using > >>>> DataImportHandler. I started the full > import 4 > >> hours ago, > >>>> and it's still running > >>>>> My computer has 4GB memory. Any > suggestion on > >> the > >>>> solutions? > >>>>> Thanks! > >>>>> > >>>>> JB > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> > >> > - > >>>> Noble Paul | Principal Engineer| AOL | http://aol.com > >>>> > >>> > >>> > >>> > >>> > >>> > >> > >> > >> > >> -- > >> > - > >> Noble Paul | Principal Engineer| AOL | http://aol.com > >> > > > > > > > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem > (Lucene/Solr/Nutch/Mahout/Tika/Droids) > using Solr/Lucene: > http://www.lucidimagination..com/search > >
How to use DIH to index attributes in xml file
I have an xml file like this 301.46 In the data-config.xml, I use but how can I index "id", "mid"? Thanks.
Re: How to index large set data
Hi Pual, but in your previous post, you said "there is already an issue for writing to Solr in multiple threads SOLR-1089". Do you think use solrj alone would be better than DIH? Thanks and have a good weekend! --- On Fri, 5/22/09, Noble Paul നോബിള് नोब्ळ् wrote: > no need to use embedded Solrserver. > you can use SolrJ with streaming > in multiple threads > > On Fri, May 22, 2009 at 8:36 PM, Jianbin Dai > wrote: > > > > If I do the xml parsing by myself and use embedded > client to do the push, would it be more efficient than DIH? > > > > > > --- On Fri, 5/22/09, Grant Ingersoll > wrote: > > > >> From: Grant Ingersoll > >> Subject: Re: How to index large set data > >> To: solr-user@lucene.apache.org > >> Date: Friday, May 22, 2009, 5:38 AM > >> Can you parallelize this? I > >> don't know that the DIH can handle it, > >> but having multiple threads sending docs to Solr > is the > >> best > >> performance wise, so maybe you need to look at > alternatives > >> to pulling > >> with DIH and instead use a client to push into > Solr. > >> > >> > >> On May 22, 2009, at 3:42 AM, Jianbin Dai wrote: > >> > >> > > >> > about 2.8 m total docs were created. only the > first > >> run finishes. In > >> > my 2nd try, it hangs there forever at the end > of > >> indexing, (I guess > >> > right before commit), with cpu usage of 100%. > Total 5G > >> (2050) index > >> > files are created. Now I have two problems: > >> > 1. why it hangs there and failed? > >> > 2. how can i speed up the indexing? > >> > > >> > > >> > Here is my solrconfig.xml > >> > > >> > > >> > false > >> > > >> > 3000 > >> > > >> 1000 > >> > > >> > 2147483647 > >> > > >> > 1 > >> > > >> > false > >> > > >> > > >> > > >> > > >> > --- On Thu, 5/21/09, Noble Paul > >> നോബിള് नो > >> > ब्ळ् > >> wrote: > >> > > >> >> From: Noble Paul നോബിള് > >> नोब्ळ् > >> >> > >> >> Subject: Re: How to index large set data > >> >> To: solr-user@lucene.apache.org > >> >> Date: Thursday, May 21, 2009, 10:39 PM > >> >> what is the total no:of docs created > >> >> ? I guess it may not be memory > >> >> bound. indexing is mostly amn IO bound > operation. > >> You may > >> >> be able to > >> >> get a better perf if a SSD is used (solid > state > >> disk) > >> >> > >> >> On Fri, May 22, 2009 at 10:46 AM, Jianbin > Dai > >> > >> >> wrote: > >> >>> > >> >>> Hi Paul, > >> >>> > >> >>> Thank you so much for answering my > questions. > >> It > >> >> really helped. > >> >>> After some adjustment, basically > setting > >> mergeFactor > >> >> to 1000 from the default value of 10, I > can > >> finished the > >> >> whole job in 2.5 hours. I checked that > during > >> running time, > >> >> only around 18% of memory is being used, > and VIRT > >> is always > >> >> 1418m. I am thinking it may be restricted > by JVM > >> memory > >> >> setting. But I run the data import > command through > >> web, > >> >> i.e., > >> >>> > >> >> > >> > http://:/solr/dataimport?command=full-import, > >> >> how can I set the memory allocation for > JVM? > >> >>> Thanks again! > >> >>> > >> >>> JB > >> >>> > >> >>> --- On Thu, 5/21/09, Noble Paul > >> നോബിള് > >> >> नोब्ळ् > >> >> wrote: > >> >>> > >> >>>> From: Noble Paul > നോബിള് > >> >> नोब्ळ् > >> >>>> Subject: Re: How to index large > set data > >> >>>> To: solr-u...@lucene.apache..org > >> >>>> Date: Thursday, May 21, 2009, > 9:57 PM > >> >>>> check the status page of DIH and > see > >> >>>> if it is working properly. and > >> >>>> if, yes what is the rate of > indexing > >> >>>> > >> >>>> On Thu, May 21, 2009 at 11:48 AM, > Jianbin > >> Dai > >> >> > >> >>>> wrote: > >> >>>>> > >> >>>>> Hi, > >> >>>>> > >> >>>>> I have about 45GB xml files > to be > >> indexed. I > >> >> am using > >> >>>> DataImportHandler. I started the > full > >> import 4 > >> >> hours ago, > >> >>>> and it's still running. > >> >>>>> My computer has 4GB memory. > Any > >> suggestion on > >> >> the > >> >>>> solutions? > >> >>>>> Thanks! > >> >>>>> > >> >>>>> JB > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>>> > >> >>>> > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> > >> >> > >> > - > >> >>>> Noble Paul | Principal Engineer| > AOL | http://aol.com > >> >>>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >> > >> >> > >> >> > >> >> -- > >> >> > >> > - > >> >> Noble Paul | Principal Engineer| AOL | http://aol.com > >> >> > >> > > >> > > >> > > >> > >> -- > >> Grant Ingersoll > >> http://www.lucidimagination.com/ > >> > >> Search the Lucene ecosystem > >> (Lucene/Solr/Nutch/Mahout/Tika/Droids) > >> using Solr/Lucene: > >> http://www.lucidimagination...com/search > >> > >> > > > > > > > > > > > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com >
Re: How to use DIH to index attributes in xml file
Oh, I guess I didn't say it clearly in my post. I didn't use wild cards in xpath. My question was how to index attributes "id" and "mid" in the following xml file. 301.46 In the data-config.xml, I use but what are the xpath for "id" and "mid"? Thanks again! --- On Fri, 5/22/09, Noble Paul നോബിള് नोब्ळ् wrote: > From: Noble Paul നോബിള് नोब्ळ् > Subject: Re: How to use DIH to index attributes in xml file > To: solr-user@lucene.apache.org > Date: Friday, May 22, 2009, 9:03 PM > wild cards are not supported . u must > use full xpath > > On Sat, May 23, 2009 at 4:55 AM, Jianbin Dai > wrote: > > > > I have an xml file like this > > > > > > type="stock-4" /> > > type="cond-0" /> > > > 301.46 > > > > > > In the data-config.xml, I use > > xpath="/.../merchantProduct/price" /> > > > > but how can I index "id", "mid"? > > > > Thanks. > > > > > > > > > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com >
Re: How to index large set data
Hi Paul, Hope you have a great weekend so far. I still have a couple of questions you might help me out: 1. In your earlier email, you said "if possible , you can setup multiple DIH say /dataimport1, /dataimport2 etc and split your files and can achieve parallelism" I am not sure if I understand it right. I put two requesHandler in solrconfig.xml, like this ./data-config.xml ./data-config2.xml and create data-config.xml and data-config2.xml. then I run the command http://host:8080/solr/dataimport?command=full-import But only one data set (the first one) was indexed. Did I get something wrong? 2. I noticed that after solr indexed about 8M documents (around two hours), it gets very very slow. I use "top" command in linux, and noticed that RES is 1g of memory. I did several experiments, every time RES reaches 1g, the indexing process becomes extremely slow. Is this memory limit set by JVM? And how can I set the JVM memory when I use DIH through web command full-import? Thanks! JB --- On Fri, 5/22/09, Noble Paul നോബിള് नोब्ळ् wrote: > From: Noble Paul നോബിള് नोब्ळ् > Subject: Re: How to index large set data > To: "Jianbin Dai" > Date: Friday, May 22, 2009, 10:04 PM > On Sat, May 23, 2009 at 10:27 AM, > Jianbin Dai > wrote: > > > > Hi Pual, but in your previous post, you said "there is > already an issue for writing to Solr in multiple threads > SOLR-1089". Do you think use solrj alone would be better > than DIH? > > nope > you will have to do indexing in multiple threads > > if possible , you can setup multiple DIH say /dataimport1, > /dataimport2 etc and split your files and can achieve > parallelism > > > > Thanks and have a good weekend! > > > > --- On Fri, 5/22/09, Noble Paul നോബിള് > नोब्ळ् > wrote: > > > >> no need to use embedded Solrserver.. > >> you can use SolrJ with streaming > >> in multiple threads > >> > >> On Fri, May 22, 2009 at 8:36 PM, Jianbin Dai > > >> wrote: > >> > > >> > If I do the xml parsing by myself and use > embedded > >> client to do the push, would it be more efficient > than DIH? > >> > > >> > > >> > --- On Fri, 5/22/09, Grant Ingersoll > >> wrote: > >> > > >> >> From: Grant Ingersoll > >> >> Subject: Re: How to index large set data > >> >> To: solr-user@lucene.apache.org > >> >> Date: Friday, May 22, 2009, 5:38 AM > >> >> Can you parallelize this? I > >> >> don't know that the DIH can handle it, > >> >> but having multiple threads sending docs > to Solr > >> is the > >> >> best > >> >> performance wise, so maybe you need to > look at > >> alternatives > >> >> to pulling > >> >> with DIH and instead use a client to push > into > >> Solr. > >> >> > >> >> > >> >> On May 22, 2009, at 3:42 AM, Jianbin Dai > wrote: > >> >> > >> >> > > >> >> > about 2.8 m total docs were created. > only the > >> first > >> >> run finishes. In > >> >> > my 2nd try, it hangs there forever > at the end > >> of > >> >> indexing, (I guess > >> >> > right before commit), with cpu usage > of 100%. > >> Total 5G > >> >> (2050) index > >> >> > files are created. Now I have two > problems: > >> >> > 1. why it hangs there and failed? > >> >> > 2. how can i speed up the indexing? > >> >> > > >> >> > > >> >> > Here is my solrconfig.xml > >> >> > > >> >> > > >> >> > >> > false > >> >> > > >> >> > >> > 3000 > >> >> > > >> >> > 1000 > >> >> > > >> >> > >> > 2147483647 > >> >> > > >> >> > >> > 1 > >> >> > > >> >> > >> > false > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > --- On Thu, 5/21/09, Noble Paul > >> >> നോബിള് नो > >> >> > ब्ळ् > >> >> wrote: > >> >> > > >> >> >> From: Noble Paul > നോബിള് > >> >> नोब्ळ् > >> >> >> > &g
Is it memory leaking in solr?
I am using DIH to do indexing. After I indexed about 8M documents (took about 1hr40m), it used up almost all memory (4GB), and the indexing becomes extremely slow. If I delete all indexing and shutdown tomcat, it still shows over 3gb memory was used. Is it memory leaking? if it is, then the leaking is in solr indexing or DIH? Thanks.
Re: Is it memory leaking in solr?
Again, indexing becomes extremely slow after indexed 8m documents (about 25G of original file size). Here is the memory usage info of my computer. Does this have anything to do with tomcat setting? Thanks. top - 08:09:53 up 7:22, 1 user, load average: 1.03, 1.01, 1.00 Tasks: 78 total, 2 running, 76 sleeping, 0 stopped, 0 zombie Cpu(s): 49.9%us, 0.2%sy, 0.0%ni, 49.8%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4044776k total, 3960740k used,84036k free,42196k buffers Swap: 2031608k total, 84k used, 2031524k free, 2729892k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 3322 root 21 0 1357m 1.0g 11m S 100 27.0 397:51.74 java --- On Mon, 5/25/09, Jianbin Dai wrote: > From: Jianbin Dai > Subject: Is it memory leaking in solr? > To: solr-user@lucene.apache.org, noble.p...@gmail.com > Date: Monday, May 25, 2009, 1:27 AM > > I am using DIH to do indexing. After I indexed about 8M > documents (took about 1hr40m), it used up almost all memory > (4GB), and the indexing becomes extremely slow. If I delete > all indexing and shutdown tomcat, it still shows over 3gb > memory was used. Is it memory leaking? if it is, then the > leaking is in solr indexing or DIH? Thanks. > > > > >
Re: Is it memory leaking in solr?
Hi Otis, The slowness was due to the JVM memory limit set by tomcat.. I have solved this problem. Initially I thought there might be memory leaking because I noticed the following behavior: In the peak of indexing, almost all 4GB of memory was used. Once indexing is done, the memory usage was about 3GB. If I delete all indexing, and shutdown solr, I still noticed that about 2 GB memory used, much more than the initial memory usage about 250M. I am not sure if I guess it right. Thanks. --- On Tue, 5/26/09, Otis Gospodnetic wrote: > From: Otis Gospodnetic > Subject: Re: Is it memory leaking in solr? > To: solr-user@lucene.apache.org > Date: Tuesday, May 26, 2009, 10:03 AM > > Jianbin, > > If you connect to that Java process with jconsole, do you > see a lot of garbage collection activity? > > What makes you think there is a memory leak? The > slowness? > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message > > From: Jianbin Dai > > To: solr-user@lucene.apache.org > > Sent: Monday, May 25, 2009 1:05:43 PM > > Subject: Re: Is it memory leaking in solr? > > > > > > Again, indexing becomes extremely slow after indexed > 8m documents (about 25G of > > original file size). Here is the memory usage info of > my computer. Does this > > have anything to do with tomcat setting? Thanks. > > > > > > top - 08:09:53 up 7:22, 1 user, load > average: 1.03, 1.01, 1.00 > > Tasks: 78 total, 2 > running, 76 sleeping, 0 > stopped, 0 zombie > > Cpu(s): 49.9%us, 0.2%sy, 0.0%ni, > 49.8%id, 0.2%wa, 0.0%hi, 0.0%si, > 0.0%st > > Mem: 4044776k total, 3960740k > used, 84036k free, 42196k buffers > > Swap: 2031608k total, > 84k used, 2031524k free, > 2729892k cached > > > > PID USER PR > NI VIRT RES SHR S %CPU %MEM > TIME+ COMMAND > > > > > > 3322 root 21 0 > 1357m 1.0g 11m S 100 27.0 397:51.74 java > > > > > > > > --- On Mon, 5/25/09, Jianbin Dai wrote: > > > > > From: Jianbin Dai > > > Subject: Is it memory leaking in solr? > > > To: solr-user@lucene.apache.org, > noble.p...@gmail.com > > > Date: Monday, May 25, 2009, 1:27 AM > > > > > > I am using DIH to do indexing. After I indexed > about 8M > > > documents (took about 1hr40m), it used up almost > all memory > > > (4GB), and the indexing becomes extremely slow. > If I delete > > > all indexing and shutdown tomcat, it still shows > over 3gb > > > memory was used. Is it memory leaking? if it is, > then the > > > leaking is in solr indexing or DIH? > Thanks. > > > > > > > > > > > > > > > > >
how to do exact serch with solrj
Hi, I want to search "hello the world" in the "title" field using solrj. I set the query filter query.addFilterQuery("title"); query.setQuery("hello the world"); but it returns not exact match results as well. I know one way to do it is to set "title" field to string instead of text. But is there any way i can do it? If I do the search through web interface Solr Admin by title:"hello the world", it returns exact matches. Thanks. JB
Re: how to do exact serch with solrj
I tried, but seems it's not working right. --- On Sat, 5/30/09, Avlesh Singh wrote: > From: Avlesh Singh > Subject: Re: how to do exact serch with solrj > To: solr-user@lucene.apache.org > Date: Saturday, May 30, 2009, 10:56 PM > query.setQuery("title:hello the > world") is what you need. > > Cheers > Avlesh > > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai > wrote: > > > > > Hi, > > > > I want to search "hello the world" in the "title" > field using solrj. I set > > the query filter > > query.addFilterQuery("title"); > > query.setQuery("hello the world"); > > > > but it returns not exact match results as well. > > > > I know one way to do it is to set "title" field to > string instead of text. > > But is there any way i can do it? If I do the search > through web interface > > Solr Admin by title:"hello the world", it returns > exact matches. > > > > Thanks. > > > > JB > > > > > > > > > > >
Re: how to do exact serch with solrj
That's correct! Thanks Avlesh. --- On Sat, 5/30/09, Avlesh Singh wrote: > From: Avlesh Singh > Subject: Re: how to do exact serch with solrj > To: solr-user@lucene.apache.org > Date: Saturday, May 30, 2009, 11:45 PM > You need exact match for all the > three tokens? > If yes, try query.setQuery("title:\"hello the world\""); > > Cheers > Avlesh > > On Sun, May 31, 2009 at 12:12 PM, Jianbin Dai > wrote: > > > > > I tried, but seems it's not working right. > > > > --- On Sat, 5/30/09, Avlesh Singh > wrote: > > > > > From: Avlesh Singh > > > Subject: Re: how to do exact serch with solrj > > > To: solr-user@lucene.apache.org > > > Date: Saturday, May 30, 2009, 10:56 PM > > > query.setQuery("title:hello the > > > world") is what you need. > > > > > > Cheers > > > Avlesh > > > > > > On Sun, May 31, 2009 at 6:23 AM, Jianbin Dai > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > I want to search "hello the world" in the > "title" > > > field using solrj. I set > > > > the query filter > > > > query.addFilterQuery("title"); > > > > query.setQuery("hello the world"); > > > > > > > > but it returns not exact match results as > well. > > > > > > > > I know one way to do it is to set "title" > field to > > > string instead of text. > > > > But is there any way i can do it? If I do > the search > > > through web interface > > > > Solr Admin by title:"hello the world", it > returns > > > exact matches. > > > > > > > > Thanks. > > > > > > > > JB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Index Comma Separated numbers
Hi, One of the fields to be indexed is price which is comma separated, e.g., 12,034.00. How can I indexed it as a number? I am using DIH to pull the data. Thanks.
Re: how to do exact serch with solrj
I still have a problem with exact matching. query.setQuery("title:\"hello the world\""); This will return all docs with title containing "hello the world", i.e., "hello the world, Jack" will also be matched. What I want is exactly "hello the world". Setting this field to string instead of text doesn't work well either, because I want something like "Hello, The World" to be matched as well. Any idea? Thanks. > --- On Sat, 5/30/09, Avlesh Singh > wrote: > > > From: Avlesh Singh > > Subject: Re: how to do exact serch with solrj > > To: solr-user@lucene.apache.org > > Date: Saturday, May 30, 2009, 11:45 PM > > You need exact match for all the > > three tokens? > > If yes, try query.setQuery("title:\"hello the > world\""); > > > > Cheers > > Avlesh > > > > On Sun, May 31, 2009 at 12:12 PM, Jianbin Dai > > wrote: > > > > > > > > I tried, but seems it's not working right. > > > > > > --- On Sat, 5/30/09, Avlesh Singh > > wrote: > > > > > > > From: Avlesh Singh > > > > Subject: Re: how to do exact serch with > solrj > > > > To: solr-user@lucene.apache.org > > > > Date: Saturday, May 30, 2009, 10:56 PM > > > > query.setQuery("title:hello the > > > > world") is what you need. > > > > > > > > Cheers > > > > Avlesh > > > > > > > > On Sun, May 31, 2009 at 6:23 AM, Jianbin > Dai > > > > > > wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > I want to search "hello the world" in > the > > "title" > > > > field using solrj. I set > > > > > the query filter > > > > > query.addFilterQuery("title"); > > > > > query.setQuery("hello the world"); > > > > > > > > > > but it returns not exact match results > as > > well. > > > > > > > > > > I know one way to do it is to set > "title" > > field to > > > > string instead of text. > > > > > But is there any way i can do it? If I > do > > the search > > > > through web interface > > > > > Solr Admin by title:"hello the world", > it > > returns > > > > exact matches. > > > > > > > > > > Thanks. > > > > > > > > > > JB > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: Index Comma Separated numbers
Hi, Yes, I put it in data-config.xml, like following wrote: > From: Noble Paul നോബിള് नोब्ळ् > Subject: Re: Index Comma Separated numbers > To: solr-user@lucene.apache.org > Date: Thursday, June 4, 2009, 9:24 PM > did you try the > NumberFormatTransformer ? > > On Fri, Jun 5, 2009 at 12:09 AM, Jianbin Dai > wrote: > > > > Hi, One of the fields to be indexed is price which is > comma separated, e.g., 12,034.00. How can I indexed it as > a number? > > I am using DIH to pull the data. Thanks. > > > > > > > > > > > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com >
Re: Index Comma Separated numbers
I forgot to put formatStyle="number" on the field. It works now. Thanks!! --- On Fri, 6/5/09, Jianbin Dai wrote: > From: Jianbin Dai > Subject: Re: Index Comma Separated numbers > To: solr-user@lucene.apache.org, noble.p...@gmail.com > Date: Friday, June 5, 2009, 12:37 PM > > Hi, > > Yes, I put it in data-config.xml, like following > > > > > dataSource="xmlreader" > > > processor="XPathEntityProcessor" > > > url="${f.fileAbsolutePath}" > > > forEach="/abc/def/gh" > > > transformer="NumberFormatTransformer" > > > > > > But it's not working on comma separated numbers. > Did I miss something? > > Thanks. > > > > > > --- On Thu, 6/4/09, Noble Paul നോബിള് > नोब्ळ् > wrote: > > > From: Noble Paul നോബിള് > नोब्ळ् > > Subject: Re: Index Comma Separated numbers > > To: solr-user@lucene.apache.org > > Date: Thursday, June 4, 2009, 9:24 PM > > did you try the > > NumberFormatTransformer ? > > > > On Fri, Jun 5, 2009 at 12:09 AM, Jianbin Dai > > wrote: > > > > > > Hi, One of the fields to be indexed is price > which is > > comma separated, e.g., 12,034.00. How can I indexed > it as > > a number? > > > I am using DIH to pull the data. Thanks. > > > > > > > > > > > > > > > > > > > > > > > -- > > - > > Noble Paul | Principal Engineer| AOL | http://aol.com > > > > > > >
Use DIH with large xml file
Hi, I have about 50GB of data to be indexed each day using DIH. Some of the files are as large as 6GB. I set the JVM Xmx to be 3GB, but the DIH crashes on those big files. Is there any way to handle it? Thanks. JB
Re: Use DIH with large xml file
Can DIH read item by item instead of the whole file before indexing? my biggest file size is 6GB, larger than the JVM max ram value. --- On Sat, 6/20/09, Erik Hatcher wrote: > From: Erik Hatcher > Subject: Re: Use DIH with large xml file > To: solr-user@lucene.apache.org > Date: Saturday, June 20, 2009, 6:52 PM > How are you configuring DIH to read > those files? It is likely that you'll need at least as > much RAM to the JVM as the largest file you're processing, > though that depends entirely on how the file is being > processed. > > Erik > > On Jun 20, 2009, at 9:23 PM, Jianbin Dai wrote: > > > > > Hi, > > > > I have about 50GB of data to be indexed each day using > DIH. Some of the files are as large as 6GB. I set the JVM > Xmx to be 3GB, but the DIH crashes on those big files. Is > there any way to handle it? > > > > Thanks. > > > > JB > > > > > > > >
weighted search and index
Hi, I am trying to use solr for a content match application. A content is described by a set of keywords with weights associated, eg., C1: fruit 0.8, apple 0.4, banana 0.2 C2: music 0.9, pop song 0.6, Britney Spears 0.4 Those contents would be indexed in solr. In the search, I also have a set of keywords with weights: Query: Sports 0.8, golf 0.5 I am trying to find the closest matching contents for this query. My question is how to index the contents with weighted scores, and how to write search query. I was trying to use boosting, but seems not working right. Thanks. Jianbin
RE: weighted search and index
Thank you very much Erick! 1. I used boost in search, but I don't know exactly what's the best way to boost, for such as Sports 0.8, golf 0.5 in my example, would it be sports^0.8 AND golf^0.5 ? 2. I cannot use boost in indexing. Because the weight of the value changes, not the field, look at this example again, C1: fruit 0.8, apple 0.4, banana 0.2 C2: music 0.9, pop song 0.6, Britney Spears 0.4 There is no good way to boost it during indexing. Thanks. JB -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, March 03, 2010 5:45 PM To: solr-user@lucene.apache.org Subject: Re: weighted search and index You have to provide some more details to get meaningful help. You say "I was trying to use boosting". How? At index time? Search time? Both? Can you provide some code snippets? What does your schema look like for the relevant field(s)? You say "but seems not working right". What does that mean? No hits? Hits not ordered as you expect? Have you tried putting "&debugQuery=on" on your URL and examined the return values? Have you looked at your index with the admin page and/or Luke to see if the data in the index is as you expect? As far as I know, boosts are multiplicative. So boosting by a value less than 1 will actually decrease the ranking. But see the Lucene scoring, See: http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity. html And remember, that boosting will *tend* to move a hit up or down in the ranking, not position it absolutely. HTH Erick On Wed, Mar 3, 2010 at 8:13 PM, Jianbin Dai wrote: > Hi, > > I am trying to use solr for a content match application. > > A content is described by a set of keywords with weights associated, eg., > > C1: fruit 0.8, apple 0.4, banana 0.2 > C2: music 0.9, pop song 0.6, Britney Spears 0.4 > > Those contents would be indexed in solr. > In the search, I also have a set of keywords with weights: > > Query: Sports 0.8, golf 0.5 > > I am trying to find the closest matching contents for this query. > > My question is how to index the contents with weighted scores, and how to > write search query. I was trying to use boosting, but seems not working > right. > > Thanks. > > Jianbin > > >
RE: weighted search and index
Hi Erick, Each doc contains some keywords that are indexed. However each keyword is associated with a weight to represent its importance. In my example, D1: fruit 0.8, apple 0.4, banana 0.2 The keyword fruit is the most important keyword, which means I really really want it to be matched in a search result, but banana is less important (It would be good to be matched though). Hope that explains. Thanks. JB -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Wednesday, March 03, 2010 6:23 PM To: solr-user@lucene.apache.org Subject: Re: weighted search and index Then I'm totally lost as to what you're trying to accomplish. Perhaps a higher-level statement of the problem would help. Because no matter how often I look at your point <2>, I don't see what relevance the numbers have if you're not using them to boost at index time. Why are they even there? Erick On Wed, Mar 3, 2010 at 8:54 PM, Jianbin Dai wrote: > Thank you very much Erick! > > 1. I used boost in search, but I don't know exactly what's the best way to > boost, for such as Sports 0.8, golf 0.5 in my example, would it be > sports^0.8 AND golf^0.5 ? > > > 2. I cannot use boost in indexing. Because the weight of the value changes, > not the field, look at this example again, > > C1: fruit 0.8, apple 0.4, banana 0.2 > C2: music 0.9, pop song 0.6, Britney Spears 0.4 > > There is no good way to boost it during indexing. > > Thanks. > > JB > > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Wednesday, March 03, 2010 5:45 PM > To: solr-user@lucene.apache.org > Subject: Re: weighted search and index > > You have to provide some more details to get meaningful help. > > You say "I was trying to use boosting". How? At index time? > Search time? Both? Can you provide some code snippets? > What does your schema look like for the relevant field(s)? > > You say "but seems not working right". What does that mean? No hits? > Hits not ordered as you expect? Have you tried putting "&debugQuery=on" on > your URL and examined the return values? > > Have you looked at your index with the admin page and/or Luke to see if > the data in the index is as you expect? > > As far as I know, boosts are multiplicative. So boosting by a value less > than > 1 will actually decrease the ranking. But see the Lucene scoring, See: > > http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity. > html<http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Simila rity.%0Ahtml> > > And remember, that boosting will *tend* to move a hit up or down in the > ranking, not position it absolutely. > > HTH > Erick > > On Wed, Mar 3, 2010 at 8:13 PM, Jianbin Dai wrote: > > > Hi, > > > > I am trying to use solr for a content match application. > > > > A content is described by a set of keywords with weights associated, eg., > > > > C1: fruit 0.8, apple 0.4, banana 0.2 > > C2: music 0.9, pop song 0.6, Britney Spears 0.4 > > > > Those contents would be indexed in solr. > > In the search, I also have a set of keywords with weights: > > > > Query: Sports 0.8, golf 0.5 > > > > I am trying to find the closest matching contents for this query. > > > > My question is how to index the contents with weighted scores, and how to > > write search query. I was trying to use boosting, but seems not working > > right. > > > > Thanks. > > > > Jianbin > > > > > > > >
RE: weighted search and index
Thanks! Will try it. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Thursday, March 04, 2010 5:59 AM To: solr-user@lucene.apache.org Subject: Re: weighted search and index OK, lights are finally dawning. I think what you want is payloads, see: http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payload s/ <http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloa ds/>for your index-time term boosting. Query time boosting is as you indicated HTH Erick On Wed, Mar 3, 2010 at 9:34 PM, Jianbin Dai wrote: > Hi Erick, > > Each doc contains some keywords that are indexed. However each keyword is > associated with a weight to represent its importance. In my example, > D1: fruit 0.8, apple 0.4, banana 0.2 > > The keyword fruit is the most important keyword, which means I really > really > want it to be matched in a search result, but banana is less important (It > would be good to be matched though). > > Hope that explains. > > Thanks. > > JB > > > > -Original Message- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Wednesday, March 03, 2010 6:23 PM > To: solr-user@lucene.apache.org > Subject: Re: weighted search and index > > Then I'm totally lost as to what you're trying to accomplish. Perhaps > a higher-level statement of the problem would help. > > Because no matter how often I look at your point <2>, I don't see > what relevance the numbers have if you're not using them to > boost at index time. Why are they even there? > > Erick > > On Wed, Mar 3, 2010 at 8:54 PM, Jianbin Dai wrote: > > > Thank you very much Erick! > > > > 1. I used boost in search, but I don't know exactly what's the best way > to > > boost, for such as Sports 0.8, golf 0.5 in my example, would it be > > sports^0.8 AND golf^0.5 ? > > > > > > 2. I cannot use boost in indexing. Because the weight of the value > changes, > > not the field, look at this example again, > > > > C1: fruit 0.8, apple 0.4, banana 0.2 > > C2: music 0.9, pop song 0.6, Britney Spears 0.4 > > > > There is no good way to boost it during indexing. > > > > Thanks. > > > > JB > > > > > > -Original Message- > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > Sent: Wednesday, March 03, 2010 5:45 PM > > To: solr-user@lucene.apache.org > > Subject: Re: weighted search and index > > > > You have to provide some more details to get meaningful help. > > > > You say "I was trying to use boosting". How? At index time? > > Search time? Both? Can you provide some code snippets? > > What does your schema look like for the relevant field(s)? > > > > You say "but seems not working right". What does that mean? No hits? > > Hits not ordered as you expect? Have you tried putting "&debugQuery=on" > on > > your URL and examined the return values? > > > > Have you looked at your index with the admin page and/or Luke to see if > > the data in the index is as you expect? > > > > As far as I know, boosts are multiplicative. So boosting by a value less > > than > > 1 will actually decrease the ranking. But see the Lucene scoring, See: > > > > > http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity > . > > > html< > http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Simila > rity.%0Ahtml> > > > > And remember, that boosting will *tend* to move a hit up or down in the > > ranking, not position it absolutely. > > > > HTH > > Erick > > > > On Wed, Mar 3, 2010 at 8:13 PM, Jianbin Dai wrote: > > > > > Hi, > > > > > > I am trying to use solr for a content match application. > > > > > > A content is described by a set of keywords with weights associated, > eg., > > > > > > C1: fruit 0.8, apple 0.4, banana 0.2 > > > C2: music 0.9, pop song 0.6, Britney Spears 0.4 > > > > > > Those contents would be indexed in solr. > > > In the search, I also have a set of keywords with weights: > > > > > > Query: Sports 0.8, golf 0.5 > > > > > > I am trying to find the closest matching contents for this query. > > > > > > My question is how to index the contents with weighted scores, and how > to > > > write search query. I was trying to use boosting, but seems not working > > > right. > > > > > > Thanks. > > > > > > Jianbin > > > > > > > > > > > > > > >