RE: custom scorer in Solr
I've been investigating this further and I might have found another path to consider. Would it be possible to create a custom implementation of a SortField, comparable to the RandomSortField, to tackle the problem? I know it is not your standard question but would really appreciate all feedback and suggestions on this because this is the issue that will make or break the acceptance of Solr for this client. Thanks, Tom -Original Message- From: Fornoville, Tom Sent: woensdag 9 juni 2010 15:35 To: solr-user@lucene.apache.org Subject: custom scorer in Solr Hi all, We are currently working on a proof-of-concept for a client using Solr and have been able to configure all the features they want except the scoring. Problem is that they want scores that make results fall in buckets: * Bucket 1: exact match on category (score = 4) * Bucket 2: exact match on name (score = 3) * Bucket 3: partial match on category (score = 2) * Bucket 4: partial match on name (score = 1) First thing we did was develop a custom similarity class that would return the correct score depending on the field and an exact or partial match. The only problem now is that when a document matches on both the category and name the scores are added together. Example: searching for "restaurant" returns documents in the category restaurant that also have the word restaurant in their name and thus get a score of 5 (4+1) but they should only get 4. I assume for this to work we would need to develop a custom Scorer class but we have no clue on how to incorporate this in Solr. Maybe there is even a simpler solution that we don't know about. All suggestions welcome! Thanks, Tom
Re: custom scorer in Solr
First of all, Do you expect every query to return results for all 4 buckets? i.o.w: say you make a Sortfield that sorts for score 4 first, than 3, 2, 1. When displaying the first 10 results, is it ok that these documents potentially all have score 4, and thus only bucket 1 is filled? If so, I can think of the following out-of-the-box option works: (which I'm not sure performs enough, but you can easily test it on your data) following your example create 4 fields: 1. categoryExact - configure anaylzers so that only full matches score, other 2. categoryPartial - configure so that full and partial match (likely you have already configured this) 3. nameExact - like 1 4. namepartial - like 2 configure copyfields: 1 --> 2 and 3 --> 4 this way your indexing client can stay the same as it likely is at the moment. Now you have 4 fields which scores you have to combine on search-time so that the evenual scores are [1,4] Out-of-the-box you can do this with functionqueries. http://wiki.apache.org/solr/FunctionQuery I don't have time to write it down exactly, but for each field: - calc the score of each field (use the Query functionquery (nr 16 in the wiki) . If score > 0 use the map function to map it to respectively 4,3,2,1. now for each document you have potentially multiple scores for instance: 4 and 2 if your doc matches exact and partial on category. - use the max functionquery to only return the highest score --> 4 in this case. You have to find out for yourself if this performs though. Hope that helps, Geert-Jan 2010/6/14 Fornoville, Tom > I've been investigating this further and I might have found another path > to consider. > > Would it be possible to create a custom implementation of a SortField, > comparable to the RandomSortField, to tackle the problem? > > > I know it is not your standard question but would really appreciate all > feedback and suggestions on this because this is the issue that will > make or break the acceptance of Solr for this client. > > Thanks, > Tom > > -Original Message- > From: Fornoville, Tom > Sent: woensdag 9 juni 2010 15:35 > To: solr-user@lucene.apache.org > Subject: custom scorer in Solr > > Hi all, > > > > We are currently working on a proof-of-concept for a client using Solr > and have been able to configure all the features they want except the > scoring. > > > > Problem is that they want scores that make results fall in buckets: > > * Bucket 1: exact match on category (score = 4) > * Bucket 2: exact match on name (score = 3) > * Bucket 3: partial match on category (score = 2) > * Bucket 4: partial match on name (score = 1) > > > > First thing we did was develop a custom similarity class that would > return the correct score depending on the field and an exact or partial > match. > > > > The only problem now is that when a document matches on both the > category and name the scores are added together. > > Example: searching for "restaurant" returns documents in the category > restaurant that also have the word restaurant in their name and thus get > a score of 5 (4+1) but they should only get 4. > > > > I assume for this to work we would need to develop a custom Scorer class > but we have no clue on how to incorporate this in Solr. > > Maybe there is even a simpler solution that we don't know about. > > > > All suggestions welcome! > > > > Thanks, > > Tom > >
Re: custom scorer in Solr
Just to be clear, this is for the use-case in which it is ok that potentially only 1 bucket gets filled. 2010/6/14 Geert-Jan Brits > First of all, > > Do you expect every query to return results for all 4 buckets? > i.o.w: say you make a Sortfield that sorts for score 4 first, than 3, 2, > 1. > When displaying the first 10 results, is it ok that these documents > potentially all have score 4, and thus only bucket 1 is filled? > > If so, I can think of the following out-of-the-box option works: (which I'm > not sure performs enough, but you can easily test it on your data) > > following your example create 4 fields: > 1. categoryExact - configure anaylzers so that only full matches score, > other > 2. categoryPartial - configure so that full and partial match (likely you > have already configured this) > 3. nameExact - like 1 > 4. namepartial - like 2 > > configure copyfields: 1 --> 2 and 3 --> 4 > this way your indexing client can stay the same as it likely is at the > moment. > > > Now you have 4 fields which scores you have to combine on search-time so > that the evenual scores are [1,4] > Out-of-the-box you can do this with functionqueries. > > http://wiki.apache.org/solr/FunctionQuery > > I don't have time to write it down exactly, but for each field: > - calc the score of each field (use the Query functionquery (nr 16 in the > wiki) . If score > 0 use the map function to map it to respectively > 4,3,2,1. > > now for each document you have potentially multiple scores for instance: 4 > and 2 if your doc matches exact and partial on category. > - use the max functionquery to only return the highest score --> 4 in this > case. > > You have to find out for yourself if this performs though. > > Hope that helps, > Geert-Jan > > > 2010/6/14 Fornoville, Tom > > I've been investigating this further and I might have found another path >> to consider. >> >> Would it be possible to create a custom implementation of a SortField, >> comparable to the RandomSortField, to tackle the problem? >> >> >> I know it is not your standard question but would really appreciate all >> feedback and suggestions on this because this is the issue that will >> make or break the acceptance of Solr for this client. >> >> Thanks, >> Tom >> >> -Original Message- >> From: Fornoville, Tom >> Sent: woensdag 9 juni 2010 15:35 >> To: solr-user@lucene.apache.org >> Subject: custom scorer in Solr >> >> Hi all, >> >> >> >> We are currently working on a proof-of-concept for a client using Solr >> and have been able to configure all the features they want except the >> scoring. >> >> >> >> Problem is that they want scores that make results fall in buckets: >> >> * Bucket 1: exact match on category (score = 4) >> * Bucket 2: exact match on name (score = 3) >> * Bucket 3: partial match on category (score = 2) >> * Bucket 4: partial match on name (score = 1) >> >> >> >> First thing we did was develop a custom similarity class that would >> return the correct score depending on the field and an exact or partial >> match. >> >> >> >> The only problem now is that when a document matches on both the >> category and name the scores are added together. >> >> Example: searching for "restaurant" returns documents in the category >> restaurant that also have the word restaurant in their name and thus get >> a score of 5 (4+1) but they should only get 4. >> >> >> >> I assume for this to work we would need to develop a custom Scorer class >> but we have no clue on how to incorporate this in Solr. >> >> Maybe there is even a simpler solution that we don't know about. >> >> >> >> All suggestions welcome! >> >> >> >> Thanks, >> >> Tom >> >> >
RE: custom scorer in Solr
Hello Geert-Jan, This seems like a very promising idea, I will test it out later today. It is not expected that we have results in all buckets, we have many use-cases where only 1 or 2 buckets are filled. It is also not a problem that the first 10 results (or 20 in our case) all fall in the same bucket. I'll keep you updated on how this works out. -Original Message- From: Geert-Jan Brits [mailto:gbr...@gmail.com] Sent: maandag 14 juni 2010 11:00 To: solr-user@lucene.apache.org Subject: Re: custom scorer in Solr First of all, Do you expect every query to return results for all 4 buckets? i.o.w: say you make a Sortfield that sorts for score 4 first, than 3, 2, 1. When displaying the first 10 results, is it ok that these documents potentially all have score 4, and thus only bucket 1 is filled? If so, I can think of the following out-of-the-box option works: (which I'm not sure performs enough, but you can easily test it on your data) following your example create 4 fields: 1. categoryExact - configure anaylzers so that only full matches score, other 2. categoryPartial - configure so that full and partial match (likely you have already configured this) 3. nameExact - like 1 4. namepartial - like 2 configure copyfields: 1 --> 2 and 3 --> 4 this way your indexing client can stay the same as it likely is at the moment. Now you have 4 fields which scores you have to combine on search-time so that the evenual scores are [1,4] Out-of-the-box you can do this with functionqueries. http://wiki.apache.org/solr/FunctionQuery I don't have time to write it down exactly, but for each field: - calc the score of each field (use the Query functionquery (nr 16 in the wiki) . If score > 0 use the map function to map it to respectively 4,3,2,1. now for each document you have potentially multiple scores for instance: 4 and 2 if your doc matches exact and partial on category. - use the max functionquery to only return the highest score --> 4 in this case. You have to find out for yourself if this performs though. Hope that helps, Geert-Jan 2010/6/14 Fornoville, Tom > I've been investigating this further and I might have found another path > to consider. > > Would it be possible to create a custom implementation of a SortField, > comparable to the RandomSortField, to tackle the problem? > > > I know it is not your standard question but would really appreciate all > feedback and suggestions on this because this is the issue that will > make or break the acceptance of Solr for this client. > > Thanks, > Tom > > -Original Message- > From: Fornoville, Tom > Sent: woensdag 9 juni 2010 15:35 > To: solr-user@lucene.apache.org > Subject: custom scorer in Solr > > Hi all, > > > > We are currently working on a proof-of-concept for a client using Solr > and have been able to configure all the features they want except the > scoring. > > > > Problem is that they want scores that make results fall in buckets: > > * Bucket 1: exact match on category (score = 4) > * Bucket 2: exact match on name (score = 3) > * Bucket 3: partial match on category (score = 2) > * Bucket 4: partial match on name (score = 1) > > > > First thing we did was develop a custom similarity class that would > return the correct score depending on the field and an exact or partial > match. > > > > The only problem now is that when a document matches on both the > category and name the scores are added together. > > Example: searching for "restaurant" returns documents in the category > restaurant that also have the word restaurant in their name and thus get > a score of 5 (4+1) but they should only get 4. > > > > I assume for this to work we would need to develop a custom Scorer class > but we have no clue on how to incorporate this in Solr. > > Maybe there is even a simpler solution that we don't know about. > > > > All suggestions welcome! > > > > Thanks, > > Tom > >
Re: diff logging for each solr-core?
After some more research, i found an even older thread on the list where it was discussed a little more, but still no separat logfiles: http://search.lucidimagination.com/search/document/a5cdc596b2c76a7c/setting_a_log_file_per_core_with_slf4 Anyway i will use this in my custom-code to add a prefix for each line. Regards, Alex -- Alexander Rothenberg Fotofinder GmbH USt-IdNr. DE812854514 Software EntwicklungWeb: http://www.fotofinder.net/ Potsdamer Str. 96 Tel: +49 30 25792890 10785 BerlinFax: +49 30 257928999 Geschäftsführer:Ali Paczensky Amtsgericht:Berlin Charlottenburg (HRB 73099) Sitz: Berlin
Re: diff logging for each solr-core?
Hi Alex, as I understand the thread you will have to change the solr src then, right? The logPath is not available or did I understand something wrong? If you are okay with touching solr I would rather suggest repackaging the solr.war with a different logging configuration. (so that the cores do not fallback to the tomcat one) Regards, Peter. > After some more research, i found an even older thread on the list where it > was discussed a little more, but still no separat logfiles: > http://search.lucidimagination.com/search/document/a5cdc596b2c76a7c/setting_a_log_file_per_core_with_slf4 > > > Anyway i will use this in my custom-code to add a prefix for each line. > > Regards, Alex > > -- http://karussell.wordpress.com/
Re: diff logging for each solr-core?
On Monday 14 June 2010 13:21:31 Peter Karich wrote: > as I understand the thread you will have to change the solr src then, > right? The logPath is not available or did I understand something wrong? For me, i will only change my own custom-code, not the orginal src from solr. I had to write a custom dataSource-plugin and a custom dataImportHandler to index over 300 db's (database design of our customers is special... :s ) and badly need to see which log-msg comes from which solrcore. It will not be possible to really influence the logpath from inside the solrcore. The only way i know is altering the log4j.xml for example: this would write all log-msg's from the java-package "org.apache.solr.handler.dataimport" to /var/log/indexer_log (and my custom classes are in org.apache.solr.handler.dataimport) -- Alexander Rothenberg Fotofinder GmbH USt-IdNr. DE812854514 Software EntwicklungWeb: http://www.fotofinder.net/ Potsdamer Str. 96 Tel: +49 30 25792890 10785 BerlinFax: +49 30 257928999 Geschäftsführer:Ali Paczensky Amtsgericht:Berlin Charlottenburg (HRB 73099) Sitz: Berlin
dataimporthandler and javascript transformer and default values
hi, i have two questions: 1) how can i set a default value on an imported field if the field/column is missing from a SQL query 2) i had a problem with the dataimporthandler. in one database column (WebDst) i have a string with a comma/semicolon seperated numbers, like 100,200; 300;400, 500 there can be a space or not. i want to have a multivalued field in the end like 100 200 300 400 500 i thought that the javascript/script-transformer could do the trick. i have a script like in my entity-definition i have transformer="RegexTransformer,script:dst2intern,TemplateTransformer" and then i have a field __intern i thought that this would work perfect. it seems the split only can split on ; when comparing a single char. the regex with webdst.split(/[,; ] */); doesn't work. i have check it in a simple html-page, there the javascript split works with the regex. the solution which works for me is to first use a regex transformer on WebDst and use a simple ";" split in the javascript. i am using solr 1.4, java 1.6... does anyone know or can tell my, why the javascript split with a regex doesn't work? thank you markus
Re: dataimporthandler and javascript transformer and default values
hi, check Regex Transformer http://wiki.apache.org/solr/DataImportHandler#RegexTransformer umar On Mon, Jun 14, 2010 at 5:44 PM, wrote: > hi, > > i have two questions: > > 1) how can i set a default value on an imported field if the > field/column is missing from a SQL query > 2) i had a problem with the dataimporthandler. in one database column > (WebDst) i have a string with a comma/semicolon seperated numbers, like > >100,200; 300;400, 500 > > there can be a space or not. i want to have a multivalued field in the > end like > > >100 >200 >300 >400 >500 > > > i thought that the javascript/script-transformer could do the trick. i > have a script like > > function dst2intern(row) { >var webdst=''; >var count = 0; >webdst = row.get('WebDst'); >var arr = new java.util.ArrayList(); >if (webdst) { >// var dst = webdst.split(/[,; ] */); >var dst = webdst.split(';'); >for (var i=0; iarr.add(dst[i]); >count++; >} >if (!count) { >arr.add('0'); >} >row.put('intern', arr); >} else { >arr.add('0'); >row.put('intern', arr); >} >return row; >} >]]> > > in my entity-definition i have > transformer="RegexTransformer,script:dst2intern,TemplateTransformer" > > and then i have a field __intern > > > > i thought that this would work perfect. it seems the split only can > split on ; when comparing a single char. > the regex with > >webdst.split(/[,; ] */); > > doesn't work. i have check it in a simple html-page, there the > javascript split works with the regex. > the solution which works for me is to first use a regex transformer on > WebDst > > > > > and use a simple ";" split in the javascript. > > i am using solr 1.4, java 1.6... > > does anyone know or can tell my, why the javascript split with a regex > doesn't work? > > thank you > > markus > > > >
VelocityResponseWriter in Solr Core ?! configuration
Hello. I want to use the VelocityResponseWriter. I did all these steps from this site: http://wiki.apache.org/solr/VelocityResponseWriter Builded a war-file with "ant dist" and use it. but solr cannot find the VelocityResponseWriter - Class java.lang.NoClassDefFoundError: org/apache/solr/response/QueryResponseWriter at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(Unknown Source) i think i have an mistake when i build the warfile, because in my build-folder after building, no valecity-classes are in there. how can i build a solr.war with this classes ? thx =) -- View this message in context: http://lucene.472066.n3.nabble.com/VelocityResponseWriter-in-Solr-Core-configuration-tp894262p894262.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: VelocityResponseWriter in Solr Core ?! configuration
What version of Solr are you using? If you're using trunk, the VelocityResponseWriter is built in to the example. If you're using previous versions, try specifying "solr.VelocityResponseWriter" as the class name, as it switched from the request to the response packages, and the "solr." shortcut will find it in either one. The additional JAR files can go into your /lib subdirectory and don't need to be built into the WAR at all. Erik On Jun 14, 2010, at 8:39 AM, stockii wrote: Hello. I want to use the VelocityResponseWriter. I did all these steps from this site: http://wiki.apache.org/solr/VelocityResponseWriter Builded a war-file with "ant dist" and use it. but solr cannot find the VelocityResponseWriter - Class java.lang.NoClassDefFoundError: org/apache/solr/response/ QueryResponseWriter at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(Unknown Source) i think i have an mistake when i build the warfile, because in my build-folder after building, no valecity-classes are in there. how can i build a solr.war with this classes ? thx =) -- View this message in context: http://lucene.472066.n3.nabble.com/VelocityResponseWriter-in-Solr-Core-configuration-tp894262p894262.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: VelocityResponseWriter in Solr Core ?! configuration
ah okay. i tried it with 1.4 and put the jars into lib of solr.home but it want be work. i get the same error ... i use 2 cores. and my solr.home is ...path/cores in this folder i put another folder with the name: "lib" and put all these Jars into it: apache-solr-velocity-1.4-dev.jar velocity-1.6.1.jar velocity-tools-2.0-beta3.jar commons-beanutils-1.7.0.jar commons-collections-3.2.1.jar commons-lang-2.1.jar and then in solrconfig.xml this line: solr cannot find the jars =( -- View this message in context: http://lucene.472066.n3.nabble.com/VelocityResponseWriter-in-Solr-Core-configuration-tp894262p894354.html Sent from the Solr - User mailing list archive at Nabble.com.
Using solr with woodstox 4.0.8
Hi all, we are using woodstox-4.0 and solr-1.4 in our project. As solr is using woodstox-3.2.7, there is a version clash. So I tried to check if solr would run with woodstox-4.0. I downloaded a clean solr-1.4.0 and replaced wstx-asl-3.2.7.jar with stax2-api-3.0.2.jar and woodstox-core-lgpl-4.0.8.jar in the lib directory. Then I called "ant clean test" and it succeeded with no failures. Am I missing something? Anything more to test? Cheers, Alex
Re: Solr and Nutch/Droids - to use or not to use?
Just wanted to push the topic a little bit, because those question come up quite often and it's very interesting for me. Thank you! - Mitch MitchK wrote: > > Hello community and a nice satureday, > > from several discussions about Solr and Nutch, I got some questions for a > virtual web-search-engine. > > The requirements: > I. I need a scalable solution for a growing index that becomes larger than > one machine can handle. If I add more hardware, I want to linear improve > the performance. > > II. I want to use technologies like the OPIC-algorithm (default algorithm > in Nutch) or PageRank or... whatever is out there to improve the ranking > of the webpages. > > III. I want to be able to easily add more fields to my documents. Imagine > one retrives information from a webpage's content, than I want to make it > searchable. > > IV. While fetching my data, I want to make special-searches possible. For > example I want to retrive pictures from a webpage and want to index > picture-related content into another search-index plus I want to save a > small thumbnail of the picture itself. Btw: This is (as far as I know) not > possible with solr, because solr was not intended to do such special > indexing-logic. > > V. I want to use filter queries (i.e. main-query "christopher lee" returns > 1.5mio results, subquery "action" -> the main-query would be a > filter-query and "action" would be the actual query. So a search within > search-results would be easily made available). > > VI. I want to be able to use different logics for different pages. Maybe I > got a pool of 100 domains that I know better than others and I got special > scripts that retrive more special information from those 100 domains. Than > I want to apply my special logic to those 100 domains, but every other > domain should use the default logic. > > - > > The project is only virtual. So why I am asking? > I want to learn more about websearch and I would like to make some new > experiences. > > What do I know about Solr + Nutch: > As it is said on lucidimagination.com, Solr + Nutch does not scale if the > index is too large. > The article was a little bit older and I don't know whether this problem > gets fixed with the new distributed abilities of Solr. > > Furthermore I don't want to index the pages with nutch and reindex them > with solr. > The only exception would be: If the content of a webpage get's indexed by > nutch, I want to use the already tokenized content of the body with some > Solr copyfield operations to extend the search (i.e. making fuzzy search > possible). At the moment: I don't think this is possible. > > I don't know much about the droids project and how well it is documented. > But from what I can read by some posts of Otis, it seems to be usable as a > crawler-framework. > > > Pros for Nutch are: It is very scalable! Thanks to hadoop and MapReduce it > is a scaling-monster (from what I've read). > > Cons: The search is not as rich as it is possible with Solr. Extend > Nutch's search-abilities *seems* to be more complicated than with Solr. > Furthermore, if I want to use Solr to search nutch's index, looking at my > requirements I would need to reindex the whole thing - without the > benefits of Hadoop. > > What I don't know at the moment is, how it is possible to use algorithms > like in II. mentioned with Solr. > > I hope you understand the problem here - Solr *seems* to me as it would > not be the best solution for a web-search-engine, because of scaling > reasons in indexing. > > > Where should I dive deeper? > Solr + Droids? > Solr + Nutch? > Nutch + howToExtendNutchToMakeSearchBetter? > > > Thanks for the discussion! > - Mitch > -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-and-Nutch-Droids-to-use-or-not-to-use-tp890640p894391.html Sent from the Solr - User mailing list archive at Nabble.com.
FW: Tika in Action
All, FYI, as SolrCell is built on top of Tika, some folks might be interested in this message I posted to the Tika lists. Thanks! Cheers, Chris -- Forwarded Message From: "Mattmann, Chris A (388J)" Reply-To: Date: Fri, 11 Jun 2010 19:07:24 -0700 To: Cc: Subject: Tika in Action Hi Folks, Just wanted to give you an FYI that the book that Jukka Zitting and I are writing on Tika titled "Tika in Action" is now available through Manning's Early Access Program [1]. Feedback, comments welcome. Thanks! Cheers, Chris [1] http://www.manning.com/mattmann/ ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ -- End of Forwarded Message
Re: Using solr with woodstox 4.0.8
Hi Alex! > Am I missing something? Anything more to test? > Are you using solrj too? If so, beware of: https://issues.apache.org/jira/browse/SOLR-1950 Regards, Peter.
Re: VelocityResponseWriter in Solr Core ?! configuration
On Jun 14, 2010, at 9:12 AM, stockii wrote: i tried it with 1.4 and put the jars into lib of solr.home but it want be work. i get the same error ... i use 2 cores. and my solr.home is ...path/cores in this folder i put another folder with the name: "lib" and put all these Jars into it: apache-solr-velocity-1.4-dev.jar velocity-1.6.1.jar velocity-tools-2.0-beta3.jar commons-beanutils-1.7.0.jar commons-collections-3.2.1.jar commons-lang-2.1.jar With multicore, you either have to put the JARs in each cores lib/ directory, or use the multicore sharedLib feature to point to the proper lib directory. and then in solrconfig.xml this line: name="velocity" class="org.apache.solr.response.VelocityResponseWriter"/> Again, I strongly recommend you use class="solr.VelocityResponseWriter". Erik
Re: Using solr with woodstox 4.0.8
Hi Peter! Yes, we do. Thanks for the hint! Cheers, Alex Am 14.06.10 16:49 schrieb "Peter Karich" unter : > Hi Alex! > >> Am I missing something? Anything more to test? >> > > Are you using solrj too? If so, beware of: > https://issues.apache.org/jira/browse/SOLR-1950 > > Regards, > Peter.
Need help on Solr Cell usage with specific Tika parser
Hi, I use Solr Cell to send specific content files. I developped a dedicated Parser for specific mime types. However I cannot get Solr accepting my new mime types. In solrconfig, in update/extract requesthandler I specified ./tika-config.xml , where tika-config.xml is in conf directory (same as solrconfig). In tika-config I added my mimetypes: biosequence/document biosequence/embl biosequence/genbank I do not know for: whereas path to tika mimetypes should be absolute or relative... and even if this file needs to be redefined if "magic" is not used. When I run my update/extract, I have an error that "biosequence/document" does not match any known parser. Thanks Olivier
Re: Need help on Solr Cell usage with specific Tika parser
Hi Olivier, Are you setting the mime type explicitly via the stream.type parameter? -- Ken On Jun 14, 2010, at 9:14am, olivier sallou wrote: Hi, I use Solr Cell to send specific content files. I developped a dedicated Parser for specific mime types. However I cannot get Solr accepting my new mime types. In solrconfig, in update/extract requesthandler I specified name="tika.config">./tika-config.xml , where tika-config.xml is in conf directory (same as solrconfig). In tika-config I added my mimetypes: biosequence/document biosequence/embl biosequence/genbank I do not know for: whereas path to tika mimetypes should be absolute or relative... and even if this file needs to be redefined if "magic" is not used. When I run my update/extract, I have an error that "biosequence/ document" does not match any known parser. Thanks Olivier Ken Krugler +1 530-210-6378 http://bixolabs.com e l a s t i c w e b m i n i n g
Re: Need help on Solr Cell usage with specific Tika parser
Yeap, I do. As magic is not set, this is the reason why it looks for this specific mime-type. Unfortunatly, It seems it either do not read my specific tika-config file or the mime-type file. But there is no error log concerning those files... (not trying to load them?) 2010/6/14 Ken Krugler > Hi Olivier, > > Are you setting the mime type explicitly via the stream.type parameter? > > -- Ken > > > On Jun 14, 2010, at 9:14am, olivier sallou wrote: > > Hi, >> I use Solr Cell to send specific content files. I developped a dedicated >> Parser for specific mime types. >> However I cannot get Solr accepting my new mime types. >> >> In solrconfig, in update/extract requesthandler I specified > name="tika.config">./tika-config.xml , where tika-config.xml is in >> conf directory (same as solrconfig). >> >> In tika-config I added my mimetypes: >> >> > class="org.irisa.genouest.tools.readseq.ReadSeqParser"> >> biosequence/document >> biosequence/embl >> biosequence/genbank >> >> >> I do not know for: >> >> >> whereas path to tika mimetypes should be absolute or relative... and even >> if >> this file needs to be redefined if "magic" is not used. >> >> >> When I run my update/extract, I have an error that "biosequence/document" >> does not match any known parser. >> >> Thanks >> >> Olivier >> > > > Ken Krugler > +1 530-210-6378 > http://bixolabs.com > e l a s t i c w e b m i n i n g > > > > >
Solr 1.4 and Nutch 1.0 Integration
I'm new to Solr, but I'm interested in setting it up to act like a google search appliance to crawl and index my website. It's my understanding that nutch provides the web crawling but needs to be integrated with Solr in order to get a google search appliance type experience. Two questions: 1. Is the scenario I'm outlining above possible? 2. If it is possible, where may I found documentation describing how to set up a Solr/Nutch instance? Thanks for your help, Dean Del Ponte
need help with multicore dataimport
Hi, Does anyone know how to access the dataimport handler on a multicore setup? This is my solr.xml I've tried http://localhost:8080/solr/advisors/dataimport but that doesn't work. My solrconfig.xml for advisors looks like this: C:\solr\example\solr\advisors\conf\dih-advisors-jdbc.xml Thanks, Moazzam
Re: need help with multicore dataimport
This issue is your request handler path: , use name="/dataimport" instead. Implicitly all access to a core is /solr/ and all paths in solrconfig go after that. Erik On Jun 14, 2010, at 1:44 PM, Moazzam Khan wrote: Hi, Does anyone know how to access the dataimport handler on a multicore setup? This is my solr.xml I've tried http://localhost:8080/solr/advisors/dataimport but that doesn't work. My solrconfig.xml for advisors looks like this: C:\solr\example\solr\advisors\conf\dih- advisors-jdbc.xml Thanks, Moazzam
Re: need help with multicore dataimport
Thanks! It worked. - Moazzam On Mon, Jun 14, 2010 at 12:48 PM, Erik Hatcher wrote: > This issue is your request handler path: name="/advisor/dataimport"...>, use name="/dataimport" instead. Implicitly > all access to a core is /solr/ and all paths in solrconfig go > after that. > > Erik > > On Jun 14, 2010, at 1:44 PM, Moazzam Khan wrote: > >> Hi, >> >> Does anyone know how to access the dataimport handler on a multicore >> setup? >> >> This is my solr.xml >> >> >> >> >> >> >> >> >> >> I've tried http://localhost:8080/solr/advisors/dataimport but that >> doesn't work. My solrconfig.xml for advisors looks like this: >> >> >> > class="org.apache.solr.handler.dataimport.DataImportHandler"> >> >> > name="config">C:\solr\example\solr\advisors\conf\dih-advisors-jdbc.xml >> >> >> >> Thanks, >> >> Moazzam > >
Re: AW: XSLT for JSON
: i'm only want the response format of StandardSearchHandler for the : TermsComponent. how can i do this in a simple way ? :D I still don't understand what you are asking ... TermsComponent returns data about terms. The SearchHandler runs multiple components, and returns whatever data those components want to return. If you are using TermsComponent in SearchHandler, you will get one type of data back in the terms section, and it will be in the "terms structure" (either as XML or as JSON depending on the writer you use) ... if you use some other components in your SerachHandler they will return *different* data in the data structutre that makes sense for that component, which will either be formated as JSON or XML depending on the response writer you use. But all of this seems orthoginal tothe question you seem adament about, which is translating the XML reponse (from some component) into some a JSON structure your clients are expecting. In Short: sure you can probably use XSLT to generate JSON from the XML response -- if that's what you really want to do, then go right ahead and try it, but since i don't know anyone else who has ever done that i can't offer you any specific tips or assistance. -Hoss
Re: Questions about hsin and dist
i'm not very knowledgable on spatial search, but... : for example, if I were to use a filter query such as : : {!frange l=0 u=75}dist(2,latitude,longitude,44.0,73.0) : : I would expect it to return all results within 75 mi of the given : latitude and longitude. however, the values being returned are far : outside of that range: nothing in the wiki for the dist function suggests that the returned value is in miles -- it's notably devoid of mention of units of measurements. I believe (but am not certain) based on skimming the Junit test that it's returning a number between 0 and 1 (as noted in the docs, it's finding the distance between two *vectors*) : {!frange}hsin(1,44.0,73.0,latitude,longitude,true) : : expecting that it would return a filtered set of queries in a radius : of 1 mi within 44.0lat and 73.0 long, where true tells the hsin : function to convert to radians. However, whether or not the filter is That doens't match my reading of the docs at all -- as i understand it, the "radius" argument to the hsin function is the radius of the sphere, in whatever units you want, and then it computes the distance between two points on that sphere using the same units. so if you want to filter to only points within 1 mile of some specific point (where all points are specified in degrees) you would use something like... fq={!frange l=0 u=1}hsin(XXX,44.0,73.0,latitude,longitude,true) ...where XXX is the radius of hte earth in miles (i didn't bother to look it up) -Hoss
Re: Questions about hsin and dist
On Mon, Jun 14, 2010 at 3:35 PM, Chris Hostetter wrote: > fq={!frange l=0 u=1}hsin(XXX,44.0,73.0,latitude,longitude,true) > > ...where XXX is the radius of hte earth in miles (i didn't bother to look > it up) That's what the docs say, but it doesn't really work in my experience. IMO, the spatial stuff is still in development and not ready for public consumption. -Yonik http://www.lucidimagination.com
Re: Solr Architecture discussion
: B- A backup of the current index would be created : C- Re-Indexing will happen on Master-core2 : D- When Indexing is done, we'll trigger a swap between Master-core1 and : core2 ... : But how can B,C, and D. I'll do it manually. Wait! I'm not sure my boss will : pay for that. : 1/Can I leverage on some solr mechanisms (that is, by configuration only) in : order to reach that goal? : I haven't found how to do it! your best bet is some external scheduler -- depending on how your build process works, you can fairly easily integrate it into external publishing tools. : 2/ Is there any issue while replicating master "swapped" index files? I've : seen in the literature that there might be some issues. As long as the "new" version of the index is treuly "newer" then the old version, there shouldn't be any problem. Frankly though: i'm not sure you need core swapping on the master either -- it depends largely on how much "churn" will happen each time you do one of these full rebuilds. you could just as easily do incremental reindexing on your master, with occasional commits (or even autocommits) nad your slaves picking up those new segments -- either gradually, or all at once when you do a monolithic commit. if you're ok with the slaves pulling over the *entire* index after you do the core swap, then you should be fine with the slaves pulling over the *entire* index (or maybe just most of it) after a rebuild directly to the existing core. all you really need to do explicitly on the master is trigger a backup just before you rebuild the world, and if (and only if) something goes terribly wrong, then restore from your backup. -Hoss
Re: Default filter in solr config (+filter document by now for near time index feeling)
: 10 minutes. Sure, but idea now is to index all documents with a index : date, set this index date 10 min to the future and create a filter : "INDEX_DATE:[* TO NOW]". : : Question 1: is it possible to set this as part of solr-config, so every : implementation against the server will regard this. yes. : Question 2: From caching point of view this sounds a little ugly, is it - anybody tried this? it is very ugly, and i don't recommend it if you even remotely care about caching -- at a minimum you should do something like "INDEX_DATE:[* TO NOW/MINUTE+1MINUTE]" so you at least get reusable queries for 1 minute at a time. -Hoss
Re: how to use "q=string" in solrconfig.xml `?
: this ist my request to solr. and i cannot change this.: : http://host/solr/select/?q=string : : i cannot change this =( so i have a new termsComponent. i want to use : q=string as default for terms.prefix=string. : : can i do somethin like this: ? : : : true : suggest : index : ${???} : in general: no. for things that are QParsers there is a "local var" feature that can be used -- but the term.prefix isn't parsed as a query, so it doesn't work that way. your best bet is to add a server side rule (using something like mod_rewrite) that adds a term.prefix param using the value of the q param -Hoss
Re: SolrException: No such core
: Here the wrappers to use ...solrj.SolrServer : [code] : public class SolrCoreServer : { :private static Logger log = LoggerFactory.getLogger(SolrCoreServer.class); : :private SolrServer server=null; : :public SolrCoreServer(CoreContainer container, String coreName) :{ : server = new EmbeddedSolrServer( container, coreName ); :} showing the code for your SolrCoreServer isn't any use if you don't show us how you construct instance of it ... how are you initializing that CoreContainer? In general, you've provided a lot of info, but you haven't answered most of my very specific questions... : * what does your code for initializing solr look like? ...need to details on the CoreContainer (and what coreName you are passing) for that to be of any use. : * what does your soler home dir look like (ie: what files are in it) ...you showed us the files, but not the directory structure : * what is the full stack trace of these exceptions, and what does your : code look like around the lines where these stack traces indicate your : code is interacting with solr? ...no mention what so ever in your response. -Hoss
Re: Some basics
: - I want my search to "auto" spell check - that is if someone types : "restarant" I'd like the system to automatically search for restaurant. : I've seen the SpellCheckComponent but that doesn't seem to have a simple way : to automatically do the "near" type comparison. Is the SpellCheckComponent : the wrong one or do I just need to manually handle the situation in my : client code? at the moment you need to handle this in your client -- if you get no results back (or too few results based on some expecatation you have) but the spellcheck component retunred a suggestion then trigger a subsequent search using that suggestion. : - Also, what is the proper analyzer if I want to search a search for "thai : food" or "thai restaurant" to actually match on Thai? I can't totally : ignore words like food and restaurant but I want to ignore more general : terms and look for specific first (or I should say score them higher). the issue isn't so much your analyzer as how you structure your query -- i would suggest using the dismax query parser with a very low value for hte 'mm' param (ie: '1' or something like '10%' if you expect a lot of queries with many many words) and a useful "pf" param -- that way two word queries will return matches for either word, but docs that match both words will score higher, and docs that match the full phrase will score the highest. -Hoss
Re: Indexing stops after exception
: on one of the PDF documents and this causes indexing to stop (the : TikaEntityProcessor) throws a Severe exception. Is it possible to ignore : this exception and continue indexing by some kind of solr configuration ? i'm not really a power user of DIH but have you tried adusting the value of the 'onError' param? : TikaEntityProcessor to return null in this case. BTW shouldn't the : inputstream close be in a finally block? Almost certainly -- can you please open a Jira issue and either attach a patch with your suggested "finally" changes or just cite the files/lines you think look suspicious. -Hoss
Re: general debugging techniques?
: > if you are only seeing one log line per request, then you are just looking : > at the "request" log ... there should be more logs with messages from all : > over the code base with various levels of severity -- and using standard : > java log level controls you can turn these up/down for various components. : : Unfortunately, I'm not very familiar with java deploys so I don't know : where the standard controls are yet. As a concrete example, I do see : INFO level logs, but haven't found a way to move up DEBUG level in : either solr or tomcat. I was hopeful debug statements would point to : where extraction/indexing hangs were occurring. I will keep poking : around, thanks for the tips. Hmm ... it sounds like maybe you haven't seen this wiki page... http://wiki.apache.org/solr/SolrLogging ..as mentioned there, for quick debugging, there is an admin page to adjust the log levels on the fly... http://localhost:8983/solr/admin/logging.jsp ...but for more long term changes to the logging configuration, it depends greatly on wether your servlet container customizes the Java LogManager. There are links there to general info about Java logging, and about tweaking this in the example Jetty setup. -Hoss
Re: Master master?
: Does Solr handling having two masters that are also slaves to each other (ie : in a cycle)? no. -Hoss
Re: Help with Shingled queries
: the queryparser first splits on whitespace. FWIW: robert is refering to the LuceneQParser, and it also applies to the DismaxQParser ... whitespace is considered markup in those parsers unless it's escaped or quoted. The FieldQParser may make more sense for your usecase - or you may need a custom QParser (hard to tell) To answer your specific question... : > the debug output, for example with the term "short red evil fox" I would : > expect : > to see the shingles : > 'short_red' 'red_evil' 'evil_fox' : > : > but instead I get the following : > : > "debug":{ : > "rawquerystring":"short red evil fox", : > "querystring":"short red evil fox", : > "parsedquery":"+() ()", : > "parsedquery_toString":"+() ()", : > "explain":{}, : > "QParser":"DisMaxQParser", ...you are using the DisMaxQParser, but evidently you haven't configured the qf or pf fields, so you are getting a query that is completley empty. -Hoss
Re: Problem in solr reponse time when Query size is big
You'll have to give us some specific details of what your code/queries look like, and the exact error messages you are getting back if you expect anyone to be able to compe up with a meaniniful guess as to what might be going wrong for you Off the top of my head, there is no reason i can think of why a "large" query would cause something that might be called an "HTTP Version not supported." error unless there was a bug in your servlet container, or a bug in your client code, or both. : Hi All, : : I have configured Apache Solr 1.4 with JBoss 5.1.0GA and Working fine when I : send some small query strings but my requirement is different and I have to : build query string in the fly and pass to solr and execute to get response. : It's working fine with small query of data but when passing big query then : not responding anything on page and in JBoss console I got message HTTP : Version not supported. Can anyone help me where I am wrong? If any other way : to overcome this problem then please reply me. : : : Thanks & Regards, : Dhirendra : -- : View this message in context: http://lucene.472066.n3.nabble.com/Problem-in-solr-reponse-time-when-Query-size-is-big-tp876221p876221.html : Sent from the Solr - User mailing list archive at Nabble.com. : -Hoss
Re: custom scorer in Solr
: Problem is that they want scores that make results fall in buckets: : : * Bucket 1: exact match on category (score = 4) : * Bucket 2: exact match on name (score = 3) : * Bucket 3: partial match on category (score = 2) : * Bucket 4: partial match on name (score = 1) ... : First thing we did was develop a custom similarity class that would : return the correct score depending on the field and an exact or partial : match. ... : The only problem now is that when a document matches on both the : category and name the scores are added together. what QParser are you using? what does the resulting Query data structure look like? I think with your custom Similarity class you might be able to achieve your goal using the DisMaxQParser w/o any other custom code -- just set your "qf=category name" (i'm assuming your Similarity already handles the relative weighting) and set the "tie=0" ... that will ensure that the final score only comes from the "Max" scoring field (ie: no tie breaking values fro mthe other fields) if thta doesn't do what you want -- then your best bet is probably to write a custom QParser that generates *exactly* the query structure you want (likely using a DisjunctionMaxQuery) thta will give you the scores you want in conjunction with your similarity class. -Hoss
Re: Request log does not show QTime
: How do you customize the RequestLog to include the query time, hits, and the "RequestLog" is a jetty specific log file -- it's only going to know the concepts that Jetty specificly knows about. : Note, I do see this information in log.solr.0, but it also includes the full : query parameters which are too verbose, so I need to turn that logging off. : Jun 10, 2010 1:35:03 PM org.apache.solr.core.SolrCore execute : INFO: [] webapp=/solr path=/select/ params={...} hits=4587 status=0 QTime=19 that's the format Solr uses for logging individual requests. if you want to change it you can either write a custom LogHandler or a custom LogFormatter, or you can post-process... http://java.sun.com/j2se/1.5.0/docs/guide/logging/overview.html -Hoss
Re: AW: how to get multicore to work?
: As it stands, solr works fine, and sites like : http://locahost:8983/solr/admin also work. : : As soon as I put a solr.xml in the solr directory, and restart the tomcat : service. It all stops working. : : : : : You need to elaborate on "It all stops working" ... what does that mean? what are you trying to do? and what errors are you getting? when i take an existing (functional) Solr 1.4 SolrHome dir, and drop that solr.xml file into it, everything works as expected for me 1. Solr starts up 2. This URL lists a link to the admin page for a single core named "core0"... http://localhost:8983/solr/ 3. This URL let's me use core0... http://localhost:8983/solr/core0/admin/ 4. this URL (specified in your solr.xml) let's my admin the cores (ie: view-status/add/remove/reload) ... http://localhost:8983/solr/admin/cores -Hoss
Re: Copyfield multi valued to single value
: Is there a way to copy a multivalued field to a single value by taking : for example the first index of the multivalued field? Unfortunately no. This would either need to be done with an UpdateProcessor, or on the client constructing hte doc (either the remote client, or in your DIH config if that's how you are using Tika) -Hoss
Re: Need help on Solr Cell usage with specific Tika parser
: In solrconfig, in update/extract requesthandler I specified ./tika-config.xml , where tika-config.xml is in : conf directory (same as solrconfig). can you show us the full requestHandler decalration? ... tika.config needs to be a direct child of the requestHandler (not in the defaults) I also don't know if using a "local" path like that will work -- depends on how that file is loaded (if solr loads it, then you might want to remove the "./"; if solr just gives the path to tika, then you probably need an absolute path. -Hoss
Re: Custom faceting question
: I believe I'll need to write some custom code to accomplish what I want : (efficiently that is) but I'm unsure of what would be the best route to : take. Will this require a custom request handler? Search component? You'll need a customized version of the FacetComponent if you want to do this all on the server side. : We have a similar category structure whereas we have top level-categories : and then sub-categories. I want to be able to perform a search and then only : return the top 3 top-level categories with their sub-categories also : faceted. The problem is I don't know what those top 3 top-level categories : are until after I search. the main complexity with a situation like this is how you model it. regardless of wether you do it server side or client side the straight forward appraoch is to do basic faceting on a "top level" category field, and then given the top three responses do secondary faceting o na field that contains the full category "breadcrumb" -- either using something like facet.prefix or by walking some other in memory data structure represending your category graph that lets you access the children of a particular category (depends wether you need complex rules to identify what documents are in a category) : Second way. Have the client send multiple requests on the backend. First to : determine the top 3 categories, then another for all the subcategories. This : involves more client side coding and I would prefer not to perform 2x the : requests. If at all possible I would like to do this on the Solr side. ...you've already got the conceptual model of how to do it, all you need now is to implement it as a Component that does the secondary-faceting in the same requests (which should definitley be more efficient since you can reuse the DocSets) instead of issuing secondary requets from your client -Hoss
Re: Indexing Problem with SOLR multicore
I can't think of any way this could happen -- can you provide some more detials on what example you are doing, and hat you are doing to observe the problem? In particular: * what do each of your DIH config files look like? * what URLs are you using to trigger DIH imports? * how are you checking your document counts? * what URLs are you querying to see the results? - what results do you get from these URLs before you stop/start the server that look correct? - what results do you get after the stop/start thta are incorrect? : Hi, : I am using SOLR with Tomcat server. I have configured two : multicore inside the SOLR home directory. The solr.xml file looks like : : : : : : : : : I am also using DIH to upload the data in these two cores separately & : document count in these two core is different. However whenever I restart : the tomcat server my document count in these two core show the same. Also : both the core exits but whenever I tried to search the data in any core it : returns me data from different core. : : E.g. If I tried to search the data in MyTestCore1 core then solr returns the : result from MyTestCore2 core (this is a problem) & If I tried to search the : data in MyTestCore2 core then solr returns the data from MyTestCore2 core : (which is fine) OR some time vice-versa happens... : : Now if I reindex the data in MyTestCore1 core using "Full data-import with : cleanup" then problem gets sort out. but comes gaing if I restart my tomcat : server. : : Is there any issue with my core configuration? Please help : : : Thanks, : Siddharth : : : : -- : View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Problem-with-SOLR-multicore-tp884745p884745.html : Sent from the Solr - User mailing list archive at Nabble.com. : -Hoss
Re: Custom faceting question
: ...you've already got the conceptual model of how to do it, all you need : now is to implement it as a Component that does the secondary-faceting in : the same requests (which should definitley be more efficient since you can : reuse the DocSets) instead of issuing secondary requets from your client Couldn't I just create a custom search handler to do this so it all the logic resides on the server side? I'm guessing I would need to subclass SearchHandler and override handleRequestBody. -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-faceting-question-tp868015p895990.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multiple location filters per search
: I am currently working with the following: : : {code} : {!frange l=0 u=1 unit=mi}dist(2,32.6126, -86.3950, latitude, longitude) : {/code} ... : {code} : {!frange l=0 u=1 unit=mi}dist(2,32.6126, -86.3950, latitude, : longitude) OR {!frange l=0 u=1 unit=mi}dist(2,44.1457, -73.8152, : latitude, longitude) : {/code} ... : I get an error. Hoping someone has an idea of how to work with : multiple locations in a single search. I think yo uare confused about how that query is getting parsed ... when SOlr sees the "{!frange" at the begining of hte param, that tells it that the *entire* praam value should be parsed by the frange parser. The frange parser doesn't know anything about keywords like "OR" What you probably want is to utilize the "_query_" hack of the LuceneQParser so that you can parse some "Lucene" syntax (ie: A OR B) where the clauses are then generated by using another parser... http://wiki.apache.org/solr/SolrQuerySyntax fq=_query_="{!frange l=0 u=1 unit=mi}dist(2,32.6126, -86.3950, latitude, longitude)" OR _query_:"{!frange l=0 u=1 unit=mi}dist(2,44.1457, -73.8152, latitude, longitude)" ...or a little more readable... fq=_query_="{!frange l=0 u=1 unit=mi v=$qa}" OR _query_:"{!frange l=0 u=1 unit=mi v=$qb}" qa=dist(2,32.6126, -86.3950, latitude, longitude) qb=dist(2,44.1457, -73.8152, latitude, longitude) -Hoss
Re: Custom faceting question
: : ...you've already got the conceptual model of how to do it, all you need : : now is to implement it as a Component that does the secondary-faceting in : : the same requests (which should definitley be more efficient since you can : : reuse the DocSets) instead of issuing secondary requets from your client : : Couldn't I just create a custom search handler to do this so it all the : logic resides on the server side? I'm guessing I would need to subclass : SearchHandler and override handleRequestBody. I think you're missunderstanding me -- i'm agreeing with you that you can do it on the server side, and that it will make sense to do it on the server side -- i'm saying that instead of impelementing a SearchHandler, you should just implement a SearchComponent that you would use in place of (or in addition to) FacetComponent ... http://wiki.apache.org/solr/SearchComponent -Hoss
CFP for Surge Scalability Conference 2010
We're excited to announce Surge, the Scalability and Performance Conference, to be held in Baltimore on Sept 30 and Oct 1, 2010. The event focuses on case studies that demonstrate successes (and failures) in Web applications and Internet architectures. Our Keynote speakers include John Allspaw and Theo Schlossnagle. We are currently accepting submissions for the Call For Papers through July 9th. You can find more information, including our current list of speakers, online: http://omniti.com/surge/2010 If you've been to Velocity, or wanted to but couldn't afford it, then Surge is just what you've been waiting for. For more information, including CFP, sponsorship of the event, or participating as an exhibitor, please contact us at su...@omniti.com. Thanks, -- Jason Dixon OmniTI Computer Consulting, Inc. jdi...@omniti.com 443.325.1357 x.241
Re: Indexing Problem with SOLR multicore
Hi Chris, Thank you so much for the help & reply to my query However my problem got resolved. There was a configuration problem in my solrconfig.xml file. The tag was not configured properly that is why both core were directing to the same directory for indexing. Regards, Siddharth -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-Problem-with-SOLR-multicore-tp884745p896347.html Sent from the Solr - User mailing list archive at Nabble.com.
Spellchecker index cannot be optimized
Hello, when I rebuild the spellchecker index ( by optimizing the data index or by calling cmd=rebuild ) the spellchecker index is not optimized. I even cannot delete the old indexfiles on the filesystem, because they are locked by the solr server. I have to stop the solr server(resin) to optimize the spellchecker index with luke or by deleting the old files. How can I optimize the index without stopping the solr server? Thanks Lutz Pumpenmeier