Re: WordDelimiterFilter and the dot character
Hi, I had a very similar Problem while searching in a bibliographic field called "signatur". I could solve it by the help of additional Filterclasses. At the moment I use the following Filters. Then it works for me: ... ... The MappingCharFilterFactory I have added in order to have a better support of german "Umlaute". Concerning the Wildcards: It is important that you use the ReversedWildcardFilterFactory only at index time. All other Filters I also use at query time. Perhaps it could help. Dirk - erste Erfahrungen mit SOLR u. Vufind -- View this message in context: http://lucene.472066.n3.nabble.com/WordDelimiterFilter-and-the-dot-character-tp4014220p4014225.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to boost query term after tokenizer
Hi, I think that`s not the way you can do it, because you cannot give a hint to your analyzer, which text fragment is more relevant than another at runtime. There is no marker so a filter process cannot know, which terms are to boost. You could write your own filter and let it read a file with some important terms in order to compare each term with your queryterms, but I think that would not be a good way. If you have a way in order to split search query text into relevant terms. The first step is done. That`s a possible way for analysis at query time in order to search with right terms. In order to provide index data you can try to pre-process your data in order to save most important keywords in seperated search fields. Then you boost those fields on query time. Hope I could help, Dirk - erste Erfahrungen mit SOLR u. Vufind -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-boost-query-term-after-tokenizer-tp4010889p4014245.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Building an enterprise quality search engine using Apache Solr
Hi, your question is not easy to answer. It depends on so many things, that there is no standard way to realize an enterprise solution and time planning aspects are depending on so much things. I can try to give you some brief notes about our solution, but there are some differences in target group and data source. I am technical responsible for the system disco (a research and discovery system) at the library at university of Münster. (excuse me, I don't want to make a promotion tour here, I earn no money with such activities -:)). Ok, in this search engine, based on lucene, we search in about 200 Mio Articles, Books, Journals and so on. So we have different data sources in structure and also in the way of delivery. At the beginning we thought, lets buy a solution in order to avoid more or less own developement work. So we bought a commercial search engine, which works on a lucene core with a proprietary business logic in order to talk to lucene core. So far so good - or not good. At that time I was the onliest worker on this project and I need nearly one and a half year in fulltime in order to fullfill most features and requirements. And the reason for that long time is not, that I had no exiperiences, (I hope so). I work in this area nearly 15 years in different companies, always as developer in J2EE. (That`s rare today, because today every experienced developer wants to work as "leader" or manager, that`s sounds better and less project leader are outsourced. ok, other topic) And other universities (customers) who realized a comparable search engine in that environment took as long or longer. So I am hopefully... In germany we say "der teufel steckt im detail" (translation literally: devil is hidden in detail), which means you start work and parallel to that process mostly requirements changed, sadly in most cases after development has done the software basis. For example we need a lot of time for the fine tuning of ranking and for realizing a complete automatic mechanism to update data sources. And it was one thing to realize the search in development and run a first developer test, a complete other thing is to make the system fit for 24/7 service and run a productive system without problems. Most time we need on data pre-processing because of the "shit in - shit out" problem. Work on the quality of data is expensive but you get no appreciation, because everybody is cope with searching features. This requirement shows us, that mostly it is impossible to avoid own developement completely. Next thing is user interface, not every feature a customer knows from good old database backboned systems is easy to realized in a search engine because of more or less flat data structure. So we had to develop one service after the other in order to read additional informations. In our case for example runtime holding informations of our library. Summarized, if you want to estimate a concrete time duration in order to realize a complete productive enterprise search solution, you should talk to some people with similar solutions, think of your own requirements in detail and then multiply your estimation with 2. Then perhaps you have a realistic estimate. Dirk - my developer logs -- View this message in context: http://lucene.472066.n3.nabble.com/Building-an-enterprise-quality-search-engine-using-Apache-Solr-tp4014557p4014688.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: "diversity" of search results?
Hi Paul, yes that`s a typical problem in configuring a search engine. A solution depends on your data. Sometimes you can overcome this problem by fine tuning your search engine on boosting level. Thats not easy and always based on trail and error tests. Another thing you can do is to try to realize a data pre-processing which compensate the reasons of similar content in certain fields, e.g. in a title field. For example if you have products with very similar titles and you boost such a field. The result is, that you always will found all documents in the result list. But if you go on and add some informations (perhaps out of other search fields) in this title field you perhaps can reduce the similarity. (typical example in my branch: Book titles in different volumes, then I add the volumn number and der year to the title field.) Perhaps it is also necessary to cape with a pre-processed deduplication. Here you can find an entry point: http://wiki.apache.org/solr/Deduplication Dirk - my developer logs -- View this message in context: http://lucene.472066.n3.nabble.com/diversity-of-search-results-tp4014692p4014696.html Sent from the Solr - User mailing list archive at Nabble.com.
Extended Dismax Query Parser with AND as default operator
Hello, I have a question to the extended dismax query parser. If the default operator is changed to AND (q.op=AND) then the search results seems to be incorrect. I will explain it on some examples. For this test I use solr v5.1 and the tika core from the example directory. == Preparation == Add the following lines to the schema.xml file id Change the field "text" to stored="true" Remove the multiValued attribute from the title and text field (we don't need multivaled fields in our test) Add test data (use curl or fiddler) Url:http://localhost:8983/solr/tika/update/json?commit=true Header: Content-type: application/json [ {"id":"1", "title":"green", "author":"Jon", "text":"blue"}, {"id":"2", "title":"green", "author":"Jon Jessie", "text":"red"}, {"id":"3", "title":"yellow", "author":"Jessie", "text":"blue"}, {"id":"4", "title":"green", "author":"Jessie", "text":"blue"}, {"id":"5", "title":"blue", "author":"Jon", "text":"yellow"}, {"id":"6", "title":"red", "author":"Jon", "text":"green"} ] == Test == The following parameter are always set. default operator is AND: q.op=AND use the extended dismax query parser: defType=edismax set the default query fields to title and text: qf=title text sort: id asc === #1 test === q=red green response: { "numFound":2,"start":0, "docs":[ {"id":"2","title":"green","author":"Jon Jessie","text":"red"}, {"id":"6","title":"red","author":"Jon","text":"green"}] } parsedquery_toString: "+(((text:green | title:green) (text:red | title:red))~2)" This test works as expected. === #2 test === We use a group q=(red green) Same response as test one. parsedquery_toString: "+(((text:green | title:green) (text:red | title:red))~2)" This test works as expected. === #3 test === q=green red author:Jessie response: { "numFound":1,"start":0, "docs":[{"id":"2","title":"green","author":"Jon Jessie","text":"red"}] } parsedquery_toString: "+(((text:green | title:green) (text:red | title:red) author:jessie)~3)" This test works as expected. === #4 test === q=(green red) author:Jessie response: { "numFound":2,"start":0, "docs":[ {"id":"2","title":"green","author":"Jon Jessie","text":"red"}, {"id":"4","title":"green","author":"Jessie","text":"blue"}] } parsedquery_toString: "+text:green | title:green) (text:red | title:red)) author:jessie)~2)" The same result as the 3th test was expected. Why no AND is used for the query group? === #5 test === q=(+green +red) author:Jessie response: { "numFound":4,"start":0, "docs":[ {"id":"2","title":"green","author":"Jon Jessie","text":"red"}, {"id":"3","title":"yellow","author":"Jessie","text":"blue"}, {"id":"4","title":"green","author":"Jessie","text":"blue"}, {"id":"6","title":"red","author":"Jon","text":"green"}] } parsedquery_toString: "+((+(text:green | title:green) +(text:red | title:red)) author:jessie)" Now AND is used for the group but the author is concatenated with OR. Why? === #6 test === q=(+green +red) +author:Jessie response: { "numFound":3,"start":0, "docs":[ {"id":"2","title":"green","author":"Jon Jessie","text":"red"}, {"id":"3","title":"yellow","author":"Jessie","text":"blue"}, {"id":"4","title":"green","author":"Jessie","text":"blue"}] } parsedquery_toString: "+((+(text:green | title:green) +(text:red | title:red)) +author:jessie)" Still not the expected result. === #7 test === q=+(+green +red) +author:Jessie response: { "numFound":1,"start":0, "docs":[{"id":"2","title":"green","author":"Jon Jessie","text":"red"}] } parsedquery_toString: "+(+(+(text:green | title:green) +(text:red | title:red)) +author:jessie)" Now the result is ok. But if all operators must be given then q.op=AND is useless. === #8 test === q=green author:(Jon Jessie) Found four results, expected are one. The query must changed to '+green +author:(+Jon +Jessie)' to get the expected result. Is this a bug in the extended dismax parser or what is the reason for not consequently applying q.op=AND to the query expression? Kind regards Dirk Buchhorn
EarlyTerminatingCollectorException
Our production Solr-Slaves-Cores (we have about 40 Cores (each has a moderate size about 10K documents to 90K documents)) produce many exceptions of type: 014-11-05 15:06:06.247 [searcherExecutor-158-thread-1] ERROR org.apache.solr.search.SolrCache: Error during auto-warming of key:org.apache.solr.search.QueryResultKey@62340b01 :org.apache.solr.search.EarlyTerminatingCollectorException Our relevant solrconfig is 18 2 What exactly does the exception mean? Thank you! -- Dirk --
Re: EarlyTerminatingCollectorException
https://issues.apache.org/jira/browse/SOLR-6710 2014-11-05 21:56 GMT+01:00 Mikhail Khludnev : > I'm wondered too, but it seems it warmups queryResultCache > > https://github.com/apache/lucene-solr/blob/20f9303f5e2378e2238a5381291414881ddb8172/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L522 > at least this ERRORs broke nothing see > > https://github.com/apache/lucene-solr/blob/20f9303f5e2378e2238a5381291414881ddb8172/solr/core/src/java/org/apache/solr/search/FastLRUCache.java#L165 > > anyway, here are two usability issues: > - of key:org.apache.solr.search.QueryResultKey@62340b01 lack of readable > toString() > - I don't think regeneration exceptions are ERRORs, they seem WARNs for me > or even lower. also for courtesy, particularly > EarlyTerminatingCollectorExcepions can be recognized, and even ignored, > providing SolrIndexSearcher.java#L522 > > Would you mind to raise a ticket? > > On Wed, Nov 5, 2014 at 6:51 PM, Dirk Högemann wrote: > > > Our production Solr-Slaves-Cores (we have about 40 Cores (each has a > > moderate size about 10K documents to 90K documents)) produce many > > exceptions of type: > > > > 014-11-05 15:06:06.247 [searcherExecutor-158-thread-1] ERROR > > org.apache.solr.search.SolrCache: Error during auto-warming of > > key:org.apache.solr.search.QueryResultKey@62340b01 > > :org.apache.solr.search.EarlyTerminatingCollectorException > > > > Our relevant solrconfig is > > > > > > > > 18 > > > > > > > > > > 2 > > > class="solr.FastLRUCache" > > size="8192" > > initialSize="8192" > > autowarmCount="4096"/> > > > > > > > class="solr.FastLRUCache" > > size="8192" > > initialSize="8192" > > autowarmCount="4096"/> > > > > > > > class="solr.FastLRUCache" > > size="8192" > > initialSize="8192" > > autowarmCount="4096"/> > > > > > > What exactly does the exception mean? > > Thank you! > > > > -- Dirk -- > > > > > > -- > Sincerely yours > Mikhail Khludnev > Principal Engineer, > Grid Dynamics > > <http://www.griddynamics.com> > >
Solr4.2 PostCommit EventListener not working on Replication-Instances
Hello, I have implemented a Solr EventListener, which should be fired after committing. This works fine on the Solr-Master Instance and it also worked in Solr 3.5 on any Slave Instance. I upgraded my installation to Solr 4.2 and now the postCommit event is not fired any more on the replication (Slave) instances, which is a huge problem, as other cache have to be invalidated, when replication took place. This is my configuration solrconfig.xml on the slaves: 1 ... ... http://localhost:9101/solr/Core1 00:03:00 Any hints? Best regards
Solr3.5 PatternTokenizer / Search Analyzer tokenizing always at whitespace?
Hi, I am not sure if am missing something, or maybe I do not exactly understand the index/search analyzer definition and their execution. I have a field definition like this: Any field starting with cl2 should be recognized as being of type cl2Tokenized_string: When I try to search for a token in that sense the query is tokenized at whitespaces: {!q.op=AND df=cl2Categories_NACE}cl2Categories_NACE:08 Gewinnung von Steinen und Erden, sonstiger Bergbau+cl2Categories_NACE:08 +cl2Categories_NACE:gewinnung +cl2Categories_NACE:von +cl2Categories_NACE:steinen +cl2Categories_NACE:und +cl2Categories_NACE:erden, +cl2Categories_NACE:sonstiger +cl2Categories_NACE:bergbau I expected the query parser would also tokenize ONLY at the pattern ###, instead of using a white space tokenizer here? Is is possible to define a filter query, without using phrases, to achieve the desired behavior? Maybe local parameters are not the way to go here? Best Dirk
Re: Solr3.5 PatternTokenizer / Search Analyzer tokenizing always at whitespace?
{!q.op=AND df=cl2Categories_NACE}08 Gewinnung von Steinen und Erden, sonstiger Bergbau+cl2Categories_NACE:08 +cl2Categories_NACE:gewinnung +cl2Categories_NACE:von +cl2Categories_NACE:steinen +cl2Categories_NACE:und +cl2Categories_NACE:erden, +cl2Categories_NACE:sonstiger +cl2Categories_NACE:bergbau That is the relevant debug Output from the query. 2012/12/17 Dirk Högemann > Hi, > > I am not sure if am missing something, or maybe I do not exactly > understand the index/search analyzer definition and their execution. > > I have a field definition like this: > > > sortMissingLast="true" omitNorms="true"> > > group="-1"/> > > > > group="-1"/> > > > > > Any field starting with cl2 should be recognized as being of type > cl2Tokenized_string: > stored="true" /> > > When I try to search for a token in that sense the query is tokenized at > whitespaces: > > {!q.op=AND > df=cl2Categories_NACE}cl2Categories_NACE:08 Gewinnung von Steinen und > Erden, sonstiger Bergbau name="parsed_filter_queries">+cl2Categories_NACE:08 > +cl2Categories_NACE:gewinnung +cl2Categories_NACE:von > +cl2Categories_NACE:steinen +cl2Categories_NACE:und > +cl2Categories_NACE:erden, +cl2Categories_NACE:sonstiger > +cl2Categories_NACE:bergbau > > I expected the query parser would also tokenize ONLY at the pattern ###, > instead of using a white space tokenizer here? > Is is possible to define a filter query, without using phrases, to achieve > the desired behavior? > Maybe local parameters are not the way to go here? > > Best > Dirk >
Re: Solr3.5 PatternTokenizer / Search Analyzer tokenizing always at whitespace?
Ok- right, changed that... Nevertheless I thought I should always use the same analyzers for the query and the index section to have consistent results. Does this mean that the tokenizer in the query section will always be ignored by the given query parsers? 2012/12/17 Jack Krupansky > The query parsers normally tokenize on white space and query operators, > but you can escape any white space with backslash or put the text in quotes > and then it will be tokenized by the analyzer rather than the query parser. > > Also, you have: > > > > Change "search" to "query", but that won't change your problem since Solr > defaults to using the "index" analyzer if it doesn't "see" a "query" > analyzer. > > -- Jack Krupansky > > -Original Message- From: Dirk Högemann > Sent: Monday, December 17, 2012 5:59 AM > To: solr-user@lucene.apache.org > Subject: Solr3.5 PatternTokenizer / Search Analyzer tokenizing always at > whitespace? > > > Hi, > > I am not sure if am missing something, or maybe I do not exactly understand > the index/search analyzer definition and their execution. > > I have a field definition like this: > > > sortMissingLast="true" omitNorms="true"> > > group="-1"/> > > > > group="-1"/> > > > > > Any field starting with cl2 should be recognized as being of type > cl2Tokenized_string: > stored="true" /> > > When I try to search for a token in that sense the query is tokenized at > whitespaces: > > {!**q.op=AND > df=cl2Categories_NACE}**cl2Categories_NACE:08 Gewinnung von Steinen und > Erden, sonstiger Bergbau name="parsed_filter_queries"><**str>+cl2Categories_NACE:08 > +cl2Categories_NACE:gewinnung +cl2Categories_NACE:von > +cl2Categories_NACE:steinen +cl2Categories_NACE:und > +cl2Categories_NACE:erden, +cl2Categories_NACE:sonstiger > +cl2Categories_NACE:bergbau > > I expected the query parser would also tokenize ONLY at the pattern ###, > instead of using a white space tokenizer here? > Is is possible to define a filter query, without using phrases, to achieve > the desired behavior? > Maybe local parameters are not the way to go here? > > Best > Dirk >
Re: Solr3.5 PatternTokenizer / Search Analyzer tokenizing always at whitespace?
Ah - now I got it. My solution to this was to use phrase queries - now I know why: Thanks! 2012/12/17 Jack Krupansky > No, the "query" analyzer tokenizer will simply be applied to each term or > quoted string AFTER the query parser has already parsed it. You may have > escaped or quoted characters which will then be seen by the analyzer > tokenizer. > > > -- Jack Krupansky > > -Original Message- From: Dirk Högemann > Sent: Monday, December 17, 2012 11:01 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr3.5 PatternTokenizer / Search Analyzer tokenizing always > at whitespace? > > > Ok- right, changed that... Nevertheless I thought I should always use the > same analyzers for the query and the index section to have consistent > results. > Does this mean that the tokenizer in the query section will always be > ignored by the given query parsers? > > > > 2012/12/17 Jack Krupansky > > The query parsers normally tokenize on white space and query operators, >> but you can escape any white space with backslash or put the text in >> quotes >> and then it will be tokenized by the analyzer rather than the query >> parser. >> >> Also, you have: >> >> >> >> Change "search" to "query", but that won't change your problem since Solr >> defaults to using the "index" analyzer if it doesn't "see" a "query" >> analyzer. >> >> -- Jack Krupansky >> >> -Original Message- From: Dirk Högemann >> Sent: Monday, December 17, 2012 5:59 AM >> To: solr-user@lucene.apache.org >> Subject: Solr3.5 PatternTokenizer / Search Analyzer tokenizing always at >> whitespace? >> >> >> Hi, >> >> I am not sure if am missing something, or maybe I do not exactly >> understand >> the index/search analyzer definition and their execution. >> >> I have a field definition like this: >> >> >>> sortMissingLast="true" omitNorms="true"> >> >>> group="-1"/> >> >> >> >>> group="-1"/> >> >> >> >> >> >> Any field starting with cl2 should be recognized as being of type >> cl2Tokenized_string: >> > stored="true" /> >> >> When I try to search for a token in that sense the query is tokenized at >> whitespaces: >> >> {!q.op=AND >> df=cl2Categories_NACE}cl2Categories_NACE:08 Gewinnung von Steinen >> und >> >> Erden, sonstiger Bergbau> name="parsed_filter_queries"><str>+cl2Categories_NACE:08 >> >> +cl2Categories_NACE:gewinnung +cl2Categories_NACE:von >> +cl2Categories_NACE:steinen +cl2Categories_NACE:und >> +cl2Categories_NACE:erden, +cl2Categories_NACE:sonstiger >> +cl2Categories_NACE:bergbau >> >> >> I expected the query parser would also tokenize ONLY at the pattern ###, >> instead of using a white space tokenizer here? >> Is is possible to define a filter query, without using phrases, to achieve >> the desired behavior? >> Maybe local parameters are not the way to go here? >> >> Best >> Dirk >> >> >
Re: Bad performance while query pdf solr documents
You can define the fields to be returned with the fl parameter fl=the, needed, fields - usually the score and the id... 2012/12/23 uwe72 > hi > > i am indexing pdf documents to solr by tika. > > when i do the query in the client with solrj the performance is very bad > (40 > seconds) to load 100 documents? > > Probably because to load all the content. The content i don't need. How can > i tell the query to don't load the content? > > Or other reasons why the performance is so bad? > > Regards > Uwe > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Bad performance while query pdf solr documents
Do you really need them all in the response to show them in the results? As you define them as not stored now this does not seem so. 2012/12/23 Otis Gospodnetic > Hi, > > You can specify them in solrconfig.xml for your request handler, so you > don't have to specify it for each query unless you want to override fl. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Dec 23, 2012 4:39 AM, "uwe72" wrote: > > > we have more than hundreds fields...i don't want to put them all to the > fl > > parameters > > > > is there a other way, like to say return all fields, except the > fields...? > > > > anyhow i will change the field from stored to stored=false in the schema. > > > > > > > > -- > > View this message in context: > > > http://lucene.472066.n3.nabble.com/Bad-performance-while-query-pdf-solr-documents-tp4028766p4028816.html > > Sent from the Solr - User mailing list archive at Nabble.com. > > >
Highlighting problems
Hi all, I have problems with the higlighting mechanism: The query is: http://127.0.0.1:8983/solr/mpiwgweb/select?facet=true&facet.field=description&facet.field=lang&facet.field=main_content&start=0&q=meier+AND+%28description:member+OR+description:project%29 after that: In the field "main_content" which is the default search field. "meier" as well as as "member" and "project" is highlighted, although im searching for member and project only in the field description. The search results are ok, as far as I can see. my settings explicit 10 300 on main_content html 200 2 true tvComponent Cheers Dirk
Re: AW: Highlighting problems
Hi Andre, thanks this did the job. I also had to enable edismax and set the default parameter there - otherwise no highlighting at all. Best Dirk Am 11.03.2013 um 13:59 schrieb André Widhani : > Hi Dirk, > > please check > http://wiki.apache.org/solr/HighlightingParameters#hl.requireFieldMatch - > this may help you. > > Regards, > André > > ____ > Von: Dirk Wintergruen [dwin...@mpiwg-berlin.mpg.de] > Gesendet: Montag, 11. März 2013 13:56 > An: solr-user@lucene.apache.org > Betreff: Highlighting problems > > Hi all, > > I have problems with the higlighting mechanism: > > The query is: > > http://127.0.0.1:8983/solr/mpiwgweb/select?facet=true&facet.field=description&facet.field=lang&facet.field=main_content&start=0&q=meier+AND+%28description:member+OR+description:project%29 > > > after that: > > In the field "main_content" which is the default search field. > > "meier" as well as as "member" and "project" is highlighted, although im > searching for member and project only in the field description. > > The search results are ok, as far as I can see. > > > my settings > > > > > explicit > 10 >300 >on > main_content > html > > 200 > 2 > true > > >tvComponent > > > > > > Cheers > Dirk >
Phonetic search and matching
Hi, I have a question on phonetic search and matching in solr. In our application all the content of an article is written to a full-text search field, which provides stemming and a phonetic filter (cologne phonetic for german). This is the relevant part of the configuration for the index analyzer (search is analogous): Unfortunately this results sometimes in strange, but also explainable, matches. For example: Content field indexes the following String: Donnerstag von 13 bis 17 Uhr. This results in a match, if we search for "puf" as the result of the phonetic filter for this is 13. (As a consequence the 13 is then also highlighted) Does anyone has an idea how to handle this in a reasonable way that a search for "puf" does not match 13 in the content? Thanks in advance! Dirk
Re: Phonetic search and matching
Thanks Erick. In the first place we thought of removing numbers with a pattern filter. Setting inject to false will have the "same" effect If we want to be able to search for numbers in the content this solution will not work,but another field without phonetic filtering and searching in both fields would be ok,right? Dirk Am 07.02.2012 14:01 schrieb "Erick Erickson" : > What happens if you do NOT inject? Setting inject="false" > stores only the phonetic reduction, not the original text. In that > case your false match on "13" would go away > > Not sure what that means for the rest of your app though. > > Best > Erick > > On Mon, Feb 6, 2012 at 5:44 AM, Dirk Högemann > wrote: > > Hi, > > > > I have a question on phonetic search and matching in solr. > > In our application all the content of an article is written to a > full-text > > search field, which provides stemming and a phonetic filter (cologne > > phonetic for german). > > This is the relevant part of the configuration for the index analyzer > > (search is analogous): > > > > > > > generateWordParts="1" generateNumberParts="1" catenateWords="0" > > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> > > > > language="German2" > > /> > > > encoder="ColognePhonetic" inject="true"/> > > > > > > Unfortunately this results sometimes in strange, but also explainable, > > matches. > > For example: > > > > Content field indexes the following String: Donnerstag von 13 bis 17 Uhr. > > > > This results in a match, if we search for "puf" as the result of the > > phonetic filter for this is 13. > > (As a consequence the 13 is then also highlighted) > > > > Does anyone has an idea how to handle this in a reasonable way that a > > search for "puf" does not match 13 in the content? > > > > Thanks in advance! > > > > Dirk >
Solr / Tika Integration
Hello, we use Solr 3.5 and Tika to index a lot of PDFs. The content of those PDFs is searchable via a full-text search. Also the terms are used to make search suggestions. Unfortunately pdfbox seems to insert a space character, when there are soft-hyphens in the content of the PDF Thus the extracted text is sometimes very fragmented. For example the word Medizin is extracted as Me di zin. As a consequence the suggestions are often unusable and the search does not work as expected. Has anyone a suggestion how to extract the content of PDF containing sof-hyphens withpout fragmenting it? Best Dirk
Re: Solr / Tika Integration
Thanks so far. I will have a closer look at the PDF. I tried the enableautospace setting with pdfbox1.6 - did not work: PDFParser parser = new PDFParser(); parser.setEnableAutoSpace(false); ContentHandler handler = new BodyContentHandler(); Output: Va ri an te Creutz feldt- Ja kob-Krank heit Stel lung nah men des Ar beits krei ses Blut Our suggest component and parts of our search is getting hard to use by this. Any other ideas? Best Dirk 2012/2/10 Jan Høydahl > I think you need to control the parameter "enableAutoSpace" in PDFBox. > There's a JIRA for it, but it depends on some Tika1.1 stuff as far I can > understand > > https://issues.apache.org/jira/browse/SOLR-2930 > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > Solr Training - www.solrtraining.com > > On 10. feb. 2012, at 11:21, Dirk Högemann wrote: > > > Hello, > > > > we use Solr 3.5 and Tika to index a lot of PDFs. The content of those > PDFs > > is searchable via a full-text search. > > Also the terms are used to make search suggestions. > > > > Unfortunately pdfbox seems to insert a space character, when there are > > soft-hyphens in the content of the PDF > > Thus the extracted text is sometimes very fragmented. For example the > word > > Medizin is extracted as Me di zin. > > As a consequence the suggestions are often unusable and the search does > not > > work as expected. > > > > Has anyone a suggestion how to extract the content of PDF containing > > sof-hyphens withpout fragmenting it? > > > > Best > > Dirk > >
Re: Solr / Tika Integration
Interesting thing is that the only Tool I found to handle my pdf correctly was pdftotext. 2012/2/10 Robert Muir > On Fri, Feb 10, 2012 at 6:18 AM, Dirk Högemann > wrote: > > > > Our suggest component and parts of our search is getting hard to use by > > this. Any other ideas? > > > > Looks like https://issues.apache.org/jira/browse/PDFBOX-371 > > The title of the issue is a bit confusing (I don't think it should go > to hyphen either!), but I think its the reason its being mapped to a > space. > > -- > lucidimagination.com >
Auto-Commit and failures / schema violations
Hello, we are running a large CMS with multiple customers and we are now going to use solr for our search and indexing tasks. As we have a lot of users working simultaneously on the CMS we decided not to commit our changes programatically (we use StreamingUpdateSolrServer) on each add. Instead we are using the autocommit functions ins solr-config.xml. To be "reliable" we write Timestamp files on each "add" of a document to the StreamingUpdateSolrServer. (In case of a crash we could restart indexing since that timetamp. ) Unfortunately we don't know how to be sure that the add was successfull, as (for example) schema violations seem to be detected on commit, which is therefore too late, as the timestamp is usually already overwritten then. So: Are there any valid approaches to bes sure that an add of a document has been processed successfully? Maybe: Is ist better to collect a list of documents to add and commit these, instead of using the auto-commit function? Thanks in advance for any help! Dirk Högemann ___ Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die Toolbar eingebaut! http://produkte.web.de/go/toolbar
Auto commit exception in Solr 4.0 Beta
Hello, I am trying to make our search application Solr 4.0 (Beta) ready and elaborate on the tasks necessary to accomplish this. When I try to reindex our documents I get the following exception: auto commit error...:java.lang.UnsupportedOperationException: this codec can only be used for reading at org.apache.lucene.codecs.lucene3x.Lucene3xCodec$1.writeLiveDocs(Lucene3xCodec.java:74) at org.apache.lucene.index.ReadersAndLiveDocs.writeLiveDocs(ReadersAndLiveDocs.java:278) at org.apache.lucene.index.IndexWriter$ReaderPool.release(IndexWriter.java:435) at org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:278) at org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2928) at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2919) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2666) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2793) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2773) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:531) at org.apache.solr.update.CommitTracker.run(CommitTracker.java:214) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Is this a known bug, or is it maybe a Classpath problem I am facing here? Best Dirk Hoegemann
Re: Auto commit exception in Solr 4.0 Beta
Perfect. I reindexed the whole index and everything worked fine. The exception was just a little bit confusing. Best Dirk Am 21.08.2012 14:39 schrieb "Jack Krupansky" : > Did you explicitly run the IndexUpgrader before adding new documents? > > In theory, you don't have to do that, but... who knows for sure. > > While you wait for one of the hard-core Lucene guys to respond, you could > try IndexUpgrader, if you haven't already. > > OTOH, if you are in fact reindexing (rather than reusing your old index), > why not start with an empty 4.0 index? > > From CHANGES.TXT: > > - On upgrading to 4.0, if you do not fully reindex your documents, > Lucene will emulate the new flex API on top of the old index, > incurring some performance cost (up to ~10% slowdown, typically). > To prevent this slowdown, use oal.index.IndexUpgrader > to upgrade your indexes to latest file format (LUCENE-3082). > > Mixed flex/pre-flex indexes are perfectly fine -- the two > emulation layers (flex API on pre-flex index, and pre-flex API on > flex index) will remap the access as required. So on upgrading to > 4.0 you can start indexing new documents into an existing index. > To get optimal performance, use oal.index.IndexUpgrader > to upgrade your indexes to latest file format (LUCENE-3082). > > -- Jack Krupansky > > -Original Message- From: Dirk Högemann > Sent: Tuesday, August 21, 2012 9:17 AM > To: solr-user@lucene.apache.org > Subject: Auto commit exception in Solr 4.0 Beta > > Hello, > > I am trying to make our search application Solr 4.0 (Beta) ready and > elaborate on the tasks necessary to accomplish this. > When I try to reindex our documents I get the following exception: > > auto commit error...:java.lang.**UnsupportedOperationException: this codec > can only be used for reading >at > org.apache.lucene.codecs.**lucene3x.Lucene3xCodec$1.** > writeLiveDocs(Lucene3xCodec.**java:74) >at > org.apache.lucene.index.**ReadersAndLiveDocs.**writeLiveDocs(** > ReadersAndLiveDocs.java:278) >at > org.apache.lucene.index.**IndexWriter$ReaderPool.** > release(IndexWriter.java:435) >at > org.apache.lucene.index.**BufferedDeletesStream.**applyDeletes(** > BufferedDeletesStream.java:**278) >at > org.apache.lucene.index.**IndexWriter.applyAllDeletes(** > IndexWriter.java:2928) >at > org.apache.lucene.index.**IndexWriter.maybeApplyDeletes(** > IndexWriter.java:2919) >at > org.apache.lucene.index.**IndexWriter.prepareCommit(** > IndexWriter.java:2666) >at > org.apache.lucene.index.**IndexWriter.commitInternal(** > IndexWriter.java:2793) >at org.apache.lucene.index.**IndexWriter.commit(** > IndexWriter.java:2773) >at > org.apache.solr.update.**DirectUpdateHandler2.commit(** > DirectUpdateHandler2.java:531) >at org.apache.solr.update.**CommitTracker.run(** > CommitTracker.java:214) >at > java.util.concurrent.**Executors$RunnableAdapter.** > call(Executors.java:441) >at > java.util.concurrent.**FutureTask$Sync.innerRun(**FutureTask.java:303) >at java.util.concurrent.**FutureTask.run(FutureTask.**java:138) >at > java.util.concurrent.**ScheduledThreadPoolExecutor$** > ScheduledFutureTask.access$**301(**ScheduledThreadPoolExecutor.**java:98) >at > java.util.concurrent.**ScheduledThreadPoolExecutor$** > ScheduledFutureTask.run(**ScheduledThreadPoolExecutor.**java:206) >at > java.util.concurrent.**ThreadPoolExecutor$Worker.** > runTask(ThreadPoolExecutor.**java:886) >at > java.util.concurrent.**ThreadPoolExecutor$Worker.run(** > ThreadPoolExecutor.java:908) >at java.lang.Thread.run(Thread.**java:662) > > Is this a known bug, or is it maybe a Classpath problem I am facing here? > > Best > Dirk Hoegemann >
solr4.0 LimitTokenCountFilterFactory NumberFormatException
Hi, I am trying to upgrade from Solr 3.5 to Solr 4.0. I read the following in the example solrconfig: I tried that as follows: ... ... The LimitTokenCountFilterFactory configured like that crashes the startup of the corresponding core with the following exception (without the Factory the core startup works): 17.10.2012 17:44:19 org.apache.solr.common.SolrException log SCHWERWIEGEND: null:org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] fieldType "textgen": Plugin init failure for [schema.xml] analyze r/filter: null at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177) at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:369) at org.apache.solr.schema.IndexSchema.(IndexSchema.java:113) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:846) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:534) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:103) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4638) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5294) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:895) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:871) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:615) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:649) at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1581) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] analyzer/filter: null at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177) at org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:377) at org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95) at org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:43) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151) ... 25 more Caused by: java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:417) at java.lang.Integer.parseInt(Integer.java:499) at org.apache.lucene.analysis.miscellaneous.LimitTokenCountFilterFactory.init(LimitTokenCountFilterFactory.java:48) at org.apache.solr.schema.FieldTypePluginLoader$3.init(FieldTypePluginLoader.java:367) at org.apache.solr.schema.FieldTypePluginLoader$3.init(FieldTypePluginLoader.java:358) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:159) ... 29 more Any ideas? Best Dirk
Re: solr4.0 LimitTokenCountFilterFactory NumberFormatException
:-) great solution...will look funny in our production system. Am 17.10.2012 16:12 schrieb "Jack Krupansky" : > Anybody want to guess what's wrong with this code: > > String maxTokenCountArg = args.get("maxTokenCount"); > if (maxTokenCountArg == null) { > throw new IllegalArgumentException("**maxTokenCount is mandatory."); > } > maxTokenCount = Integer.parseInt(args.get(**maxTokenCountArg)); > > Hmmm... try this "workaround": > > maxTokenCount="foo" foo="1"/> > > -- Jack Krupansky > > -Original Message- From: Dirk Högemann > Sent: Wednesday, October 17, 2012 11:50 AM > To: solr-user@lucene.apache.org > Subject: solr4.0 LimitTokenCountFilterFactory NumberFormatException > > Hi, > > I am trying to upgrade from Solr 3.5 to Solr 4.0. > I read the following in the example solrconfig: > > > > I tried that as follows: > > ... > positionIncrementGap="100"> > > > maxTokenCount="10"/> > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> > > language="German" > /> > words="stopwords.txt" enablePositionIncrements="**true" /> > > > ... > > The LimitTokenCountFilterFactory configured like that crashes the startup > of the corresponding core with the following exception (without the Factory > the core startup works): > > > 17.10.2012 17:44:19 org.apache.solr.common.**SolrException log > SCHWERWIEGEND: null:org.apache.solr.common.**SolrException: Plugin init > failure for [schema.xml] fieldType "textgen": Plugin init failure for > [schema.xml] analyze > r/filter: null >at > org.apache.solr.util.plugin.**AbstractPluginLoader.load(** > AbstractPluginLoader.java:177) >at > org.apache.solr.schema.**IndexSchema.readSchema(**IndexSchema.java:369) >at org.apache.solr.schema.**IndexSchema.(** > IndexSchema.java:113) >at org.apache.solr.core.**CoreContainer.create(** > CoreContainer.java:846) >at org.apache.solr.core.**CoreContainer.load(** > CoreContainer.java:534) >at org.apache.solr.core.**CoreContainer.load(** > CoreContainer.java:356) >at > org.apache.solr.core.**CoreContainer$Initializer.** > initialize(CoreContainer.java:**308) >at > org.apache.solr.servlet.**SolrDispatchFilter.init(** > SolrDispatchFilter.java:107) >at > org.apache.catalina.core.**ApplicationFilterConfig.**initFilter(** > ApplicationFilterConfig.java:**277) >at > org.apache.catalina.core.**ApplicationFilterConfig.**getFilter(** > ApplicationFilterConfig.java:**258) >at > org.apache.catalina.core.**ApplicationFilterConfig.**setFilterDef(** > ApplicationFilterConfig.java:**382) >at > org.apache.catalina.core.**ApplicationFilterConfig.** > (ApplicationFilterConfig.java:**103) >at > org.apache.catalina.core.**StandardContext.filterStart(** > StandardContext.java:4638) >at > org.apache.catalina.core.**StandardContext.startInternal(** > StandardContext.java:5294) >at > org.apache.catalina.util.**LifecycleBase.start(**LifecycleBase.java:150) >at > org.apache.catalina.core.**ContainerBase.**addChildInternal(** > ContainerBase.java:895) >at > org.apache.catalina.core.**ContainerBase.addChild(** > ContainerBase.java:871) >at > org.apache.catalina.core.**StandardHost.addChild(**StandardHost.java:615) >at > org.apache.catalina.startup.**HostConfig.deployDescriptor(** > HostConfig.java:649) >at > org.apache.catalina.startup.**HostConfig$DeployDescriptor.** > run(HostConfig.java:1581) >at > java.util.concurrent.**Executors$RunnableAdapter.** > call(Executors.java:441) >at > java.util.concurrent.**FutureTask$Sync.innerRun(**FutureTask.java:303) >at java.util.concurrent.**FutureTask.run(FutureTask.**java:138) >at > java.util.concurrent.**ThreadPoolExecutor$Worker.** > runTask(ThreadPoolExecutor.**java:886) >at > java.util.concurrent.**ThreadPoolExecutor$Worker.run(** > ThreadPoolExecutor.java:908) >at java.lang.Thread.run(Thread.**java:662) > Caused by: org.apache.solr.common.**SolrException: Plugin init failure for > [schema.xml] analyzer/filter: null >at > org.apache.solr.util.plugin.**AbstractPluginLoader.load(** > AbstractPluginLoader.java:177) >at > org.apache.solr.schema.**FieldTypePlu
Forwardslash delimiter.Solr4.0 query for path like /Customer/Content/*
Hi, I am currently upgrading from Solr 3.5 to Solr 4.0 I used to have filter-bases restrictions for my search based on the paths of documents in a content repository. E.g. fq={!q.op=OR df=}folderPath_}/customer/content/* Unfortunately this does not work anymore, as lucene now supports Regexpsearches - delimiting the expression with forward slashes: http://lucene.apache.org/core/4_0_0-BETA/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches this leads to a parsed query, which is of course not what is intended: RegexpQuery(folderPath_:/standardlsg/) folderPath_:shareddocs RegexpQuery(folderPath_:/personen/) folderPath_:* Is there a possibility to make the example query above work, without escaping the "/" with "\/"? Otherwise I will have to parse all queries (coming from persisted configurations in the repositiory) and escape the relevant parts of the queries on that field, which is somewhat ugly... The field I search on is of type: Best and thanks for any hints Dirk
Re: Forwardslash delimiter.Solr4.0 query for path like /Customer/Content/*
Ok.If there is no other way I will have some string parsing to do, but in this case I am wondering a little bit about the chosen delimiter...as it is central to nearly any path in directories, web resources etc.,right? Best Dirk Am 30.10.2012 19:16 schrieb "Jack Krupansky" : > Maybe a custom search component that runs before the QueryComponent and > does the escaping? > > -- Jack Krupansky > > -Original Message- From: Dirk Högemann > Sent: Tuesday, October 30, 2012 1:07 PM > To: solr-user@lucene.apache.org > Subject: Forwardslash delimiter.Solr4.0 query for path like > /Customer/Content/* > > Hi, > > I am currently upgrading from Solr 3.5 to Solr 4.0 > > I used to have filter-bases restrictions for my search based on the paths > of documents in a content repository. > E.g. fq={!q.op=OR df=}folderPath_}/customer/**content/* > > Unfortunately this does not work anymore, as lucene now supports > Regexpsearches - delimiting the expression with forward slashes: > http://lucene.apache.org/core/**4_0_0-BETA/queryparser/org/** > apache/lucene/queryparser/**classic/package-summary.html#**Regexp_Searches<http://lucene.apache.org/core/4_0_0-BETA/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches> > > this leads to a parsed query, which is of course not what is intended: > > name="parsed_filter_queries"><**str>RegexpQuery(folderPath_:/** > standardlsg/) > folderPath_:shareddocs RegexpQuery(folderPath_:/**personen/) > folderPath_:* > > Is there a possibility to make the example query above work, without > escaping the "/" with "\/"? > Otherwise I will have to parse all queries (coming from persisted > configurations in the repositiory) and escape the relevant parts of the > queries on that field, which is somewhat ugly... > > The field I search on is of type: > > > > > > > > > > > Best and thanks for any hints > Dirk >
Empty shard1 - -:{"shard1":[]} cannot add new replicas
Dear all, I cannot add or remove any replicas of one collection. Diagnostics in the log file shows empty shards "gmpg-fulltext3":{"shard1":[]}, see below. What can I do? ng.error.diagnostics.3897285248187441 { "sortedNodes":[{ "node":"gmpg-services8.mpiwg-berlin.mpg.de:48983_solr", "isLive":true, "cores":3.0, "freedisk":466.3022346496582, "totaldisk":3814.24609375, "replicas":{ "gmpg-fulltext3":{"shard1":[]}, "gmpg-db":{"shard1":[{ "core_node20":{ "core":"gmpg-db_shard1_replica_n19", "shard":"shard1", "collection":"gmpg-db", "node_name":"gmpg-services8.mpiwg-berlin.mpg.de:48983_solr", "type":"NRT", "base_url":"http://gmpg-services8.mpiwg-berlin.mpg.de:48983/solr";, "state":"down", "force_set_state":"false", "INDEX.sizeInGB":0.06685098074376583}}]}, "abstracts4":{"shard1":[{ "core_node4":{ "core":"abstracts4_shard1_replica_n3", "shard":"shard1", "collection":"abstracts4", "node_name":"gmpg-services8.mpiwg-berlin.mpg.de:48983_solr", "type":"NRT", "leader":"true", "base_url":"http://gmpg-services8.mpiwg-berlin.mpg.de:48983/solr";, "state":"active", "force_set_state":"false", "INDEX.sizeInGB":22.7537336172536}}]}, "gmpg-fulltext-dev":{"shard1":[{ "core_node2":{ "core":"gmpg-fulltext-dev_shard1_replica_n1", "shard":"shard1", "collection":"gmpg-fulltext-dev", "node_name":"gmpg-services8.mpiwg-berlin.mpg.de:48983_solr", "type":"NRT", "leader":"true", "base_url":"http://gmpg-services8.mpiwg-berlin.mpg.de:48983/solr";, "state":"active", "force_set_state":"false", "INDEX.sizeInGB":1.3224780559539795E-7}}]}}} Error message is: 021-02-05 10:57:41.181 ERROR (OverseerThreadFactory-23-thread-3-processing-n:gmpg-services8.mpiwg-berlin.mpg.de:58983_solr) [c:gmpg-fulltext3 s:shard1 ] o.a.s.c.a.c.OverseerCollectionMessageHandler C\ ollection: gmpg-fulltext3 operation: addreplica failed:org.apache.solr.cloud.api.collections.Assign$AssignmentException: Error getting replica locations : No node can satisfy the rules "[] More detail\ s from logs in node : gmpg-services8.mpiwg-berlin.mpg.de:58983_solr, errorId : AutoScaling.error.diagnostics.3897285248187441" at org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:394) at org.apache.solr.cloud.api.collections.Assign$PolicyBasedAssignStrategy.assign(Assign.java:630) at org.apache.solr.cloud.api.collections.Assign.getNodesForNewReplicas(Assign.java:368) at org.apache.solr.cloud.api.collections.AddReplicaCmd.buildReplicaPositions(AddReplicaCmd.java:370) at org.apache.solr.cloud.api.collections.AddReplicaCmd.addReplica(AddReplicaCmd.java:156) at org.apache.solr.cloud.api.collections.AddReplicaCmd.call(AddReplicaCmd.java:93) at org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:263) at org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505) at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.solr.common.SolrException: No node can satisfy the rules "[] More details from logs in node : gmpg-services8.mpiwg-berlin.mpg.de:58983_solr, errorId : AutoScaling.error.diagnosti\ cs.3897285248187441" at org.apache.solr.client.solrj.cloud.autoscaling.PolicyHelper.getReplicaLocations(PolicyHelper.java:185) at org.apache.solr.cloud.api.collections.Assign.getPositionsUsingPolicy(Assign.java:382) ... 11 more Cheers Dirk -- -- Dr.-Ing. Dirk Wintergrün Max-Planck-Institut für Wissenschaftsgeschichte Max Planck Institute for the History of Science Department I / Digital and Computational Humanities Boltzmannstr. 22 14195 Berlin +49 20 22 66 7108 dwin...@mpiwg-berlin.mpg.de signature.asc Description: Message signed with OpenPGP