Re: dose solr sopport distribute index storage ?
On Mon, Oct 12, 2009 at 10:27 AM, Pravin Karne < pravin_ka...@persistent.co.in> wrote: > How to set master/slave setup for solr. > > Index documents only on the master. Put the slaves behind a load balancer and query only on slaves. Setup replication between the master and slaves. See http://wiki.apache.org/solr/SolrReplication -- Regards, Shalin Shekhar Mangar.
Re: Facet query help
On Mon, Oct 12, 2009 at 6:07 AM, Tommy Chheng wrote: > The dummy data set is composed of 6 docs. > > My query is set for 'tommy' with the facet query of Memory_s:1+GB > > http://lh:8983/solr/select/?facet=true&facet.field=CPU_s&facet.field=Memory_s&facet.field=Video+Card_s&wt=ruby&facet.query=Memory_s:1+GB&q=tommy&indent=on > > However, in the response (http://pastie.org/650932), I get two docs: one > which has the correct field Memory_s:1 GB and the second document which has > a Memory_s:3+GB. Why did the second document match if i set the facet.query > to just 1+GB?? > > facet.query does not limit documents. It is used for finding the number of documents matching the query. In order to filter the result set you should use filter query e.g. fq=Memory_s:"1 GB" -- Regards, Shalin Shekhar Mangar.
Re: Is negative boost possible?
Yonik Seeley wrote: On Sun, Oct 11, 2009 at 6:04 PM, Lance Norskog wrote: And the other important thing to know about boost values is that the dynamic range is about 6-8 bits That's an index-time boost - an 8 bit float with 5 bits of mantissa and 3 bits of exponent. Query time boosts are normal 32 bit floats. To be more specific: index-time float encoding does not permit negative numbers (see SmallFloat), but query-time boosts can be negative, and they DO affect the score - see below. BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector. --- BeanShell 2.0b4 - by Pat Niemeyer (p...@pat.net) bsh % import org.apache.lucene.search.*; bsh % import org.apache.lucene.index.*; bsh % import org.apache.lucene.store.*; bsh % import org.apache.lucene.document.*; bsh % import org.apache.lucene.analysis.*; bsh % tq = new TermQuery(new Term("a", "b")); bsh % print(tq); a:b bsh % tq.setBoost(-1); bsh % print(tq); a:b^-1.0 bsh % q = new BooleanQuery(); bsh % tq1 = new TermQuery(new Term("a", "c")); bsh % tq1.setBoost(10); bsh % q.add(tq1, BooleanClause.Occur.SHOULD); bsh % q.add(tq, BooleanClause.Occur.SHOULD); bsh % print(q); a:c^10.0 a:b^-1.0 bsh % dir = new RAMDirectory(); bsh % w = new IndexWriter(dir, new WhitespaceAnalyzer()); bsh % doc = new Document(); bsh % doc.add(new Field("a", "b c d", Field.Store.YES, Field.Index.ANALYZED)); bsh % w.addDocument(doc); bsh % w.close(); bsh % r = IndexReader.open(dir); bsh % is = new IndexSearcher(r); bsh % td = is.search(q, 10); bsh % sd = td.scoreDocs; bsh % print(sd.length); 1 bsh % print(is.explain(q, 0)); 0.1373985 = (MATCH) sum of: 0.15266499 = (MATCH) weight(a:c^10.0 in 0), product of: 0.99503726 = queryWeight(a:c^10.0), product of: 10.0 = boost 0.30685282 = idf(docFreq=1, numDocs=1) 0.32427183 = queryNorm 0.15342641 = (MATCH) fieldWeight(a:c in 0), product of: 1.0 = tf(termFreq(a:c)=1) 0.30685282 = idf(docFreq=1, numDocs=1) 0.5 = fieldNorm(field=a, doc=0) -0.0152664995 = (MATCH) weight(a:b^-1.0 in 0), product of: -0.099503726 = queryWeight(a:b^-1.0), product of: -1.0 = boost 0.30685282 = idf(docFreq=1, numDocs=1) 0.32427183 = queryNorm 0.15342641 = (MATCH) fieldWeight(a:b in 0), product of: 1.0 = tf(termFreq(a:b)=1) 0.30685282 = idf(docFreq=1, numDocs=1) 0.5 = fieldNorm(field=a, doc=0) bsh % -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: rollback and cumulative_add
Koji Sekiguchi wrote: > Hello, > > I found that rollback resets adds and docsPending count, > but doesn't reset cumulative_adds. > > $ cd example/exampledocs > # comment out the line of so avoid committing in post.sh > $ ./post.sh *.xml > => docsPending=19, adds=19, cumulative_adds=19 > > # do rollback > $ curl http://localhost:8983/solr/update?rollback=true > => rollbacks=1, docsPending=0, adds=0, cumulative_adds=19 > > Is this correct behavior? > > Koji > > (forwarded dev list) I think this is a bug that was introduced by me when I contributed the first patch for the rollback and the bug was inherited by the successive patches. I'll reopen SOLR-670 and attach the fix soon: https://issues.apache.org/jira/browse/SOLR-670 Koji -- http://www.rondhuit.com/
Re: Is negative boost possible?
On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki wrote: > BTW, standard Collectors collect only results > with positive scores, so if you want to collect results with negative scores > as well then you need to use a custom Collector. Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. -Yonik
two facet.prefix on one facet field in a single query
Is it possible to have two different facet.prefix on the same facet field in a single query. I wan to get facet counts for two prefix, "xx" and "yy". I tried using two facet.prefix (ie &facet.prefix=xx&facet.prefix=yy) but the second one seems to have no effect. Bill
Re: Facet query help
ok, so fq != facet.query. i thought it was an alias. I'm trying your suggestion fq=Memory_s:"1 GB" and now it's returning zero documents even though there is one document that has "tommy" and "Memory_s:1 GB" as seen in the original pastie(http://pastie.org/650932). I tried the fq query body with quotes and without quotes. http://lh:8983/solr/select/?facet=true&facet.field=CPU_s&facet.field=Memory_s&facet.field=Video+Card_s&wt=ruby&fq=%22Memory_s:1+GB%22&q=tommy&indent=on Any thoughts? thanks, tommy On 10/12/09 1:00 AM, Shalin Shekhar Mangar wrote: On Mon, Oct 12, 2009 at 6:07 AM, Tommy Chhengwrote: The dummy data set is composed of 6 docs. My query is set for 'tommy' with the facet query of Memory_s:1+GB http://lh:8983/solr/select/?facet=true&facet.field=CPU_s&facet.field=Memory_s&facet.field=Video+Card_s&wt=ruby&facet.query=Memory_s:1+GB&q=tommy&indent=on However, in the response (http://pastie.org/650932), I get two docs: one which has the correct field Memory_s:1 GB and the second document which has a Memory_s:3+GB. Why did the second document match if i set the facet.query to just 1+GB?? facet.query does not limit documents. It is used for finding the number of documents matching the query. In order to filter the result set you should use filter query e.g. fq=Memory_s:"1 GB"
Re: format of sort parameter in Solr::Request::Standard
I did an experiment that worked. In Solr::Request::Standard, in the to_hash() method, I changed the commented line below to the two lines following it. sort = @params[:sort].collect do |sort| key = sort.keys[0] "#{key.to_s} #{sort[key] == :descending ? 'desc' : 'asc'}" end.join(',') if @params[:sort] # START OF CHANGES #hash[:q] = sort ? "#...@params[:query]};#{sort}" : @params[:query] hash[:q] = @params[:query] hash[:sort] = sort if sort != nil # END OF CHANGES hash["q.op"] = @params[:operator] hash[:df] = @params[:default_field] Does this make sense? Should this be changed in the next version of the solr-ruby gem? Paul Rosen wrote: Hi all, I'm using solr-ruby 0.0.7 and am having trouble getting Sort to work. I have the following statement: req = Solr::Request::Standard.new(:start => start, :rows => max, :sort => [ :title_sort => :ascending ], :query => query, :filter_queries => filter_queries, :field_list => @field_list, :facets => {:fields => @facet_fields, :mincount => 1, :missing => true, :limit => -1}, :highlighting => {:field_list => ['text'], :fragment_size => 600}, :shards => @cores) That produces no results, but removing the :sort parameter off does give results. Here is the output from solr: INFO: [merged] webapp=/solr path=/select params={wt=ruby&facet.limit=-1&rows=30&start=0&facet=true&facet.mincount=1&q=(rossetti);title_sort+asc&fl=archive,date_label,genre,role_ART,role_AUT,role_EDT,role_PBL,role_TRL,source,image,thumbnail,text_url,title,alternative,uri,url,exhibit_type,license,title_sort,author_sort&qt=standard&facet.missing=true&hl.fl=text&facet.field=genre&facet.field=archive&facet.field=freeculture&hl.fragsize=600&hl=true&shards=localhost:8983/solr/merged} status=0 QTime=19 It looks to me like the string should have "&sort=title_sort+asc" instead of ";title_sort_asc" tacked on to the query, but I'm not sure about that. Any clues what I'm doing wrong? Thanks, Paul
Re: format of sort parameter in Solr::Request::Standard
Paul- Trunk solr-ruby has this instead: hash[:sort] = @params[:sort].collect do |sort| key = sort.keys[0] "#{key.to_s} #{sort[key] == :descending ? 'desc' : 'asc'}" end.join(',') if @params[:sort] The ";sort..." stuff is now deprecated with Solr itself I suppose the 0.8 gem needs to be pushed to rubyforge, eh? Erik On Oct 12, 2009, at 11:03 AM, Paul Rosen wrote: I did an experiment that worked. In Solr::Request::Standard, in the to_hash() method, I changed the commented line below to the two lines following it. sort = @params[:sort].collect do |sort| key = sort.keys[0] "#{key.to_s} #{sort[key] == :descending ? 'desc' : 'asc'}" end.join(',') if @params[:sort] # START OF CHANGES #hash[:q] = sort ? "#...@params[:query]};#{sort}" : @params[:query] hash[:q] = @params[:query] hash[:sort] = sort if sort != nil # END OF CHANGES hash["q.op"] = @params[:operator] hash[:df] = @params[:default_field] Does this make sense? Should this be changed in the next version of the solr-ruby gem? Paul Rosen wrote: Hi all, I'm using solr-ruby 0.0.7 and am having trouble getting Sort to work. I have the following statement: req = Solr::Request::Standard.new(:start => start, :rows => max, :sort => [ :title_sort => :ascending ], :query => query, :filter_queries => filter_queries, :field_list => @field_list, :facets => {:fields => @facet_fields, :mincount => 1, :missing => true, :limit => -1}, :highlighting => {:field_list => ['text'], :fragment_size => 600}, :shards => @cores) That produces no results, but removing the :sort parameter off does give results. Here is the output from solr: INFO: [merged] webapp=/solr path=/select params = {wt = ruby &facet .limit = -1 &rows=30&start=0&facet=true&facet.mincount=1&q=(rossetti);title_sort + asc &fl = archive ,date_label ,genre ,role_ART ,role_AUT ,role_EDT ,role_PBL ,role_TRL ,source ,image ,thumbnail ,text_url ,title ,alternative ,uri ,url ,exhibit_type ,license ,title_sort ,author_sort &qt = standard &facet .missing = true &hl .fl = text &facet .field = genre &facet .field = archive &facet.field=freeculture&hl.fragsize=600&hl=true&shards=localhost: 8983/solr/merged} status=0 QTime=19 It looks to me like the string should have "&sort=title_sort+asc" instead of ";title_sort_asc" tacked on to the query, but I'm not sure about that. Any clues what I'm doing wrong? Thanks, Paul
Solr over DRBD
Hi there, I have a 2 node cluster running apache and solr over a shared partition ontop of DRBD. Think of it like a SAN. I'm curios as to how I should do load balancing / sharing with Solr in this setup. I'm already using DNS round robbin for apache. My Solr installation is on /cluster/Solr. I've been starting an instance of Solr on each server out of the same installation / working directory. Is this safe? I haven't noticed any problems so far. Does this mean they'll share the same index? Is there a better way to do this? Should I perhaps only do commits on one of the servers (and setup heartbeat to determine which server to run the commit on)? I'm running Solr 1.3, but I'm not against upgrading if that provides me with a better way of load balancing. Kind regards, Pieter
capitalization and delimiters
In my search docs, I have content such as 'powershot' and 'powerShot'. I would expect 'powerShot' would be searched as 'power', 'shot' and 'powershot', so that results for all these are returned. Instead, only results for 'power' and 'shot' are returned. Any suggestions? In the schema, index analyzer: In the schema, query analyzer ThanksAudrey _ New! Open Messenger faster on the MSN homepage http://go.microsoft.com/?linkid=9677405
Re: Default query parameter for one core
Thanks for your input, Shalin. On Sun, Oct 11, 2009 at 12:30 AM, Shalin Shekhar Mangar wrote: >> - I can't use a variable like ${shardsParam} in a single shared >> solrconfig.xml, because the line >> ${shardsParam} >> has to be in there, and that forces a (possibly empty) &shards >> parameter onto cores that *don't* need one, causing a >> NullPointerException. >> >> > Well, we can fix the NPE :) Please raise an issue. The NPE may be the "correct" behavior -- I'm causing an empty &shards= parameter, which doesn't have a defined behavior AFAIK. The deficiency I was pointing out was that using ${shardsParam} doesn't help me achieve my real goal, which is to have the entire tag disappear for some shards. >> So I think my best bet is to make two mostly-identical >> solrconfig.xmls, and point core0 to the one specifying a &shards= >> parameter: >> >> >> I don't like the duplication of config, but at least it accomplishes my >> goal! >> >> > There is another way too. Each plugin in Solr now supports a configuration > attribute named "enable" which can be true or false. You can control the > value (true/false) through a variable. So you can duplicate just the handle > instead of the complete solrconfig.xml I had looked into this, but thought it doesn't help because I'm not disabling an entire plugin -- just a tag specifying a default parameter to a . Individual tags don't have an "enable" flag for me to conditionally set to false. Maybe I'm misunderstanding what you're suggesting? Thanks again, Michael
Re: Is negative boost possible?
Yonik Seeley wrote: On Mon, Oct 12, 2009 at 5:58 AM, Andrzej Bialecki wrote: BTW, standard Collectors collect only results with positive scores, so if you want to collect results with negative scores as well then you need to use a custom Collector. Solr never discarded non-positive hits, and now Lucene 2.9 no longer does either. Hmm ... The code that I pasted in my previous email uses Searcher.search(Query, int), which in turn uses search(Query, Filter, int), and it doesn't return any results if only the first clause is present (the one with negative boost) even though it's a matching clause. I think this is related to the fact that in TopScoreDocCollector:48 the pqTop.score is initialized to 0, and then all results that have lower score that this are discarded. Perhaps this should be initialized to Float.MIN_VALUE? -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
Re: Scoring for specific field queries
Avlesh, I got it, finally, by doing an OR between the two fields, one with an exact match keyword and the other is grouped. q=suggestion:"formula xxx" OR tokenized_suggestion:(formula ) Thanks for all your help! Rih On Fri, Oct 9, 2009 at 4:26 PM, R. Tan wrote: > I ended up with the same set of results earlier but I don't results such as > "the champion", I think because of the EdgeNGram filter. > > With NGram, I'm back to the same problem: > > Result for q=ca > > > 0.8717008 > Blu Jazz Cafe > > > > 0.8717008 > Café in the Pond > > >
Letters with accent in query
Hi, I'm querying with an accented keyword such as "café" but the debug info shows that it is only searching for "caf". I'm using the ISOLatin1Accent filter as well. Query: http://localhost:8983/solr/select?q=%E9&debugQuery=true Params return shows this: true What am I missing here? Rih
Re: Default query parameter for one core
OK, a hacky but working solution to making one core shard to all others: have the default parameter *name* vary, so that one core gets "&shards=foo" and all other cores get "&dummy=foo". # solr.xml ... # solrconfig.xml ${shardsValue} ... Michael On Mon, Oct 12, 2009 at 12:00 PM, Michael wrote: > Thanks for your input, Shalin. > > On Sun, Oct 11, 2009 at 12:30 AM, Shalin Shekhar Mangar > wrote: >>> - I can't use a variable like ${shardsParam} in a single shared >>> solrconfig.xml, because the line >>> ${shardsParam} >>> has to be in there, and that forces a (possibly empty) &shards >>> parameter onto cores that *don't* need one, causing a >>> NullPointerException. >>> >>> >> Well, we can fix the NPE :) Please raise an issue. > > The NPE may be the "correct" behavior -- I'm causing an empty &shards= > parameter, which doesn't have a defined behavior AFAIK. The > deficiency I was pointing out was that using ${shardsParam} doesn't > help me achieve my real goal, which is to have the entire tag > disappear for some shards. > >>> So I think my best bet is to make two mostly-identical >>> solrconfig.xmls, and point core0 to the one specifying a &shards= >>> parameter: >>> >>> >>> I don't like the duplication of config, but at least it accomplishes my >>> goal! >>> >>> >> There is another way too. Each plugin in Solr now supports a configuration >> attribute named "enable" which can be true or false. You can control the >> value (true/false) through a variable. So you can duplicate just the handle >> instead of the complete solrconfig.xml > > I had looked into this, but thought it doesn't help because I'm not > disabling an entire plugin -- just a tag specifying a default > parameter to a . Individual tags don't have an > "enable" flag for me to conditionally set to false. Maybe I'm > misunderstanding what you're suggesting? > > Thanks again, > Michael >
Re: Letters with accent in query
What tokenizer and filters are you using in what order? See schema.xml. Also, you may wish to use ASCIIFoldingFilter, which covers more cases than ISOLatin1AccentFilter. Michael On Mon, Oct 12, 2009 at 12:42 PM, R. Tan wrote: > Hi, > I'm querying with an accented keyword such as "café" but the debug info > shows that it is only searching for "caf". I'm using the ISOLatin1Accent > filter as well. > > Query: > http://localhost:8983/solr/select?q=%E9&debugQuery=true > > Params return shows this: > > > true > > > What am I missing here? > > Rih >
Search results order
Hi, I have indexed my xml which contains the following data. http://www.yahoo.com yahoomail yahoo has various links and gives in detail about the all the links in it http://www.rediff.com It is a good website Rediff has a interesting homepage http://www.ndtv.com Ndtv has a variety of good links The homepage of Ndtv is very good In my solr home page , when I search input as “good” It displays the docs which has “good” as highest occurrences by default. The output comes as follows. http://www.ndtv.com Ndtv has a variety of good links The homepage of Ndtv is very good http://www.rediff.com It is a good website Rediff has a interesting homepage If I need to display doc which has least occurrence of search input “good” as first result. What changes should I make in solrconfig file to achieve the same?. Any suggestions would be helpful. For me the output should come as below. http://www.rediff.com It is a good website Rediff has a interesting homepage http://www.ndtv.com Ndtv has a variety of good links The homepage of Ndtv is very good Regards Bhaskar
Re: dose solr sopport distribute index storage ?
Hi, How should we setup master and slaves in Solr? What configuration files and parameters should we need to change and how ? Thanks, Chaitali --- On Mon, 10/12/09, Shalin Shekhar Mangar wrote: From: Shalin Shekhar Mangar Subject: Re: dose solr sopport distribute index storage ? To: solr-user@lucene.apache.org Date: Monday, October 12, 2009, 3:17 AM On Mon, Oct 12, 2009 at 10:27 AM, Pravin Karne < pravin_ka...@persistent.co.in> wrote: > How to set master/slave setup for solr. > > Index documents only on the master. Put the slaves behind a load balancer and query only on slaves. Setup replication between the master and slaves. See http://wiki.apache.org/solr/SolrReplication -- Regards, Shalin Shekhar Mangar.
Conditional copyField
Hi, I am pushing data to solr from two different sources nutch and a cms. I have a data clash in that in nutch a copyField is required to push the url field to the id field as it is used as the primary lookup in the nutch solr intergration update. The other cms also uses the url field but also populates the id field with a different value. Now I can't really change either source definition so is there a way in solrconfig or schema to check if id is empty and only copy if true or is there a better way via the updateprocessor? Thanks for your help in advance Regards David
Re: format of sort parameter in Solr::Request::Standard
I've just pushed a new 0.0.8 gem to Rubyforge that includes the fix I described for the sort parameter. Erik On Oct 12, 2009, at 11:03 AM, Paul Rosen wrote: I did an experiment that worked. In Solr::Request::Standard, in the to_hash() method, I changed the commented line below to the two lines following it. sort = @params[:sort].collect do |sort| key = sort.keys[0] "#{key.to_s} #{sort[key] == :descending ? 'desc' : 'asc'}" end.join(',') if @params[:sort] # START OF CHANGES #hash[:q] = sort ? "#...@params[:query]};#{sort}" : @params[:query] hash[:q] = @params[:query] hash[:sort] = sort if sort != nil # END OF CHANGES hash["q.op"] = @params[:operator] hash[:df] = @params[:default_field] Does this make sense? Should this be changed in the next version of the solr-ruby gem? Paul Rosen wrote: Hi all, I'm using solr-ruby 0.0.7 and am having trouble getting Sort to work. I have the following statement: req = Solr::Request::Standard.new(:start => start, :rows => max, :sort => [ :title_sort => :ascending ], :query => query, :filter_queries => filter_queries, :field_list => @field_list, :facets => {:fields => @facet_fields, :mincount => 1, :missing => true, :limit => -1}, :highlighting => {:field_list => ['text'], :fragment_size => 600}, :shards => @cores) That produces no results, but removing the :sort parameter off does give results. Here is the output from solr: INFO: [merged] webapp=/solr path=/select params = {wt = ruby &facet .limit = -1 &rows=30&start=0&facet=true&facet.mincount=1&q=(rossetti);title_sort + asc &fl = archive ,date_label ,genre ,role_ART ,role_AUT ,role_EDT ,role_PBL ,role_TRL ,source ,image ,thumbnail ,text_url ,title ,alternative ,uri ,url ,exhibit_type ,license ,title_sort ,author_sort &qt = standard &facet .missing = true &hl .fl = text &facet .field = genre &facet .field = archive &facet.field=freeculture&hl.fragsize=600&hl=true&shards=localhost: 8983/solr/merged} status=0 QTime=19 It looks to me like the string should have "&sort=title_sort+asc" instead of ";title_sort_asc" tacked on to the query, but I'm not sure about that. Any clues what I'm doing wrong? Thanks, Paul
Re: dose solr sopport distribute index storage ?
On 10/12/2009 10:49 AM, Chaitali Gupta wrote: Hi, How should we setup master and slaves in Solr? What configuration files and parameters should we need to change and how ? Thanks, Chaitali Hi - I think Shalin was pretty clear on that, it is documented very well at http://wiki.apache.org/solr/SolrReplication . I am responding, however, to explain something that took me a bit of time to wrap my brain around in the hopes that it helps you and perhaps some others. Solr in itself does not replicate. Instead, Solr relies on an underlying rsync setup to keep these indices sync'd throughout the collective. When you break it down, its simply rsync with a configuration file making all the nodes "aware" that they participate in this configuration. Wrap a cron around this between all the nodes, and they simply replicate raw data from one "master" to one or more slave. I would suggest reading up on how snapshots are preformed and how the log files are created/what they do. Of course it would benefit you to know the ins and outs of all the elements that help Solr replicate, but its been my experience that most of it has to do with those particular items. Thanks -dant
Re: dose solr sopport distribute index storage ?
Sorry for the hijack, but s replication necessary when using a cluster file-system such as GFS2. Where the files are the same for any instance of Solr? On Mon, Oct 12, 2009 at 8:36 PM, Dan Trainor wrote: > On 10/12/2009 10:49 AM, Chaitali Gupta wrote: >> >> Hi, >> >> How should we setup master and slaves in Solr? What configuration files >> and parameters should we need to change and how ? >> >> Thanks, >> Chaitali > > Hi - > > I think Shalin was pretty clear on that, it is documented very well at > http://wiki.apache.org/solr/SolrReplication . > > I am responding, however, to explain something that took me a bit of time to > wrap my brain around in the hopes that it helps you and perhaps some others. > > Solr in itself does not replicate. Instead, Solr relies on an underlying > rsync setup to keep these indices sync'd throughout the collective. When > you break it down, its simply rsync with a configuration file making all the > nodes "aware" that they participate in this configuration. Wrap a cron > around this between all the nodes, and they simply replicate raw data from > one "master" to one or more slave. > > I would suggest reading up on how snapshots are preformed and how the log > files are created/what they do. Of course it would benefit you to know the > ins and outs of all the elements that help Solr replicate, but its been my > experience that most of it has to do with those particular items. > > Thanks > -dant > >
Re: Search results order
You can reverse the sort order. In this case, you want score ascending: sort=score+asc If you just want documents without that keyword, then try using the minus sign: q=-good http://wiki.apache.org/solr/CommonQueryParameters -Nick On Mon, Oct 12, 2009 at 1:19 PM, bhaskar chandrasekar wrote: > Hi, > > I have indexed my xml which contains the following data. > > > > http://www.yahoo.com > yahoomail > yahoo has various links and gives in detail > about the all the links in it > > > http://www.rediff.com > It is a good website > Rediff has a interesting homepage > > > http://www.ndtv.com > Ndtv has a variety of good links > The homepage of Ndtv is very good > > > > > In my solr home page , when I search input as “good” > > It displays the docs which has “good” as highest occurrences by default. > > The output comes as follows. > > http://www.ndtv.com > Ndtv has a variety of good links > The homepage of Ndtv is very good > > > http://www.rediff.com > It is a good website > Rediff has a interesting homepage > > > If I need to display doc which has least occurrence of search input “good” > as first result. > > What changes should I make in solrconfig file to achieve the same?. > Any suggestions would be helpful. > > > For me the output should come as below. > > > http://www.rediff.com > It is a good website > Rediff has a interesting homepage > > > http://www.ndtv.com > Ndtv has a variety of good links > The homepage of Ndtv is very good > > > Regards > Bhaskar > > >
Re: Boosting of words
The easiest way to boost your query is to modify your query string. q=product:red color:red^10 In the above example, I have boosted the color field. If "red" is found in that field, it will get a boost of 10. If it is only found in the product field, then there will be no boost. Here's more information: http://wiki.apache.org/solr/SolrRelevancyCookbook#Boosting_Ranking_Terms Once you're comfortable with that, I suggest that you look into using the DisMax request handler. It will allow you to easily search across multiple fields with custom boost values. http://wiki.apache.org/solr/DisMaxRequestHandler -Nick On Sun, Oct 11, 2009 at 12:26 PM, bhaskar chandrasekar wrote: > Hi, > > I would like to know how can i give boosting to search input in Solr. > Where exactly should i make the changes?. > > Regards > Bhaskar > > >
Re: Is negative boost possible?
On Mon, Oct 12, 2009 at 12:03 PM, Andrzej Bialecki wrote: >> Solr never discarded non-positive hits, and now Lucene 2.9 no longer >> does either. > > Hmm ... The code that I pasted in my previous email uses > Searcher.search(Query, int), which in turn uses search(Query, Filter, int), > and it doesn't return any results if only the first clause is present (the > one with negative boost) even though it's a matching clause. > > I think this is related to the fact that in TopScoreDocCollector:48 the > pqTop.score is initialized to 0, and then all results that have lower score > that this are discarded. Perhaps this should be initialized to > Float.MIN_VALUE? Hmmm, You're actually seeing this with Lucene 2.9? The HitQueue (subclass of PriorityQueue) is pre-populated with sentinel objects with scores of -Inf, not zero. -Yonik http://www.lucidimagination.com
Re: Conditional copyField
> Hi, > I am pushing data to solr from two different sources nutch > and a cms. I have a data clash in that in nutch a copyField > is required to push the url field to the id field as it is > used as the primary lookup in the nutch solr > intergration update. The other cms also uses the url field > but also populates the id field with a different value. Now > I can't really change either source definition so is there a > way in solrconfig or schema to check if id is empty and only > copy if true or is there a better way via the > updateprocessor? copyField declaration has three attributes: source, dest and maxChars. Therefore it can be concluded that there is no way to do it in schema.xml Luckily, Wiki [1] has a quick example that implements a conditional copyField. [1] http://wiki.apache.org/solr/UpdateRequestProcessor
Re: Solr 1.4 Release Party
Where does the quote come from :) On Sat, Oct 10, 2009 at 6:38 AM, Israel Ekpo wrote: > I can't wait... > > -- > "Good Enough" is not good enough. > To give anything less than your best is to sacrifice the gift. > Quality First. Measure Twice. Cut Once. >
doing searches from within an UpdateRequestProcessor
Is it possible to do searches from within an UpdateRequestProcessor? The documents in my index reference each other. When a document is deleted, I would like to update all documents containing a reference to the deleted document. My initial idea is to use a custom UpdateRequestProcessor. Is there a better way to do this? Bill
Lucene Merge Threads
Hi, I'm attempting to optimize a pretty large index, and even though the optimize request timed out, I watched it using a profiler and saw that the optimize thread continued executing. Eventually it completed, but in the background I still see a thread performing a merge: Lucene Merge Thread #0 [RUNNABLE, IN_NATIVE] CPU time: 17:51 java.io.RandomAccessFile.readBytes(byte[], int, int) java.io.RandomAccessFile.read(byte[], int, int) org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], int, int) org.apache.lucene.store.BufferedIndexInput.refill() org.apache.lucene.store.BufferedIndexInput.readByte() org.apache.lucene.store.IndexInput.readVInt() org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) org.apache.lucene.index.SegmentTermEnum.next() org.apache.lucene.index.SegmentMergeInfo.next() org.apache.lucene.index.SegmentMerger.mergeTermInfos(FormatPostingsFieldsConsumer) org.apache.lucene.index.SegmentMerger.mergeTerms() org.apache.lucene.index.SegmentMerger.merge(boolean) org.apache.lucene.index.IndexWriter.mergeMiddle(MergePolicy$OneMerge) org.apache.lucene.index.IndexWriter.merge(MergePolicy$OneMerge) org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(MergePolicy$OneMerge) org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run() This has taken quite a while, and hasn't really been fully utilizing the machine's resources. After looking at the Lucene source, I noticed that you can set a MaxThreadCount parameter in this class. Is this parameter exposed by Solr somehow? I see the class mentioned, commented out, in my solrconfig.xml, but I'm not sure of the correct way to specify the parameter: Also, if I can specify this parameter, is it safe to just start/stop my servlet server (Tomcat) mid-merge? Thanks in advance, Gio.
Re: Lucene Merge Threads
Try this in solrconfig.xml: 1 Yes you can stop the process mid-merge. The partially merged files will be deleted on restart. We need to update the wiki? On Mon, Oct 12, 2009 at 4:05 PM, Giovanni Fernandez-Kincade wrote: > Hi, > I'm attempting to optimize a pretty large index, and even though the optimize > request timed out, I watched it using a profiler and saw that the optimize > thread continued executing. Eventually it completed, but in the background I > still see a thread performing a merge: > > Lucene Merge Thread #0 [RUNNABLE, IN_NATIVE] CPU time: 17:51 > java.io.RandomAccessFile.readBytes(byte[], int, int) > java.io.RandomAccessFile.read(byte[], int, int) > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], > int, int) > org.apache.lucene.store.BufferedIndexInput.refill() > org.apache.lucene.store.BufferedIndexInput.readByte() > org.apache.lucene.store.IndexInput.readVInt() > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentMergeInfo.next() > org.apache.lucene.index.SegmentMerger.mergeTermInfos(FormatPostingsFieldsConsumer) > org.apache.lucene.index.SegmentMerger.mergeTerms() > org.apache.lucene.index.SegmentMerger.merge(boolean) > org.apache.lucene.index.IndexWriter.mergeMiddle(MergePolicy$OneMerge) > org.apache.lucene.index.IndexWriter.merge(MergePolicy$OneMerge) > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(MergePolicy$OneMerge) > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run() > > > This has taken quite a while, and hasn't really been fully utilizing the > machine's resources. After looking at the Lucene source, I noticed that you > can set a MaxThreadCount parameter in this class. Is this parameter exposed > by Solr somehow? I see the class mentioned, commented out, in my > solrconfig.xml, but I'm not sure of the correct way to specify the parameter: > > > > > > > Also, if I can specify this parameter, is it safe to just start/stop my > servlet server (Tomcat) mid-merge? > > Thanks in advance, > Gio. >
RE: Lucene Merge Threads
Do you have to make a new call to optimize to make it start the merge again? -Original Message- From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: Monday, October 12, 2009 7:28 PM To: solr-user@lucene.apache.org Subject: Re: Lucene Merge Threads Try this in solrconfig.xml: 1 Yes you can stop the process mid-merge. The partially merged files will be deleted on restart. We need to update the wiki? On Mon, Oct 12, 2009 at 4:05 PM, Giovanni Fernandez-Kincade wrote: > Hi, > I'm attempting to optimize a pretty large index, and even though the optimize > request timed out, I watched it using a profiler and saw that the optimize > thread continued executing. Eventually it completed, but in the background I > still see a thread performing a merge: > > Lucene Merge Thread #0 [RUNNABLE, IN_NATIVE] CPU time: 17:51 > java.io.RandomAccessFile.readBytes(byte[], int, int) > java.io.RandomAccessFile.read(byte[], int, int) > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.readInternal(byte[], > int, int) > org.apache.lucene.store.BufferedIndexInput.refill() > org.apache.lucene.store.BufferedIndexInput.readByte() > org.apache.lucene.store.IndexInput.readVInt() > org.apache.lucene.index.TermBuffer.read(IndexInput, FieldInfos) > org.apache.lucene.index.SegmentTermEnum.next() > org.apache.lucene.index.SegmentMergeInfo.next() > org.apache.lucene.index.SegmentMerger.mergeTermInfos(FormatPostingsFieldsConsumer) > org.apache.lucene.index.SegmentMerger.mergeTerms() > org.apache.lucene.index.SegmentMerger.merge(boolean) > org.apache.lucene.index.IndexWriter.mergeMiddle(MergePolicy$OneMerge) > org.apache.lucene.index.IndexWriter.merge(MergePolicy$OneMerge) > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(MergePolicy$OneMerge) > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run() > > > This has taken quite a while, and hasn't really been fully utilizing the > machine's resources. After looking at the Lucene source, I noticed that you > can set a MaxThreadCount parameter in this class. Is this parameter exposed > by Solr somehow? I see the class mentioned, commented out, in my > solrconfig.xml, but I'm not sure of the correct way to specify the parameter: > > > > > > > Also, if I can specify this parameter, is it safe to just start/stop my > servlet server (Tomcat) mid-merge? > > Thanks in advance, > Gio. >
Re: two facet.prefix on one facet field in a single query
It looks like there is a JIRA covering this: https://issues.apache.org/jira/browse/SOLR-1387 On Mon, Oct 12, 2009 at 11:00 AM, Bill Au wrote: > Is it possible to have two different facet.prefix on the same facet field > in a single query. I wan to get facet counts for two prefix, "xx" and > "yy". I tried using two facet.prefix (ie &facet.prefix=xx&facet.prefix=yy) > but the second one seems to have no effect. > > Bill >
XSLT Response for multivalue fields
I am having trouble generating the xsl file for multivalue entries. I'm not sure I'm missing something, or if this is how it is supposed to function. I have to authors and I'd like to have seperate ByLine notes in my translation. Here is what solr returns normally ... Crista Souza Darrell Dunn Here is my xsl And here is what it is returning: Crista SouzaDarrell Dunn I was expecting it to return Crista Souza Darrell Dunn I've tried other variations and using templates instead but it keeps displaying the same thing, one ByLine field with things mushed together. Any clues if this is an issue with xslt code, the xslt response Writer, XALAN, or solr? I've no clues where to go from here. Any ideas to point me in the right direction appreciated. -- View this message in context: http://www.nabble.com/XSLT-Response-for-multivalue-fields-tp25865618p25865618.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 1.4 Release Party
It is my email signature. It is a sort of hybrid/mashup from different sources. On Mon, Oct 12, 2009 at 6:49 PM, Michael Masters wrote: > Where does the quote come from :) > > On Sat, Oct 10, 2009 at 6:38 AM, Israel Ekpo wrote: > > I can't wait... > > > > -- > > "Good Enough" is not good enough. > > To give anything less than your best is to sacrifice the gift. > > Quality First. Measure Twice. Cut Once. > > > -- "Good Enough" is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once.
Re: Boosting of words
Hi Nicholas, Thanks for your input.Where exactly the query q=product:red color:red^10 should be used and defined?. Help me. Regards Bhaskar --- On Mon, 10/12/09, Nicholas Clark wrote: From: Nicholas Clark Subject: Re: Boosting of words To: solr-user@lucene.apache.org Date: Monday, October 12, 2009, 2:13 PM The easiest way to boost your query is to modify your query string. q=product:red color:red^10 In the above example, I have boosted the color field. If "red" is found in that field, it will get a boost of 10. If it is only found in the product field, then there will be no boost. Here's more information: http://wiki.apache.org/solr/SolrRelevancyCookbook#Boosting_Ranking_Terms Once you're comfortable with that, I suggest that you look into using the DisMax request handler. It will allow you to easily search across multiple fields with custom boost values. http://wiki.apache.org/solr/DisMaxRequestHandler -Nick On Sun, Oct 11, 2009 at 12:26 PM, bhaskar chandrasekar wrote: > Hi, > > I would like to know how can i give boosting to search input in Solr. > Where exactly should i make the changes?. > > Regards > Bhaskar > > >
RE: Lucene Merge Threads
This didn't end up working. I got the following error when I tried to commit: Oct 12, 2009 8:36:42 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Error loading class ' 5 ' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:325) at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:81) at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:178) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:123) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:172) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:400) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:168) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.ClassNotFoundException: 5 at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.$$YJP$$doPrivileged(Native Method) at java.security.AccessController.doPrivileged(Unknown Source) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClassInternal(Unknown Source) at java.lang.Class.$$YJP$$forName0(Native Method) at java.lang.Class.forName0(Unknown Source) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:294) ... 28 more I believe it's because the MaxThreadCount is not a public property of the ConcurrentMergeSchedulerClass. You have to call this method to set it: public void setMaxThreadCount(int count) { if (count < 1) throw new IllegalArgumentException("count should be at least 1"); maxThreadCount = count; } Is that possible through the solrconfig? Thanks, Gio. -Original Message- From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] Sent: Monday, October 12, 2009 7:53 PM To: solr-user@lucene.apache.org Subject: RE: Lucene Merge Threads Do you have to make a new call to optimize to make it start the merge again? -Original Message- From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: Monday, October 12, 2009 7:28 PM To: solr-user@lucene.apache.org Subject: Re: Lucene Merge Threads Try this in solrconfig.xml: 1 Yes you can stop the process mid-merge. The partially merged files will be deleted on restart. We need to update the wiki? On Mon, Oct 12, 2009 at 4:05 PM, Giovanni Fernandez-Kincade wrote: > Hi, > I'm attempting to optimize a pretty large index, and even though the optimize > request timed out, I watched it using a pro
SpellCheck Index not building
Hi, I am using Solr 1.3 for spell checking. I am facing a strange problem of spell checking index not been generated. When I have less number of documents (less than 1000) indexed then the spell check index builds, but when the documents are more (around 40K), then the index for spell checking does not build. I can see the directory for spell checking build and there are two files under it: segments_3 & segments.gen I am using the following query to build the spell checking index: /select params={spellcheck=true&start=0&qt=contentsearch&wt=xml&rows=0&spellcheck.build=true&version=2.2 In the logs I see: INFO: [] webapp=/solr path=/select params={spellcheck=true&start=0&qt=contentsearch&wt=xml&rows=0&spellcheck.build=true&version=2.2} hits=37467 status=0 QTime=44 Please help me solve this problem. Here is my configuration: *schema.xml:* *solrconfig.xml:* dismax false false 5 true jarowinkler spellcheck textSpell a_spell a_spell ./spellchecker_a_spell 0.7 jarowinkler a_spell org.apache.lucene.search.spell.JaroWinklerDistance ./spellchecker_a_spell 0.7 -- Thanks Varun Gupta
Re: SpellCheck Index not building
On Tue, Oct 13, 2009 at 8:36 AM, Varun Gupta wrote: > Hi, > > I am using Solr 1.3 for spell checking. I am facing a strange problem of > spell checking index not been generated. When I have less number of > documents (less than 1000) indexed then the spell check index builds, but > when the documents are more (around 40K), then the index for spell checking > does not build. I can see the directory for spell checking build and there > are two files under it: segments_3 & segments.gen > > It seems that you might be running out of memory with a larger index. Can you check the logs to see if it has any exceptions recorded? -- Regards, Shalin Shekhar Mangar.
Re: SpellCheck Index not building
No, there are no exceptions in the logs. -- Thanks Varun Gupta On Tue, Oct 13, 2009 at 8:46 AM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Tue, Oct 13, 2009 at 8:36 AM, Varun Gupta > wrote: > > > Hi, > > > > I am using Solr 1.3 for spell checking. I am facing a strange problem of > > spell checking index not been generated. When I have less number of > > documents (less than 1000) indexed then the spell check index builds, but > > when the documents are more (around 40K), then the index for spell > checking > > does not build. I can see the directory for spell checking build and > there > > are two files under it: segments_3 & segments.gen > > > > > It seems that you might be running out of memory with a larger index. Can > you check the logs to see if it has any exceptions recorded? > > -- > Regards, > Shalin Shekhar Mangar. >
Re: doing searches from within an UpdateRequestProcessor
A custom UpdateRequestProcessor is the solution. You can access the searcher in a UpdateRequestProcessor. On Tue, Oct 13, 2009 at 4:20 AM, Bill Au wrote: > Is it possible to do searches from within an UpdateRequestProcessor? The > documents in my index reference each other. When a document is deleted, I > would like to update all documents containing a reference to the deleted > document. My initial idea is to use a custom UpdateRequestProcessor. Is > there a better way to do this? > Bill > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: search by some functionality
: Maybe I'm missing something, but function queries aren't involved in : determining whether a document matches or not, only its score. How is a a : custom function / value-source going to filter? it's not ... i didn't realize that was the context of the question, i was just answering the specific question about how to create custom functions. -Hoss
Re: Weird Facet and KeywordTokenizerFactory Issue
: I had to be brief as my facets are in the order of 100K over 800K documents : and also if I give the complete schema.xml I was afraid nobody would read my : long message :-) ..Hence I showed only relevant pieces of the result showing : different fields having same problem relevant is good, but you have to provide a consistent picture from start to finish ... you don't need to show 1,000 lines of facet field output, but you at least need to show the field names. : : : : : : : : : ...have you used analysis.jsp to see what terms that analyzer produces based on the strings you are indexing for your documents? becuase combined with synonyms like this... : New York, N.Y., NY => New York ...it doesn't suprise me that you're getting "New" as an indexed term. By default SynonymFilter uses whitespace to delimit tokens in multi-token synonyms, so for some input like "NY" you should see it produce the token "New" and "York" you can use the tokenizerFactory attribute on SynonymFilterFactory to specify a TokenizerFactory class to use when parsing synonyms.txt -Hoss
Re: Question about PatternReplace filter and automatic Synonym generation
: There is a Solr.PatternTokenizerFactory class which likely fits the bill in : this case. The related question I have is this - is it possible to have : multiple Tokenizers in your analysis chain? No .. Tokenizers consume CharReaders and produce a TokenStream ... what's needed here is a TokenFilter that comsumes a TOkenStream and produces a TokenStream -Hoss
Re: De-basing / re-basing docIDs, or how to effectively pass calculated values from a Scorer or Filter up to (Solr's) QueryComponent.process
: In the code I'm working with, I generate a cache of calculated values as a : by-product within a Filter.getDocidSet implementation (and within a Query-ized : version of the filter and its Scorer method) . These values are keyed off the : IndexReader's docID values, since that's all that's accessible at that level. : Ultimately, however, I need to be able to access these values much higher up : in the stack (Solr's QueryComponent.process method), so that I can inject the my suggestion would be to change your Filter to use the FieldCache to lookup the uiqueKey for your docid, and base your cache off that ... then other uses of your cache (higher up the chain) will have an idea that makes sense outside the ocntext of segment reader. -Hoss
Re: DIH and EmbeddedSolr
Hey Any reason why it may be happening ?? Regards Rohan On Sun, Oct 11, 2009 at 9:25 PM, rohan rai wrote: > > Small data set.. > > > > 11 > 11 > 11 > > > 22 > 22 > 22 > > > 33 > 33 > 33 > > > > data-config > > > > forEach="/root/test/" > url="/home/test/test_data.xml" > > > > > > > > > > schema > > > > omitNorms="true"/> > > > >multiValued="false" required="true"/> >multiValued="false" /> >multiValued="false" /> > > > id > > name > > > > > Sometime it creates sometimes it gives thread pool exception. It does not > consistently creates the index. > > Regards > Rohan > > > On Sun, Oct 11, 2009 at 3:56 PM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > >> On Sat, Oct 10, 2009 at 7:44 PM, rohan rai wrote: >> >> > This is pretty unstable...anyone has any clue...Sometimes it even >> creates >> > index, sometimes it does not ?? >> > >> > >> Most DataImportHandler tests run Solr in an embedded-like mode and they >> run >> fine. Can you tell us which version of Solr are you using? Also, any data >> which can help us reproduce the problem would be nice. >> >> -- >> Regards, >> Shalin Shekhar Mangar. >> > >
Re: Lucene Merge Threads
which version of Solr are you using? the 1 syntax was added recently On Tue, Oct 13, 2009 at 8:08 AM, Giovanni Fernandez-Kincade wrote: > This didn't end up working. I got the following error when I tried to commit: > > Oct 12, 2009 8:36:42 PM org.apache.solr.common.SolrException log > SEVERE: org.apache.solr.common.SolrException: Error loading class ' > 5 > ' > at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:310) > at > org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:325) > at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:81) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:178) > at > org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:123) > at > org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:172) > at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:400) > at > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) > at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:168) > at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) > at > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) > at > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) > at > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) > at > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) > at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) > at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) > at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) > at > org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) > at > org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) > at > org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) > at > org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) > at > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) > at java.lang.Thread.run(Unknown Source) > Caused by: java.lang.ClassNotFoundException: > 5 > > at java.net.URLClassLoader$1.run(Unknown Source) > at java.security.AccessController.$$YJP$$doPrivileged(Native Method) > at java.security.AccessController.doPrivileged(Unknown Source) > at java.net.URLClassLoader.findClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at java.net.FactoryURLClassLoader.loadClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at java.lang.ClassLoader.loadClassInternal(Unknown Source) > at java.lang.Class.$$YJP$$forName0(Native Method) > at java.lang.Class.forName0(Unknown Source) > at java.lang.Class.forName(Unknown Source) > at > org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:294) > ... 28 more > > > I believe it's because the MaxThreadCount is not a public property of the > ConcurrentMergeSchedulerClass. You have to call this method to set it: > > public void setMaxThreadCount(int count) { > if (count < 1) > throw new IllegalArgumentException("count should be at least 1"); > maxThreadCount = count; > } > > Is that possible through the solrconfig? > > Thanks, > Gio. > > -Original Message- > From: Giovanni Fernandez-Kincade [mailto:gfernandez-kinc...@capitaliq.com] > Sent: Monday, October 12, 2009 7:53 PM > To: solr-user@lucene.apache.org > Subject: RE: Lucene Merge Threads > > Do you have to make a new call to optimize to make it start the merge again? > > -Original Message- > From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] > Sent: Monday, October 12, 2009 7:28 PM > To: solr-user@lucene.apache.org > Subject: Re: Lucene Merge Threads > > Try this in solrconfig.xml: > > > 1 > > > Yes you can stop the process mid-mer