Re: Solr & JVM performance issue after 2 days

2010-12-12 Thread Hamid Vahedi
Hi Thanks for suggestion. I do following changes in solrconfig.xml : 256 false 1 2000 30 simple after that, i see one server works fine (that includes 3 cores for 3 languages) but another server (3 cores for 3 other languages) has problem after 52 hours. I will pla

Re: Solr & JVM performance issue after 2 days

2010-12-12 Thread Erick Erickson
Several things: 1> Your ramBufferSizeMB is probably too large. 128M is often the point of diminishing returns. Your situation may be different... 2> Your logs will show you what is happening with your autocommit properties. If you're really sending a 200 docs/second to your index your co

Re: Solr & JVM performance issue after 2 days

2010-12-12 Thread Hamid Vahedi
Dear Erick thanks for advice Index size on all cores is 35 GB for 35 million doc (for 3 week indexing data) Kind Regards, Hamid From: Erick Erickson To: solr-user@lucene.apache.org Sent: Sun, December 12, 2010 5:24:18 PM Subject: Re: Solr & JVM performance

Re: SOLR geospatial

2010-12-12 Thread Adam Estrada
I am particularly interested in storing and querying polygons. That sort of thing looks like its on their roadmap so does anyone know what the status is on that? Also, integration with JTS would make this a core component of any GIS. Again, anyone know what the status is on that? *What’s on the ro

Re: SOLR geospatial

2010-12-12 Thread Erick Erickson
By and large, spatial solr is being replaced by geospatial, see: http://wiki.apache.org/solr/SpatialSearch. I don't think the old spatial contrib is still included in the trunk or 3.x code bases, but I could be wrong That said, I don't know whether what you want is on the roadmap there either.

Re: SOLR geospatial

2010-12-12 Thread Dennis Gearon
We're in Alpha, heading to Alpha 2. Our requirements are simple: radius searching, and distance from center. Solr Spatial works and is current. GeoSpatial is almost there, but we're going to wait until it's released to spend time with it. We have other tasks to work on and don't want to be part

boosting, both query time and other

2010-12-12 Thread Dennis Gearon
So, our main search results has some very common fields, 'title' 'tags' 'description' What kind of boosting has everybody been using that makes them and their customers happy with these kind of fields? What are the pros and cons of query time boosting versus configured boosting? Dennis Gearon

Very high load after replicating

2010-12-12 Thread Mark
After replicating an index of around 20g my slaves experience very high load (50+!!) Is there anything I can do to alleviate this problem? Would solr cloud be of any help? thanks

Re: Very high load after replicating

2010-12-12 Thread Markus Jelsma
There can be numerous explanations such as your configuration (cache warm queries, merge factor, replication events etc) but also I/O having trouble flushing everything to disk. It could also be a memory problem, the OS might start swapping if you allocate too much RAM to the JVM leaving little

Re: Using synonyms in combination with facets

2010-12-12 Thread kirchheimer
Thanks, this is exactly the type of solution I need. -- View this message in context: http://lucene.472066.n3.nabble.com/Using-synonyms-in-combination-with-facets-tp1968584p2074692.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

2010-12-12 Thread PeterKerk
I went for the * operator, and it works now! Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2075140.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: boosting, both query time and other

2010-12-12 Thread Erick Erickson
Basically that's unanswerable, you have to look at trying various choices with your corpus. Take a look at the defaults in the dismax request handler in the example schema for a place to start... And do be aware that the "correct" values may change as your corpus acquires more data. I'm not sure w

Re: SOLR geospatial

2010-12-12 Thread Adam Estrada
I would be more than happy to help with any of the spatial testing you are working on. adam On Sun, Dec 12, 2010 at 3:08 PM, Dennis Gearon wrote: > We're in Alpha, heading to Alpha 2. Our requirements are simple: radius > searching, and distance from center. Solr Spatial works and is current. >

Re: [Multiple] RSS Feeds at a time...

2010-12-12 Thread Adam Estrada
Hi Ahmet, This is a great idea but still does not appear to be working correctly. The idea is that I want to be able to add an RSS feed and then index that feed on a schedule. My C# method looks something like this. public ActionResult Index() { try { H

[pubDate] is not converting correctly

2010-12-12 Thread Adam Estrada
All, I am having some difficu"lties parsing the pubDate field that is part of the RSS spec (I believe). I get the warning that "states, "Dec 12, 2010 6:45:26 PM org.apache.solr.handler.dataimport.DateFormatTransformer transformRow WARNING: Could not parse a Date field java.text.ParseException: Un

Re: [pubDate] is not converting correctly

2010-12-12 Thread Koji Sekiguchi
(10/12/13 8:49), Adam Estrada wrote: All, I am having some difficu"lties parsing the pubDate field that is part of the RSS spec (I believe). I get the warning that "states, "Dec 12, 2010 6:45:26 PM org.apache.solr.handler.dataimport.DateFormatTransformer transformRow WARNING: Could not parse a

Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Dennis Gearon
Which query parser did my partner set up below, and how to I parse three fields in the index for scoring and returning results? /solr/select?wt=json&indent=true&start=0&rows=20&q={!spatial%20lat=37.326375%20long=-121.892639%20radius=3%20unit=km%20threadCount=3}title:Art%20Loft Dennis Gearon

Re: full text search in multiple fields

2010-12-12 Thread Dennis Gearon
For those of us who come late to a thread, having at least the last post that you're replying to would help. Me at least ;-) Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes,

Re: Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Pradeep Singh
You said you were using a third party plugin. What do you expect people herre to know? Solr plugins don't have parameters lat, long, radius and threadCount (they have pt and dist). On Sun, Dec 12, 2010 at 4:47 PM, Dennis Gearon wrote: > Which query parser did my partner set up below, and how to I

Rebuild Spellchecker based on cron expression

2010-12-12 Thread Martin Grotzke
Hi, the spellchecker component already provides a buildOnCommit and buildOnOptimize option. Since we have several spellchecker indices building on each commit is not really what we want to do. Building on optimize is not possible as index optimization is done on the master and the slaves don't ev

Re: Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Markus Jelsma
Pradeep is right, but, check the solrconfig, the query parser is defined there. Look for the basedOn attribute in the queryParser element. > You said you were using a third party plugin. What do you expect people > herre to know? Solr plugins don't have parameters lat, long, radius and > thread

Re: Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Dennis Gearon
Well, I didn't think the plugin would be an issue. I thought the rest of the query was from the main query parser, and the plugin processes after that. so I thought the rest of query AFTER the plugin/filter part of the query was like normal,without the filter/plugin. Is that so? Using the plugi

Re: Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Dennis Gearon
And to be more specific, the fields I want to combine for *full text* are just three text fields, they're not geospatial. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mistakes, so

Re: Rebuild Spellchecker based on cron expression

2010-12-12 Thread Markus Jelsma
Maybe you've overlooked the build parameter? http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build > Hi, > > the spellchecker component already provides a buildOnCommit and > buildOnOptimize option. > > Since we have several spellchecker indices building on each commit is > not really

Re: Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Dennis Gearon
Oh, I didn't know that the syntax didn't show the parser used, that it was set in the config file. I'll talk to my partner, thanks. Dennis Gearon Signature Warning It is always a good idea to learn from your own mistakes. It is usually a better idea to learn from others’ mi

Re: Which query parser and how to do full text on mulitple fields

2010-12-12 Thread Markus Jelsma
The manual answers most questions. > Oh, I didn't know that the syntax didn't show the parser used, that it was > set in the config file. > > I'll talk to my partner, thanks. > > Dennis Gearon > > > Signature Warning > > It is always a good idea to learn from your own mistake

Re: Rebuild Spellchecker based on cron expression

2010-12-12 Thread Martin Grotzke
On Mon, Dec 13, 2010 at 2:12 AM, Markus Jelsma wrote: > Maybe you've overlooked the build parameter? > http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.build I'm aware of this, but we don't want to maintain cron-jobs on all slaves for all spellcheckers for all cores. That's why I'm think

Re: [pubDate] is not converting correctly

2010-12-12 Thread Adam Estrada
Thanks for the feedback! There are quite a few formats that can be used. I am experiencing at least 5 of them. Would something like this work? Note that there are 2 different formats separated by a comma. I don't suppose it will because there is already a comma in the first parser. I guess I am

Re: [Multiple] RSS Feeds at a time...

2010-12-12 Thread Ahmet Arslan
> What else am I missing here because the reload-config > command does not seem > to be working. Any ideas would be great! solr/dataimport?command=reload-config should return the message Configuration Re-loaded sucessfully if everything went well. May be you can check that after each reload. May

Re: Rebuild Spellchecker based on cron expression

2010-12-12 Thread Erick Erickson
I'm shooting in the dark here, but according to this: http://wiki.apache.org/solr/SolrReplication after the slave pulls the index down, it issues a commit. So if your slave is configured to generate the dictionary on commit, will it "just happen"? But a

Search with facet.pivot

2010-12-12 Thread Anders Dam
Hi, I have a minor problem in getting the pivoting working correctly. The thing is that two otherwise equal search queries behave differently, namely one is returning the search result with the facet.pivot fields below and another is returning the search result with an empty facet.pivot. This is a

Re: [pubDate] is not converting correctly

2010-12-12 Thread Lance Norskog
Nice find! This is Apache 2.0, copyright SUN. O Great Apache Elders: Is it kosher to add this to the Solr distribution? It's not in the JDK and is also com.sun.* On Sun, Dec 12, 2010 at 5:33 PM, Adam Estrada wrote: > Thanks for the feedback! There are quite a few formats that can be used. I > a

PDFBOX 1.3.1 Parsing Error

2010-12-12 Thread pankaj bhatt
hi All, While using PDFBOX 1.3.1 in APACHE TIKA 1.7 i am getting the following error to parse an PDF Document. *Error: Expected an integer type, actual='' " at org.apache.pdfbox.pdfparser.BaseParser.readInt* * * This error occurs, because of SHA-256 Encryption used by Adobe Acro

Re: PDFBOX 1.3.1 Parsing Error

2010-12-12 Thread Pradeep Singh
If the document is encrypted maybe it isn't meant to be indexed and publicly visible after all? On Sun, Dec 12, 2010 at 10:22 PM, pankaj bhatt wrote: > hi All, >While using PDFBOX 1.3.1 in APACHE TIKA 1.7 i am getting the > following error to parse an PDF Document. > *Error: Expected

Re: Rebuild Spellchecker based on cron expression

2010-12-12 Thread Martin Grotzke
Hi, when thinking further about it it's clear that https://issues.apache.org/jira/browse/SOLR-433 would be even better - we could generate the spellechecker indices on commit/optimize on the master and replicate them to all slaves. Just wondering what's the reason that this patch receives that