Re: Indexing new documents - Real time or hourly?

2012-05-13 Thread Shawn Heisey
On 5/13/2012 3:49 PM, Spadez wrote: So, for the XML its easy, just update this index once a day. For the database, should I have it incrementally reindexed into SOLR in real time, or do it every hour or two? Realtime is better, but I dont know how much strain this would put on my server. If its

Re: Index Corruption

2012-05-13 Thread Lance Norskog
"Index corruption" usually means data structure problems. There is a Lucene program 'org.apache.lucene.index.CheckIndex' in the lucene core jar. If there is a problem with the data structures, this program will find it: java -cp lucene-core-XX.jar org.apache.lucene.index.CheckIndex /index/data

Re: Solr Shards multi core slower then single big core

2012-05-13 Thread arjit
Thanks Erick for the reply. I have 6 cores which doesn't contain duplicated data. every core has some unique data. What I thought was when I read it would read parallel 6 cores and join the result and return the query. And this would be efficient then reading one big core. My question is wouldn't S

Re: How add custom field to Nutch1.4?

2012-05-13 Thread forwardswing
who can help me ? -- View this message in context: http://lucene.472066.n3.nabble.com/How-add-custom-field-to-Nutch1-4-tp3983549p3983597.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Documents With large number of fields

2012-05-13 Thread Jack Krupansky
I didn't see any response. There was a similar issue recently, where someone had 400 faceted fields with 50-70 facets per query and they were running out of memory due to accumulation of the FieldCache for these faceted fields, but that was on a 3 GB system. It probably could be done, assuming

Re: highlighting unstored externally provided fields solr 1397

2012-05-13 Thread Jamie Johnson
Very true ill put together a patch in the next few days and we can start from there On Sunday, May 13, 2012, Otis Gospodnetic wrote: > 10 votes and 13 watches sounds like solid community interest to me! > > Otis > > Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.co

Re: Suddenly OOM

2012-05-13 Thread Otis Gospodnetic
Hi Jasper, Hm, not sure what it could be without a closer inspection.  If you facet or sort, those two operations can use lots of memory.  Check your Solr caches, make sure they don't have crazy high values.  Consider upgrading to Solr 3.6, it uses less memory than previous versions of Solr.  C

Re: highlighting unstored externally provided fields solr 1397

2012-05-13 Thread Otis Gospodnetic
10 votes and 13 watches sounds like solid community interest to me! Otis  Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm  > > From: Jamie Johnson >To: "solr-user@lucene.apache.org" >Sent: Saturday, May 12, 2012 11:33 PM

Re: Documents With large number of fields

2012-05-13 Thread Darren Govoni
Was there a response to this? On Fri, 2012-05-04 at 10:27 -0400, Keswani, Nitin - BLS CTR wrote: > Hi, > > My data model consist of different types of data. Each data type has its own > characteristics > > If I include the unique characteristics of each type of data, my single Solr > Document

Index Corruption

2012-05-13 Thread shubham
We have a problem in last couple of days when a particular Solr master was restarted while there was an import running . This led to the corruption of some document entities where they had multiple doc's of same unique id etc. Is this kind of corruption possible , by now I expected that Solr index

Indexing new documents - Real time or hourly?

2012-05-13 Thread Spadez
Hi, I have the following situation. I have two feeds into my SOLR: 1. XML file containing around 30,000 entries, likely to be changing significantly each day, only changing once every 24 hours 2. A SQL database containing around 30 entries, only 1-2 new entries each day, but important to be inde

Re: Indexing Wikipedia with Solr/Lucene

2012-05-13 Thread András Bártházi
Hi, Using the RegexTransformer? I guess you can make a regular expression for the wikipedia text field to extract category and external links. Bye, Andras 2012/5/13 vineet yadav > Hi all, > I want to create Lucene/Solr index of

Fwd: Indexing Wikipedia with Solr/Lucene

2012-05-13 Thread vineet yadav
Hi all, I want to create Lucene/Solr index of wikipedia xml dump. I used Solr example(http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia) to index wikipedia xml dump. Since in wikipedia, Category and external links are part of wikipedia text, I am not able to index category a

How add custom field to Nutch1.4?

2012-05-13 Thread forwardswing
I using Solr1.4 integrated with Nutch1.4,when nutch send data to solr for indexing,how to add some custome fileds? -- View this message in context: http://lucene.472066.n3.nabble.com/How-add-custom-field-to-Nutch1-4-tp3983549.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Merging two DocSets in solr

2012-05-13 Thread Dmitry Kan
Are you operating inside the SOLR source code or on the (solrj) client side? On Fri, May 11, 2012 at 12:46 PM, Ramprakash Ramamoorthy < youngestachie...@gmail.com> wrote: > Dear all, > > I get two different DocSets from two different searchers. I need > to merge them into one and get the fac