SnapPuller Exception in Slave server

2015-11-07 Thread sangeetha.subraman...@gtnexus.com
Hi All, I am using solr 4.5.1 with Master and Slave architecture. I am seeing the below exception in the Slave server SnapPuller Master at: not available. Index fetch failed. Exception: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: I don't

Re: data import extremely slow

2015-11-07 Thread Yangrui Guo
Thanks for your kind reply. I tried using both sqlentityprocessor and set batchSize to -1but didn't get any improvement. It'd be helpful if I can see data import handler's log. On Saturday, November 7, 2015, Alexandre Rafalovitch wrote: > LoL. Of course I meant SolrJ. I had to misspell the most

Re: Data import handler not indexing all data

2015-11-07 Thread Yangrui Guo
Yes the id is unique. If I only select distinct id,count(id) I get the same results. However I found this is more likely a MySQL issue. I created a new table called director1 and ran query "insert into director1 select * from director" I got only 287041 results inserted, which was the same as Solr.

Re: data import extremely slow

2015-11-07 Thread Alexandre Rafalovitch
LoL. Of course I meant SolrJ. I had to misspell the most important word of the hundreds I wrote in this thread :-) Thank you Erick for the correction. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 7 November 2015 at 19:18, Erick Erickson wro

Re: data import extremely slow

2015-11-07 Thread Erick Erickson
Alexandre, did you mean SolrJ? Here's a way to get started https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ Best, Erick On Sat, Nov 7, 2015 at 2:22 PM, Alexandre Rafalovitch wrote: > Have you thought of just using Solr. Might be faster than troubleshooting > DIH for complex scenarios

Re: Data import handler not indexing all data

2015-11-07 Thread Alexandre Rafalovitch
That's not quite the question I asked. Do a distinct on 'id' only in the database itself. If your ids are NOT unique, you need to create a composite or a virtual id for Solr. Because whatever your solrconfig.xml say is uniqueKey will be used to deduplicate the documents. If you have 10 documents wi

Re: Data import handler not indexing all data

2015-11-07 Thread Yangrui Guo
Hi thanks for the continued support. I'm really worried as my project deadline is near. It was 1636549 in MySQL vs 287041 in Solr. I put select distinct in the beginning of the query because IMDB doesn't have a table for cast & crew. It puts movie and person and their roles into one huge table 'cas

Re: Data import handler not indexing all data

2015-11-07 Thread Alexandre Rafalovitch
Just to get the paranoid option out of the way, is 'id' actually the column that has unique ids in your database? If you do "select distinct id from imdb.director" - how many items do you get? Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-s

Data import handler not indexing all data

2015-11-07 Thread Yangrui Guo
Hello I'm being troubled by solr's data import handler. My solr version is 5.3.1 and mysql is 5.5. I tried to index imdb data but found solr only partially indexed. I ran "SELECT DISTINCT COUNT(*) FROM imdb.director" and the query result was 1636549. However DIH only fetched and indexed 287041 row

Re: data import extremely slow

2015-11-07 Thread Alexandre Rafalovitch
Have you thought of just using Solr. Might be faster than troubleshooting DIH for complex scenarios. On 7 Nov 2015 3:39 pm, "Yangrui Guo" wrote: > I found multiple strange things besides the slowness. I performed count(*) > in MySQL but only one-fifth of the records were imported. Also sometimes

Re: data import extremely slow

2015-11-07 Thread Yangrui Guo
I found multiple strange things besides the slowness. I performed count(*) in MySQL but only one-fifth of the records were imported. Also sometimes dataimporthandler either doesn't import at all or only imports a portion of the table. How can I debug the importer? On Saturday, November 7, 2015, Y

Re: data import extremely slow

2015-11-07 Thread Yangrui Guo
I just realized that not everything was ok. Three child entities were not imported. Had set batchSize to -1 but again solr was stuck :( On Fri, Nov 6, 2015 at 3:11 PM, Yangrui Guo wrote: > Thanks for the reply. I just removed CacheKeyLookUp and CachedKey and used > WHERE clause instead. Everythi

Re: SolrCloud breaks and does not recover

2015-11-07 Thread Pushkar Raste
HI, To minimize GC pauses, try using G1GC and turn on 'ParallelRefProcEnabled' jvm flag. G1GC works much better for heaps > 4 GB. Lowering 'InitiatingHeapOccupancyPercent' will also help to avoid long GC pauses at the cost of more short pauses. On 3 November 2015 at 12:12, Björn Häuser wrote: >

Re: EC2 instance type recommended for SOLR?

2015-11-07 Thread Toke Eskildsen
Costi Muraru wrote: > 1) The total index size is 89.9GB (somewhere around 20 mil records). > 2) The number of requests that reach Solr is pretty low (thousands per > day), but they are heavy (long queries with frange and stuff like that). > 3) Running Solr 4.10 > 4) The focus is on quick response

EC2 instance type recommended for SOLR?

2015-11-07 Thread Costi Muraru
Hi folks, I'm trying to decide on the EC2 instance type to use for a Solr cluster. Some details about the cluster: 1) The total index size is 89.9GB (somewhere around 20 mil records). 2) The number of requests that reach Solr is pretty low (thousands per day),