Scratch that installation and start over?

Really, it sounds like something is fundamentally messed up with the
Linux install. Perhaps something as simple as file paths, or you have
old jars hanging around that are mis-matched. Or someone manually
deleted files from the Solr install. Or your disk filled up. Or....

How sure are you that the linux setup was done properly?

Not much help I know,
Erick

On Tue, Feb 2, 2016 at 10:11 AM, Troy Edwards <tedwards415...@gmail.com> wrote:
> Rerunning the Data Import Handler again on the the linux machine has
> started producing some errors and warnings:
>
> On the node on which DIH was started:
>
> WARN SolrWriter Error creating document : SolrInputDocument
>
> org.apache.solr.common.SolrException: No registered leader was found
> after waiting for 4000ms , collection: collectionmain slice: shard1
>
>
>
> On the second node:
>
> WARN ReplicationHandler Exception while writing response for params:
> command=filecontent&checksum=true&generation=1047&qt=/replication&wt=filestream&file=_1oo_Lucene50_0.tip
>
> java.nio.file.NoSuchFileException:
> /var/solr/data/collectionmain_shard2_replica1/data/index/_1oo_Lucene50_0.tip
>
>
> ERROR
>
> Index fetch failed :org.apache.solr.common.SolrException: Unable to
> download _169.si completely. Downloaded 0!=466
>
>
> ReplicationHandler Index fetch failed
> :org.apache.solr.common.SolrException: Unable to download _169.si
> completely. Downloaded 0!=466
>
> WARN
> IndexFetcher File _1pd_Lucene50_0.tim did not match. expected checksum is
> 3549855722 and actual is checksum 2062372352. expected length is 72522 and
> actual length is 39227
>
> WARN UpdateLog Log replay finished. recoveryInfo=RecoveryInfo{adds=840638
> deletes=0 deleteByQuery=0 errors=0 positionOfStart=554264}
>
>
> Any suggestions about this?
>
> Thanks
>
> On Mon, Feb 1, 2016 at 10:03 PM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> The first thing I'd be looking at is how I the JDBC batch size compares
>> between the two machines.....
>>
>> AFAIK, Solr shouldn't notice the difference, and since a large majority
>> of the development is done on Linux-based systems, I'd be surprised if
>> this was worse than Windows, which would lead me to the one thing that
>> is definitely different between the two: Your JDBC driver and its settings.
>> At least that's where I'd look first.
>>
>> If nothing immediate pops up, I'd probably write a small driver program to
>> just access the database from the two machines and process your 10M
>> records _without_ sending them to Solr and see what the comparison is.
>>
>> You can also forgo DIH and do a simple import program via SolrJ. The
>> advantage here is that the comparison I'm talking about above is
>> really simple, just comment out the call that sends data to Solr. Here's an
>> example...
>>
>> https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
>>
>> Best,
>> Erick
>>
>> On Mon, Feb 1, 2016 at 7:34 PM, Troy Edwards <tedwards415...@gmail.com>
>> wrote:
>> > Sorry, I should explain further. The Data Import Handler had been running
>> > for a while retrieving only about 150000 records from the database. Both
>> in
>> > development env (windows) and linux machine it took about 3 mins.
>> >
>> > The query has been changed and we are now trying to retrieve about 10
>> > million records. We do expect the time to increase.
>> >
>> > With the new query the time taken on windows machine is consistently
>> around
>> > 40 mins. While the DIH is running queries slow down i.e. a query that
>> > typically took 60 msec takes 100 msec.
>> >
>> > The time taken on linux machine is consistently around 2.5 hours. While
>> the
>> > DIH is running queries take about 200  to 400 msec.
>> >
>> > Thanks!
>> >
>> > On Mon, Feb 1, 2016 at 8:45 PM, Erick Erickson <erickerick...@gmail.com>
>> > wrote:
>> >
>> >> What happens if you run just the SQL query from the
>> >> windows box and from the linux box? Is there any chance
>> >> that somehow the connection from the linux box is
>> >> just slower?
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Mon, Feb 1, 2016 at 6:36 PM, Alexandre Rafalovitch
>> >> <arafa...@gmail.com> wrote:
>> >> > What are you importing from? Is the source and Solr machine collocated
>> >> > in the same fashion on dev and prod?
>> >> >
>> >> > Have you tried running this on a Linux dev machine? Perhaps your prod
>> >> > machine is loaded much more than a dev.
>> >> >
>> >> > Regards,
>> >> >    Alex.
>> >> > ----
>> >> > Newsletter and resources for Solr beginners and intermediates:
>> >> > http://www.solr-start.com/
>> >> >
>> >> >
>> >> > On 2 February 2016 at 13:21, Troy Edwards <tedwards415...@gmail.com>
>> >> wrote:
>> >> >> We have a windows development machine on which the Data Import
>> Handler
>> >> >> consistently takes about 40 mins to finish. Queries run fine. JVM
>> >> memory is
>> >> >> 2 GB per node.
>> >> >>
>> >> >> But on a linux machine it consistently takes about 2.5 hours. The
>> >> queries
>> >> >> also run slower. JVM memory here is also 2 GB per node.
>> >> >>
>> >> >> How should I go about analyzing and tuning the linux machine?
>> >> >>
>> >> >> Thanks
>> >>
>>

Reply via email to