That is help!

Thank you for the thoughts.


On Tue, Feb 2, 2016 at 12:17 PM, Erick Erickson <erickerick...@gmail.com>
wrote:

> Scratch that installation and start over?
>
> Really, it sounds like something is fundamentally messed up with the
> Linux install. Perhaps something as simple as file paths, or you have
> old jars hanging around that are mis-matched. Or someone manually
> deleted files from the Solr install. Or your disk filled up. Or....
>
> How sure are you that the linux setup was done properly?
>
> Not much help I know,
> Erick
>
> On Tue, Feb 2, 2016 at 10:11 AM, Troy Edwards <tedwards415...@gmail.com>
> wrote:
> > Rerunning the Data Import Handler again on the the linux machine has
> > started producing some errors and warnings:
> >
> > On the node on which DIH was started:
> >
> > WARN SolrWriter Error creating document : SolrInputDocument
> >
> > org.apache.solr.common.SolrException: No registered leader was found
> > after waiting for 4000ms , collection: collectionmain slice: shard1
> >
> >
> >
> > On the second node:
> >
> > WARN ReplicationHandler Exception while writing response for params:
> >
> command=filecontent&checksum=true&generation=1047&qt=/replication&wt=filestream&file=_1oo_Lucene50_0.tip
> >
> > java.nio.file.NoSuchFileException:
> >
> /var/solr/data/collectionmain_shard2_replica1/data/index/_1oo_Lucene50_0.tip
> >
> >
> > ERROR
> >
> > Index fetch failed :org.apache.solr.common.SolrException: Unable to
> > download _169.si completely. Downloaded 0!=466
> >
> >
> > ReplicationHandler Index fetch failed
> > :org.apache.solr.common.SolrException: Unable to download _169.si
> > completely. Downloaded 0!=466
> >
> > WARN
> > IndexFetcher File _1pd_Lucene50_0.tim did not match. expected checksum is
> > 3549855722 and actual is checksum 2062372352. expected length is 72522
> and
> > actual length is 39227
> >
> > WARN UpdateLog Log replay finished. recoveryInfo=RecoveryInfo{adds=840638
> > deletes=0 deleteByQuery=0 errors=0 positionOfStart=554264}
> >
> >
> > Any suggestions about this?
> >
> > Thanks
> >
> > On Mon, Feb 1, 2016 at 10:03 PM, Erick Erickson <erickerick...@gmail.com
> >
> > wrote:
> >
> >> The first thing I'd be looking at is how I the JDBC batch size compares
> >> between the two machines.....
> >>
> >> AFAIK, Solr shouldn't notice the difference, and since a large majority
> >> of the development is done on Linux-based systems, I'd be surprised if
> >> this was worse than Windows, which would lead me to the one thing that
> >> is definitely different between the two: Your JDBC driver and its
> settings.
> >> At least that's where I'd look first.
> >>
> >> If nothing immediate pops up, I'd probably write a small driver program
> to
> >> just access the database from the two machines and process your 10M
> >> records _without_ sending them to Solr and see what the comparison is.
> >>
> >> You can also forgo DIH and do a simple import program via SolrJ. The
> >> advantage here is that the comparison I'm talking about above is
> >> really simple, just comment out the call that sends data to Solr.
> Here's an
> >> example...
> >>
> >> https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Feb 1, 2016 at 7:34 PM, Troy Edwards <tedwards415...@gmail.com>
> >> wrote:
> >> > Sorry, I should explain further. The Data Import Handler had been
> running
> >> > for a while retrieving only about 150000 records from the database.
> Both
> >> in
> >> > development env (windows) and linux machine it took about 3 mins.
> >> >
> >> > The query has been changed and we are now trying to retrieve about 10
> >> > million records. We do expect the time to increase.
> >> >
> >> > With the new query the time taken on windows machine is consistently
> >> around
> >> > 40 mins. While the DIH is running queries slow down i.e. a query that
> >> > typically took 60 msec takes 100 msec.
> >> >
> >> > The time taken on linux machine is consistently around 2.5 hours.
> While
> >> the
> >> > DIH is running queries take about 200  to 400 msec.
> >> >
> >> > Thanks!
> >> >
> >> > On Mon, Feb 1, 2016 at 8:45 PM, Erick Erickson <
> erickerick...@gmail.com>
> >> > wrote:
> >> >
> >> >> What happens if you run just the SQL query from the
> >> >> windows box and from the linux box? Is there any chance
> >> >> that somehow the connection from the linux box is
> >> >> just slower?
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Mon, Feb 1, 2016 at 6:36 PM, Alexandre Rafalovitch
> >> >> <arafa...@gmail.com> wrote:
> >> >> > What are you importing from? Is the source and Solr machine
> collocated
> >> >> > in the same fashion on dev and prod?
> >> >> >
> >> >> > Have you tried running this on a Linux dev machine? Perhaps your
> prod
> >> >> > machine is loaded much more than a dev.
> >> >> >
> >> >> > Regards,
> >> >> >    Alex.
> >> >> > ----
> >> >> > Newsletter and resources for Solr beginners and intermediates:
> >> >> > http://www.solr-start.com/
> >> >> >
> >> >> >
> >> >> > On 2 February 2016 at 13:21, Troy Edwards <
> tedwards415...@gmail.com>
> >> >> wrote:
> >> >> >> We have a windows development machine on which the Data Import
> >> Handler
> >> >> >> consistently takes about 40 mins to finish. Queries run fine. JVM
> >> >> memory is
> >> >> >> 2 GB per node.
> >> >> >>
> >> >> >> But on a linux machine it consistently takes about 2.5 hours. The
> >> >> queries
> >> >> >> also run slower. JVM memory here is also 2 GB per node.
> >> >> >>
> >> >> >> How should I go about analyzing and tuning the linux machine?
> >> >> >>
> >> >> >> Thanks
> >> >>
> >>
>

Reply via email to