I this the issue is NFS. If you mode all to a NVMe or SSD local to the server indexing process will work fine. NFS is the wrong filesystem for solr.
I hope this helps. Il giorno gio 27 feb 2020 alle ore 00:03 Massimiliano Randazzo < massimiliano.randa...@gmail.com> ha scritto: > Il giorno mer 26 feb 2020 alle ore 23:42 Vincenzo D'Amore < > v.dam...@gmail.com> ha scritto: > > > Hi Massimiliano, > > > > it’s not clear how much memory you have configured for your Solr > instance. > > > > SOLR_HEAP="20480m" > SOLR_JAVA_MEM="-Xms20480m -Xmx20480m" > GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \ > -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime" > > > And I would avoid an nfs mount for the datadir. > > > > Ciao, > > Vincenzo > > > > -- > > mobile: 3498513251 > > skype: free.dev > > > > > On 26 Feb 2020, at 19:44, Massimiliano Randazzo < > > massimiliano.randa...@gmail.com> wrote: > > > > > > Il giorno mer 26 feb 2020 alle ore 19:30 Dario Rigolin < > > > dario.rigo...@comperio.it> ha scritto: > > > > > >> You can avoid commit and leave solr do autocommit at certain times. > > >> Or use softcommit if you have search queries at the same time to > answer. > > >> 550000 pages of 3500 words isn't a big deal for a solr server, what's > > the > > >> hardware configuration? > > > The solr instance runs on a server with the following configuration: > > > 12 core Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz > > > 64GB Ram > > > solr's DataDir is on a volume of another server that I mounted via NFS > (I > > > was thinking of moving the solr server to the server where the DataDir > > > resides even if it has lower characteristics 8 core Intel(R) Xeon(R) > CPU > > > E5506 @ 2.13GHz 24GB Ram) > > > > > > What's you single solr document a single newspaper? a single page? > > > > > > the single solr document refers to the single word of the document > > > > > > > > >> Do you have a solrcloud with 8 nodes? Or are you sending same document > > to 8 > > >> single solr servers? > > >> I have 8 servers that process 550,000 newspapers and all of them write > > on > > > 1 solr server only > > > > > > > > >>> Il giorno mer 26 feb 2020 alle ore 19:22 Massimiliano Randazzo < > > >>> massimiliano.randa...@gmail.com> ha scritto: > > >>> Good morning > > >>> I have the following situation I have to index the OCR of about > 550,000 > > >>> pages of newspapers counting an average of 3,500 words per page and > > >> making > > >>> a document per word the records are many. > > >>> At the moment I have 1 instance of Solr and 8 servers that read and > > write > > >>> all on the same instance at the same time, at the beginning > everything > > is > > >>> fine after a while when I add, delete or commit it gives me a TimeOut > > >> error > > >>> towards the solr server. > > >>> I suspect the problem is due to the fact that it is that I do many > > commit > > >>> operations of many docs at a time (practically if the newspaper is 30 > > >> pages > > >>> I do 105,000 add and in the end I commit), if everyone does this and > 8 > > >>> servers within walking distance of each other I think this creates > > >> problems > > >>> for Solr. > > >>> What can I do to solve the problem? > > >>> Do I make a commi to each add? > > >>> Is it possible to configure the solr server to apply the add and > delete > > >>> commands, and to commit it, the server autonomously supports the > > >> available > > >>> resources as it seems to do for the optmized command? > > >>> Reading the documentation I would have found this configuration to > > >>> implement but not if it solves my problem > > >>> <deletionPolicy class="solr.SolrDeletionPolicy"> > > >>> <str name="maxCommitsToKeep">1</str> > > >>> <str name="maxOptimizedCommitsToKeep">0</str> > > >>> <str > > >> > > > name="maxCommitAge">1DAY</str></deletionPolicy><infoStream>false</infoStream> > > >>> Thanks for your consideration > > >>> Massimiliano Randazzo > > >> -- > > >> Dario Rigolin > > >> Comperio srl - CTO > > >> Mobile: +39 347 7232652 - Office: +39 0425 471482 > > >> Skype: dario.rigolin > > > > > > > > > -- > > > Massimiliano Randazzo > > > > > > Analista Programmatore, > > > Sistemista Senior > > > Mobile +39 335 6488039 > > > email: massimiliano.randa...@gmail.com > > > pec: massimiliano.randa...@pec.net > > > > > -- > Massimiliano Randazzo > > Analista Programmatore, > Sistemista Senior > Mobile +39 335 6488039 > email: massimiliano.randa...@gmail.com > pec: massimiliano.randa...@pec.net > -- Dario Rigolin Comperio srl - CTO Mobile: +39 347 7232652 - Office: +39 0425 471482 Skype: dario.rigolin