Hi Paras, thank you for your answer if you don't mind I would have a couple of questions
I am experiencing very long indexing times I have 8 servers for currently working on 1 instance of Solr, I thought of moving to a cloud of 4 solr servers with 3 zookeeeper servers to distribute the load but I was wondering if I had to start over with the indexing or if there was a tool to load the index of a Solr into a SolrCloud by redistributing the load? Currently in the "managed-schema" file I have configured the fields to be indexed type="text_it" to which "lang/stopwords_it.txt" is assigned they ask me to remove the stopwords, if I modify the "managed-schema" file I remove the stopwords file Is it possible to re-index the database without having to reload all the material but taking the documents already present? Thank you Massimiliano Randazzo Il giorno mer 26 feb 2020 alle ore 13:26 Paras Lehana < paras.leh...@indiamart.com> ha scritto: > Hi Massimiliano, > > Is it still necessary to run the Optimize command from my application when > > I have finished indexing? > > > I guess you can stop worrying about optimizations and let Solr handle that > implicitly. There's nothing so bad about having more segments. > > On Wed, 26 Feb 2020 at 16:02, Massimiliano Randazzo < > massimiliano.randa...@gmail.com> wrote: > > > > Good morning, > > > > > > recently I went from version 6.4 to version 8.4.1, I access solerre > > > through java applications written by me to which I have updated the > > > solr-solrj-8.4.1.jar libraries. > > > > > > I am performing the OCR indexing of a newspaper of about 550,000 pages > in > > > production for which I have calculated at least 1,000,000,000 words > and I > > > am experiencing slowness I wanted to know if you could advise me on > > changes > > > to the configuration. > > > > > > The server I'm using is a server with 12 cores and 64GB of Ram, the > only > > > changes I made in the configuration are: > > > Solr.in.sh <http://solr.in.sh/> file > > > SOLR_HEAP="20480m" > > > SOLR_JAVA_MEM="-Xms20480m -Xmx20480m" > > > GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \ > > > -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps > > > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime" > > > The Java version I use is > > > java version "1.8.0_51" > > > Java(TM) SE Runtime Environment (build 1.8.0_51-b16) > > > Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode) > > > > > > Also comparing the solr web interface I noticed a difference in the > > > "Overview" page in solr 6.4 it was affected Optimized and Current and > > > allowed me to launch Optimized if necessary, in version 8.41 Optimized > is > > > no longer present I hypothesized that this activity is done with the > > commit > > > or through some operation in the backgroup, if this were so, is it > still > > > necessary to run the Optimize command from my application when I have > > > finished indexing? I noticed that the Optimized function requires > > > considerable time and resources especially in large databases > > > > > > Thank you for your attention > > > > Massimiliano Randazzo > > > > > > > > > > > > > -- > -- > Regards, > > *Paras Lehana* [65871] > Development Engineer, *Auto-Suggest*, > IndiaMART InterMESH Ltd, > > 11th Floor, Tower 2, Assotech Business Cresterra, > Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305 > > Mob.: +91-9560911996 > Work: 0120-4056700 | Extn: > *1196* > > -- > * > * > > <https://www.facebook.com/IndiaMART/videos/578196442936091/> > -- Massimiliano Randazzo Analista Programmatore, Sistemista Senior Mobile +39 335 6488039 email: massimiliano.randa...@gmail.com pec: massimiliano.randa...@pec.net