Hi Paras,

thank you for your answer if you don't mind I would have a couple of
questions

I am experiencing very long indexing times I have 8 servers for currently
working on 1 instance of Solr, I thought of moving to a cloud of 4 solr
servers with 3 zookeeeper servers to distribute the load but I was
wondering if I had to start over with the indexing or if there was a tool
to load the index of a Solr into a SolrCloud by redistributing the load?

Currently in the "managed-schema" file I have configured the fields to be
indexed type="text_it" to which "lang/stopwords_it.txt" is assigned they
ask me to remove the stopwords, if I modify the "managed-schema" file I
remove the stopwords file Is it possible to re-index the database without
having to reload all the material but taking the documents already present?

Thank you
Massimiliano Randazzo

Il giorno mer 26 feb 2020 alle ore 13:26 Paras Lehana <
paras.leh...@indiamart.com> ha scritto:

> Hi Massimiliano,
>
> Is it still necessary to run the Optimize command from my application when
> > I have finished indexing?
>
>
> I guess you can stop worrying about optimizations and let Solr handle that
> implicitly. There's nothing so bad about having more segments.
>
> On Wed, 26 Feb 2020 at 16:02, Massimiliano Randazzo <
> massimiliano.randa...@gmail.com> wrote:
>
> > > Good morning,
> > >
> > > recently I went from version 6.4 to version 8.4.1, I access solerre
> > > through java applications written by me to which I have updated the
> > > solr-solrj-8.4.1.jar libraries.
> > >
> > > I am performing the OCR indexing of a newspaper of about 550,000 pages
> in
> > > production for which I have calculated at least 1,000,000,000 words
> and I
> > > am experiencing slowness I wanted to know if you could advise me on
> > changes
> > > to the configuration.
> > >
> > > The server I'm using is a server with 12 cores and 64GB of Ram, the
> only
> > > changes I made in the configuration are:
> > > Solr.in.sh <http://solr.in.sh/> file
> > > SOLR_HEAP="20480m"
> > > SOLR_JAVA_MEM="-Xms20480m -Xmx20480m"
> > > GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
> > >   -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> > > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
> > > The Java version I use is
> > > java version "1.8.0_51"
> > > Java(TM) SE Runtime Environment (build 1.8.0_51-b16)
> > > Java HotSpot(TM) 64-Bit Server VM (build 25.51-b03, mixed mode)
> > >
> > > Also comparing the solr web interface I noticed a difference in the
> > > "Overview" page in solr 6.4 it was affected Optimized and Current and
> > > allowed me to launch Optimized if necessary, in version 8.41 Optimized
> is
> > > no longer present I hypothesized that this activity is done with the
> > commit
> > > or through some operation in the backgroup, if this were so, is it
> still
> > > necessary to run the Optimize command from my application when I have
> > > finished indexing? I noticed that the Optimized function requires
> > > considerable time and resources especially in large databases
> > >
> > > Thank you for your attention
> >
> > Massimiliano Randazzo
> >
> > >
> > >
> >
>
>
> --
> --
> Regards,
>
> *Paras Lehana* [65871]
> Development Engineer, *Auto-Suggest*,
> IndiaMART InterMESH Ltd,
>
> 11th Floor, Tower 2, Assotech Business Cresterra,
> Plot No. 22, Sector 135, Noida, Uttar Pradesh, India 201305
>
> Mob.: +91-9560911996
> Work: 0120-4056700 | Extn:
> *1196*
>
> --
> *
> *
>
>  <https://www.facebook.com/IndiaMART/videos/578196442936091/>
>


-- 
Massimiliano Randazzo

Analista Programmatore,
Sistemista Senior
Mobile +39 335 6488039
email: massimiliano.randa...@gmail.com
pec: massimiliano.randa...@pec.net

Reply via email to