Hello Andrea,

I'm really sorry for the delay of my answer but I beed more information before 
answer you.

Yes 5.365.213 is the numDocs you got just after the sync and yes 4.537.651 is 
the numDocs you got in the staging server after the reindexing and the 
colleague who realized the rsync confirm that it has been entirely completed.

I don't see any transaction not completed that normaly means that the 
indexation is completed. That's why I don't understand the difference.

Kind Regards

Matthieu

----Original Message-----
From: Andrea Gazzarini [mailto:a.gazzar...@sease.io] 
Sent: samedi 9 février 2019 16:56
To: solr-user@lucene.apache.org
Subject: Re: Solr Index Size after reindex

Yes, those numbers are different and that should explain the different size. I 
think you should be able to find some information in the Alfresco or Solr log. 
There must be a reason about the missing content. 
For example, are those numbers coming from two comparable snapshots? In other 
words, I imagine that at a given moment X you rsync-ed the two servers

  * 5.365.213 is the numDocs you got just after the sync, isn't it?
  * 4.537.651 is the numDocs you got in the staging server after the
    reindexing isn't it? Are you sure the whole reindexing is completed?

MaxDocs is the number of documents you have in the index including the deleted 
docs not yet cleared by a merge. In the console you should also see the 
"Deleted docs" count which should be equal to (maxdocs - numdocs)

Ciao

Andrea

On 08/02/2019 15:53, Mathieu Menard wrote:
>
> Hi Andrea,
>
> I've checked this information and here is the result:
>
>       
>
> PRODUCTION
>
>       
>
> STAGING
>
> *numDocs*
>
>       
>
> 5.365.213
>
>       
>
> 4.537.651
>
> *MaxDoc*
>
>       
>
> 5.845.469
>
>       
>
> 5.129.556
>
> It seems that there is more than 800.00 docs in PRODUCTION that will 
> explain the size of indexes more important. But there is a thing that 
> I don't understand, we have copied the DB and the contenstore the 
> numDocs for the two environments should be the same no?
>
> Could you also explain me the meaning of the maxDocs value pleases?
>
> Thanks
>
> Matthieu
>
> *From:*Andrea Gazzarini [mailto:a.gazzar...@sease.io]
> *Sent:* vendredi 8 février 2019 14:54
> *To:* solr-user@lucene.apache.org
> *Subject:* Re: Solr Index Size after reindex
>
> Hi Mathieu,
> what about the docs in the two infrastructures? Do they have the same 
> numbers (numdocs / maxdocs)? Any meaningful message (error or not) in 
> log files?
>
> Andrea
>
> On 08/02/2019 14:19, Mathieu Menard wrote:
>
>     Hello,
>
>     I would like to have your point of view about an observation we
>     have made on our two alfresco install (Production and Staging
>     environment) and more specifically on the size of our solr indexes
>     on these two environments.
>
>     Regularly we do a rsync between the Production and the Staging
>     environment, we make a copy of the Alfresco's DB and a copy of the
>     entire contenstore after that we reindex all the alfresco content.
>
>     We have noticed that for the production environment we have 19 Gb
>     of indexes while in the staging we have "only" 11. Gb of indexes.
>     We have some difficulties to understand this difference because we
>     assume that the indexes optimization in the same for a full
>     reindex or for the normal use of solr.
>
>     I've verified the configuration between the two solr instances and
>     I don't see any differences could you help me to better understand
>      this phenomenon.
>
>     Here you can find some information about our two environment, if
>     you need more details, I will give you as soon as possible:
>
>       
>
>     PRODUCTION
>
>       
>
>     STAGING
>
>     Alfresco version
>
>       
>
>     5.1.1.4
>
>       
>
>     5.1.1.4
>
>     Solr Version
>
>       
>
>       
>
>     Java version
>
>       
>
>       
>
>     Linux Machine
>
>       
>
>     See Staging_caracteristics.txt file in attachment
>
>       
>
>     See Staging_caracteristics.txt file in attachment
>
>     Please let me know if you any other information I will sent it to
>     you rapidly.
>
>     Kind Regards
>
>     Matthieu
>

Reply via email to