RE: Overall large size in Solr across collections

2016-04-26 Thread Allison, Timothy B.
Message- From: Zheng Lin Edwin Yeo [mailto:edwinye...@gmail.com] Sent: Thursday, April 21, 2016 12:13 AM To: solr-user@lucene.apache.org Subject: Re: Overall large size in Solr across collections Hi Shawn, Yes, I'm using the Extracting Request Handler. The 0.7GB/hr is the indexing rate

Re: Overall large size in Solr across collections

2016-04-20 Thread Zheng Lin Edwin Yeo
Hi Shawn, Yes, I'm using the Extracting Request Handler. The 0.7GB/hr is the indexing rate at which the size of the original documents which get ingested into Solr. This means that for every hour, only 0.7GB of my documents gets ingested into Solr. It will require 10 hours just to index documents

Re: Overall large size in Solr across collections

2016-04-20 Thread Shawn Heisey
On 4/20/2016 8:10 PM, Zheng Lin Edwin Yeo wrote: > I'm currently running 4 threads concurrently to run the indexing, Which > means I run the script in command prompt in 4 different command windows. > The ID has been configured in such a way that it will not overwrite each > other during the indexin

Re: Overall large size in Solr across collections

2016-04-20 Thread Zheng Lin Edwin Yeo
Hi Shawn, I'm currently running 4 threads concurrently to run the indexing, Which means I run the script in command prompt in 4 different command windows. The ID has been configured in such a way that it will not overwrite each other during the indexing. Is that considered multi-threading? The ra

Re: Overall large size in Solr across collections

2016-04-20 Thread Shawn Heisey
On 4/19/2016 10:12 PM, Zheng Lin Edwin Yeo wrote: > Thanks for the information Shawn. > > I believe it could be due to the types of file that is being indexed. > Currently, I'm indexing the EML files which are in HTML format, and they > are more rich in content (with in line images and full text),

Re: Overall large size in Solr across collections

2016-04-19 Thread Zheng Lin Edwin Yeo
Thanks for the information Shawn. I believe it could be due to the types of file that is being indexed. Currently, I'm indexing the EML files which are in HTML format, and they are more rich in content (with in line images and full text), while previously the EML files are in Plain Text format, wi

Re: Overall large size in Solr across collections

2016-04-19 Thread Shawn Heisey
On 4/19/2016 9:28 AM, Zheng Lin Edwin Yeo wrote: > Currently, the searching performance is still doing fine, but it is the > indexing that is slowing down. Not sure if increasing the RAM, or changing > to a SSD hard disk will help with the indexing speed? You need to figure out exactly what is slo

Re: Overall large size in Solr across collections

2016-04-19 Thread Zheng Lin Edwin Yeo
Hi Shawn, Currently, the searching performance is still doing fine, but it is the indexing that is slowing down. Not sure if increasing the RAM, or changing to a SSD hard disk will help with the indexing speed? Regards, Edwin On 19 April 2016 at 21:57, Shawn Heisey wrote: > On 4/18/2016 8:50

Re: Overall large size in Solr across collections

2016-04-19 Thread Shawn Heisey
On 4/18/2016 8:50 PM, Zheng Lin Edwin Yeo wrote: > Thanks for your explanation. > > I have set my segment size to 20GB under the TieredMergePolicy > > "maxMergeAtOnce">10 10 "maxMergedSegmentMB">20480 That just controls the maximum size of a segment. This defaults to 5GB. When segments reach

Re: Overall large size in Solr across collections

2016-04-18 Thread Zheng Lin Edwin Yeo
Hi Shawn, Thanks for your explanation. I have set my segment size to 20GB under the TieredMergePolicy 10 10 20480 Does it means that the segment merging will occurs more often, as it will need to keep merging during indexing till it reaches 20GB. I do have 192GB of RAM on my server which Sol

Re: Overall large size in Solr across collections

2016-04-18 Thread Shawn Heisey
On 4/18/2016 4:22 AM, Zheng Lin Edwin Yeo wrote: > I have many collections in Solr, but with only 1 shard. I found that the > index size across all the collections has passed the 1TB mark. Currently > the query speed is still normal, but the indexing speed seems to be become > slower. > > Will it a

Overall large size in Solr across collections

2016-04-18 Thread Zheng Lin Edwin Yeo
Hi, I have many collections in Solr, but with only 1 shard. I found that the index size across all the collections has passed the 1TB mark. Currently the query speed is still normal, but the indexing speed seems to be become slower. Will it affect the performance if I continue to increase the ind