Re: Whether SolrCloud can support 2 TB data?

2016-09-24 Thread Erick Erickson
John: The MapReduceIndexerTool (in contrib) is intended for bulk indexing in a Hadoop ecosystem. This doesn't preclude home-grown setups of course, but it's available OOB. The only tricky bit is at the end. Either you have your Solr indexes on HDFS in which case MRIT can merge them into a live Sol

Re: Viewing the Cache Stats [SOLR 6.1.0]

2016-09-24 Thread Tomás Fernández Löbbe
That thread is pretty old and probably talking about the old(est) admin UI (before 4.0). The cache stats can be found selecting the core in the dropdown and then "Plugin/Stats". See https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=32604180 Tomás On Sat, Sep 24, 2016 at 12:14 PM

Re: slow updates/searches

2016-09-24 Thread Erick Erickson
Hmm.. About <1>: Yep, GC is one of the "more art than science" bits of Java/Solr. Siiih. About <2>: that's what autowarming is about. Particularly the filterCache and queryResultCache. My guess is that you have the autowarm count on those two caches set to zero. Try setting it to some modest

Re: Viewing the Cache Stats [SOLR 6.1.0]

2016-09-24 Thread Erick Erickson
Solr evolves pretty quickly. The link you reference is from 2006, almost 10 years ago, nothing about that link is relevant at this point. Go to http://host:port/solr. Then select a core from the drop-down. >From there, there should be a plugins/stats choice, then the "cache" section. Best, Erick

Viewing the Cache Stats [SOLR 6.1.0]

2016-09-24 Thread slee
I'm trying to view the Cache Stats. After reading this thread: Cache Stats , I can't seem to find the Statistic page in the SOLR Admin. Should I be installing some plug-in or do some configuration? -- View this message in context:

Challenges with new Solrcloud Backup/Restore functionality

2016-09-24 Thread Stephen Weiss
Hi everyone, We're very excited about SolrCloud's new backup / restore collection APIs, which should introduce some major new efficiencies into our indexing workflow. Unfortunately, we've run into some snags with it that are preventing us from moving into production. I was hoping someone on t

Re: slow updates/searches

2016-09-24 Thread Rallavagu
On 9/22/16 5:59 AM, Shawn Heisey wrote: On 9/22/2016 5:46 AM, Muhammad Zahid Iqbal wrote: Did you find any solution to slow searches? As far as I know jetty container default configuration is bit slow for large production environment. This might be true for the default configuration that com

Re: Whether SolrCloud can support 2 TB data?

2016-09-24 Thread Toke Eskildsen
Regarding a 12TB index: Yago Riveiro wrote: > Our cluster is small for the data we hold (12 machines with SSD and 32G of > RAM), but we don't need sub-second queries, we need facet with high > cardinality (in worst case scenarios we aggregate 5M unique string values) > In a peak of inserts we c

Re: Whether SolrCloud can support 2 TB data?

2016-09-24 Thread Toke Eskildsen
John Bickerstaff wrote: > As an aside - I just spoke with somone the other day who is using Hadoop > for re-index in order to save a lot of time. If you control which documents goes into which shards, then that is certainly a possibility. We have a collection with long re-indexing time (about 20

Re: Whether SolrCloud can support 2 TB data?

2016-09-24 Thread John Bickerstaff
As an aside - I just spoke with somone the other day who is using Hadoop for re-index in order to save a lot of time. I don't know the details, but I assume they're using Hadoop to call Lucene code and index documents using the map-reduce approach... This was made in their own shop - I don't thin

Re: Unsubscibe from mailing list

2016-09-24 Thread Customer
LOL, and you are senior engineer ? On 23/09/16 23:00, Khalid Galal wrote: Please, I need to unsubscribe from this mailing list. Thanks.

Re: Whether solr can support 2 TB data?

2016-09-24 Thread Toke Eskildsen
Jeffery Yuan wrote: > In our application, every data there is about 800mb raw data, we are going > to store this data for 5 years, then it's about 1 or 2 TB data. > I am wondering whether solr can support this much data? Yes it can. Or rather: You could probably construct a scenario where it

Re: Whether SolrCloud can support 2 TB data?

2016-09-24 Thread Yago Riveiro
"LucidWorks achieved 150k docs/second" This is only valid is you don't have replication, I don't know your use case, but a realistic use case normally use some type of redundancy to not lost data in a hardware failure, at least 2 replicas, more implicates a reduction of throughput. Also don't

Re: Performance Issue when querying Multivalued fields [SOLR 6.1.0]

2016-09-24 Thread Alexandre Rafalovitch
Yes, swap will switch which core the name points to. For non Cloud setup. Just remember that your directory name does not get renamed, when you are deleting the old one. Just the core name in core.properties file. Regards, Alex On 24 Sep 2016 10:28 AM, "slee" wrote: Erick / Alex, I want to