omitNorms for short searchable fields and ID field

2017-08-24 Thread Chaula Ganatra
Hello We have a use case with very large index split on 2 shards and 2 replicas. Each shard has around 200GB data. We want to reduce our index size and for which we tried to do omit Norms for all the fields. We have also done it for the ID field and short searchable fields. We have observed tha

What is the org.apache.solr.uninverting.FieldCacheImpl?

2017-08-24 Thread Sundeep T
Hi, In our enterprise application, we occasionally get range facet queries ordered by the timestamp field. The timestamp field is of date type. Below is the query from solr.log - 2017-08-25 05:18:51.048 INFO (qtp1321530272-90) [ x:drums] o.a.s.c.S.Request [drums] webapp=/solr path=/select pa

Re: Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

2017-08-24 Thread Scott Stults
Hi Dani, It seems like your use case falls into the Index-Heavy / Query-Heavy category, so you might try increasing your hard commit frequency to 15 seconds rather than 15 minutes: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ -Scott On Th

RE: write.lock file appears and solr wont open

2017-08-24 Thread Phil Scadden
SOLR_HOME is /var/www/solr/data The zip was actually the entire data directory which also included configsets. And yes core.properties is in var/www/solr/data/prindex (just has single line name=prindex, in it). No other cores are present. The data directory should have been unzipped before the so

Re: Solr caching the index file make server refuse serving

2017-08-24 Thread Erick Erickson
10 billion documents on 12 cores is over 800M documents/shard at best. This is _very_ aggressive for a shard. Could you give more information about your setup? I've seen 250M docs fit in 12G memory. I've also seen 10M documents strain 32G of memory. Details matter a lot. The only way I've been abl

Error when using IndexMergeTool

2017-08-24 Thread Zheng Lin Edwin Yeo
Hi, I am trying to use the IndexMergeTool to merge two indexes that are indexed in different collections into one. Both collections have the same fields, and are using Parent-Child Block-Join. When I tried to run the following command as stated from the Solr Wiki https://cwiki.apache.org/confluen

Re: write.lock file appears and solr wont open

2017-08-24 Thread Erick Erickson
It's certainly possible to move a core like this. You say you moved the core. Did you move the core.properties file as well? And did it point to the _same_ directory as the original (dataDir property)? The whole purpose of write.lock is to keep two cores from being able to update the same index at

Solr caching the index file make server refuse serving

2017-08-24 Thread 陈永龙
Hello, ENV: solrcloud 6.3 3*dell server 128G 12cores 4.3T /server 3 solr node /server 20G /node (with parameter �Cm 20G) 10 billlion documents totle Problem: When we start solrcloud ,the cached index will make memory 98% or more used . And if we continue to index document (batch

write.lock file appears and solr wont open

2017-08-24 Thread Phil Scadden
I am slowing moving 6.5.1 from development to production. After installing solr on the final test machine, I tried to supply a core by zipping up the data directory on development and unzipping on test. When I go to admin I get: [cid:image001.png@01D31DA9.1B0EF540] Write.lock obviously causing a

Re: Slow when query with cursorMark

2017-08-24 Thread Erick Erickson
I would guess that your first query is hitting the queryResultCache. Getting a 5ms response time is highly suspicious, there are very few queries that take that little time to execute unless they're just hitting that cache. Best, Erick On Thu, Aug 24, 2017 at 1:21 AM, hawk wrote: > Hi, > > Our s

Re: indexed vs queried documents count mismatch

2017-08-24 Thread Erick Erickson
I'd try to isolate one of the "extra" documents and see what it looked like. My guess is that your SQL statement is somehow (tm) generating more docs than in the DB. Best, Erick On Thu, Aug 24, 2017 at 12:58 PM, Satya Marivada wrote: > Source is database: 17,920,274 records in db > Indexed docu

Re: Running Solr-Server inside other process

2017-08-24 Thread Erick Erickson
Solr has the EmbeddedSolrServer is that what you're looking for? Best, Erick On Thu, Aug 24, 2017 at 11:15 AM, S G wrote: > Hi, > > We are looking to run Solr in-memory for testing and examples. > > For example: > 1) Cassandra has cassandra-unit: > https://github.com/jsevellec/cassandra-unit/wik

Re: SPLITSHARD in SOLR 5.5.1

2017-08-24 Thread Erick Erickson
This is strange. You can use the "bin/solr zk cp..." commands to bring the collection state.json down to your local machine and do anything you want to do with it, including marking the old shard as "active" and removing the entries for the sub-shards. It's just a text file after all. I'd also dele

Re: autoSoftCommit doesn't work as expected / documented

2017-08-24 Thread Walter Underwood
1. I would call this a bug. It should be equal to or greater than. 2. A design that needs a soft commit after every document is also a bug. That is big performance hit. Soft commits are cheaper than hard commits, but they are not free. If you want a commit after every document, use a database. T

Re: SPLITSHARD in SOLR 5.5.1

2017-08-24 Thread Vannia Rajan
Binoy, I don't see anything wrong with the logs. The newly split shards are up, with the parent set to inactive. But before restart, the new shards had all of the data as in parent. After a restart, it wiped off to 0. I checked the data directory, the data is completely gone. Below, I attach the

Re: autoSoftCommit doesn't work as expected / documented

2017-08-24 Thread Angel Todorov
Hello, So I can never have soft auto commit after each update ? This sounds like a bug to me. Thanks Angrl On Thu, Aug 24, 2017 at 9:36 PM Susheel Kumar wrote: > I believe the commit triggers on > condition (no of cached docs > maxDocs > then commit). So that's why you need one extra... > > O

Re: indexed vs queried documents count mismatch

2017-08-24 Thread Satya Marivada
Source is database: 17,920,274 records in db Indexed documents from admin screen: 17,920,274 Query the collection: 17,948,826 Thanks, Satya On Thu, Aug 24, 2017 at 3:44 PM Susheel Kumar wrote: > Does this happen again if you repeat above? How much total docs does DIH > query/source shows to co

Re: indexed vs queried documents count mismatch

2017-08-24 Thread Susheel Kumar
Does this happen again if you repeat above? How much total docs does DIH query/source shows to compare with Solr? On Thu, Aug 24, 2017 at 3:23 PM, Satya Marivada wrote: > Hi, > > I have a weird situation, when I index the documents from admin console by > doing "clean, commit and optimize", when

indexed vs queried documents count mismatch

2017-08-24 Thread Satya Marivada
Hi, I have a weird situation, when I index the documents from admin console by doing "clean, commit and optimize", when the indexing is completed, it showed 17,920,274 documents are indexed. When queried from the solr admin console from the query tab for all the documents in that collection, it s

Re: autoSoftCommit doesn't work as expected / documented

2017-08-24 Thread Susheel Kumar
I believe the commit triggers on > condition (no of cached docs > maxDocs then commit). So that's why you need one extra... On Thu, Aug 24, 2017 at 6:59 AM, Angel Todorov wrote: > I also tested, of course, by setting a value of 0, expecting that it would > work in the way I expect it to , but u

Running Solr-Server inside other process

2017-08-24 Thread S G
Hi, We are looking to run Solr in-memory for testing and examples. For example: 1) Cassandra has cassandra-unit: https://github.com/jsevellec/cassandra-unit/wiki/How-to-use-it-in-your-code 2) Storm has local-mode: http://storm.apache.org/releases/current/Local-mode.html Is there something simil

solrcloud restore error

2017-08-24 Thread Xuguang Su
Hi, I have a Solrcloud with 3 nodes and 2 replicas for each node. I did a successful backup but when I tried to restore with it later I got this error: 5005694 org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Could not restore core Could not restore core500 org.apach

RE: Solr uses lots of shared memory!

2017-08-24 Thread Markus Jelsma
Hello Bernd, According to the man page, i should get a list of stuff in shared memory if i invoke it with just a PID. Which shows a list of libraries that together account for about 25 MB's shared memory usage. Accoring to ps and top, the JVM uses 2800 MB shared memory (not virtual), that leave

Re: Custom StoredFieldVisitor in Solr

2017-08-24 Thread Rick Leir
Jamie, what is the use case? Cheers -- Rick On August 23, 2017 11:30:38 AM MDT, Jamie Johnson wrote: >I thought I had asked this previously, but I can't find reference to it >now. I am interested in using a custom StoredFieldVisitor in Solr and >after spelunking through the code for a little it

Re: SPLITSHARD in SOLR 5.5.1

2017-08-24 Thread Binoy Dalal
Vanniarajan, Do you see errors in your solr logs when you reboot? If so paste them here. Are both the shards up? Have you checked the data directory for both and confirmed that the data is gone? On Thu 24 Aug, 2017, 18:04 Vannia Rajan wrote: > Hi, > > I'm facing weird issues on using SPLITSHARD

Re: Data Dir from a core with init-failure

2017-08-24 Thread davidvartanian
Another great question is: How to clean the mess *WITHOUT* restarting the server. -- View this message in context: http://lucene.472066.n3.nabble.com/Data-Dir-from-a-core-with-init-failure-tp4335806p4351942.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

2017-08-24 Thread Daniel Ortega
Hi Scott, In our indexing service we are using that client too (org.apache.solr.client.solrj.impl.CloudSolrClient) :) This is out Update Request Processor chain configuration: true hash false solr.processor.Lookup3Signature < updateRequestProcessorChain processor="signature" name="dedupe">

Re: Solr uses lots of shared memory!

2017-08-24 Thread Bernd Fehling
Just an idea, how about taking a dump with jmap and using MemoryAnalyzerTool to see what is going on? Regards Bernd Am 24.08.2017 um 11:49 schrieb Markus Jelsma: > Hello Shalin, > > Yes, the main search index has DocValues on just a few fields, they are used > for facetting and function querie

Re: Excessive resources consumption migrating from Solr 6.6.0 Master/Slave to SolrCloud 6.6.0 (dozen times more resources)

2017-08-24 Thread Scott Stults
Hi Daniel, SolrJ has a few client implementations to choose from: CloudSolrClient, ConcurrentUpdateSolrClient, HttpSolrClient, LBHttpSolrClient. You said your query service uses CloudSolrClient, but it would be good to verify which implementation your indexing service uses. One of the problems yo

SPLITSHARD in SOLR 5.5.1

2017-08-24 Thread Vannia Rajan
Hi, I'm facing weird issues on using SPLITSHARD on fairly large shards (150GB shard size), using SOLR 5.5.1 in cloud mode (4 nodes, 1 shard per node). After issuing SPLITSHARD, I successfully get the sub-shards activated and my parent shard set "inactive". The newly split shards counts equal the

Re: Please Unsubscribe from the mailing list

2017-08-24 Thread Susheel Kumar
Just send the email to solr-user-unsubscr...@lucene.apache.org. DO not send to solr-user mailing list. http://lucene.apache.org/solr/community.html On Thu, Aug 24, 2017 at 4:56 AM, Manoj Agrawal wrote: > Hi Team, > > Please unsubscribe me from the mailing list. > > Thanks, > Manoj > > On Thu,

Re: autoSoftCommit doesn't work as expected / documented

2017-08-24 Thread Angel Todorov
I also tested, of course, by setting a value of 0, expecting that it would work in the way I expect it to , but unfortunately - it doesn't. Nothing is committed in that case. Thanks On Thu, Aug 24, 2017 at 1:54 PM, Angel Todorov wrote: > Hi all, > > I have this in my config: > > > >

autoSoftCommit doesn't work as expected / documented

2017-08-24 Thread Angel Todorov
Hi all, I have this in my config: 1 My expectation is that SOLR will make changes available in the index after every document change. But this doesn't work - I need to do _ another _ update in order for the changes to be visible. Basically it's like: if maxDocs is 1, it beh

Re: JSON facet SUM precision and accuracy is incorrect

2017-08-24 Thread pchankh
Dear Yonik, Thanks for the response. As many companies using solr are using function to compute results, the loss of precision is a critical problem to resolve and is important to us. How can we escalate this given that there is already a JIRA filed in 2016. Appreciate if the Solr team can help

RE: Solr uses lots of shared memory!

2017-08-24 Thread Markus Jelsma
Hello Shalin, Yes, the main search index has DocValues on just a few fields, they are used for facetting and function queries, we started using DocValues when 6.0 was released. Most fields are content fields for many languages. I don't think it is going to be DocValues because the max shared me

RE: Solr uses lots of shared memory!

2017-08-24 Thread Markus Jelsma
Hello Erick, I know the article, it is about virtual memory. My problem is with shared memory. Correct me if i am wrong, but MMApped files do not occupy shared but virtual instead. If i am wrong, the article must be rewritten. Our main searchers show very normal numbers for virtual, which is in

Slow when query with cursorMark

2017-08-24 Thread hawk
Hi, Our system is running query with paging, when the page no increase, the query time takes more time. I found that in the Solr document, it provides the cursor mark functionality to optimize the performance of deep paging. But I found the query using normal paging (start and rows) is much fas

unsubsribe

2017-08-24 Thread matthew.fowler
Unsubscribe This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments

Re: Please Unsubscribe from the mailing list

2017-08-24 Thread Manoj Agrawal
Hi Team, Please unsubscribe me from the mailing list. Thanks, Manoj On Thu, Aug 24, 2017 at 3:55 AM, Manoj Agrawal wrote: > Hi Team, > > Please unsubscribe me from the mailing list. > > Thanks, > Manoj >

Please Unsubscribe from the mailing list

2017-08-24 Thread Manoj Agrawal
Hi Team, Please unsubscribe me from the mailing list. Thanks, Manoj

Re: EdgeNGramFilterFactory More specific ?

2017-08-24 Thread Guilleret Florian
Thanks NGramFilter Work perfectly for my case ;) Guilleret Florian Tel : +33 6 21 28 43 06 2017-08-24 10:41 GMT+02:00 Markus Jelsma : > NGramFilter! > > > > -Original message- > > From:Guilleret Florian > > Sent: Thursday 24th August 2017 10:30 > > To: sol

RE: EdgeNGramFilterFactory More specific ?

2017-08-24 Thread Markus Jelsma
NGramFilter! -Original message- > From:Guilleret Florian > Sent: Thursday 24th August 2017 10:30 > To: solr-user@lucene.apache.org > Subject: EdgeNGramFilterFactory More specific ? > > I use a fieldtype who use EdgeNGramFilterFactory min 1 max 15. It work > perfectly and i got this b

EdgeNGramFilterFactory More specific ?

2017-08-24 Thread Guilleret Florian
I use a fieldtype who use EdgeNGramFilterFactory min 1 max 15. It work perfectly and i got this behavior : For string like : DU2083 I got : D DU DU2 DU20 DU208 DU2083 Its ok but i need do deeper. I want this behavior : DU2083 I need to split like : D DU DU2 DU20 DU208 DU2083 3 83 083 2083