Solr old log files are not archived or removed automatically.

2015-08-10 Thread Adrian Liew
Hi there, I am using Solr v.5.2.1 on my local machine. I realized that old log files are not removed in a timely manner by log4j. The logs which I am referring to are the log files that reside within solr_directory\server\logs. So far I have previous two months' worth of log files accumulated i

docFreq vs no docs returned

2015-08-10 Thread Martin Leopold
Hi, I'm new to Solr and I'm trying to understand how relevancy is computed in Solr. I've run across the following difference that puzzles me: I'm seeing a difference between the "docFreq" returned by the debugQuery output and the number of documents for that search term. As understand docFreq it c

Exception on updating Solr Documents

2015-08-10 Thread viks1234
Hi All, I am facing an issue on updating solr Documents. I am using SolrJ and the code works fine if the document is present . However if the document is not present and we try to update , an exception is thrown. I can handle exception for single document update , but sometimes I require to updat

Renaming Solr webapp in release 5.2.1

2015-08-10 Thread saurabh tewari
Hi, Is there any way through which one can change the name of solr webapp? I used to name my instances differently according to their usage and I'd like to keep doing that, since I've mapped the solr APIs in my code using those names only. Regards, Saurabh Tewari

Re: docFreq vs no docs returned

2015-08-10 Thread Mikhail Khludnev
Hello Martin, I think two docs has been deleted. Try to force merge to 1 segments and repeat observation. Usually numFound(*:*) = maxDocs - numDeleted. On Mon, Aug 10, 2015 at 11:05 AM, Martin Leopold wrote: > Hi, > I'm new to Solr and I'm trying to understand how relevancy is computed > in Sol

Re: Renaming Solr webapp in release 5.2.1

2015-08-10 Thread Modassar Ather
May be you can look at solr-jetty-context.xml present under server/context folder of solr. Regards, Modassar On Mon, Aug 10, 2015 at 2:42 PM, saurabh tewari wrote: > Hi, > > Is there any way through which one can change the name of solr webapp? I > used to name my instances differently accordin

Re: Streaming API running a simple query

2015-08-10 Thread Selvam
Hi, Thanks, that seems to be working! On Sat, Aug 8, 2015 at 9:28 PM, Joel Bernstein wrote: > This sounds doable using nested merge functions like this: > > merge(search(...), >merge(search(...), search(),...), ...) > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Sat, Aug

Re: Exception on updating Solr Documents

2015-08-10 Thread Mikhail Khludnev
Are talking about https://issues.apache.org/jira/browse/SOLR-445 ? On Mon, Aug 10, 2015 at 11:11 AM, viks1234 wrote: > Hi All, > > I am facing an issue on updating solr Documents. > I am using SolrJ and the code works fine if the document is present . > However if the document is not present an

Make search faster in Solr

2015-08-10 Thread Nitin Solanki
Hi, I have 32 shards and single replica of each shards having 4 nodes over Solr cloud. I have indexed 16 million documents. Without cache, total time taken to search a document is 0.2 second. And with cache is 0.04 second. I don't do anything of cache. Caches are set by default in solrconfi

Re: Exception on updating Solr Documents

2015-08-10 Thread viks1234
Hi, I guess the issue is mostly same. I get the issue when I add several documents for update. So what is the final solution for this ? -- View this message in context: http://lucene.472066.n3.nabble.com/Exception-on-updating-Solr-Documents-tp4222114p4222145.html Sent from the Solr - User mail

Changing solr.Date to solr.TrieDate

2015-08-10 Thread saurabh tewari
Hi, I recently started to move my solr index from 4.8 to 5.2 via replication from 4.8 master to 5.2 slave. I replaced all the primitive fieldtypes to Trie-fields, since the original ones are removed from 5.2 . Replication went successful, but when I fired a query on slave, it shows a class-cast ex

Is cache enabled by default?

2015-08-10 Thread Nitin Solanki
Hi, I have commented "querycache, filterquerycache and document cache". Still searching is using cache. why so? 2) First time searching a query, it takes time and afterwards, it can't due to cache, I know that. But how to make search always faster even first time searching?

Re: Is cache enabled by default?

2015-08-10 Thread davidphilip cherian
Hi Nitin, You can just set the attributes of caches to zero. size="0" initialSize="0"autowarmCount="0" and so on. Why do you want to turn off caches btw? Any specific reasons?IMO, documents are cached in OS disc cache space which you may not able to control. It is OS specific. I don't quite

Re: Changing solr.Date to solr.TrieDate

2015-08-10 Thread davidphilip cherian
Hi Saurabh, You could probably try command=fetchindex functionality. http://node:port /solr//replication?command=fetchindex&masterUrl=http://node:port /solr/ Master url should be the existing index solr instance url. On Mon, Aug 10, 2015 at 6:37 PM, saurabh tewari wrote: > Hi, > > I recently s

New Solr installation fails to create collection/core

2015-08-10 Thread Scott Derrick
I've sent this twice and it appears the list is dropping it. I think including the entire log output exceeds the size limit, so I've truncated the log in hopes it will get through. I installed solr 5.2.1 onCentOS 6 got the tgz file and extracted the install_solr_service.sh file ran the instal

Re: Make search faster in Solr

2015-08-10 Thread davidphilip cherian
Hi Nitin, 32 shards for 16 million documents is too much. 2 shards should suffice considering your document sizes are moderate. Caches are to be monitored and tuned accordingly. You should study about caches a bit here https://cwiki.apache.org/confluence/display/solr/Query+Settings+in+SolrConfig

Re: plagiarism Checker with solr

2015-08-10 Thread Jack Krupansky
The simplest and maybe best approach is to use the edismax query parser and query all terms using the OR operator and use the PF1, PF2, and PF3 parameters to boost phrases so that the closest matches rank higher. No need to do any special indexing. You can tune the ps, ps2, and ps3 parameters as

Re: New Solr installation fails to create collection/core

2015-08-10 Thread Erick Erickson
Yeah, be sure to send mail as "plain text", the spam filters are pretty aggressive when it comes to anything else. At a glance this is your problem: .AccessDeniedException: /var/solr/data/demo/data I'm guessing the user you run Solr as doesn't have create permissions there. Best, Erick On Mon,

Re: Changing solr.Date to solr.TrieDate

2015-08-10 Thread Erick Erickson
bq: I replaced all the primitive fieldtypes to Trie-fields, since the original ones are removed from 5.2 . You have to explain this better. What were "the original ones"? pdate? If so, you need to re-index from scratch after changing from pdate to tdate. Simply changing the definitions in the sch

Re: Exception on updating Solr Documents

2015-08-10 Thread Erick Erickson
This has never been really satisfactory, when sending a bunch of docs to Solr it's unclear which ones fail or even which ones were committed. Most people simply re-send the docs either in smaller and smaller batches or one-at-a-time when a batch fails. It's tricky, especially in the asynch case. I

Re: Solr old log files are not archived or removed automatically.

2015-08-10 Thread Erick Erickson
1> how did you install/run your Solr? As a service or "regular"? See the reference guide, "Permanent Logging Settings" for some info on the difference there. 2> what does your log4j.properties file look like? Best, Erick On Mon, Aug 10, 2015 at 12:13 AM, Adrian Liew wrote: > Hi there, > > I am

Cluster down for long time after zookeeper disconnection

2015-08-10 Thread danny teichthal
Hi, We are using Solr cloud with solr 4.10.4. On the passed week we encountered a problem where all of our servers disconnected from zookeeper cluster. This might be ok, the problem is that after reconnecting to zookeeper it looks like for every collection both replicas do not have a leader and are

Re: New Solr installation fails to create collection/core

2015-08-10 Thread Scott Derrick
I ran the command bin/solr create -c demo as root! Not sure how I can raise my permissions any more? Scott Original Message Subject: Re: New Solr installation fails to create collection/core From: Erick Erickson To: solr-user@lucene.apache.org Date: 08/10/2015 09:23 AM Yea

Re: Concurrent Indexing and Searching in Solr.

2015-08-10 Thread Erick Erickson
bq: I changed *2* to *100*. And apply simultaneous searching using 100 workers. Do not do this. This has nothing to do with the number of searcher threads. And with your update rate, especially if you continue to insist on adding commit=true to every update request, this will explode your memory

Re: New Solr installation fails to create collection/core

2015-08-10 Thread Alexandre Rafalovitch
> Setup new core instance directory: > /var/solr/data/demo > ... > Failed to create core 'demo' due to: Error CREATEing SolrCore 'demo': Unable > to create core [demo] Caused by: /var/solr/data/demo/data Was one of these entries typed by hand? Because I see 'data/demo' and 'demo/data'. Which does

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread Alexandre Rafalovitch
Did you look at release notes for Solr versions after your own? I am pretty sure some similar things were identified and/or resolved for 5.x. It may not help if you cannot migrate, but would at least give a confirmation and maybe workaround on what you are facing. Regards, Alex. Solr Anal

Re: New Solr installation fails to create collection/core

2015-08-10 Thread Scott Derrick
No I copied that from the console. I haven't changed any config files, just installed using the install_solr_service.sh file I've never used Solr before and it seems like its broken right out of the box? Scott Original Message Subject: Re: New Solr installation fails t

Re: New Solr installation fails to create collection/core

2015-08-10 Thread Shawn Heisey
On 8/10/2015 10:21 AM, Scott Derrick wrote: > No I copied that from the console. > > I haven't changed any config files, just installed using the > install_solr_service.sh file > > I've never used Solr before and it seems like its broken right out of > the box? I think you're running into the prob

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread danny teichthal
Hi Alexander , Thanks for your reply, I looked at the release notes. There is one bug fix - SOLR-7503 – register cores asynchronously. It may reduce the registration time since it is done on parallel, but still, 3 minutes (leaderVoteWait) is a long

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread Erick Erickson
I didn't see the zk timeout you set (just skimmed). But if your Zookeeper was down _very_ termporarily, it may suffice to up the ZK timeout. The default in the 10.4 time-frame (if I remember correctly) was 15 seconds which has proven to be too short in many circumstances. Of course if your ZK was

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread danny teichthal
Erick, I assume you are referring to zkClientTimeout, it is set to 30 seconds. I also see these messages on Solr side: "Client session timed out, have not heard from server in 48865ms for sessionid 0x44efbb91b5f0001, closing socket connection and attempting reconnect". So, I'm not sure what was th

Re: New Solr installation fails to create collection/core

2015-08-10 Thread Scott Derrick
Shawn, That was it exactly! Thanks, Scott Original Message Subject: Re: New Solr installation fails to create collection/core From: Shawn Heisey To: solr-user@lucene.apache.org Date: 08/10/2015 11:08 AM On 8/10/2015 10:21 AM, Scott Derrick wrote: No I copied that f

Re: SolrNet and deep pagination

2015-08-10 Thread Chris Hostetter
: Has anyone worked with deep pagination using SolrNet? The SolrNet : version that I am using is v0.4.0.2002. I followed up with this article, : https://github.com/mausch/SolrNet/blob/master/Documentation/CursorMark.md : , however the version of SolrNet.dll does not expose the a StartOrCursor

codec mismatch

2015-08-10 Thread Pavel Hladik
Hi, I would like to upgrade from Solr 4.10.3 to Solr 5.2.1 and I have 6 cores all the same version 4.10.3. When I start 5.2.1, 5 cores are started correctly and 1 core (most important 130M docs) fails. I got the message: Error opening new searcher, because: org.apache.lucene.index.CorruptIndexExc

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread Erick Erickson
Not that I know of. With ZK as the "one source of truth", dropping below quorum is Really Serious, so having to wait 3 minutes or so for action to be taken is the fallback. Best, Erick On Mon, Aug 10, 2015 at 1:34 PM, danny teichthal wrote: > Erick, I assume you are referring to zkClientTimeout,

Filter Out Facet Results

2015-08-10 Thread Paden
Hello, I'm trying to figure out how to filter out particular facets out of my results. I'm doing some Named Entity Extraction and putting them up as faceting information. However, not all the results I get are exact. For example, the string "w 5th street" will appear in the "Person" facet list. Th

date field in the schema causing a problem

2015-08-10 Thread Scott Derrick
I'm attempting to index my data. Which are autogenerated html documents, the source being TEI(xml) documents. I created the collection with sudo su - solr -c "/opt/solr/bin/solr create -c mbepp -n data_driven_schema_configs" I'm indexing with find . -mindepth 2 -not -name "person_*.*" -not

SOLR Physical Memory leading to OOM

2015-08-10 Thread rohit
Hi All, I have just started with a new project and using solrCloud and during my performance testing, i have been finding OOM issues. The thing which i notice more is physical memory keeps on increasing and never comes to original state. I'm indexing 10 million documents and have 4 nodes as lea

Re: date field in the schema causing a problem

2015-08-10 Thread Chris Hostetter
: : : : Most documents have a correctly formatted date string and I would like to keep : that data available for search on the date field. ... : I realize it is complaining because the date string isn't matching the : data_driven_schema file. How can I coerce it into allowing the non-st

Re: SOLR Physical Memory leading to OOM

2015-08-10 Thread Shawn Heisey
On 8/10/2015 5:05 PM, rohit wrote: > I have just started with a new project and using solrCloud and during > my performance testing, i have been finding OOM issues. The thing > which i notice more is physical memory keeps on increasing and never > comes to original state. > > I'm indexing 10 milli

Re: date field in the schema causing a problem

2015-08-10 Thread Scott Derrick
Chris, mucho thanks. The solr.RegexReplaceProcessorFactory looks like what I need. What a fantastic search engine this is! thanks again, Scott On 8/10/2015 5:21 PM, Chris Hostetter wrote: : : : : Most documents have a correctly formatted date string and I would like to keep : that data

Re: SOLR Physical Memory leading to OOM

2015-08-10 Thread rohit
Thanks Shawn. I was looking at SOLR Admin UI and also using top command on server. Im running a endurance for 4 hr with 50/sec TPS and i see the physical memory keeps on increasing during the time and if we have schedule delta import during that time frame which can import upto 4 million docs. Af

Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread deniz
I have a simple 2-node(5.1) cloud env with 6 different cores. One of the cores(CoreB) has some update issue which I am aware of, but the error logs on solr, I am seeing these below: ERROR - 2015-08-11 05:04:34.592; [*CoreA shard1 core_node2 CoreA*] org.apache.solr.update.StreamingSolrClients$1; er

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread Anshum Gupta
Hi Deniz, Seems like the update that's being forwarded from a non-leader (original node that received the request) is failing. This could be due to multiple reasons, including issue with your schema vs document that you sent. To elaborate more, here's how a typical batched request in SolrCloud wo

Re: Concurrent Indexing and Searching in Solr.

2015-08-10 Thread Nitin Solanki
Hi Erick, Thanks a lot for your help. I will go through MongoDB. On Mon, Aug 10, 2015 at 9:14 PM Erick Erickson wrote: > bq: I changed > *2* > to *100*. And apply simultaneous > searching using 100 workers. > > Do not do this. This has nothing to do with the number of searcher

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread deniz
Hello Anshum, thanks for the quick reply I know it is being forwarded one node to the leader node, but for collection names, it shows different collections while master node address is correct. Dunno if I am missing some points but my concern is the bold parts below: ERROR - 2015-08-11 05:04:34

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread Anshum Gupta
How did you create your collections? Also, is that verbatim from the logs or is it just because you obfuscated that part while posting it here? On Mon, Aug 10, 2015 at 11:02 PM, deniz wrote: > Hello Anshum, > > thanks for the quick reply > > I know it is being forwarded one node to the leader no

Re: Make search faster in Solr

2015-08-10 Thread Nitin Solanki
Okay davidphilip. On Mon, Aug 10, 2015 at 8:24 PM davidphilip cherian < davidphilipcher...@gmail.com> wrote: > Hi Nitin, > > 32 shards for 16 million documents is too much. 2 shards should suffice > considering your document sizes are moderate. Caches are to be monitored > and tuned accordingly.

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread deniz
I have created by simply creating configs and then using upconfig to upload to zookeeper, then adding it on admin interface of solr. I have only changed the ips of server and server1 and changed the core/collection names to CoreA and CoreB, in the logs CoreA and CoreB are different collections wit

Re: Make search faster in Solr

2015-08-10 Thread Nitin Solanki
Hi davidphilip, Without caching, Can we do fast searching? On Tue, Aug 11, 2015 at 11:43 AM Nitin Solanki wrote: > Okay davidphilip. > > On Mon, Aug 10, 2015 at 8:24 PM davidphilip cherian < > davidphilipcher...@gmail.com> wrote: > >> Hi Nitin, >> >> 32 shards for 16 mill

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread Anshum Gupta
bq. adding it on admin interface of solr Did you not use Collections Admin API? If you try to create your own cores using the core admin APIs instead of using Collection Admin APIs, you could really end up shooting yourself in your feet. Also, the only supported mechanism to create a collection in

Re: Core mismatch in org.apache.solr.update.StreamingSolrClients Errors for ConcurrentUpdateSolrClient

2015-08-10 Thread deniz
okay, to make everything clear, here are the steps: - Creating configs etc and then running: ./zkcli.sh -cmd upconfig -n CoreA -d /path/to/core/configs/CoreA/conf/ -z zk1:2181,zk2:2182,zk3:2183 - Then going to http://someserver:8983/solr/#/~cores - Clicking Add Core: