Re: Purging unused segments.

2013-08-09 Thread Robert Muir
On Fri, Aug 9, 2013 at 7:48 PM, Erick Erickson wrote: > > So is there a good way, without optimizing, to purge any segments not > referenced in the segments file? Actually I doubt that optimizing would > even do it if I _could_, any phantom segments aren't visible from the > segments file anyway..

Re: Question about filter query: "half" of my index is slower than the other?

2013-08-09 Thread Erick Erickson
To add to what Shawn said, this filterCache is enormous. The key statistics are the hit ratio and evictions. Evictions aren't bad if the hit ratio is high. If hit ratio is low and evictions are high, only then should you consider making it larger. So I'd drop it back to 512. Hit ratios around 75%

Re: Unable to load com.microsoft.sqlserver.jdbc.SQLServerDriver

2013-08-09 Thread PeterKerk
Had to add one more "../" and now it works indeed...thanks for noticing! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Unable-to-load-com-microsoft-sqlserver-jdbc-SQLServerDriver-tp4083529p4083665.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Unable to load com.microsoft.sqlserver.jdbc.SQLServerDriver

2013-08-09 Thread Erick Erickson
placed sqljdbc4.jar in folder: \solr-4.3.1\example\lib\ in my solrconfig.xml I have: Hold on. You put the jar file in one place (solr-4.3.1/example/lib), then told Solr to look for it in another (../../../dist/). And even if I'm mis-reading that part, the lib directive right above it has on

Purging unused segments.

2013-08-09 Thread Erick Erickson
I have a situation in which I can't safely optimize, the index is on a machine that doesn't have enough disk space. I have no control over the hardware. There are some raw statistics on this index that don't make sense, it's roughly twice the size of a similar index (could be legit, but it seems ou

Shard splitting failure, with and without composite hashing

2013-08-09 Thread Greg Preston
Howdy, I'm trying to test shard splitting, and it's not working for me. I've got a 4 node cloud with a single collection and 2 shards. I've indexed 170k small documents, and I'm using the compositeId router, with an internal "client id" as the shard key, with 4 distinct values across the data se

Re: Filter search items based on creator permission settings

2013-08-09 Thread Chris Hostetter
: In-Reply-To: : : References: : : Subject: Filter search items based on creator permission settings https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead

Re: Percolate feature?

2013-08-09 Thread Roman Chyla
On Fri, Aug 9, 2013 at 2:56 PM, Chris Hostetter wrote: > > : I'll look into this. Thanks for the concrete example as I don't even > : know which classes to start to look at to implement such a feature. > > Either roman isn't understanding what you are aksing for, or i'm not -- > but i don't think

RE: Sharding and Replication

2013-08-09 Thread Alexey Kozhemiakin
+1 I'd like to vote for this issue https://issues.apache.org/jira/browse/SOLR-4956 It would be useful to have this parameters configurable. When we index hundreds of millions of documents to 4 shard SolrCloud in batches of 20K - overhead of this chatty conversation with replicas and other sh

Re: Percolate feature?

2013-08-09 Thread Jack Krupansky
I thought about that suggested doc/query model, but... Do you really want a query of "Sony xbox" or "Sony ipad" or even "Sony Samsung" to match document "Sony"? Seems quite odd. -- Jack Krupansky -Original Message- From: Chris Hostetter Sent: Friday, August 09, 2013 2:56 PM To: solr

Problem running Solr indexing in Amazon EMR

2013-08-09 Thread Dmitriy Shvadskiy
Hello, We are trying to utilize Amazon Elastic Map Reduce to build Solr indexes. We are using embedded Solr in the Reduce phase to create the actual index. However we run into a following error and not sure what is causing it. Solr version is 4.4. The job runs fine locally in Cloudera CDH 4.3 VM T

Re: Percolate feature?

2013-08-09 Thread Chris Hostetter
: I'll look into this. Thanks for the concrete example as I don't even : know which classes to start to look at to implement such a feature. Either roman isn't understanding what you are aksing for, or i'm not -- but i don't think what roman described will work for you... : > so if your query

Re: Solr on glassfish with multiple nodes - problem in data import

2013-08-09 Thread Shawn Heisey
On 8/9/2013 12:24 PM, kaustubh147 wrote: We have Solr installed on Glassfish cluster which has 4 nodes and we have a single solr.data directory which is shared among all 4 nodes. This doesn't work well at all. Solr expects exclusive access to the Lucene index, and if you have more than one se

Re: Environment Timezone considered When using SolrJ

2013-08-09 Thread Shawn Heisey
On 8/9/2013 12:20 PM, Chris Hostetter wrote: : If you index a Java date object instead of the text format, it will be : valid in the timezone at the client, and SolrJ will do timezone : translation before sending to Solr. Solr will index/store the date in UTC. this is missleading -- SolrJ does

Solr on glassfish with multiple nodes - problem in data import

2013-08-09 Thread kaustubh147
Hi, We have Solr installed on Glassfish cluster which has 4 nodes and we have a single solr.data directory which is shared among all 4 nodes. When I trigger full data import on one of the cores on the server, because it is a http request, it goes to one of the nodes on the cluster. Now after the

Re: Environment Timezone considered When using SolrJ

2013-08-09 Thread Chris Hostetter
: If you index a Java date object instead of the text format, it will be : valid in the timezone at the client, and SolrJ will do timezone : translation before sending to Solr. Solr will index/store the date in UTC. this is missleading -- SolrJ doesn't do any sort of "timezone translation" ...

Re: external zookeeper with SolrCloud

2013-08-09 Thread Shawn Heisey
On 8/9/2013 11:15 AM, Joshi, Shital wrote: Same thing happen. It only works with N/2 + 1 zookeeper instances up. Got it. An update came in on the issue that I filed. This behavior that you're seeing is currently by design. Because this is expected behavior, I've changed the issue to improv

Re: Question about filter query: "half" of my index is slower than the other?

2013-08-09 Thread Shawn Heisey
On 8/9/2013 9:36 AM, Neal Ensor wrote: I have an 8 million document solr index, roughly divided down the middle by an identifying "product" value, one of two distinct values. The documents in both "sides" are very similar, with stored text fields, etc. I have two nearly identical request handle

Re: Question about filter query: "half" of my index is slower than the other?

2013-08-09 Thread Neal Ensor
It seems (from observation only) that most of the documents on both sides of this equation have the same "weights". I don't see any wide swaths of unpopulated fields on the "good" side. Just wondering if there's some caching involved that I'm missing here, or something I can balance out better...

RE: external zookeeper with SolrCloud

2013-08-09 Thread Joshi, Shital
Same thing happen. It only works with N/2 + 1 zookeeper instances up. -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: Friday, August 09, 2013 11:22 AM To: solr-user@lucene.apache.org Subject: Re: external zookeeper with SolrCloud On 8/9/2013 9:02 AM, Joshi, Shita

Re: Percolate feature?

2013-08-09 Thread Walter Underwood
All of the query words must match, right? So this is a phrase query in edismax with mm=100%. We have suggestions for exactly matching a whole field, but you need "samsung galaxy" to match the document "samsung galaxy s4". That means you do not need an exact match on the field. If you do need t

Re: Percolate feature?

2013-08-09 Thread Mark
I'll look into this. Thanks for the concrete example as I don't even know which classes to start to look at to implement such a feature. On Aug 9, 2013, at 9:49 AM, Roman Chyla wrote: > On Fri, Aug 9, 2013 at 11:29 AM, Mark wrote: > >>> *All* of the terms in the field must be matched by the q

Re: Percolate feature?

2013-08-09 Thread Roman Chyla
On Fri, Aug 9, 2013 at 11:29 AM, Mark wrote: > > *All* of the terms in the field must be matched by the querynot > vice-versa. > > Exactly. This is why I was trying to explain it as a reverse search. > > I just realized I describe it as a *large list of known keywords when > really its small;

Version Conflict on Atomic Update

2013-08-09 Thread Bruno René Santos
Using the document interface on the Solr admin i try to update the following document: { "responseHeader": { "status": 0, "QTime": 1, "params": { "indent": "true", "q": "*:*", "_": "1376064413493", "wt": "json" } }, "response": { "numFound": 1, "start": 0, "docs": [ { "id": "change.me", "author":

Re: JSON Update create different copies of the same document

2013-08-09 Thread Jack Krupansky
You're getting yourself very, very confused. And then you're using the source code to confuse yourself even more! Sigh. First, you wouldn't (shouldn't) use atomic update for "loading" batches of documents. Atomic update is for selectively update a subset of the fields of existing documents. If

Re: Percolate feature?

2013-08-09 Thread Jack Krupansky
Starting with the presumption that Solr is a "search engine" for user queries, what exactly would a user query look like? Are you really requiring your users to enter long, carefully constructed, full length product titles?? What kind of application would force its users to do such a thing?

Re: JSON Update create different copies of the same document

2013-08-09 Thread Bruno René Santos
Hi, I think I found out what is really happening. When I try to do a atomic update the document id is transformed into a BytesRef (indexedId variable) on the org.apache.solr.update.AddUpdateCommand. But on line 726 of the org.apache.solr.update.processor.DistributedUpdateProcessor ( SolrInputDocum

Re: Question about filter query: "half" of my index is slower than the other?

2013-08-09 Thread Raymond Wiker
On Aug 9, 2013, at 17:36 , Neal Ensor wrote: > So, I have an oddball question I have been battling with in the last day or > two. > > I have an 8 million document solr index, roughly divided down the middle by > an identifying "product" value, one of two distinct values. The documents > in both

Question about filter query: "half" of my index is slower than the other?

2013-08-09 Thread Neal Ensor
So, I have an oddball question I have been battling with in the last day or two. I have an 8 million document solr index, roughly divided down the middle by an identifying "product" value, one of two distinct values. The documents in both "sides" are very similar, with stored text fields, etc. I

Re: Percolate feature?

2013-08-09 Thread Mark
> *All* of the terms in the field must be matched by the querynot > vice-versa. Exactly. This is why I was trying to explain it as a reverse search. I just realized I describe it as a *large list of known keywords when really its small; no more than 1000. Forgetting about performance how h

Re: Error while indexing in solrcloud

2013-08-09 Thread host@123
hi, Thanks for your reply. Please find my answers below: Do you send all your indexing requests to 1 node (though that doesn't really matter here)?: *Yes* are they all host:8080 or have you replaced some info there? *Yes, they are all port 8080.* You should be able to see in your Cloud Admin tha

Re: external zookeeper with SolrCloud

2013-08-09 Thread Shawn Heisey
On 8/9/2013 9:02 AM, Joshi, Shital wrote: > At this point, we cannot see admin page or query of any solr nodes unless we > restart entire cloud and after that everything is great. So we must put > checks to make sure that N/2 + 1 zookeeper instances are up before we can > bring up any solr nodes

RE: external zookeeper with SolrCloud

2013-08-09 Thread Joshi, Shital
Thanks so much for your reply. Appreciate your help with this. We have 10 Solr4 nodes (5 shards with replication factor 2) and three zookeeper instances. When we bring 10 Solr4 nodes (while all zookeeper instances are down), we see this exception in Solr4 logs. (which makes sense) java.net.Con

Re: Environment Timezone considered When using SolrJ

2013-08-09 Thread Shawn Heisey
On 8/9/2013 6:12 AM, sowja...@pointcross.com wrote: > When using SolrJ I've realized document dates are being modified according > to the environment UTC timezone. > > I have indexed the large amount of data on date fileds of Solr (using Solr > 3.3). While retrieving this date using the SolrJ into

Re: Post Call to Solr RequestHandler

2013-08-09 Thread Shawn Heisey
On 8/9/2013 4:47 AM, Vineet Mishra wrote: > Currently I am working with RequestHandler in Solr, where the user defined > query is processed at the class specified by the requesthandler in > Solrconfig.xml. > > But my requirement is that I want to make it a Post call rather than a Get > query call.

Re: Transform data at index time: country -> continent

2013-08-09 Thread omu_negru
Hey, Since you're using solr and have access to the database in question did you consider making an extra index on the machine to hold your country to continent mapping ? I know it's more trouble than it's worth for such a small data set but hey, you get to set up another index :) -- View this m

Re: Error while indexing in solrcloud

2013-08-09 Thread Daniel Collins
The shard update error in essence means the shard that received the update was trying to forward it on to the leader of that shard. Do you send all your indexing requests to 1 node (though that doesn't really matter here)? The error 503 normally means Solr is down at the remote end, are they all

Re: Spelling suggestions.

2013-08-09 Thread Jason Hellman
The majority of the behavior outlined in that wiki page should work quite sufficiently for 3.5.0. Note that there are only a few items that are marked Solr4.0 only (DirectSolrSpellChecker and WordBreakSolrSpellChecker, for example). On Aug 9, 2013, at 6:26 AM, Kamaljeet Kaur wrote: > Hello,

Re: Percolate feature?

2013-08-09 Thread Yonik Seeley
*All* of the terms in the field must be matched by the querynot vice-versa. And no, we don't have a query for that out of the box. To implement, it seems like it would require the total number of terms indexed for a field (for each document). I guess you could also index start and end tokens a

Unable to load com.microsoft.sqlserver.jdbc.SQLServerDriver

2013-08-09 Thread PeterKerk
I'm getting this error when trying to user Data Import Handler via URL: http://localhost:8983/solr/1001/dataimport?command=full-import Caused by: java.lang.ClassNotFoundException: Unable to load com.microsoft.sqlser ver.jdbc.SQLServerDriver or org.apache.solr.handler.dataimport.com.microsoft.sql

Error while indexing in solrcloud

2013-08-09 Thread Rachna
Hello everyone, I have a solrcloud of 3 shards and 1 replica each. The documents that I am trying to index has about 30 fields and there are more than a million docs. I am using solrJ to index and I amindexing after adding 5000 docs. I have also enable softcommit with maxtime as 60 seconds. The ve

Spelling suggestions.

2013-08-09 Thread Kamaljeet Kaur
Hello, I have just configured apache-solr with my django project. And its working fine with a very simple and basic searching. I want to add spelling suggestions, if user misspell any word in the string entered. In this particular mailing-list, I searched for it. Many have give the link: http://

Re: JSON Update create different copies of the same document

2013-08-09 Thread Bruno René Santos
Hi, I just saw overwrite option on the backoffice. I am loading the documents in 5000 document batches in JSON so I do not use this interface. How can I use this overwrite = true option in my environment? Or how solr admin interface translate this overwrite option into JSON update syntax? Regard

Re: Error loading class 'solr.DisMaxRequestHandler' after upgrade from solr350 to 431

2013-08-09 Thread Erick Erickson
This is probably a classpath problem. I'd guess that you have some old 3.5 jars laying around that are confusing the class loader. First thing I'd do is test with a clean 4.3 installation, then track down where the old jars are. Best Erick On Fri, Aug 9, 2013 at 5:42 AM, PeterKerk wrote: > I'

Re: First Indexing a postgres database

2013-08-09 Thread Erick Erickson
bq: Caused by: java.lang.ClassNotFoundException: solr.apache.solr.handler.dataimport.DataImportHandler So, it looks like part of your configuration uses DIH, but you haven't included the DIH jars in the classpath accessible by Solr running under Tomcat, so as you attempt to create the core, things

Re: Environment Timezone considered When using SolrJ

2013-08-09 Thread Jack Krupansky
What specific evidence do you have that the dates have ben modified? All Solr dates are specified using the "Z" suffix, meaning GMT. -- Jack Krupansky -Original Message- From: sowja...@pointcross.com Sent: Friday, August 09, 2013 8:12 AM To: solr-user@lucene.apache.org Subject: Enviro

Re: Problem with SolrCloud + Zookeeper + DataImportHandler

2013-08-09 Thread Erick Erickson
The mail programs usually strip out attachments, so your attachments didn't go through. Maybe put it on Pastebin or similar? But since you say it works on a single node, I wonder if one or more of your nodes has an old jar on it that's getting used. One could test it by trying to run your import

Re: Percolate feature?

2013-08-09 Thread Erick Erickson
This _looks_ like simple phrase matching (no slop) and highlighting... But whenever I think the answer is really simple, it usually means that I'm missing something Best Erick On Thu, Aug 8, 2013 at 11:18 PM, Mark wrote: > Ok forget the mention of percolate. > > We have a large list of kn

Environment Timezone considered When using SolrJ

2013-08-09 Thread sowja...@pointcross.com
Hi, When using SolrJ I've realized document dates are being modified according to the environment UTC timezone. I have indexed the large amount of data on date fileds of Solr (using Solr 3.3). While retrieving this date using the SolrJ into SolrDocumentList. The original date value is modified.

SOLR 3.6.2 setup in Websphere 7.X

2013-08-09 Thread Thirukumaran - Mariappan
Hi Team, I recently tried setting up Solr in Tomcat. It works well without issues. I tried setting up SOLR 3.6.2 in Websphere 7.0.0.25. Deployed the solr war available in the Solr Zip - dist folder. Have referred the following forum as well.. http://wiki.apache.org/solr/SolrWebSphere Below men

Do docValues influence range faceting speed in solr?

2013-08-09 Thread omu_negru
Hello, >From my understanding doc.values work like a persistent field-cache, which is awesome for when you have to do sorting on a field after matching a certain query (assuming you have doc-values for that field enabled) That being the case , Field faceting is indeed improved by having docVal

Re: Solr doesn't make indexes for all the enteries

2013-08-09 Thread Kamaljeet Kaur
On Wed, Aug 7, 2013 at 6:06 PM, Raymond Wiker [via Lucene] wrote: > If you want to see alll results, you can either increase "rows", or run > multiple queries, increasing "offset" each time. Thanks, I got that. But where it can be done? I mean multiple queries?? Also I am getting the following e

Re: Solr doesn't make indexes for all the enteries

2013-08-09 Thread Kamaljeet Kaur
On Fri, Aug 9, 2013 at 1:22 PM, Kamal Kaur wrote: > "The model '' has an > empty model_attr 'company' and doesn't allow a default or null value." Complete trace back is here : http://tny.cz/0215ea04 -- Kamaljeet Kaur kamalkaur188.wordpress.com facebook.com/kaur.188 -- View this message in

Filter search items based on creator permission settings

2013-08-09 Thread Mugoma Joseph O.
Hello, I have an application where document creators determine what access permission (s) to give. The permissions are of the form: 1. EVERYONE => 1 2. MY_FRIENDS => 2 3. ME_ONLY => 3 Example: 1. User 1 creates doc1 and sets permission to EVERYONE 2. User 2 created doc2 and sets permission to

Post Call to Solr RequestHandler

2013-08-09 Thread Vineet Mishra
Hi Currently I am working with RequestHandler in Solr, where the user defined query is processed at the class specified by the requesthandler in Solrconfig.xml. But my requirement is that I want to make it a Post call rather than a Get query call. Is it possible or are there some way we can acco

Error loading class 'solr.DisMaxRequestHandler' after upgrade from solr350 to 431

2013-08-09 Thread PeterKerk
I'm in the process of upgrading from solr 350 to 431. I see this in my log: Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.DisMa xRequestHandler' 1742 [coreLoadExecutor-3-thread-1] ERROR org.apache.solr.core.CoreContainer û n ull:org.apache.solr.common.SolrException: U

Re: First Indexing a postgres database

2013-08-09 Thread geoport
Hi, the error is from the client, i get it in the browser. I am creating my core by copying the collection1-example and rename it as postgres_test and at an item in solr.xml. That is my Catalina output: Aug 09, 2013 11:35:50 AM org.apache.solr.core.CachingDirectoryFactory closeCacheValue INFO: l

Reg Syncing Multiple Lucene Indexes

2013-08-09 Thread VIGNESH S
Hi, How to Sync Multiple Lucene Indexes and Query on it. My Usecase is like I am doing indexes in multiple machines or mobile or tablet and need to query on a central place where i need to have all these indexes merged. Whenever a new file is indexed in mobile or tablet or desktop,I need to syn