Re: Term count in multivalue fields

2014-11-08 Thread Nickolay41189
"while indexing add a field containing the number" isn't suitable for my case. I can't add new field and do indexing. -- View this message in context: http://lucene.472066.n3.nabble.com/Term-count-in-multivalue-fields-tp4168138p4168400.html Sent from the Solr - User mailing list archive at Nabb

Re: on regards to Solr and NoSQL storages integration

2014-11-08 Thread Jack Krupansky
There is no "double storage" of data - the Solr index for DataStax Enterprise ignores the "stored" attribute and only stores the primary key data to allow the Solr document to reference the Cassandra row, which is where the data is stored. The exception would be doc values, where the data does

Re: Occasionally hit ArrayIndexOutOfBoundException when searching

2014-11-08 Thread Mohmed Hussain
Hi All, More analysis revealed it fails when we have indexed documents with many Japanese characters and have indexed it using tika parser. The search is successful when we turn OFF facets on the only one date param used. Following is the SolrParam Query q=(+(+((+resource_type:CURRICULUM^0.001

Re: Solrcloud replicas do not match

2014-11-08 Thread Erick Erickson
re: Solr admin console. Hmmm, switch it to a different node? It gets you the same info no matter which node you're pointing at in your SolrCloud Not sure why this happens though. Best, Erick On Sat, Nov 8, 2014 at 10:12 AM, Michal Krajňanský wrote: > Hi Erick, > > I found the issue to be r

Re: Help with SolrCloud exceptions while recovering

2014-11-08 Thread Erick Erickson
First. for tweets committing every 500 docs is much too frequent. Especially from the client and super-especially if you have multiple clients running. I'd recommend you just configure solrconfig this way as a place to start and do NOT commit from any clients. 1> a hard commit (openSearcher=false)

Re: Occasionally hit ArrayIndexOutOfBoundException when searching

2014-11-08 Thread anil raju
Can anyone here provide help on this? Any further logs or environment details I can provide to help the analysis? On Nov 8, 2014 12:31 AM, "Mohmed Hussain" wrote: > Hey All, > We are using Solr for an enterprise product. Recently we did an upgrade > from 4.7.0 to 4.9.1 and are seeing this exc

Re: Solr 4.10 very slow on build()

2014-11-08 Thread Yonik Seeley
Try commenting out the suggester component & handler in solrconfig.xml: https://issues.apache.org/jira/browse/SOLR-6679 -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data On Sat, Nov 8, 2014 at 2:03 PM, Mohsen Saboorian wrote: > I have a ~4GB index

Re: How to dynamically create Solr cores with schema

2014-11-08 Thread Jorge Luis Betancourt González
I remember a talk by CareerBuilder whe they wrote an API using the approach explained by Alexandre and they got really good results. - Original Message - From: "Anurag Sharma" To: solr-user@lucene.apache.org Sent: Saturday, November 8, 2014 7:58:48 AM Subject: Re: How to dynamically crea

Solr 4.10 very slow on build()

2014-11-08 Thread Mohsen Saboorian
I have a ~4GB index which takes a minute (or over) to /build()/ when starting server. I noticed that this happens when I upgrade from solr 4.0 to 4.10. The index was fully rebuilt with solr 4.10 (using DIH). How can I speed up startup time?Here is the slow part of the starting log:INFO 141101-23:4

Re: Solrcloud solrconfig.xml

2014-11-08 Thread Michal Krajňanský
Hi Erick, Thank you for making this clearer (it helped me solve issue with replication I asked about in different thread). However I suspect I still do something wrong. I am running a single Tomcat instance with two instances of Solr. The shared solrconfig.xml contains: ${solr.data.dir:data} An

Re: Solrcloud replicas do not match

2014-11-08 Thread Michal Krajňanský
Hi Erick, I found the issue to be related to my other question (about shared solrconfig.xml) which you also answered. Turns out that I had set data.dir variable in solrconfig.xml to an absolute path that coincided with a different index. So replica tried to be created there and something nasty pr

Help with SolrCloud exceptions while recovering

2014-11-08 Thread Bruno Osiek
Hi, I am a newbie SolrCloud enthusiast. My goal is to implement an infrastructure to enable text analysis (clustering, classification, information extraction, sentiment analysis, etc). My development environment consists of one machine, quad-core processor, 16GB RAM and 1TB HD. Have started impl

Re: Synonymn for Numbers

2014-11-08 Thread Jack Krupansky
Are you using the synonyms for both indexing and query? It sounds like you want to use these synonyms only at query time. Otherwise, "10" in the index becomes "2010" in the index. -- Jack Krupansky -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) Se

Re: Delete data from stored documents

2014-11-08 Thread Jack Krupansky
Agreed, but I think it would be great if Lucene and Solr provided an API to delete a single field for the entire index. We could file a Jira, but can Lucene accommodate it? Maybe we'll just have to wait for Elasticsearch to implement this feature! -- Jack Krupansky -Original Message-

Re: Sort documents by exist(multivalued field)

2014-11-08 Thread Erick Erickson
Well, if you can write a custom function that does the right thing with multiValued fields you could sort by that. You still haven't defined the exact use case. The problem here is that sorting by a multiValued field is meaningless. Consider a field with aardvark and zebra. Where should it sort? O

Re: Solr exceptions during batch indexing

2014-11-08 Thread Erick Erickson
bq: Just trying to understand what's the challenge in returning the bad doc Mostly, nobody has done it yet. There's some complication about async updates, ConcurrentUpdateSolrServer for instance. I suspect also that one has to write error handling logic in the client anyway so the motivation is re

Re: Sort documents by first value in multivalued field

2014-11-08 Thread Anurag Sharma
What is 'first value' here, any example? On Fri, Nov 7, 2014 at 5:04 PM, Nickolay41189 wrote: > How can I sort documents by first value in multivalued field? (without > adding > copyField and without some changes in schema.xml)? > > > > -- > View this message in context: > http://lucene.472066.n

Re: Term count in multivalue fields

2014-11-08 Thread Anurag Sharma
Since 'omitTermFremFreqAndPositions' is enabled what does a function query 'totaltermfreq(field,term)' return? Another way, not sure this is the correct approach, while indexing add a field containing the number. Filter and sum(function query) on the field while querying. Range query can also be d

Re: Minimum Term Matching in More Like This Queries

2014-11-08 Thread Anurag Sharma
There is no direct way of retrieving doc based on minimum term match in Solr. mlm params 'mlt.mintf' and 'mlt.match.offset' can be explored if they meets the criteria. Refer below links for more details: http://wiki.apache.org/solr/MoreLikeThisHandler https://wiki.apache.org/solr/MoreLikeThis In c

Re: Sort documents by exist(multivalued field)

2014-11-08 Thread Yago Riveiro
Re-index data is bad for me, it's 5TB of data, the time to re-index this data it's too much, but seem to be the only option I have. On Sat 8 Nov 2014 at 13:10 Anurag Sharma wrote: > Is it possible to describe the exact use case here. > > On Fri, Nov 7, 2014 at 10:26 PM, Alexandre Rafalovitch >

Re: Sort documents by exist(multivalued field)

2014-11-08 Thread Anurag Sharma
Is it possible to describe the exact use case here. On Fri, Nov 7, 2014 at 10:26 PM, Alexandre Rafalovitch wrote: > You encode that knowledge by using UpdateRequestProcessor. Clone the > field, replace it with true, map it to boolean. That way, you will pay > the price once per document indexed

Re: How to dynamically create Solr cores with schema

2014-11-08 Thread Anurag Sharma
For more advanced dynamic fields refer dynamicField elements convention patterns for fields from the below schema.xml http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/server/solr/configsets/basic_configs/conf/schema.xml solr create core api can be referred to create a core dynamically. e.g. c

Re: Delete data from stored documents

2014-11-08 Thread Anurag Sharma
Since the data already existing and need is to remove unwanted fields using a custom update processor looks less useful here. Erick's recommendation on re-indexing into a new collection if at all possible looks simple and safe. On Sat, Nov 8, 2014 at 12:44 AM, Erick Erickson wrote: > bq: My qu

Re: Synonymn for Numbers

2014-11-08 Thread Anurag Sharma
If you are searching for single document can a real time get on doc id mentioned below serve your use case? http://localhost:8983/solr/get?id=mydoc Real time get for multiple docs: http://localhost:8983/solr/get?id=mydoc&id=mydoc On Sat, Nov 8, 2014 at 12:52 AM, EXTERNAL Taminidi Ravi (ETI, Autom

Re: Solr exceptions during batch indexing

2014-11-08 Thread Anurag Sharma
Just trying to understand what's the challenge in returning the bad doc id(s)? Solr already know which doc(s) failed on update and can return their id(s) in response or callback. Can we have JIRA ticket on it if it doesn't exist? This looks like a common use case and every solr consumer might be w

Occasionally hit ArrayIndexOutOfBoundException when searching

2014-11-08 Thread Mohmed Hussain
Hey All, We are using Solr for an enterprise product. Recently we did an upgrade from 4.7.0 to 4.9.1 and are seeing this exception. Its an EmbeddedSolrServer (know its a bad choice and are moving to Solr Cloud very soon :)). And I used maven to upgrade following is the snippet from pom.xml