Re: external indexer for Solr Cloud

2014-09-01 Thread Lee Chunki
Hi, @Jack the final goal is generate index out of Solr Cloud but run DIH externally is not bad @Shawn it sounds great to build a new application that work with multiple threads and send documents to their shards please let me know the logic how can i decide which document should go to a shard

Re: SolR replication issue

2014-09-01 Thread Mauricio Ferreyra
The entire stacktrace: ERROR SolrIndexWriter Coud not unlock directory after seemingly failed IndexWriter#close() org.apache.lucene.store.LockReleaseFailedException: Cannot forcefully unlock a NativeFSLock which is held by another indexer component: /home/miapp/collection1/data/index.2014090114080

Re: Indexing & search list of Key/Value pairs

2014-09-01 Thread amid
Hi Jack, This can really improve the solution, still it's less robust than an option to search a "pair" (for example if the skills are indexed with an index key, a query like "(skill1:php skill1_years:[2 TO *]) OR (skill2:php skill2_years:[2 TO *] ...). The solution I describe is not really good a

Re: SolR replication issue

2014-09-01 Thread Shawn Heisey
On 9/1/2014 10:31 AM, Mauricio Ferreyra wrote: > I'm using Solr 4.3.1 with a master/slave configuration. > > Configuration: > > Master: > * * > * commit* > * startup* > * schema.xml,stopwords.txt* > * * > > > Slave: > * * > * http://10.xx.xx.xx:90

Re: Indexing & search list of Key/Value pairs

2014-09-01 Thread Jack Krupansky
You can certainly have a separate multivalued text field, like "skills" that can have arbitrary text values like "PHP", "Ruby, "Software Development", "Agile Methodology", "Agile Development", "Cat Herding", etc., that are analyzed, lower cased, stemmed, etc. As far as the dynamic field names,

Re: Indexing & search list of Key/Value pairs

2014-09-01 Thread amid
Hi Jack, Thanks for the fast response. I assume that using this technique will have the following limitations: 1) Skill characters will be limited 2) Field name are not analyze and will not be able to get the full search pack (synonym, analyzers...) Am i right? If so do you familiar with other

Re: Indexing & search list of Key/Value pairs

2014-09-01 Thread Jack Krupansky
Solr supports multivalued fields, but really only for scalar, not structured values. And trying to manage two or more multivalued fields in parallel is also problematic. Better to simply use dynamic fields, such as name the field "xyz_skill" and the value is the number of years. Then you can sim

SolR replication issue

2014-09-01 Thread Mauricio Ferreyra
Hi folks, I'm using Solr 4.3.1 with a master/slave configuration. Configuration: Master: * * * commit* * startup* * schema.xml,stopwords.txt* * * Slave: * * * http://10.xx.xx.xx:9081/solr * * 00:00:60* *

Indexing & search list of Key/Value pairs

2014-09-01 Thread amid
Hi, I'm using solr and trying to index a list of key/value pairs, the key contains a string with a skill and the value is the years of experience (i.e. someone with 5 years of php and 2 years of ruby). I want to be able to create a query which return all document with a specific skill and range o

Re: external indexer for Solr Cloud

2014-09-01 Thread Siegfried Goeschl
Hi folks, we are using Apache Camel but could use Spring Integration with the option to upgrade to Apache BatchEE or Spring Batch later on - especially Tikka document extraction can kill you server due to CPU consumption, memory usage and plain memory leaks AFAIK Douf Turnbull also improved

Re: external indexer for Solr Cloud

2014-09-01 Thread Jack Krupansky
Packaging SolrCell in the same manner, with parallel threads and able to talk to multiple SolrCloud servers in parallel would have a lot of the same benefits as well. And maybe there could be some more generic Java framework for indexing as well, that "external indexers" in general could use.

Re: Classloader for plugin jar

2014-09-01 Thread Shawn Heisey
On 9/1/2014 9:30 AM, Nimrod Cohen wrote: > We have a plugin that works as long as we have all the jars in sold > webapps\web-inf\lib > > Once we copy the jar to a different location we solr starts ok but on > run time we get the below error, we think that solr don’t use the right > class loader.

Re: Too much mail

2014-09-01 Thread Stefan Matheis
Almost, try https://lucene.apache.org/solr/discussion.html -Stefan On Monday, September 1, 2014 at 5:39 PM, William von Hagen wrote: > unsubscribe > >

Re: external indexer for Solr Cloud

2014-09-01 Thread Shawn Heisey
On 9/1/2014 7:19 AM, Jack Krupansky wrote: > It would be great to have a "standalone DIH" that runs as a separate > server and then sends standard Solr update requests to a Solr cluster. This has been discussed, and I thought we had an issue in Jira, but I can't find it. A completely standalone D

Too much mail

2014-09-01 Thread William von Hagen
unsubscribe

Classloader for plugin jar

2014-09-01 Thread Nimrod Cohen
Hi We have a plugin that works as long as we have all the jars in sold webapps\web-inf\lib Once we copy the jar to a different location we solr starts ok but on run time we get the below error, we think that solr don't use the right class loader. Do you have any idea get over this? Working on

AW: AW: AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
Isn't this not just a question of configuration and my hardware. The better my hardware the more cores I can keep in memory. Let's say I can keep 10.000 cores in memory, when I reach the limit, I will put the last recent used to "dormant" and "wakeup" the new one I need. So I can scale further

Re: external indexer for Solr Cloud

2014-09-01 Thread Jack Krupansky
Okay, but please clarify further - do you simply wish to run DIH externally, but still sending each document to SolrCloud for indexing, or... are you expecting to generate the index completely external to the cluster and then somehow "merge" that DIH "index" into the SolrCloud index? It would

Re: Specify Analyzer per field

2014-09-01 Thread Alexandre Rafalovitch
On Mon, Sep 1, 2014 at 2:14 AM, Ankit Jain wrote: > I want to use schema less feature Solr because the schema is created at > runtime as per user input. Did you actually look at dynamic fields? Multiple CMS products are using Solr to allow user create fields at runtime by using appropriate prefix

Re: AW: Scaling to large Number of Collections

2014-09-01 Thread Jack Krupansky
And I would add another suggested requirement - "dormant collections" - collections which may once have been active, but have not seen any recent activity and can hence be "suspended" or "swapped out" until such time as activity resumes and they can then be "reactivated" or "reloaded". That ina

Re: Specify Analyzer per field

2014-09-01 Thread Jack Krupansky
Thanks for finally specifying the feature so concisely. IOW, you want the ES feature of being able to specify the analyzer for the field as opposed to the field type. See: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-intro.html "For analyzed string fields, use the

Solr spellcheck returns more than 1 word for a 1 word spellcheck

2014-09-01 Thread Thomas Michael Engelke
I'm in the process of incorporating Solr spellchecking in our product. For that, I've created a new field: And in the fieldType definitions: Then I feed the names of products into the corresponding core. They can have a lot of words (examples): door lock rear left Door brake,

RE: Solr issue

2014-09-01 Thread Shay Sofer
Hi, None of the above fix my problem. The problem is with QueryComponent.java: private void groupedFinishStage(final ResponseBuilder rb) { // To have same response as non-distributed request. GroupingSpecification groupSpec = rb.getGroupingSpec(); if (rb.mergedTopGroups.isEmpty())

AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
We already reduced the -Xss256k. How could we reduce the size of the transaction log? By less autoCommits? Or could it be cleaned up? Thanks Christoph -Ursprüngliche Nachricht- Von: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Gesendet: Sonntag, 31. August 2014 20:12 An: solr

AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
Yes, this would help us in our scenario. -Ursprüngliche Nachricht- Von: Jack Krupansky [mailto:j...@basetechnology.com] Gesendet: Sonntag, 31. August 2014 18:10 An: solr-user@lucene.apache.org Betreff: Re: Scaling to large Number of Collections We should also consider "lightly-sharded" c

AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
At peak time a lot of customers will be on the system. Sure not all in the same second but within an hour we see a lot of them. So having something like "stand-by" cores will help to reduce resource consumption (memory, threads) if the core are rarely used. Doing a warmup at login time of the u

AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
Hm, temporarily more threads is hard. We already reduced -Xss256k. Wouldn't it be better to use Callable and Executor as proposed in: http://stackoverflow.com/questions/16789288/java-lang-outofmemoryerror-unable-to-create-new-native-thread and limit the number of used threads to the number of CP

AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
Yes, we will think about this how to reorganise the application. Thanks Christoph -Ursprüngliche Nachricht- Von: Joseph Obernberger [mailto:joseph.obernber...@gmail.com] Gesendet: Sonntag, 31. August 2014 16:58 An: solr-user@lucene.apache.org Betreff: Re: Scaling to large Number of Colle

AW: Scaling to large Number of Collections

2014-09-01 Thread Christoph Schmidt
Is there a Jira task for this? Thanks Christoph -Ursprüngliche Nachricht- Von: Mark Miller [mailto:markrmil...@gmail.com] Gesendet: Sonntag, 31. August 2014 14:24 An: solr-user Betreff: Re: Scaling to large Number of Collections > On Aug 31, 2014, at 4:04 AM, Christoph Schmidt > wrot