Solr 5: data_driven_schema_config's solrconfig causing error

2015-03-10 Thread Aman Tandon
Hi, For the sake of using the new schema.xml and solrconfig.xml with solr 5, I put my old required field type & fields names (being used with solr 4.8.1) in the schema.xml given in *basic_configs* & configurations setting given in solrconfig.xml present in *data_driven_schema_configs* and put I pu

Re: Solrcloud Index corruption

2015-03-10 Thread Martin de Vries
Hi, this _sounds_ like you somehow don't have indexed="true" set for the field in question. We investigated a lot more. The CheckIndex tool didn't find any error. We now think the following happened: - We changed the schema two months ago: we changed a field to indexed="true". We reloaded th

Solr phonetics with spelling

2015-03-10 Thread Ashish Mukherjee
Hello, Couple of questions related to phonetics - 1. If I enable the phonetic filter in managed-schema file for a particular field, how does it affect the spell handler? 2. What is the meaning of the inject attribute within in managed-schema? The documentation is not very clear about it. Regar

Re: SolrCloud: Chroot error

2015-03-10 Thread Aman Tandon
Thanks Shawn, I tried it with single string but still no success. So currently i am running it without chroot and it is working fine. With Regards Aman Tandon On Mon, Mar 9, 2015 at 9:46 PM, Shawn Heisey wrote: > On 3/9/2015 10:03 AM, Aman Tandon wrote: > > Thanks for replying, Just to send th

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-03-10 Thread Dmitry Kan
For the sake of the story completeness, just wanted to confirm these params made a positive affect: -Dsolr.solr.home=cores -Xmx12000m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+ExplicitGCInvokesConcurrent -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=8 -XX:CMSInitiatingOccupancyFraction=40

RE: Solr phonetics with spelling

2015-03-10 Thread Dyer, James
Ashish, I would not recommend using spellcheck against a phonetic-analyzed field. Instead, you can use to create a separate field that is lightly analyzed and use the copy for spelling. James Dyer Ingram Content Group -Original Message- From: Ashish Mukherjee [mailto:ashish.mukher

Re: SolrCloud: Chroot error

2015-03-10 Thread Shawn Heisey
On 3/10/2015 6:10 AM, Aman Tandon wrote: > Thanks Shawn, I tried it with single string but still no success. > > So currently i am running it without chroot and it is working fine. That brings up a something for me or you to try. I wonder if perhaps there is a bug that will prevent the "director

Re: Solr 5: data_driven_schema_config's solrconfig causing error

2015-03-10 Thread Steve Rowe
Hi Aman, The stack trace shows that the AddSchemaFieldsUpdateProcessorFactory specified in data_driven_schema_configs’s solrconfig.xml expects the “booleans” field type to exist. Solr 5’s data_driven_schema_configs includes the “booleans” field type:

Solr 5 upgrade

2015-03-10 Thread richardg
Ubuntu 14.04.02 Trying to install solr 5 following this: https://cwiki.apache.org/confluence/display/solr/Upgrading+a+Solr+4.x+Cluster+to+Solr+5.0 I keep getting "this script requires extracting a war file with either the jar or unzip utility, please install these utilities or contact your adminis

Re: Solr TCP layer

2015-03-10 Thread Erick Erickson
Just to pile on: I admire your bravery! I'll add to the other comments only by saying that _before_ you start down this path, you really need to articulate the benefit/cost analysis. "to gain a little more communications efficiency" will be a pretty hard sell due to the reasons Shawn outlined. Thi

Re: Parsing cluster result's docs

2015-03-10 Thread Erick Erickson
You can get some fields back besides ID, see the carrot.title and carrot.snippet params. I don't know a good way to get the full underlying documents though. Best, Erick On Mon, Mar 9, 2015 at 9:33 AM, Jorge Luis Lazo wrote: > Hi, > > I have a Solr instance using the clustering component (with t

Num docs, block join, and dupes?

2015-03-10 Thread Timothy Potter
Before I open a JIRA, I wanted to put this out to solicit feedback on what I'm seeing and what Solr should be doing. So I've indexed the following 8 docs into a 2-shard collection (Solr 4.8'ish - internal custom branch roughly based on 4.8) ... notice that the 3 grand-children of 2-1 have dup'd key

Re: Field Rename in SOLR

2015-03-10 Thread Erick Erickson
What do you mean "rename field"? It _looks_ like you're trying to get the results into a doc from your document and changing it's name _in the results_. I.e. you have "ProductName" in your document, but want to see Name_en-US in your output. My guess is that the hyphen is the problem. Does it work

Re: Solr 5.0.0 - Multiple instances sharing Solr server *read-only* dir

2015-03-10 Thread Erick Erickson
If I'm understanding your problem correctly, I think you want the -d option, then all the -s guys would be under that. Just to check, though, why are you running multiple Solrs? There are sometimes very good reasons, just checking that you're not making things more difficult than necessary Be

Re: Solrcloud Index corruption

2015-03-10 Thread Erick Erickson
Ahhh, ok. When you reloaded the cores, did you do it core-by-core? I can see how something could get dropped in that case. However, if you used the Collections API and two cores mysteriously failed to reload that would be a bug. Assuming the replicas in question were up and running at the time you

Re: unusually high 4.10.2 vs 4.3.1 RAM consumption

2015-03-10 Thread Erick Erickson
Thanks for letting us know! Erick On Tue, Mar 10, 2015 at 5:20 AM, Dmitry Kan wrote: > For the sake of the story completeness, just wanted to confirm these params > made a positive affect: > > -Dsolr.solr.home=cores -Xmx12000m -Djava.awt.headless=true -XX:+UseParNewGC > -XX:+ExplicitGCInvokesCon

Re: Solr 5.0.0 - Multiple instances sharing Solr server *read-only* dir

2015-03-10 Thread Timothy Potter
I think the next step here is to ship Solr with the war already extracted so that Jetty doesn't need to extract it on first startup - https://issues.apache.org/jira/browse/SOLR-7227 On Tue, Mar 10, 2015 at 10:15 AM, Erick Erickson wrote: > If I'm understanding your problem correctly, I think you

Re: Cores and and ranking (search quality)

2015-03-10 Thread johnmunir
Thanks Erick for trying to help, I really appreciate it. Unfortunately, I'm still stuck. There are times one must know the inner working and behavior of the software to make design decision and this one is one of them. If I know the inner working of Solr, I would not be asking. In addition,

Chaining components in request handler

2015-03-10 Thread Ashish Mukherjee
Hello, I would like to create a request handler which chains components in a particular sequence to return the result, similar to a Unix pipe. eg. Component 1 -> result1 -> Component 2 -> result2 result2 is final result returned. Component 1 may be a standard component, Component 2 may be out o

Re: Chaining components in request handler

2015-03-10 Thread Alexandre Rafalovitch
Is that during indexing or during query phase? Indexing has UpdateRequestProcessors (e.g. http://www.solr-start.com/info/update-request-processors/ ) Query has Components (e.g. Faceting, MoreLIkeThis, etc) Or something different? Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs a

Re: Cores and and ranking (search quality)

2015-03-10 Thread Shawn Heisey
On 3/10/2015 11:17 AM, johnmu...@aol.com wrote: > If I have two cores, one core has 10 docs another has 100,000 docs. I then > submit two docs that are 100% identical (with the exception of the unique-ID > fields, which is stored but not indexed) one to each core. The question is, > during sea

Re: Solr 5.0.0 - Multiple instances sharing Solr server *read-only* dir

2015-03-10 Thread Damien Dykman
Thanks Timothy for the pointer to the Jira ticket. That's exactly it :-) Erick, the main reason why I would run multiple instances on the same machine is to simulate a multi node environment. But beyond that, I like the idea of being able to clearly separate the server dir and the data dirs. That

Re: Chaining components in request handler

2015-03-10 Thread Alexandre Rafalovitch
Ok. Components then. Defined in solrconfig.xml. You can prepend/append/replace the standard list. Try that and see if that's enough. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.solr-start.com/ On 10 March 2015 at 14:03, Ashish Mukherjee wr

Re: Solr TCP layer

2015-03-10 Thread Saumitra Srivastav
Thanks everyone for the responses. My motivation for TCP is coming from a very heavy indexing pipeline where the smallest of optimization matters. I am working on a machine data parser which feeds data into Cassandra and Solr and we have SLAs based on how fast we can make data available in both th

Re: Solr TCP layer

2015-03-10 Thread Erick Erickson
Saumitra: We certainly don't mean to be overly discouraging, so have at it! There has been some talk of using Netty in the future as we pull the war-file distribution out of the distro. Now, I have no technical clue about the merits .vs. TCP. But that's another possibility you might want to put in

Re: Solr TCP layer

2015-03-10 Thread Walter Underwood
I would strongly recommend taking a look at HTTP/2. It might not be fast enough for you, but it is fast enough for Google and there are already implementations. http://http2.github.io/faq/ wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On Mar 10, 2015,

Re: Solr TCP layer

2015-03-10 Thread Shawn Heisey
On 3/10/2015 12:13 PM, Saumitra Srivastav wrote: > Now we want to do the same with Solr. While I do realize that this is going > to be a lot of work, but if its something that will reap benefit in long > run, then so be it. Datastax provides a netty based layer in their > enterprise version which f

Re: Num docs, block join, and dupes?

2015-03-10 Thread Jessica Mallet
We've seen this as well. Before we understood the cause, it seemed very bizarre that hitting different nodes would yield different numFound, as well as using different rows=N (since the proxying node only de-dupe the documents that are returned in the response). I think "consistency" and "correctn

Re: Cores and and ranking (search quality)

2015-03-10 Thread Walter Underwood
On Mar 10, 2015, at 10:17 AM, johnmu...@aol.com wrote: > If I have two cores, one core has 10 docs another has 100,000 docs. I then > submit two docs that are 100% identical (with the exception of the unique-ID > fields, which is stored but not indexed) one to each core. The question is, > du

Re: Num docs, block join, and dupes?

2015-03-10 Thread Mikhail Khludnev
On Tue, Mar 10, 2015 at 7:09 PM, Timothy Potter wrote: > So I guess my question is why doesn't the non-distrib query do > de-duping? > Tim, that's by design behavior. the special _root_ field is used as a delete term when a block update is applied i.e in case of block, is not used. see https://

Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Mirko Torrisi
Hi all, I am very new with Solr (and Lucene) and I use the last version of it. I do not understand why I obtain this: Exception in thread "main" org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/Collection1: Invalid Dat

RE: Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Ryan, Michael F. (LNG-DAY)
You'll need to wrap the date in quotes, since it contains a colon: String a = "speechDate:\"1992-07-10T17:33:18Z\""; -Michael -Original Message- From: Mirko Torrisi [mailto:mirko.torr...@ucdconnect.ie] Sent: Tuesday, March 10, 2015 3:34 PM To: solr-user@lucene.apache.org Subject: Invali

Re: Cores and and ranking (search quality)

2015-03-10 Thread johnmunir
Thanks Walter. The design decision I'm trying to solve is this: using multiple cores, will my ranking be impacted vs. using single core? I have records to index and each record can be grouped into object-types, such as object-A, object-B, object-C, etc. I have a total of 30 (maybe more) objec

Re: Invalid Date String:'1992-07-10T17'

2015-03-10 Thread Chris Hostetter
":" is a syntactically significant character to the query parser, so it's getting confused by it in the text of your query. you're seeing the same problem as if you tried to search for "foo:bar" in the "yak" field using q=yak:foo:bar you either need to backslash escape the ":" characters, or w

Re: Cores and and ranking (search quality)

2015-03-10 Thread Walter Underwood
If the documents are distributed randomly across shards/cores, then the statistics will be similar in each core and the results will be similar. If the documents are distributed semantically (say, by topic or type), the statistics of each core will be skewed towards that set of documents and the

Import Feed rss delta-import

2015-03-10 Thread Ednardo
Hi, How do I create a DataImportHandler using delta-import for rss feeds? Thanks!! -- View this message in context: http://lucene.472066.n3.nabble.com/Import-Feed-rss-delta-import-tp4192257.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Import Feed rss delta-import

2015-03-10 Thread Alexandre Rafalovitch
I don't think you can since you can't query RSS normally. You just do full import and override on ids. Regards, Alex On 10 Mar 2015 7:16 pm, "Ednardo" wrote: > Hi, > > How do I create a DataImportHandler using delta-import for rss feeds? > > Thanks!! > > > > -- > View this message in context

Re: how to change configurations in solrcloud setup

2015-03-10 Thread Nitin Solanki
Hi Aman, You can apply configuration on solr cloud by using this command - sudo //example/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd upconfig -confdir //example/solr/collection1/conf -confname default and then restart all nodes of solrcloud. On Mon, Mar 9, 2015 a

Re: how to change configurations in solrcloud setup

2015-03-10 Thread Aman Tandon
Hi, Thanks Nitin for replying, isn't it will be costly operation to restart all nodes. What i am doing in this is uploading the configurations again to zookeeper and then reloading my core. And it is working well. So am i missing something? With Regards Aman Tandon On Wed, Mar 11, 2015 at 11:21