Re: faceting performance on fields with high-cardinality

2014-06-19 Thread Shyamsunder R Mutcha
Hi Tag, I dont' see any query(q) given for execution in the firstSearcher and newSearcher event listener. Can you add a query term: <>> Check your logs and it will log that firstSeacher event executed and prints an message with investerdIndex and number of facet items loaded. Thanks Shyamsund

Surround query with Boolean queries

2014-06-19 Thread Shyamsunder R Mutcha
Hi, I have two fields in the index with company and year. Following surround query finds computer and applications within and 5 words of each is working fine with surround query parser. {!surround maxBasicQueries=10}company:5N(comput*, appli*) Now If I have add another boolean query +year

Solr index pdf/word document with attachements

2014-06-19 Thread Prasi S
Hi , How can I index word / pdf documents with attachments to solr? I have tried indexing a simple file with an attachment using tika, but it does not index the attachment separately. Only the origiinal document is getting indexed. Thanks, Prasi

Re: Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread RadhaJayalakshmi
Thanks Shawn and Thanks Chris!! Shawn your explanation was very clear and clarified my doubts Chris, The video was also very useful -- View this message in context: http://lucene.472066.n3.nabble.com/Segment-Count-of-my-Index-is-greater-than-the-Configured-MergeFactor-tp4142783p4142987.html S

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-19 Thread Andy
Congrats! Any idea when will native faceting & off-heap fieldcache be available for multivalued fields? Most of my fields are multivalued so that's the big one for me. Andy On Thursday, June 19, 2014 3:46 PM, Yonik Seeley wrote: FYI, for those who want to try out the new native code facet

Re: Multivalue wild card search

2014-06-19 Thread Jack Krupansky
1. Wildcards do not work within quoted terms. 2. Spaces in terms need to be escaped. 3. The quotes embedded in a term do not need to be escaped. So, try: q=*["Steve",\ "Wonder",\ ""]] or q=*["Steve",\ "Wonder",\ ""]* -- Jack Krupansky -Original Message- From: Ethan Sent: Thursday,

[ANN][Meta] "Apache Solr popularizers" LinkedIn group

2014-06-19 Thread Alexandre Rafalovitch
Hello, ( TL;DR: http://www.linkedin.com/groups?gid=6713853 ) Based on - short :-] - Twitter discussion, we have decided to have a go at some sort of a group for Solr popularizers: people who are teaching Solr, running meetups, building Solr examples, and writing Solr books. Basically, anybody w

Re: running Post jar from different server

2014-06-19 Thread Sameer Maggon
Ravi, post.jar is a standalone utility that does not have to be on the same server. If you can share the command you are executing, there might be some pointers in there. Thanks, -- *Sameer Maggon* http://measuredsearch.com On Thu, Jun 19, 2014 at 8:54 PM, EXTERNAL Taminidi Ravi (ETI, Automoti

running Post jar from different server

2014-06-19 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi, I have situation where my SQL Job initiate a console application , where I am calling the post.jar to upload data to SOLR. Both SQL DB and SOLR are 2 different servers. I am calling post.jar from my SQLDB where the path is mapped to a network drive. I am getting an error file not found. I

Fwd: Tracing Files Which Have Errors

2014-06-19 Thread Simon Cheng
Hi there, I have posted 190,000 simple XML using POST.JAR and there are only 8 files that were with errors. But how do I know which are the ones have errors? Thank you in advance, Simon Cheng.

Indexing a term into separate Lucene indexes

2014-06-19 Thread Huang, Roger
If I have documents with a person and his email address: u...@domain.com How can I configure Solr (4.6) so that the email address source field is indexed as - the user part of the address (e.g., "user") is in Lucene index X - the domain part of the add

Re: Multivalue wild card search

2014-06-19 Thread Ethan
Ahmet, Assuming there is a multiValued field called "Name" of type string stored in index - //Doc 1 "id" : 23512 "HotelId" : [ "12", "23", "12" ] "Name" : [ "[[\"Ethan\", \"G\", \"\"],[\"Steve\", \"Wonder\", \"\"]]", "[]", "[[\"hifte\", \"Grop\", \"\"]]" ] // Doc 2 "id" : 23513 "Ho

Re: Limit Porter stemmer to plural stemming only?

2014-06-19 Thread Chris Hostetter
: Can you please share the Java code for Plural Only Porter Stemmer for English if you don't mind? The Porter stemmer algorithm, by definition, does more then just stip plurals. If you are interested in a lighter weight stemmer for english, this is exactly what the EnglishMinimalStemFilterFac

Re: Cursor deep paging new behavior

2014-06-19 Thread Chris Hostetter
if by "old behavior" you mean incremending the start param then the the order of results when doing concurrent indexing was always dependent on what exactly your sort was. when using a cursor, the impacts of concurrent indexing are also dependent on what your sort clause looks like -- but in d

[ANN] Heliosearch 0.06 released, native code faceting

2014-06-19 Thread Yonik Seeley
FYI, for those who want to try out the new native code faceting, this is the first release containing it (for single valued string fields only as of yet). http://heliosearch.org/download/ Heliosearch v0.06 Features: o Heliosearch v0.06 is based on (and contains all features of) Lucene/Solr 4.9.

Re: Calculating filterCache size

2014-06-19 Thread Erick Erickson
That's specific to using the facet.method=enum, but do admit it's easy to miss that. I added a note about that though... Thanks for pointing that out! On Thu, Jun 19, 2014 at 9:38 AM, Benjamin Wiens wrote: > Thanks to both of you. Yes the mentioned config is illustrative, we decided > for 512

Re: Query Response in Html

2014-06-19 Thread Erik Hatcher
Show us the complete code you’re using. Is the “text” not HTML text? What are you receiving exactly and what you expecting instead? To use SolrJ with other types of responses (non-XML/javabin) you’ll need to configure a ResponseParser. The NoOpResponseParser may do the trick, where you get

Re: clarification on index-to-ram ratio

2014-06-19 Thread Vinay Pothnis
Thanks! And yes, the replica belongs to a different shard - not the same data. -Vinay On 19 June 2014 11:21, Toke Eskildsen wrote: > Vinay Pothnis [poth...@gmail.com] wrote: > > *"... Let's say that you have a Solr index size of 8GB. If your OS, > Solr's > > Java heap, and all other running pr

RE: clarification on index-to-ram ratio

2014-06-19 Thread Toke Eskildsen
Vinay Pothnis [poth...@gmail.com] wrote: > *"... Let's say that you have a Solr index size of 8GB. If your OS, Solr's > Java heap, and all other running programs require 4GB of memory, then > an ideal memory size for that server is at least 12GB ..."* > So, when we say "index size" does it include

Re: Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread Chris Hostetter
: > I want to understand why 13 segments are created in my index?? : > Could appreciate if i can get response ASAP : Imagine the following scenario. You start from a clean index and do FWIW: the TL;DR of Shawn's response can be seen in this animation of how Log based MergePolicy's work in the

clarification on index-to-ram ratio

2014-06-19 Thread Vinay Pothnis
Hello All, The documentation and general feedback on the mailing list suggest the following: *"... Let's say that you have a Solr index size of 8GB. If your OS, Solr's Java heap, and all other running programs require 4GB of memory, then an ideal memory size for that server is at least 12GB ..."*

Re: Why aren't my nested documents nesting?

2014-06-19 Thread Vinay B,
Thanks, I tried the block join query via the browser this morning with no success; My URL ( encoded of course). I used this as a guide https://cwiki.apache.org/confluence/display/solr/Other+Parsers http://localhost:8088/solr/test_core/select?q={!parent which="content_type:parentDocument"}ATTRIBUT

Re: Limit Porter stemmer to plural stemming only?

2014-06-19 Thread jerry.ja...@gmail.com
Hi, Do you mind attaching the Plural only Stemmer? I cant find it in this post. Thanks Jerry -- View this message in context: http://lucene.472066.n3.nabble.com/Limit-Porter-stemmer-to-plural-stemming-only-tp486449p4142867.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Calculating filterCache size

2014-06-19 Thread Benjamin Wiens
Thanks to both of you. Yes the mentioned config is illustrative, we decided for 512 after thorough testing. However, when you google "Solr filterCache" the first link is the community wiki which has a config even higher than the illustration which is quite different from the official reference guid

Query Response in Html

2014-06-19 Thread Venkata krishna
Hi, I am using XSLResponseWriter on my application for to transform xml response into html.The following params i have set for that purpose. query.set("wt", "xslt"); query.set("indent","true"); query.set("tr", "example.xsl"); but the response is coming as normal text.Even though i remove the pa

Re: Bug in Collapsing QParserPlugin : Sort by 3 or more fields is broken

2014-06-19 Thread Joel Bernstein
Umesh, this is a good summary. So, the question is what is the cost (performance and memory) of having the CollapsingQParserPlugin choose the group head by using the Solr sort criteria? Keep in mind that the CollapsingQParserPlugin's main design goal is to provide fast performance when collapsing

Re: Synonyms - 20th and 20

2014-06-19 Thread Erick Erickson
You almost certainly have WordDelimiterFilterFactory in your analysis chain after the synonym insertion. It's _job_ is to split on letter/non-letter transitions. The admin/analysis page is your friend. Best, Erick On Wed, Jun 18, 2014 at 12:47 PM, Diego Fernandez wrote: > What tokenizer and fil

Re: Calculating filterCache size

2014-06-19 Thread Erick Erickson
Ben: As Shawn says, you're on the right track... Do note, though, that a 10K size here is probably excessive, YMMV of course. And an autowarm count of 5,000 is almost _certainly_ far more than you want. All these fq clauses get re-executed whenever a new searcher is opened (soft commit or hard c

Re: Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread Shawn Heisey
On 6/19/2014 2:51 AM, RadhaJayalakshmi wrote: > I am using Solr 4.5.1. In that i have created an Index 114.8 MB. Also i have > the following index configuration > > 8 > 100 > 10 > > > I have given the ramBufferSizeMB of 100 and mergefac

Re: Unable to start solr 4.8

2014-06-19 Thread atp
Hi Markus , It been recoverd automatically after several attempts , once again thanks a lot for your help. Now all the nodes are became live. [zk: Hadoop-Main:7001(CONNECTED) 0] get /clusterstate.json {"collection1":{ "shards":{ "shard1":{ "range":"8000-",

Re: Unable to start solr 4.8

2014-06-19 Thread atp
Thank you so much Markus , I have removed the contents from Index , now its working but one of the node went Recovering state, the log says, please help this to make to live. "â Unable to get file names for indexCommit generation: 2" 12118803 [qtp1490747277-11] INFO org.apache.solr.core.So

Re: Unable to start solr 4.8

2014-06-19 Thread Markus Jelsma
Hi - remove the lock file in your solr//data/index.*/ directory. Markus On Thursday, June 19, 2014 04:10:51 AM atp wrote: > Hi experts, > > i have cnfigured solrcloud, on three machines , zookeeper started with no > errors, tomcat log also no errors , solr log alos no errors reported but all >

Unable to start solr 4.8

2014-06-19 Thread atp
Hi experts, i have cnfigured solrcloud, on three machines , zookeeper started with no errors, tomcat log also no errors , solr log alos no errors reported but all the tomcat configured solr clusterstate shows as 'down' ,8870931 [Thread-13] INFO org.apache.solr.common.cloud.ZkStateReader â Up

deep faceting issues in distributed mode

2014-06-19 Thread Dmitry Kan
Hello, We face an issue with deep faceting in a distributed non-SolrCloud setting. A query comes in through the solr frontend (router) and broadcasts to each shard. The exception below appears in the frontend's logs, but shards' logs are clear, each subquery sent by the router succeeds. RAM graph

Re: Tracing Files Which Have Errors

2014-06-19 Thread Alexandre Rafalovitch
How did you post them? Didn't you get an error at some point? If you are completely stuck, you could probably export the IDs only back (as a CSV format) and compare to the list of what you sent. Quite doable. Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http:

Re: making solr to understand English

2014-06-19 Thread Alexandre Rafalovitch
LoL. That's the several level of abstraction and complication above what Solr provides. You are looking at full Natural Language Processing and things like SemEval (http://en.wikipedia.org/wiki/SemEval ). Or at least statistical and/or frame-based analysis (http://en.wikipedia.org/wiki/Frame_langu

Tracing Files Which Have Errors

2014-06-19 Thread Simon Cheng
Hi there, I have posted 190,000 simple XML using POST.JAR and there are only 8 files that were with errors. But how do I know which are the ones have errors? Thank you in advance, Simon Cheng.

making solr to understand English

2014-06-19 Thread Vivekanand Ittigi
Hi, I'm trying to setup solr that should understand English. For example I've indexed our company website (www.biginfolabs.com) or it could be any other website or our own data. If i put some English like queries i should get the one word answer just what Google does;queries are: * Where is Indi

Segment Count of my Index is greater than the Configured MergeFactor

2014-06-19 Thread RadhaJayalakshmi
Hi, I am using Solr 4.5.1. In that i have created an Index 114.8 MB. Also i have the following index configuration 8 100 10 I have given the ramBufferSizeMB of 100 and mergefactor of 10. So this means, that after indexing is com

Re: add new Fields with SolrJ without changing schema.xml

2014-06-19 Thread benjelloun
Hello, I will use DynamicField for some fields but some other fields need to be created. this is an exemple: *Informations* *unique ID field* *Type* *indexed* *stored* *multivalued* *sortmissinglast* *required* *Iddocument* True long false True False True True so how to add this f

Store Java object in field and retrieve it in custom function?

2014-06-19 Thread Costi Muraru
Hi, I'm trying to save a Java object in a binary field and afterwards use this value in a custom solr function. I'm able to put and retrieve the Java object in Base64 via the UI, but I can't seem to be able to retrieve the value in the custom function. In the function I'm using: termsIndex = Fiel

Re: add new Fields with SolrJ without changing schema.xml

2014-06-19 Thread Alexandre Rafalovitch
Use dynamic fields definitions perhaps? Just suffix the fields with "_s", "_i", etc. As per schema.xml. You could also use new schemaless mode, but then when they send a value that auto-creates a field of wrong type, it would be really hard to troubleshoot. Regards, Alex. Personal website: htt

Re: add new Fields with SolrJ without changing schema.xml

2014-06-19 Thread benjelloun
Hello, Because i will not stay working with the actual entreprise. then no one know how to change it manually if they need to add new fields in the futur. so i need to do this with java code, can you please help me with an exemple to complete this: >public static void addNewField(Boolean uniqueId