Solr 4.4.0 on hadoop 2.2.0

2014-07-31 Thread Jeniba Johnson
Hi, Iam new to solr. I have integrated solr4.9.0 and Hadoop 2.3.0. I have changed the solrconfig.xml file so that it can index and store the data on hdfs. Solrconfig.xml hdfs://xxx.xx.xx.xx:50070/user/solr/data true 1 true 16384 true true true

Re: Solr vs ElasticSearch

2014-07-31 Thread Otis Gospodnetic
If performance is the main reason, you can stick with Solr. Both Solr and ES have many knobs to turn for performance, it is impossible to give a direct and correct answer to the question which is faster. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Suppor

Re: Autocommit, opensearchers and ingestion

2014-07-31 Thread rulinma
good -- View this message in context: http://lucene.472066.n3.nabble.com/Autocommit-opensearchers-and-ingestion-tp4119604p4150558.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr vs ElasticSearch

2014-07-31 Thread Alexandre Rafalovitch
Maybe Charlie Hull can answer that: https://twitter.com/FlaxSearch/status/494859596117602304 . He seems to think that - at least in some cases - Solr is faster. I am also doing a talk and a book on Solr vs. ElasticSearch, but I am not really planning to address those issues either, only the featur

Re: Solr vs ElasticSearch

2014-07-31 Thread Salman Akram
I did see that earlier. My main concern is search performance/scalability/throughput which unfortunately that article didn't address. Any benchmarks or comments about that? We are already using SOLR but there has been a push to check elasticsearch. All the benchmarks I have seen are at least few y

Re: Search on Date Field

2014-07-31 Thread Jack Krupansky
A range query: published_date:["2012-09-26T00:00:00Z" TO "2012-09-27T00:00:00Z"} WIth LucidWorks Search, you can simply say: published_date:2012-09-26 and it will internally generate that full range query. See: http://docs.lucidworks.com/display/lweug/Date+Queries -- Jack Krupansky -Ori

Solr Query Elevation Component

2014-07-31 Thread dboychuck
The documentation is very unclear (at least to me) around Query Elevation Component and filter queries (fq param) The documentation for Solr 4.9 states: The fq Parameter Query elevation respects the standard filter query (fq) parameter. That is, if the query contains the fq parameter, all results

Re: Solr vs ElasticSearch

2014-07-31 Thread Otis Gospodnetic
Not super fresh, but more recent than the 2 links you sent: http://blog.sematext.com/2012/08/23/solr-vs-elasticsearch-part-1-overview/ Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thu, Jul 31, 2014 at 10:33 PM, Salman Ak

Re: How to search for phrase "IAE_UPC_0001"

2014-07-31 Thread Paul Rogers
Hi Jack Thanks for the info. I'll take a look and see if I can figure it out (just purchased the book). P On 31 July 2014 17:16, Jack Krupansky wrote: > And I have a lot more explanation and examples for word delimiter filter > in my e-book: > http://www.lulu.com/us/en/shop/jack-krupansky/sol

Extend the Solr Terms Component to implement a customized Autosuggest

2014-07-31 Thread Juan Pablo Albuja
Good afternoon guys, I really appreciate if someone on the community can help me with the following issue: I need to implement a Solr autosuggest that supports: 1. Get autosuggestion over multivalued fields 2. Case - Insensitiveness 3. Look for content in the middle for examp

Re: How to search for phrase "IAE_UPC_0001"

2014-07-31 Thread Jack Krupansky
And I have a lot more explanation and examples for word delimiter filter in my e-book: http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Thursday, July 31,

Re: Searching and highlighting ten's of fields

2014-07-31 Thread Manuel Le Normand
Right, it works! I was not aware of this functionality and being able to customize it by hl.requireFieldMatch param. Thanks

Re: integrating Accumulo with solr

2014-07-31 Thread Jack Krupansky
To be clear, I wasn't suggesting that Accumulo was the cause of integration complexity - EVERY NoSQL will have integration complexity of comparable magnitude. The advantage of DataStax Enterprise or Sqrrl Enterprise is that they have done the integration work for you. -- Jack Krupansky -O

Re: Solr vs ElasticSearch

2014-07-31 Thread Salman Akram
This is quite an old discussion. Wanted to check any new comparisons after SOLR 4 especially with regards to performance/scalability/throughput? On Tue, Jul 26, 2011 at 7:33 PM, Peter wrote: > Have a look: > > > http://stackoverflow.com/questions/2271600/elasticsearch-sphinx-lucene-solr-xapian-

Re: Solr gives the same fieldnorm for two different-size fields

2014-07-31 Thread Erick Erickson
You can consider, say, a copyField directive and copy the field into a string type (or perhaps keyworTokenizer followed by lowerCaseFilter) and then match or boost on an exact match rather than trying to make scoring fill this role. In any case, I'm thinking of normalizing the sensitive fields and

re: Solr is working very slow after certain time

2014-07-31 Thread Chris Morley
A page Solr Performance Factors mentions 2 big tips that may help you, but you have to read the rest of the page to make sure you understand the caveats there. In general, adding many documents per update request is faster than one per update request. Reducing the frequency of automatic comm

Re: Solr is working very slow after certain time

2014-07-31 Thread Otis Gospodnetic
Can we look at your disk IO and CPU? SPM can help. Isn't "UseCompressedOops" a typo? And deprecated? In general, may want to simplify your JVM params unless you are really sure they are helping. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr &

Solr is working very slow after certain time

2014-07-31 Thread Ameya Aware
Hi, i could index around 10 documents in couple of hours. But after that the time for indexing very large (around just 15-20 documents per minute). i have taken care of garbage collection. i am passing below parameters to Solr: -Xms6144m -Xmx6144m -XX:MaxPermSize=128m -XX:+UseConcMarkSweepGC

Re: How to sync lib directory in SolrCloud?

2014-07-31 Thread Timothy Potter
You'll need to scp the JAR files to all nodes in the cluster. ZK is not a great distribution mechanism for large binary files since it has a 1MB znode size limit (by default) On Thu, Jul 31, 2014 at 10:26 AM, P Williams wrote: > Hi, > > I have an existing collection that I'm trying to add to a ne

Re: Solr gives the same fieldnorm for two different-size fields

2014-07-31 Thread gorjida
Thanks so much for your reply... In my case, it really matters because I am going to find the correct institution match for an affiliation string... For example, if an author belongs to the "university of Toronto", his/her affiliation should be normalized against the solr... In this case, "Universi

Re: Searching words with spaces for word without spaces in solr

2014-07-31 Thread sunshine glass
*Point 1:* On Thu, Jul 31, 2014 at 9:32 PM, Dyer, James wrote: > If a user is searching on "ice cream" but your index has "icecream", you > can treat this like a spelling error. WordBreakSolrSpellChecker would > identify the fact that while "ice cream" is not in your index, "icecream" > and th

Re: Solr gives the same fieldnorm for two different-size fields

2014-07-31 Thread Erick Erickson
And it won't be . Basically, the norms are an approximation (They used to be just a byte long), so fields of "close" lengths will have the same value here. Why is this an issue? If you back up a second, is a word appearing in a 4-word field really "enough" more important than one appearing in a 5

Re: How to search for phrase "IAE_UPC_0001"

2014-07-31 Thread Paul Rogers
Hi Erick Thanks for the reply. I'll have a look and see if it is any help. Again thanks for pointing me in the right direction. regards Paul On 31 July 2014 11:58, Erick Erickson wrote: > Take a look at WordDelimiterFilterFactory. It has a bunch of > options to allow this kind of thing to

Solr gives the same fieldnorm for two different-size fields

2014-07-31 Thread gorjida
I use solr for searching over a collection of institution names... My solr DB contains multiple field names such as name, country, city, A sample document looks like this: { "solr_id": 130950, "rg_id": 140239, "rg_parent_id": 1438, "name": "University of Califo

Re: How to search for phrase "IAE_UPC_0001"

2014-07-31 Thread Erick Erickson
Take a look at WordDelimiterFilterFactory. It has a bunch of options to allow this kind of thing to be indexed and searched. Note that in the default schema, the definition in the index part of the fieldType definition has slightly different parameters than the query time WordDelimiterFilterFactor

Re: Auto suggest with adding accents

2014-07-31 Thread Otis Gospodnetic
You need to do the opposite. Make sure accents are NOT removed at index & query time. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thu, Jul 31, 2014 at 5:49 PM, benjelloun wrote: > hi, > > q="gene" it suggest "geneve

How to search for phrase "IAE_UPC_0001"

2014-07-31 Thread Paul Rogers
Hi Guys I have a Solr application searching on data uploaded by Nutch. The search I wish to carry out is for a particular document reference contained within the "url" field, e.g. IAE-UPC-0001. The problem is is that the file names that comprise the url's are not consistent, so a url might conta

How to sync lib directory in SolrCloud?

2014-07-31 Thread P Williams
Hi, I have an existing collection that I'm trying to add to a new SolrCloud. This collection has all the normal files in conf but also has a lib directory to support the filters schema.xml uses. wget https://github.com/projectblacklight/blacklight-jetty/archive/v4.9.0.zip unzip v4.9.0.zip I add

Re: SolrCloud loadbalancing, replication, and failover

2014-07-31 Thread Shawn Heisey
On 7/31/2014 12:58 AM, shuss...@del.aithent.com wrote: > Thanks for giving great explanation about the memory requirements. Could you > tell be what all parameters that I need to change in my SolrConfig.xml to > handle large index size. What are the optimal values that I need to use. > > My index

RE: Searching words with spaces for word without spaces in solr

2014-07-31 Thread Dyer, James
If a user is searching on "ice cream" but your index has "icecream", you can treat this like a spelling error. WordBreakSolrSpellChecker would identify the fact that while "ice cream" is not in your index, "icecream" and then you can re-query for the corrected version without the space. The p

Re: Auto suggest with adding accents

2014-07-31 Thread benjelloun
hi, q="gene" it suggest "geneve" ASCIIFoldingFilter work like isolate accent what i need to suggest is "genève" any idea? thanks best reagards Anass BENJELLOUN -- View this message in context: http://lucene.472066.n3.nabble.com/Auto-suggest-with-adding-accents-tp4150379p4150392.html Sent f

Re: Searching words with spaces for word without spaces in solr

2014-07-31 Thread sunshine glass
I am not clear with this. This link is related to spell check. Can you elaborate it more ? On Wed, Jul 30, 2014 at 9:17 PM, Dyer, James wrote: > In addition to the analyzer configuration you're using, you might want to > also use WordBreakSolrSpellChecker to catch possible matches that can't >

Re: Ranking based on match position in field

2014-07-31 Thread Ahmet Arslan
Hi Tomas, Sorry for the confusion. That link (open issue) means that, it is a proposed and desired functionality. However it didn't included in code base yet. You could do :  * ping the author through jira and request to bring patch to trunk * vote for the issue * you could try if patch works w

Re: Auto suggest with adding accents

2014-07-31 Thread Ahmet Arslan
Hi, What happens when you add ASCIIFoldingFilter to field type definition of suggestField? Ahmet On Thursday, July 31, 2014 5:49 PM, benjelloun wrote: Hello, i'm trying to autosuggest frensh word with accents, but if the user write q="gene" it will not suggest "genève", it will suggest "gene

Re: Index a time/date range

2014-07-31 Thread Ryan Cutter
Great resources, thanks everyone! On Wed, Jul 30, 2014 at 8:12 PM, david.w.smi...@gmail.com < david.w.smi...@gmail.com> wrote: > The wiki page on the technique cleans up some small errors from Hoss’s > presentation: > http://wiki.apache.org/solr/SpatialForTimeDurations > > But please try Solr tr

Auto suggest with adding accents

2014-07-31 Thread benjelloun
Hello, i'm trying to autosuggest frensh word with accents, but if the user write q="gene" it will not suggest "genève", it will suggest "general","genetic" ... suggestDic org.apache.solr.spelling.suggest.Suggester org.apache.solr.spelling.suggest.fst.WFSTLookupFactory

Re: Querying from solr shards

2014-07-31 Thread Jack Krupansky
That would be two separate queries, one specifying a single core, and the other specifying all cores. Or, if that one ID is unique to that core, just combine the two queries as an OR. If not unique, try to find some field query that would make it unique. If even that is not unique, then you

RE: Identify specific document insert error inside a solrj batch request

2014-07-31 Thread Liram Vardi
Hi Jack, Thank you for your reply. This is the Solr stack trace. As you can see, the missing field is "hourOfDay". Thanks, Liram 2014-07-30 14:27:54,934 ERROR [qtp-608368492-19] (SolrException.java:108) - org.apache.solr.common.SolrException: [doc=53b16126--0002-2b03-17ac4d4a07b6] missing r

Re: SolrCloud without NRT and indexing only on the master

2014-07-31 Thread Ramkumar R. Aiyengar
I agree with Erick that this gain you are looking at might not be worth, so do measure and see if there's a difference. Also, the next release of Solr is to have some significant improvements when it comes to CPU usage under heavy indexing load, and we have had at least one anecdote so far where t