Re: About enableLazyFieldLoading and memory

2014-03-19 Thread Miguel
An interesting check would be disable compression on stored fields, and to check if your searcher works better. Disable compression should increase stored and searcher should be quicker. I have read that disable compression all you need to do is to write a new codec that uses a stored fields f

Re: About enableLazyFieldLoading and memory

2014-03-19 Thread david . davila
That could be an interesting test. Unfortunately now I don't have time to do that, but maybe in future. In order to avoid these memory consumptions we have reduced DocumentCache, and we don't have any problems. Besides, big queries that can cause problems are never made twice, so the DocumentCa

How to secure Solr admin page?

2014-03-19 Thread Tony Xue
Hi all, I was following the instructions in the official wiki: https://wiki.apache.org/solr/SolrSecurity But I don't have any idea about what directory I should put between to secure only admin page. I tried to put /admin/* but it didn't work. Tony

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-19 Thread Martin de Vries
We are running stable now for a full day, so the bug has been fixed. Many thanks! Martin

Support for wildcard queries in elevate.xml

2014-03-19 Thread Bratislav Stojanovic
Hi, I have searched the mailing list archives but couldn't find the right answer so far. I want to elevate some results using instructions from QueryElevationComponent page, but I'm not sure how to set queries in *elevate.xml* file. My query looks like this : (content:"foobar" OR text:"foobar")

Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R
We run a central database of 14M (and growing) photos with dates, captions, keywords, etc. We currently upgrading from old Lucene Servers to latest Solr running with a couple of dedicated servers (6 core, 36GB, 500SSD). Planning on using Solr Cloud. We take in thousands of changes each day (big

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-19 Thread Steve Rowe
I’m glad it’s working for you now, thanks for reporting back. - Steve On Mar 19, 2014, at 5:32 AM, Martin de Vries wrote: > We are running stable now for a full day, so the bug has been fixed. > > Many thanks! > > Martin

frange and field with hyphen

2014-03-19 Thread Marcin Rzewucki
Hi everyone, I got the following issue recently. I'm trying to use frange on a field which has hyphen in name: true on *:* xml {!frange l=1 u=99}sub(if(1, div(acc_curr_834_2-1900_tl, 1), 0), 1) 2.2 I got the following error: DEBUG - 2014-03-19 12:11:53.805; or

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen
On Wed, 2014-03-19 at 11:55 +0100, Colin R wrote: > We run a central database of 14M (and growing) photos with dates, captions, > keywords, etc. > > We currently upgrading from old Lucene Servers to latest Solr running with a > couple of dedicated servers (6 core, 36GB, 500SSD). Planning on usin

Re: frange and field with hyphen

2014-03-19 Thread Jack Krupansky
For any "improperly" named field (that don't use the java identifier conventions), you simply need to use the field function with the field name in apostrophes: div(acc_curr_834_2-1900_tl,1) becomes: div(field('acc_curr_834_2-1900_tl'),1) -- Jack Krupansky -Original Mess

Re: Indexing large documents

2014-03-19 Thread Alexei Martchenko
Even the most non-structured data has to have some breakpoint. I've seen projects running solr that used to index whole books one document per chapter plus a synopsis boosted doc. The question here is how you need to search and match those docs. alexei martchenko Facebook

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R
Hi Toke Thanks for replying. My question is really regarding index architecture. One big or many small (with merged big ones) We probably get 5-10K photos added each day. Others are updated, some are deleted. Updates need to happen quite fast (e.g. within minutes of our Databases receiving them

Re: frange and field with hyphen

2014-03-19 Thread Marcin Rzewucki
Wow, that was fast reply :) It works. Thank you! On 19 March 2014 13:24, Jack Krupansky wrote: > For any "improperly" named field (that don't use the java identifier > conventions), you simply need to use the field function with the field name > in apostrophes: > > div(acc_curr_834_2-1900_tl,10

Sort by exact match

2014-03-19 Thread Rok Rejc
Hi all, I have a field in the index - lets call it Name. Name can have one or more words. I want to query all documents which match by name (full or partial match), and order the results: - first display result(s) with exact matches - after that display results with partial matched and order them

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen
On Wed, 2014-03-19 at 13:28 +0100, Colin R wrote: > My question is really regarding index architecture. One big or many small > (with merged big ones) One difference is that having a single index/collection gives you better ranked searches within each collection. If you only use date/filename sort

Re: Partial Counts in SOLR

2014-03-19 Thread Erick Erickson
Yes, that'll be slow. Wildcards are, at best, interesting and at worst resource consumptive. Especially when you're doing this kind of positioning information as well. Consider looking at the problem sideways. That is, what is your purpose in searching for, say, "buy*"? You want to find buy, buyin

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R
Hi Toke Our current configuration Lucene 2.(something) with RAILO/CFML app server. 10K drives, Quad Core, 16GB, Two servers. But the indexing and searching are starting to fail and our developer is no longer with us so it is quicker to rebuild than fix all the code. Our existing config is lots o

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Erick Erickson
Oh My. 2(something) is ancient, I second your move to scrap the current situation and start over. I'm really curious what the _reason_ for such a complex setup are/were. I second Toke's comments. This is actually quite small by modern Solr/Lucene standards. Personally I would index them all to a

Re: frange and field with hyphen

2014-03-19 Thread Erick Erickson
Jack's solution works, but I really, truly, strongly recommend that you follow the usual Java variable-naming conventions for your fields. In fact, I tend to use only lower case and underscores. The reason is that you'll run into this again and again and again. Your front-end will forget to put th

Re: Sort by exact match

2014-03-19 Thread Erick Erickson
Sorting applies to the entire result set, there's no notion of "sort some docs one way and sort others another way". So I don't know any OOB way to do what you want. I don't know what your response time requirements are, but you could do this by firing off two queries and collating the results. If

join and filter query with AND

2014-03-19 Thread Marcin Rzewucki
Hi, I have the following issue with join query parser and filter query. For such query: *:* (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara Zagora")) AND (prod:214) I got error: org.apache.solr.search.SyntaxError: Cannot parse 'city:"Stara': Lexical error at line 1, column

searche for single char number when ngram min is 3

2014-03-19 Thread Andreas Owen
Is there a way to tell ngramfilterfactory while indexing that number shall never be tokenized? then the query should be able to find numbers. Or do i have to change the ngram min for numbers to 1, if that is possible? So to speak put the hole number as token and not all possible tokens. Or can i

Re: More heap usage in Solr during indexing

2014-03-19 Thread solr2020
We are doing Autocommit for every five minutes. -- View this message in context: http://lucene.472066.n3.nabble.com/More-heap-usage-in-Solr-during-indexing-tp4124898p4125497.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: searche for single char number when ngram min is 3

2014-03-19 Thread Jack Krupansky
Interesting point. I think it would be nice to have an option to treat numeric sequences (or maybe with commas and decimal point as well) as integral tokens that won't be split by ngramming. It's worth a Jira. OTOH, you have to make a value judgment whether a query for "3.14" should only exact

Re: Best SSD block size for large SOLR indexes

2014-03-19 Thread Shawn Heisey
On 3/19/2014 12:09 AM, Salman Akram wrote: Thanks for the info. The articles were really useful but still seems I have to do my own testing to find the right page size? I thought for large indexes there would already be some tests done in SOLR community. Side note: We are heavily using Microsoft

Re: join and filter query with AND

2014-03-19 Thread Erick Erickson
It looks to me like you're feeding this from some kind of text file and you really _do_ have a line break after "Stara Or have a line break in the string you paste into the URL or something similar. Kind of shooting in the dark though. Erick On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki wro

Re: underscore in query error

2014-03-19 Thread Erick Erickson
Attachments don't come through the user list very well, you might have to put it up on pastebin or some such and provide a link. But my guess is that your analysis chain is doing something interesting you don't expect, the analyzer output you tried to paste would help here. Also, if you could pro

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Shawn Heisey
On 3/19/2014 4:55 AM, Colin R wrote: My question is an architecture one. These photos are currently indexed and searched in three ways. 1: The 14M pictures from above are split into a few hundred indexes that feed a single website. This means index sizes of between 100 and 500,000 entries each.

underscore in query error

2014-03-19 Thread Andreas Owen
If I use the underscore in the query I don't get any results. If I remove the underscore it finds the docs with underscore. Can I tell solr to search through the ngtf instead of the wdf or is there any better solution? Query: yh_cug I attached a doc with the analyzer output

Re: Partial Counts in SOLR

2014-03-19 Thread Salman Akram
This was one example. Users can even add phrase searches with wildcards/proximity etc so can't really use stemming. Sharding is definitely something we are already looking into. On Wed, Mar 19, 2014 at 6:59 PM, Erick Erickson wrote: > Yes, that'll be slow. Wildcards are, at best, interesting an

Filter in terms component

2014-03-19 Thread Jilani Shaik
Hi, I have huge index and using Solr. I need terms component with filter by a field. Please let me know is there anything that I can get it. Please provide me some pointers, even to develop this by going through the Lucene. Please suggest. Thanks, Jilani

Re: Filter in terms component

2014-03-19 Thread Ahmet Arslan
Hi Jilani, What features of terms component are you after? If if it is just terms.prefix, it could be simulated with facet component with facet.prefix parameter. faceting component respects filter queries. On Wednesday, March 19, 2014 8:58 PM, Jilani Shaik wrote: Hi, I have huge index and

Re: Filter in terms component

2014-03-19 Thread Jilani Shaik
Hi Ahmet, I have gone through the facet component, as our application has 300+ million docs and it very time consuming with this component and also it uses cache. So I have gone through the terms component where Solr is reading index for field terms, is there any approach where I can get the terms

Re: Filter in terms component

2014-03-19 Thread Ahmet Arslan
Hi, If you just need counts may be you can make use of  http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions Ahmet On Wednesday, March 19, 2014 9:49 PM, Jilani Shaik wrote: Hi Ahmet, I have gone through the facet component, as our application has 300+ million docs and it very time

w/10 ? [was: Partial Counts in SOLR]

2014-03-19 Thread T. Kuro Kurosaka
In the thread "Partial Counts in SOLR", Salman gave us this sample query: ((stock or share*) w/10 (sale or sell* or sold or bought or buy* or purchase* or repurchase*)) w/10 (executive or director) I'm not familiar with this w/10 notation. What does this mean, and what parser(s) supports this

Excessive Heap Usage from docValues?

2014-03-19 Thread tradergene
Hello All, I'm hoping to get your assistance in debugging what seems like a memory issue. I have a Solr index with about 32 million docs. Each doc is relatively small but has multiple dynamic fields that are storing INTs. The initial problem that I had to resolve is that we were running into OO

Re: How to return more fields on Solr 4.5.1 Suggester?

2014-03-19 Thread Ahmet Arslan
Hey Omer, Create a copy movie_title and use edgy_text described here :  http://searchhub.org/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ With this approach you can request whatever field you want with fl parameter. Ahmet On Monday, March 17, 2014 3:48 PM, Erick Erickson

Re: Indexing large documents

2014-03-19 Thread Tom Burton-West
Hi Stephen, We regularly index documents in the range of 500KB-8GB on machines that have about 10GB devoted to Solr. In order to avoid OOM's on Solr versions prior to Solr 4.0, we use a separate indexing machine(s) from the search server machine(s) and also set the termIndexInterval to 8 times th

Re: Zookeeper exceptions - SEVERE

2014-03-19 Thread Chris W
Thanks. Temporarily got over the problem by specifying custom limits through jute.maxbuffer= On Tue, Mar 18, 2014 at 9:45 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > Sorry guys I spoke too fast. I looked at the code again. No it doesn't > correlate with commits at all. I was m

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-19 Thread Otis Gospodnetic
Hi, Guessing it's surround query parser's support for "within" backed by span queries. Otis Solr & ElasticSearch Support http://sematext.com/ On Mar 19, 2014 4:44 PM, "T. Kuro Kurosaka" wrote: > In the thread "Partial Counts in SOLR", Salman gave us this sample query: > > ((stock or share*) w/

Re: Excessive Heap Usage from docValues?

2014-03-19 Thread Otis Gospodnetic
Hi, Which type of doc values? See Wiki or reference guide for a list of types. Otis Solr & ElasticSearch Support http://sematext.com/ On Mar 19, 2014 5:02 PM, "tradergene" wrote: > Hello All, > > I'm hoping to get your assistance in debugging what seems like a memory > issue. > > I have a Solr

Re: searche for single char number when ngram min is 3

2014-03-19 Thread Alexandre Rafalovitch
Does NGram factory support keyword token-type protection? If so, it could be just a matter of marking a number as keyword. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events f

Solr4.7 No live SolrServers available to handle this request

2014-03-19 Thread Sathya
Hi Friends, I am new to Solr. I have 5 solr node in 5 different machine. When i index the data, sometimes "*No live SolrServers available to handle this request*" exception occur in 1 or 2 machines. I dont know why its happen and how to solve this. Kindly help me to solve this issue. -- View

Re: Excessive Heap Usage from docValues?

2014-03-19 Thread Toke Eskildsen
On Wed, 2014-03-19 at 22:01 +0100, tradergene wrote: > I have a Solr index with about 32 million docs. Each doc is relatively > small but has multiple dynamic fields that are storing INTs. The initial > problem that I had to resolve is that we were running into OOMs (on a 48GB > heap, 130GB on-di

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-19 Thread Salman Akram
Yup! On Thu, Mar 20, 2014 at 5:13 AM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi, > > Guessing it's surround query parser's support for "within" backed by span > queries. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Mar 19, 2014 4:44 PM, "T. Kuro Kurosaka"