Re: update to 4.3

2013-05-07 Thread Arkadi Colson
Found it on http://wiki.apache.org/solr/SolrLogging! Thx On 05/07/2013 08:40 AM, Arkadi Colson wrote: Any tips on what to do with the configuration files? Where do I have to store them and what should they look like? Any examples? May 07, 2013 6:16:27 AM org.apache.catalina.core.AprLifecycl

Re: When a search query comes to a replica what happens?

2013-05-07 Thread Furkan KAMACI
Hi Otis; I've read at somewhere says that if you have one replica and 1000 query per second search rate and if you switch to 5 replica you may get 200 qps search rate. What do you think about that and how Solr parallelize searching within replicas? By the way when you say replica do you mean both

RE: Solr Cloud with large synonyms.txt

2013-05-07 Thread Roman Chyla
We have synonym files bigger than 5MB so even with compression that would be probably failing (not using solr cloud yet) Roman On 6 May 2013 23:09, "David Parks" wrote: > Wouldn't it make more sense to only store a pointer to a synonyms file in > zookeeper? Maybe just make the synonyms file acces

Re: Rearranging Search Results of a Search?

2013-05-07 Thread Furkan KAMACI
Can I use Transformers for my purpose? 2013/5/3 Furkan KAMACI > I think this looks like what I search for: > https://issues.apache.org/jira/browse/SOLR-4465 > > How about post filter for Lucene, can it help me for my purpose? > > 2013/5/3 Otis Gospodnetic > >> Hi, >> >> You should use search mo

Re: Solr Cloud with large synonyms.txt

2013-05-07 Thread Jan Høydahl
Hi, SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkS

Re: Delete from Solr Cloud 4.0 index..

2013-05-07 Thread Annette Newton
Hi Erick, Thanks for the tip. Will docValues help with memory usage? It seemed a bit complicated to set up.. The index size saving was nice because that means that potentially I could use smaller provisioned IOP volumes which cost less... Thanks. On 3 May 2013 18:27, Erick Erickson wrote:

Re: solr adding unique values

2013-05-07 Thread Nikhil Kumar
Thanks Erik, For the reply ! I know about 'set' but that's not my goal, i had to give a better example. I want this and if i have to add another list_c user a[ id:a liists[ list_a, list_b ] ] It Should look like: user a[ id:a liists[ list_a, list_b, list

Lazy load Error on UI analysis area

2013-05-07 Thread yriveiro
Hi, I was exploring the UI interface and in the analysis section I had a lazy load error. The logs says: INFO - 2013-05-07 11:52:06.412; org.apache.solr.core.SolrCore; [] webapp=/solr path=/admin/luke params={_=1367923926380&show=schema&wt=json} status=0 QTime=23 ERROR - 2013-05-07 11:52:06

Re: Search performance: shards or replications?

2013-05-07 Thread Jan Høydahl
Hi, It depends(TM) on what kind of search performance problems you are seeing. If you simply have so high query load that the server starts to kneal, it will definitely not help to shard, since ALL the shards will still be hit with ALL the queries, and you add some extra overhead with sharding as

Re: Search performance: shards or replications?

2013-05-07 Thread Stanislav Sandalnikov
Hi Yan, Thanks for the quick reply. Thus, replication seems to be the preferable solution. QTime decreases proportional to replications number or there are any other drawbacks? Just to clarify, what amount of documents stands for "tons of documents" in your opinion? :) 2013/5/7 Jan Høydahl >

Re: Search performance: shards or replications?

2013-05-07 Thread Stanislav Sandalnikov
P.S. Sorry for misspelling your name, Jan 2013/5/7 Stanislav Sandalnikov > Hi Yan, > > Thanks for the quick reply. > > Thus, replication seems to be the preferable solution. QTime decreases > proportional to replications number or there are any other drawbacks? > > Just to clarify, what amount

How to get Term Vector Information on Distributed Search

2013-05-07 Thread meghana
Hi, I am using distributed query to fetch records. Distributed Search Document on wiki says , Distributed Search support distributed query. but I m getting error while querying. Not sure if I am doing anything wrong. below is my Query to fetch Term Vector with Distributed Search. http://localho

RE: How to get Term Vector Information on Distributed Search

2013-05-07 Thread Markus Jelsma
hi - this is a known issue: https://issues.apache.org/jira/browse/SOLR-4479 -Original message- > From:meghana > Sent: Tue 07-May-2013 14:28 > To: solr-user@lucene.apache.org > Subject: How to get Term Vector Information on Distributed Search > > Hi, > > I am using distributed query

Solr 1.4 - Proximity Search - Where is configuration for storing positions?

2013-05-07 Thread KnightRider
I have an index built using Solr 1.4 with one field. I was able to run proximity search (Ex: word1 within5 word2) but no where in the configuration I see any information about storing/indexing the positions or offsets of the terms. My understanding is that we need to store/index termvectors positi

RE: Solr 1.4 - Proximity Search - Where is configuration for storing positions?

2013-05-07 Thread Markus Jelsma
Hi - they are indexed by default but can be omitted since 3.4: http://wiki.apache.org/solr/SchemaXml#Common_field_options -Original message- > From:KnightRider > Sent: Tue 07-May-2013 14:41 > To: solr-user@lucene.apache.org > Subject: Solr 1.4 - Proximity Search - Where is configurati

Re: Search performance: shards or replications?

2013-05-07 Thread Andre Bois-Crettez
Some clarifications : 1) *lots of docs, few queries* : If you have a high number of documents (+dozen millions) and lowish number of queries per second (say less than 10), replicas will not help to reduce the Qtime. For this kind of task it is better to shard the index, as each query will effecti

custom facet.sort

2013-05-07 Thread Giovanni Bricconi
I have a string field containing values such as "1khz" "1ghz" "1mhz" etc. I use this field to show a facet, currently I'm showing results in facet.sort=count order. Now I'm asked to reorder the facet according to the unit of measure (khz/mhz/ghz). I also have 3/4 other custom sorting to implement

RE: Solr 1.4 - Proximity Search - Where is configuration for storing positions?

2013-05-07 Thread KnightRider
Thanks Markus. - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-1-4-Proximity-Search-Where-is-configuration-for-storing-positions-tp4061315p4061325.html Sent from the Solr - User mailing list archive at Nabble.com.

SOLR query performance

2013-05-07 Thread Kamal Palei
Dear All I am using Apache SOLR 3.6.2 version for my search engine in a job site. I am observing for a solr query taking around 15 seconds to complete. I am sure there is something wrong in my approach or I am doing indexing wrongly. I need assistance/pointer to resolve this issue. I am providing

Re: Solr Cloud with large synonyms.txt

2013-05-07 Thread Mark Miller
On May 6, 2013, at 12:32 PM, Son Nguyen wrote: > I did some researches on internet and found out that because Zookeeper znode > size limit is 1MB. I tried to increase the system property "jute.maxbuffer" > but it won't work. > Does anyone have experience of dealing with it? Perhaps hit up the

Re: SOLR query performance

2013-05-07 Thread Alexandre Rafalovitch
Yes, that's what the 'start' and 'rows' parameters do in the query string. I would check the queries Solr sees when you do that long request. There is usually a delay in retrieving items further down the sorted list, but 15 seconds does feel excessive. http://wiki.apache.org/solr/CommonQueryParame

Re: Solr Cloud with large synonyms.txt

2013-05-07 Thread Mark Miller
On May 7, 2013, at 10:24 AM, Mark Miller wrote: > > On May 6, 2013, at 12:32 PM, Son Nguyen wrote: > >> I did some researches on internet and found out that because Zookeeper znode >> size limit is 1MB. I tried to increase the system property "jute.maxbuffer" >> but it won't work. >> Does a

Get Suggester to return same phrase as query

2013-05-07 Thread Rounak Jain
Hi, I'm using the Suggester component in Solr, and if I search for "iPhone 5" the suggestions never give me the same phrase, that is "iPhone 5." Is there any way to alter this behaviour to return "iPhone 5" as well? A backup option could be to always display what the user has entered in the UI, b

Re: SOLR query performance

2013-05-07 Thread Kamal Palei
Thanks a lot Alex. I will go and try to make use of start filter and update. Meantime, if I need to know, how many total search records are there. Example: Lets say I am searching key word "java". There might be 1000 documents having java keyword. I need to show only 100 records at a time. When

Re: SOLR query performance

2013-05-07 Thread Shawn Heisey
On 5/7/2013 8:45 AM, Kamal Palei wrote: > When user clicks, 4, I will set "start" filter as 300, "rows" filter as 100 > and do the query. As query result, I am expecting row count as 1000, and > 100 records data (row number 301 to 400). This is what using the start and rows parameter with Solr wil

Storing positions and offsets vs FieldType IndexOptions DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS

2013-05-07 Thread KnightRider
I see that Lucene 4.x has FieldInfo.IndexOptions that can be used to tell lucene whether to Index Documents/Frequencies/Positions/Offsets. We are in the process of upgrading from Lucene 2.9 to Lucene 4.x and I was wondering if there was a way to tell lucene whether to index docs/freqs/pos/offsets

Re: Search performance: shards or replications?

2013-05-07 Thread Stanislav Sandalnikov
Thank you, everything seems clear. 07.05.2013 20:17 пользователь "Andre Bois-Crettez" написал: > Some clarifications : > > 1) *lots of docs, few queries* : If you have a high number of documents > (+dozen millions) and lowish number of queries per second (say less than > 10), replicas will not h

Re: FieldCache insanity with field used as facet and group

2013-05-07 Thread Chris Hostetter
: I am using the Lucene FieldCache with SolrCloud and I have "insane" instances : with messages like: FWIW: I'm the one that named the result of these "sanity checks" "FieldCacheInsantity" and i have regretted it ever since -- a better label would have been "inconsistency" : VALUEMISMATCH: Mul

RE: Solr Cloud with large synonyms.txt

2013-05-07 Thread Son Nguyen
Mark, I tried to set that property on both ZK (I have only one ZK instance) and Solr, but it still didn't work. But I read somewhere that ZK is not really designed for keeping large data files, so this solution - increasing jute.maxbuffer (if I can implement it) should be just temporary. Son

RE: Solr Cloud with large synonyms.txt

2013-05-07 Thread Son Nguyen
Jan, Thank you for your answer. I've opened a JIRA issue with your suggestion. https://issues.apache.org/jira/browse/SOLR-4793 Son -Original Message- From: Jan Høydahl [mailto:jan@cominvent.com] Sent: Tuesday, May 07, 2013 4:16 PM To: solr-user@lucene.apache.org Subject: Re: Solr Cl

stats cache

2013-05-07 Thread J Mohamed Zahoor
Hi I am computing lots of stats as part of a query… looks like the solr caching is not helping here… Does solr caches stats of a query? ./zahoor

facet.pivot limit

2013-05-07 Thread J Mohamed Zahoor
Hi is there a limit for facet pivot like we have in facet.limit? ./zahoor

Re: Solr Cloud with large synonyms.txt

2013-05-07 Thread Mark Miller
I'm not so worried about the large file in zk issue myself. The concern is that you start storing and accessing lots of large files in ZK. This is not what it was made for, and everything stays in RAM, so they guard against this type of usage. We are talking about a config file that is loaded o

Re: stats cache

2013-05-07 Thread Otis Gospodnetic
Hi, Yes, in the query cache. You should see it in your monitoring tool or your Solr Stats Admin page. Doesn't help if queries don't repeat or cache settings and poor. Otis -- Search Analytics - http://sematext.com/search-analytics/index.html SOLR Performance Monitoring - http://sematext.com/spm

Use case for storing positions and offsets in index?

2013-05-07 Thread KnightRider
Can someone please tell me the usecase for storing term positions and offsets in the index? I am trying to understand the difference between storing positions/offsets vs indexing positions/offsets. Thanks KR - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble

Re: Delete from Solr Cloud 4.0 index..

2013-05-07 Thread Erick Erickson
bq: Will docValues help with memory usage? 'm still a bit fuzzy on all the ramifications of DocValues, but I somewhat doubt they'll result in index size savings, they _really_ help with loading the values for a field, but the end result is still the values in memory People who know what they'

Re: Rearranging Search Results of a Search?

2013-05-07 Thread Erick Erickson
No, DocTransformers work on a single document at a time, which is pretty clear if you look at the methods you must implement. Really, you'd do yourself a favor by doing a little more research before asking questions, you might review: http://wiki.apache.org/solr/UsingMailingLists and consider that

Re: Lazy load Error on UI analysis area

2013-05-07 Thread Erick Erickson
It looks like you have old jars in the classpath somewhere, class not found just shouldn't be happening. If this can be reproduced on a fresh install (and even better on a machine that's never had Solr installed) it would be something we'd need to pursue... Best Erick On Tue, May 7, 2013 at 6:56

RE: Unsubscribing from JIRA

2013-05-07 Thread johnmunir
For someone link me, who want to follow dev discussions but not JIRA, having a separate mailing list subscription for each would be ideal. The incoming mail traffic would be cut drastically (for me, I get far more non relevant emails from JIRA vs. dev). -- MJ -Original Message- Fro

Re: Get Suggester to return same phrase as query

2013-05-07 Thread Erick Erickson
Hmmm, R. Muir did some work here: https://issues.apache.org/jira/browse/SOLR-3143, note that it's 4.0 or later. I haven't implemented this, but this is a common problem so if you do dig into it and get it to work (warning, I haven't a clue) it'd be a great contribution to the Wiki. Best Erick On

Re: Unsubscribing from JIRA

2013-05-07 Thread Alexandre Rafalovitch
Email filters? I mean, you may have a point, but the cost of change at this moment is probably too high. Personal email filters, on the other hand, seems like an easy solution. Regards, Alex. On Tue, May 7, 2013 at 2:01 PM, wrote: > For someone link me, who want to follow dev discussions but

Search identifier fields containing blanks

2013-05-07 Thread Silvio Hermann
Hello, I am about to index identfier fields containing blanks (shelfmarks) eg. G 23/60 12 The field type is set to Solr.string. To get the exact matching hit (the doc with shelfmark mentioned above) the user must quote the search term. Is there a way to omit the quotes? Best, Silvio

Re: Storing positions and offsets vs FieldType IndexOptions DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS

2013-05-07 Thread Shawn Heisey
On 5/7/2013 9:50 AM, KnightRider wrote: I see that Lucene 4.x has FieldInfo.IndexOptions that can be used to tell lucene whether to Index Documents/Frequencies/Positions/Offsets. I really don't like giving unhelpful responses like this, but I don't think there's any other way to go. This is

dataimport handler

2013-05-07 Thread Eric Myers
In the data import handler I have multiple entities. Each one generates a date in the dataimport.properties i.e. entityname.last_index_time. How do I reference the specific entity time in my delta queries? Thanks Eric

Re: solr.LatLonType type vs solr.SpatialRecursivePrefixTreeFieldType

2013-05-07 Thread Smiley, David W.
Hi Barani, This identical question was posed at the same time on StackOverflow, and I answered it there already: http://stackoverflow.com/questions/16407110/solr-4-2-solr-latlontype-type-v s-solr-spatialrecursiveprefixtreefieldtype/16409327#16409327 ~ David On 5/6/13 12:28 PM, "bbarani" wrote:

Re: ConcurrentUpdateSolrServer "Missing ContentType" error on SOLR 4.2.1

2013-05-07 Thread cleardot
This is resolved, I switched in the 4.2.1 jars and also corrected a mismatch between the compile and runtime JDKs, for some reason the system was overriding my JAVA_HOME setting (6.1) and running the client with a 5.0 JVM. I did not have to use setParser. I did try running the 'new' 4.2.1 SolrJ c

Storing and retrieving Objects using ByteField

2013-05-07 Thread zqzuk
Hi I need to store and retrieve some custom java objects using Solr and I have used ByteField and java serialisation for this. Using the embedded jetty server I can see these byte data but when I use Solrj api to retrieve the data they are not available. Details are below: My schema:

Index compatibility between Solr releases.

2013-05-07 Thread Skand Gupta
We have a fairly large (in the order of 10s of TB) indices built using Solr 3.5. We are considering migrating to Solr 4.3 and was wondering what the policy is on maintaining backward compatibility of the indices? Will 4.3 work with my 3.5 indexes? Because of the large data size, I would ideally lik

Index compatibility between Solr releases.

2013-05-07 Thread Skand Gupta
We have a fairly large (in the order of 10s of TB) indices built using Solr 3.5. We are considering migrating to Solr 4.3 and was wondering what the policy is on maintaining backward compatibility of the indices? Will 4.3 work with my 3.5 indexes? Because of the large data size, I would ideally lik

Re: Index compatibility between Solr releases.

2013-05-07 Thread Shawn Heisey
On 5/7/2013 3:11 PM, Skand Gupta wrote: We have a fairly large (in the order of 10s of TB) indices built using Solr 3.5. We are considering migrating to Solr 4.3 and was wondering what the policy is on maintaining backward compatibility of the indices? Will 4.3 work with my 3.5 indexes? Because o

Re: dataimport handler

2013-05-07 Thread Shalin Shekhar Mangar
Using ${dih..last_index_time} should work. Make sure you put it in quotes in your query. On Tue, May 7, 2013 at 12:07 PM, Eric Myers wrote: > In the data import handler I have multiple entities. Each one > generates a date in the > dataimport.properties i.e. entityname.last_index_time. > > H

Re: stats cache

2013-05-07 Thread Yonik Seeley
On Tue, May 7, 2013 at 12:48 PM, J Mohamed Zahoor wrote: > Hi > > I am computing lots of stats as part of a query… > looks like the solr caching is not helping here… > > Does solr caches stats of a query? No. Neither facet counts or stats part of a request are cached. The query cache only cache

Re: Index compatibility between Solr releases.

2013-05-07 Thread Skand S Gupta
Thank you Shawn. This was detailed and very helpful. Skand. On May 7, 2013, at 5:54 PM, Shawn Heisey wrote: > On 5/7/2013 3:11 PM, Skand Gupta wrote: >> We have a fairly large (in the order of 10s of TB) indices built using Solr >> 3.5. We are considering migrating to Solr 4.3 and was wonderin

Index corrupted detection from http get command.

2013-05-07 Thread Michel Dion
Hello, I'm look for a way to detect solr index corruption using a http get command. I've look at the /admin/ping and /admin/luke request handlers but not sure if the their status provide guarantees that everything is all right. The idea is to be able to tell a load balancer to put a given solr ins

Re: Storing positions and offsets vs FieldType IndexOptions DOCS_AND_FREQS_AND_POSITIONS_AND_OFFSETS

2013-05-07 Thread KnightRider
Thanks Shawn. I'll reach out to Lucene discussion group. - Thanks -K'Rider -- View this message in context: http://lucene.472066.n3.nabble.com/Storing-positions-and-offsets-vs-FieldType-IndexOptions-DOCS-AND-FREQS-AND-POSITIONS-AND-OFFSETS-tp4061354p4061457.html Sent from the Solr - User ma

Re: Questions about the performance of Solr

2013-05-07 Thread joo
Thank you. However, fq is already in use. In my opinion, it is to think that it might be slow data of 70 million reviews is contained in the core of one, but do you have examples of performance of a certain number or more may decrease maybe? -- View this message in context: http://lucene.472066

Re: Search identifier fields containing blanks

2013-05-07 Thread Chris Hostetter
: I am about to index identfier fields containing blanks (shelfmarks) eg. G : 23/60 12 : The field type is set to Solr.string. To get the exact matching hit (the doc : with shelfmark mentioned above) the user must quote the search term. Is there : a way to omit the quotes? whitespace has to be qu

Re: Unsubscribing from JIRA

2013-05-07 Thread Chris Hostetter
: Email filters? I mean, you may have a point, but the cost of change at : this moment is probably too high. Personal email filters, on the other : hand, seems like an easy solution. The reason for having Jira notifications go to the devs list is that all of the comments & discussion in jira are

Re: Search identifier fields containing blanks

2013-05-07 Thread Upayavira
On Wed, May 8, 2013, at 02:07 AM, Chris Hostetter wrote: > > : I am about to index identfier fields containing blanks (shelfmarks) eg. > G > : 23/60 12 > : The field type is set to Solr.string. To get the exact matching hit > (the doc > : with shelfmark mentioned above) the user must quote the s

Re: Scores dilemma after providing boosting with bq as same weigtage for 2 condition

2013-05-07 Thread nishi
"ab_1eb83ef9bc0896":" 0.17063755 = (MATCH) sum of: 3.085E-4 = (MATCH) MatchAllDocsQuery, product of: 3.085E-4 = queryNorm 0.009742409 = (MATCH) product of: 0.019484818 = (MATCH) sum of: 0.016588148 = (MATCH) sum of: 0.0034696688 = (MATCH) weight(articleTopic:Food^1