Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Am 06.08.2015 um 17:48 schrieb Mikhail Khludnev: > On Thu, Aug 6, 2015 at 3:56 PM, Bernd Fehling < > bernd.fehl...@uni-bielefeld.de> wrote: > >> >> >> Am 06.08.2015 um 14:33 schrieb Upayavira: >>> Typically such performance issues with faceting are to do with the time >>> spend uninverting the i

Re: How to do sorting instead of using bq

2015-08-06 Thread Upayavira
How do you know those boost values? Do they come from the outside? Could you put them in the index with the docs themselves? Then you can sort on a field in the doc. On Fri, Aug 7, 2015, at 04:40 AM, rachun wrote: > Hi all, > > I'm trying to sort some docs which is about 200 or more docs. > by us

Re: Clarification on WordDelimiterFilter.

2015-08-06 Thread Modassar Ather
Hi, Any suggestion will be really helpful. Kindly provide your inputs. Thanks, Modassar On Thu, Aug 6, 2015 at 2:06 PM, Modassar Ather wrote: > I am using WordDelimiterFilter while indexing and searching both with the > following attributes. Parser used is edismax. Solr version is 5.2.1. > > *

How to do sorting instead of using bq

2015-08-06 Thread rachun
Hi all, I'm trying to sort some docs which is about 200 or more docs. by using bq like this.. *[bq] => product_id:L90094438^1 product_id:L90094438^3 product_id:L90094438^5 product_id:W27529923^123 product_id:W27529678^127 product_id:W27530909^133* *[sort] => score asc* The score that ret

Filtering documents using payloads

2015-08-06 Thread Jamie Johnson
I am attempting to put together a DocsAndPositionsEnum that can hide terms given the payload on the term. The idea is that if a term has a particular access control and the user does not I don't want it to be visible. I have based this off of https://github.com/roshanp/lucure-core/blob/master/sr

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Upayavira
On Thu, Aug 6, 2015, at 06:56 PM, Toke Eskildsen wrote: > Upayavira wrote: > > Also, attempting to facet across a large number of docs is going to take > > some time. Perhaps you might gain some performance benefit by sharding > > your index? > > One should be aware that distributed faceting on

SolrJ update

2015-08-06 Thread Henrique O. Santos
Hello all, I am using SolrJ to do a index update on one of my collections. This collection has a uniqueKey id field: id This field is configured to be auto generated on solrconfig.xml like this: id On my Java code, I just add the name field

Re: how to extend JavaBinCodec and make it available in solrj api

2015-08-06 Thread Shalin Shekhar Mangar
What do you mean by a custom format? As long as your custom component is writing primitives or NamedList/SimpleOrderedMap or collections such as List/Map, any response writer should be able to handle them. On Wed, Aug 5, 2015 at 5:08 PM, Dmitry Kan wrote: > Hello, > > Solr: 5.2.1 > class: org.apa

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Toke Eskildsen
Upayavira wrote: > Also, attempting to facet across a large number of docs is going to take > some time. Perhaps you might gain some performance benefit by sharding > your index? One should be aware that distributed faceting on String fields has a non-trivial overhead: It is a two-phase system,

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Upayavira
Also, attempting to facet across a large number of docs is going to take some time. Perhaps you might gain some performance benefit by sharding your index? Upayavira On Thu, Aug 6, 2015, at 04:48 PM, Mikhail Khludnev wrote: > On Thu, Aug 6, 2015 at 3:56 PM, Bernd Fehling < > bernd.fehl...@uni-bie

Re: Embedded Solr stopped to index after a while

2015-08-06 Thread Alexandre Rafalovitch
(shooting in the dark) What does your data directory looks like? File sizes, etc. And which Operating System. 4Gb is when Windows FAT filesystem has a size limit, but it really should not be that. Regards, Alex. Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter: http://www.

Embedded Solr stopped to index after a while

2015-08-06 Thread Aldric THOMAZO
Hello, I have an issue with embedded solr or a misconfiguration but no clue to resolve it. Solr stopped indexing a large set of data from a database after a while. It was running for many hours and when it reached a size of 4GB it stopped running although we are expecting about 40GB, without

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Mikhail Khludnev
On Thu, Aug 6, 2015 at 3:56 PM, Bernd Fehling < bernd.fehl...@uni-bielefeld.de> wrote: > > > Am 06.08.2015 um 14:33 schrieb Upayavira: > > Typically such performance issues with faceting are to do with the time > > spend uninverting the index before calculating the facet counts. > > > > If you ind

Functionality of post.jar

2015-08-06 Thread Aniruddh Sharma
Hi, I have a case where I have a csv file on my Unix file system and not in Hadoop file system. For example I have abc.xml in /home/cloudera/abc.xml on my Cloudera VMware. Now in Hadoop I go and I create a collection named test10 according to schema of abc.xml and using post.jar I post the file

Re: serious data loss bug in correlation with "too much data after closed"

2015-08-06 Thread Shawn Heisey
On 8/6/2015 8:31 AM, adfel70 wrote: > Are you sure that this parameter concerns /update requests? > On the one hand, it says that it "specides the max size of form data > (application/x-www-form-urlencoded) sent via POST. You can use POST to pass > request parameters not fitting into URL" > > and

Re: Embedded Solr now deprecated?

2015-08-06 Thread Lukasz Salwinski
On 08/05/2015 08:34 PM, Ken Krugler wrote: Hi Shawn, We have a different use case than the ones you covered in your response to Robert (below), which I wanted to call out. We currently use the embedded server when building indexes as part of a Hadoop workflow. The results get copied to a produ

Schemaless mode and DIH

2015-08-06 Thread xavi jmlucjav
hi, While working with DIH, I tried schemaless mode, and found out it does not work if you are indexing with DIH. I could not find any issue or reference to this in the mailing list, even if I found it a bit surprising nobody tried that combination so far. Did anybody tested this before? I manage

Re: serious data loss bug in correlation with "too much data after closed"

2015-08-06 Thread adfel70
Are you sure that this parameter concerns /update requests? On the one hand, it says that it "specides the max size of form data (application/x-www-form-urlencoded) sent via POST. You can use POST to pass request parameters not fitting into URL" and on the other hand, I see the my bulks are as bi

Re: Upload core.properties to ZooKeeper

2015-08-06 Thread Kevin Lee
You should be able to use user defined properties within core.properties. However, it sounds like you are uploading core.properties to Zookeeper. In SolrCloud, core.properties is not uploaded to Zookeeper. You place core.properties within your core’s top level directory and the cores are aut

Re: serious data loss bug in correlation with "too much data after closed"

2015-08-06 Thread adfel70
I have some docs that I know i've overwritten, but this is fine because this is caused by some duplicate docs with same data and same id. i know of dataloss because I know that a certain doc with certain id should be in the index but it isnt. Upayavira wrote > Are you adding all new documents?

Re: SolrCloud on 5.2.1 cluster state

2015-08-06 Thread Suma Shivaprasad
Thanks for clarifying On Thu, Aug 6, 2015 at 6:43 PM, Shawn Heisey wrote: > On 8/6/2015 6:50 AM, Suma Shivaprasad wrote: > > I was having issues since I am using a solr 4.8.1 client to talk to a > 5.2.1 > > server. There is no API in ZKStateReader as well to add a collection to > > watch in the

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Erik Hatcher
> Do you think an all_parameters -> complete_response cache is possible? > It could be initialized right before or during warmup and would not take to > much memory. This is along the lines of Solr’s 304 capabilities. See

Re: SolrCloud on 5.2.1 cluster state

2015-08-06 Thread Shawn Heisey
On 8/6/2015 6:50 AM, Suma Shivaprasad wrote: > I was having issues since I am using a solr 4.8.1 client to talk to a 5.2.1 > server. There is no API in ZKStateReader as well to add a collection to > watch in the 4.8.1 API . I assume that would have caused the > clusterstate.json to be updated? > >

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Toke Eskildsen
On Thu, 2015-08-06 at 14:32 +0200, Bernd Fehling wrote: > It "feels" like this behavior showed up after changing to docValues for > sorting, > because before the 99 percentile for qtime was at 550ms average and 1.4 > seconds at max. DocValues have faster startup but comes with a constant perform

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Am 06.08.2015 um 14:33 schrieb Upayavira: > Typically such performance issues with faceting are to do with the time > spend uninverting the index before calculating the facet counts. > > If you indexed the fields with docValues enabled, perhaps you could then > use them for faceting, which migh

Re: SolrCloud on 5.2.1 cluster state

2015-08-06 Thread Suma Shivaprasad
I was having issues since I am using a solr 4.8.1 client to talk to a 5.2.1 server. There is no API in ZKStateReader as well to add a collection to watch in the 4.8.1 API . I assume that would have caused the clusterstate.json to be updated? Since I am using a third party library (which in turn de

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Upayavira
Typically such performance issues with faceting are to do with the time spend uninverting the index before calculating the facet counts. If you indexed the fields with docValues enabled, perhaps you could then use them for faceting, which might improve performance. If you are using a non-docValu

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Thanks a lot, your statement makes me feel better :-) It "feels" like this behavior showed up after changing to docValues for sorting, because before the 99 percentile for qtime was at 550ms average and 1.4 seconds at max. So my assumption is that the inverted index on the sort fields (when _not

Re: Upload core.properties to ZooKeeper

2015-08-06 Thread Upayavira
Have you looked at the collections API? It has the ability to set properties against collections. I wonder if that'll achieve the same thing as adding them to core.properties? I've never used it myself, but wonder if it'll solve your issue. Upayavira On Thu, Aug 6, 2015, at 12:35 PM, marotosg wro

Re: are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Toke Eskildsen
On Thu, 2015-08-06 at 13:00 +0200, Bernd Fehling wrote: > Single Index Solr 4.10.4, optimized Index, 76M docs, 235GB index size. > > I was analysing my solr logs and it turned out that I have some queries > which are above 30 seconds qtime while normally the qtime is below 1 second. > Looking close

Upload core.properties to ZooKeeper

2015-08-06 Thread marotosg
Hi, I am in the process of migrating my master, slave Solr infraestructure to SolrCloud. At the moment I have several cores inside a folder with this structure /MyCores /MyCores/Core1 /MyCores/Core1/conf /MyCores/Core1/core.properties /MyCores/Core2 /MyCores/Core2/conf /MyCores/Core1/core.properti

are facets or MatchAllDocsQuery not cached?

2015-08-06 Thread Bernd Fehling
Single Index Solr 4.10.4, optimized Index, 76M docs, 235GB index size. I was analysing my solr logs and it turned out that I have some queries which are above 30 seconds qtime while normally the qtime is below 1 second. Looking closer about the queries it turned out that this is for MatchAllDocsQ

Re: Limits in individual filter sub queries

2015-08-06 Thread Selvam
Dear Toke, Thanks for your input. Infact my scenario is much more complex, let me give you an example, q=*.*&fq=(country:india AND age:[25 TO 40] AND sex:male) OR (country:iran AND income:[5 TO 9]) You can see each subquery has different parameters, I may want to limit the first subquer

Re: Limits in individual filter sub queries

2015-08-06 Thread Toke Eskildsen
On Thu, 2015-08-06 at 12:32 +0530, Selvam wrote: > Good day, I wanted to run a filter query (fq), say, I need to run > > q=*.*&fq=(country:india) OR (country:iran)&limit=100 > > Now it may return me 100 records that might contain 70 Indians & 30 Iran > records. Now how can I force to fetch 50 Ind

Clarification on WordDelimiterFilter.

2015-08-06 Thread Modassar Ather
I am using WordDelimiterFilter while indexing and searching both with the following attributes. Parser used is edismax. Solr version is 5.2.1. ** During search some of the results returned are not wanted. Following is the example. Search query: "3d image" Search results with 3-d image/3 d image/

RE: Solr spell check not showing any suggestions for other language

2015-08-06 Thread talha
Same results :( . Used following query http://localhost:8983/solr/product_live/select?q=%E0%A6%B8%E0%A6%B9%E0%A6%97&wt=json&indent=true&spellcheck=true&spellcheck.q=%E0%A6%B8%E0%A6%B9%E0%A6%97 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-spell-check-not-showing-an

Copying index from one Solr cloud to other Solr cloud

2015-08-06 Thread 无线事业部―胡胜波11289
I have the same question . I try to create a same name collection in another solrcloud and copy data folds to the the new solrcloud ,but it does not work . by the way my indexes are stored in hdfs . Does anybody can help? thanks 胡胜波 > >同程网络科技股份有限公司 > >苏州工业园区星湖街328号(崇文路口)创意产业园7栋

Limits in individual filter sub queries

2015-08-06 Thread Selvam
Hi All, Good day, I wanted to run a filter query (fq), say, I need to run q=*.*&fq=(country:india) OR (country:iran)&limit=100 Now it may return me 100 records that might contain 70 Indians & 30 Iran records. Now how can I force to fetch 50 Indian & 50 Iran records using a single SOLR query? Sh