date:20140223

Re: Solr Segments, Segment Merges,Optimize

2014-02-23 Thread KNitin

Commit Parameters: Server does an auto commit every 30 seconds with open_searcher=false. The pipeline does a hard commit only at the very end of its run The high CPU issue I am seeing is only during the reads and not during the writes. Right now I see a direct corelation between latencies and # of

Fwd: configuration for heavy system

2014-02-23 Thread Harish Reddy

Hi, We are testing solr. We have a document with some 100 indexes and there are around 10 million records.It is failing,either stuck or timed out on query. Is this indexing job possible with solr? If Yes,what should be the hardware,solr configuration and how many nodes would be optimum? Now I am r

Re: configuration for heavy system

2014-02-23 Thread Erick Erickson

You haven't told us anything about _how_ you're trying to index this document nor what it's format is. Nor what "100 indexes and around 10 million records" means. 1B total records? 10M total records? Solr easily handles 10s of M records on a single decent size node, I've seen between 50M and 300M.

DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-23 Thread Gregg Donovan

In most of our Solr use-cases, we fetch only fl= or fl=,. I'd like to be able to do a distributed search and skip STAGE_GET_FIELDS -- i.e. the stage where each shard is queried for the documents found the the top ids -- as it seems like we could be collecting this information earlier in the pipeli

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-23 Thread Shalin Shekhar Mangar

What a coincidence - I was about to commit a patch which makes it possible. It will be released with 4.8 See https://issues.apache.org/jira/browse/SOLR-1880 On Sun, Feb 23, 2014 at 11:27 PM, Gregg Donovan wrote: > In most of our Solr use-cases, we fetch only fl= or > fl=,. I'd like to be able to

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-23 Thread Shalin Shekhar Mangar

I should clarify though that this optimization only works with fl=id,score. On Sun, Feb 23, 2014 at 11:34 PM, Shalin Shekhar Mangar wrote: > What a coincidence - I was about to commit a patch which makes it > possible. It will be released with 4.8 > > See https://issues.apache.org/jira/browse/SOL

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-23 Thread Yonik Seeley

On Sun, Feb 23, 2014 at 1:08 PM, Shalin Shekhar Mangar wrote: > I should clarify though that this optimization only works with fl=id,score. Although it seems like it should be relatively simple to make it work with other fields as well, by passing down the complete "fl" requested if some optional

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-23 Thread Shalin Shekhar Mangar

Yes that should be simple. But regardless of the parameter, the fl=id,score use-case should be optimized by default. I think I'll commit the patch as-is and open a new issue to add the distrib.singlePass parameter. On Sun, Feb 23, 2014 at 11:49 PM, Yonik Seeley wrote: > On Sun, Feb 23, 2014 at 1:

Re: Solr Segments, Segment Merges,Optimize

2014-02-23 Thread KNitin

I should also mention that apart from committing, the pipeline also does a bunch of deletes for stale documents (based on a custom version field). The # of deletes can be very significant causing the % of deleted documents to be easily 40-50% of the index itself Thanks KNitin On Sun, Feb 23, 201

Re: Wikipedia Data Cleaning at Solr

2014-02-23 Thread Furkan KAMACI

I've compared the results when using WikipediaTokenizer for index time analyzer but there is no difference? 2014-02-23 3:44 GMT+02:00 Ahmet Arslan : > Hi Furkan, > > There is org.apache.lucene.analysis.wikipedia.WikipediaTokenizer > > Ahmet > > > On Sunday, February 23, 2014 2:22 AM, Furkan KAM

Issue with PHP urlencode and solr encoding

2014-02-23 Thread manju16832003

Hi, I come across the issue with urlencoding between PHP and Solr. I have a field indexed with value *WBE(Honda Edix)* in Solr. >From PHP codes, if I urlencode($string) and send to Solr, I do not get the accurate results. Here is the part of the solr query *fq=model:WBE(Honda+Edix)* However, If I

Re: Issue with PHP urlencode and solr encoding

2014-02-23 Thread Shawn Heisey

On 2/23/2014 8:58 PM, manju16832003 wrote: > I come across the issue with urlencoding between PHP and Solr. > I have a field indexed with value *WBE(Honda Edix)* in Solr. > > From PHP codes, if I urlencode($string) and send to Solr, I do not get the > accurate results. > Here is the part of the so

Re: Issue with PHP urlencode and solr encoding

2014-02-23 Thread Rico P

On Mon, Feb 24, 2014 at 11:52 AM, Shawn Heisey wrote: > > > The Solarium library for PHP also says that it does escaping, but I > can't find the manual section that they mention about term escaping. > Here's a section that has an example of phrase escaping (putting the > value in double quotes): >

Re: Issue with PHP urlencode and solr encoding

2014-02-23 Thread manju16832003

Hi Shawn and Rico, Thanks you for your suggestions, those are valuable suggestions :-). If Pharse Query does not work as we expected sometimes, I guess we could use *TermQuery* instead. http://blog.florian-hopf.de/2013/01/make-your-filters-match-faceting-in-solr.html This worked fine *fq={!term

Can not index raw binary data stored in Database in BLOB format.

2014-02-23 Thread Chandan khatua

Hi, We have raw binary data stored in database(not word,excel,xml etc files) in BLOB. We are trying to index using TikaEntityProcessor but nothing seems to get indexed. But the same configuration works when xml/word/excel files are stored in the BLOB field. Below is our data-config.xml:

Re: Solr Segments, Segment Merges,Optimize

Fwd: configuration for heavy system

Re: configuration for heavy system

DistributedSearch: Skipping STAGE_GET_FIELDS?

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

Re: Solr Segments, Segment Merges,Optimize

Re: Wikipedia Data Cleaning at Solr

Issue with PHP urlencode and solr encoding

Re: Issue with PHP urlencode and solr encoding

Re: Issue with PHP urlencode and solr encoding

Re: Issue with PHP urlencode and solr encoding

Can not index raw binary data stored in Database in BLOB format.

15 matches

Site Navigation

Mail list logo

Footer information