date:20140318

Re: Partial Counts in SOLR

2014-03-18 Thread Salman Akram

Anyone? On Mon, Mar 17, 2014 at 12:03 PM, Salman Akram < salman.ak...@northbaysolutions.net> wrote: > Below is one of the sample slow query that takes mins! > > ((stock or share*) w/10 (sale or sell* or sold or bought or buy* or > purchase* or repurchase*)) w/10 (executive or director) > > If a

Re: Best SSD block size for large SOLR indexes

2014-03-18 Thread Salman Akram

We do have couple of commodity SSDs already and they perform good. However, our user queries are very complex and quite a few of them go above a minute so we really had to do something about it. Using this beast vs putting the whole index to RAM, the beast still seemed a better option. Also we are

Re: Best SSD block size for large SOLR indexes

2014-03-18 Thread Salman Akram

Thanks for the info. The articles were really useful but still seems I have to do my own testing to find the right page size? I thought for large indexes there would already be some tests done in SOLR community. Side note: We are heavily using Microsoft technology (.NET etc) for development so by

Re: Zookeeper exceptions - SEVERE

2014-03-18 Thread Shalin Shekhar Mangar

Sorry guys I spoke too fast. I looked at the code again. No it doesn't correlate with commits at all. I was mistaken. On Wed, Mar 19, 2014 at 10:06 AM, Chris W wrote: > Thanks, Shawn and Shalin > > How does the frequency of commit affect zookeeper? > > > Thanks > > > On Tue, Mar 18, 2014 at 9:12

Re: Zookeeper exceptions - SEVERE

2014-03-18 Thread Gopal Patwa

Shalin, "correlated with how frequently you call commit" is it soft commit or hard commit? , I guess it should be later one. just curious what data it update to zookeeper during commit On Tue, Mar 18, 2014 at 9:12 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > SolrCloud will updat

Re: Zookeeper exceptions - SEVERE

2014-03-18 Thread Chris W

Thanks, Shawn and Shalin How does the frequency of commit affect zookeeper? Thanks On Tue, Mar 18, 2014 at 9:12 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > SolrCloud will update Zookeeper on state changes (node goes to > recovery, comes back up etc) or for leader election and

Re: Zookeeper exceptions - SEVERE

2014-03-18 Thread Shalin Shekhar Mangar

SolrCloud will update Zookeeper on state changes (node goes to recovery, comes back up etc) or for leader election and during collection API commands. It doesn't correlate directly with indexing but is correlated with how frequently you call commit. On Wed, Mar 19, 2014 at 5:46 AM, Shawn Heisey w

Re: Indexing large documents

2014-03-18 Thread Otis Gospodnetic

Hi, I think you probably want to split giant documents because you / your users probably want to be able to find smaller sections of those big docs that are best matches to their queries. Imagine querying War and Peace. Almost any regular word your query for will produce a match. Yes, you may w

Indexing large documents

2014-03-18 Thread Stephen Kottmann

Hi Solr Users, I'm looking for advice on best practices when indexing large documents (100's of MB or even 1 to 2 GB text files). I've been hunting around on google and the mailing list, and have found some suggestions of splitting the logical document up into multiple solr documents. However, I h

Re: StackOverflow ... the errors, not the site

2014-03-18 Thread Areek Zillur

Hi Lajos, This can be due to the heavy query-time processing chain associated with the TextField? You can also check out AnalyzingInfixLookupFactory, if the suggestion entries are a bit long (this suggester will give matches, even if the query matches a term in the middle of a suggestion entry. I

Re: /suggest

2014-03-18 Thread Areek Zillur

Hi Lajos, Can you elaborate on the "get the overflow when using a text field" part? The new SuggestComponent should work just as well for DocumentDictionary. Thanks Areek On Mon, Mar 17, 2014 at 6:05 PM, Lajos wrote: > Hi Steve, > > I've posted previously about a nice Stackoverflow exception

Re: Zookeeper exceptions - SEVERE

2014-03-18 Thread Shawn Heisey

On 3/18/2014 5:46 PM, Chris W wrote: I am running a 3 node zookeeper 3.4.5 Quorum. I am running into issues with Zookeeper transaction logs [myid:2] - ERROR [main:QuorumPeer@453] - Unable to load database on disk java.io.IOException: Unreasonable length = 1048587 at org.apache.jute.BinaryInpu

Zookeeper exceptions - SEVERE

2014-03-18 Thread Chris W

I am running a 3 node zookeeper 3.4.5 Quorum. I am running into issues with Zookeeper transaction logs [myid:2] - ERROR [main:QuorumPeer@453] - Unable to load database on disk java.io.IOException: Unreasonable length = 1048587 at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.j

does shards.tolerant deal with this scenario?

2014-03-18 Thread solr-user

hi all I have some questions re shards.tolerant=true and timeAllowed=xxx I have seen situations where shards.tolerant=true works; if one of the shards specified in a query is dead, shards.tolerant seems to work and I get results from the non-dead shards However, if one of the shards goes down du

RE: String Cast Error

2014-03-18 Thread AJ Lemke

Did you change the schema at all? No Did you upgrade Solr from a previous version with the same index? No This was fresh install from the website. Ran "ant run-example" Killed that instance Copied Example to Node1 Copied Example to Node2 Switched into Node1 java -Dbootstrap_confdir=solr/collect

RE : Edit config files

2014-03-18 Thread Francois Perron

Hi Steve, This feature make sens for us because we don't have write access in production. Anyway, I'll do a script to push config files updates directly to zookeeper and reload the collection. But, it's always simpler when it's already integrated in a admin tool. Thank you for your time.

Re: String Cast Error

2014-03-18 Thread Shawn Heisey

On 3/18/2014 3:51 PM, AJ Lemke wrote: I have a strange issue with my local SOLR install. I have a search that sorts on a boolean field. This search is pulling the following error: "java.lang.String cannot be cast to org.apache.lucene.util.BytesRef". The search is over the dummy data that is inc

Re: CollapsingQParserPlugin returning different result set

2014-03-18 Thread shamik

Joel, I had a discussion with you earlier related ngroup inconsistent number when you suggested to use the composite id to make sure that identical (ADSKDedup) fields are available in the same shard. Here's the thread --> http://lucene.472066.n3.nabble.com/SolrCloud-Result-Grouping-vs-Collaps

String Cast Error

2014-03-18 Thread AJ Lemke

Hello all! I have a strange issue with my local SOLR install. I have a search that sorts on a boolean field. This search is pulling the following error: "java.lang.String cannot be cast to org.apache.lucene.util.BytesRef". The search is over the dummy data that is included in the exampledocs.

Re: Edit config files

2014-03-18 Thread Steve Rowe

Hi Francois, The config file editing functionality was pulled out of Solr before the 4.7 release; what remains is a read-only config directory browser/file viewer. May I ask why you thought the config file editing functionality was in 4.7? Steve On Mar 18, 2014, at 4:39 PM, Francois Perron w

RE: Best SSD block size for large SOLR indexes

2014-03-18 Thread Toke Eskildsen

Salman Akram [salman.ak...@northbaysolutions.net] wrote: [Hundreds of GB index] > http://www.storagereview.com/micron_p420m_enterprise_pcie_ssd_review May I ask why you have chosen a drive with such a high speed and matching cost? We have some years of experience with using SSDs for search at w

Edit config files

2014-03-18 Thread Francois Perron

Hi, I had install lastest version of solr (4.7.0) and I want to try new functionality to edit config files in AdminUI. But when I click on file, no edit box appear! This is info on my version : Versions * solr-spec 4.7.0 * solr-impl 4.7.0 1570806 - simon - 2014-02-22 08:36:23 * luce

Re: CollapsingQParserPlugin returning different result set

2014-03-18 Thread Joel Bernstein

Hi Shamik, I see that you are using distributed search. With the CollapsingQParserPlugin you need to have all the documents that are in the same group on the same shard. Is that the way you have the documents indexed? Joel Joel Bernstein Search Engineer at Heliosearch On Mon, Mar 17, 2014 at

Re: Doing spatial search on multiple location points

2014-03-18 Thread Smiley, David W.

Varun, You could use a function query involving “min” with a comma-separated list of geodist clauses. See https://cwiki.apache.org/confluence/display/solr/Spatial+Search “Boost Nearest Results”. You’d replace the geodist() in there with min(geodist(45.15,-93.85),geodist(50.2,22.3),…) (etc.) ~

solr cloud distributed optimize() becomes serialized

2014-03-18 Thread Chris Lu

I wonder whether this is a known bug. In previous SOLR cloud versions, 4.4 or maybe 4.5, an explicit optimize(), without any parameters, it usually took 2 minutes for a 32 core cluster. However, in 4.6.1, the same call took about 1 hour. Checking the index modification time for each core shows 2 m

solr-user@lucene.apache.org

2014-03-18 Thread tchaffee

Thanks Joel - I decided upon another route - I was almost always grouping so I am trying another model where we will store the data with fewer rows and a few multivalue fields. -- View this message in context: http://lucene.472066.n3.nabble.com/CollapsingQParserPlugin-facet-results-fq-collapse-

Re: Nested documents, block join - re-indexing a single document upon update

2014-03-18 Thread danny teichthal

Thanks, Indeed, the subject line was misleading. Then I will file a new improvement request for "block atomic update support". On Tue, Mar 18, 2014 at 2:08 PM, Jack Krupansky wrote: > That's a reasonable request and worth a Jira, but different from what you > have specified in your subject lin

Re: bulk indexing - EofExceptions and big latencies after soft-commit

2014-03-18 Thread adfel70

I disabled softCommit and tried to run another indexing proccess. Now I see no jetty EofException and no latency peaks.. I also noticed that when I had softcommit every 10 minutes, I also saw spikes in the major GC (i use CMS) to around 9-10k. Any idea? Shawn Heisey-4 wrote > On 3/17/2014 7:07

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-18 Thread Till Kinstler

Am 18.03.2014 15:26, schrieb Martin de Vries: Martin, I’ve committed the SOLR-5875 fix, including to the lucene_solr_4_7 branch. Any chance you could test the fix? Hi Steve, I'm very happy you found the bug. We are running the version from SVN on one server and it's already running fine for 5

Re: Solr memory usage off-heap

2014-03-18 Thread Shawn Heisey

On 3/18/2014 8:37 AM, Erick Erickson wrote: > It sounds like you already understand mmap. Even so you might be > interested in this excellent writeup of MMapDirectory and Lucene by > Uwe: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html There is some actual bad memory report

Re: Solr memory usage off-heap

2014-03-18 Thread Erick Erickson

Avishai: It sounds like you already understand mmap. Even so you might be interested in this excellent writeup of MMapDirectory and Lucene by Uwe: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Best, Erick On Tue, Mar 18, 2014 at 7:23 AM, Avishai Ish-Shalom wrote: > aha

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-18 Thread Martin de Vries

Martin, I’ve committed the SOLR-5875 fix, including to the lucene_solr_4_7 branch. Any chance you could test the fix? Hi Steve, I'm very happy you found the bug. We are running the version from SVN on one server and it's already running fine for 5 hours. If it's still stable tomorrow than w

Re: Wiki edit rights

2014-03-18 Thread Erick Erickson

Done, thanks! On Tue, Mar 18, 2014 at 3:54 AM, Anders Gustafsson wrote: > Yes, please. My Wiki ID is Anders Gustafsson > >> But yes, please, add the howto to Wiki. You will need to get your >> account whitelisted first (due to spammers), so send a separate email >> with your Apache wiki id and so

SolrJ: SolrQuery and ModifiableSolrParams

2014-03-18 Thread Cynthia Park

Hello, What is the difference between setting parameters via SolrQuery vs ModifiableSolrParams? If there is a difference, is there a preferred choice? I'm using Solr 4.6.1. SolrQuery query = new SolrQuery(); query.setParam("wt", "json"); ModifiableSolrParams params = new ModifiableSolrParams()

Re: Solr memory usage off-heap

2014-03-18 Thread Avishai Ish-Shalom

aha! mmap explains it. thank you. On Tue, Mar 18, 2014 at 3:11 PM, Shawn Heisey wrote: > On 3/18/2014 5:30 AM, Avishai Ish-Shalom wrote: > > My solr instances are configured with 10GB heap (Xmx) but linux shows > > resident size of 16-20GB. even with thread stack and permgen taken into > > acco

Re: Best SSD block size for large SOLR indexes

2014-03-18 Thread Shawn Heisey

On 3/18/2014 7:39 AM, Salman Akram wrote: > This SSD default size seems to be 4K not 16K (as can be seen below). > > Bytes Per Sector : 512 > Bytes Per Physical Sector : 4096 > Bytes Per Cluster : 4096 > Bytes Per FileRecord Segment: 1024 The *sector* size o

RE: Master Not Available - Replication

2014-03-18 Thread Newallo, Dexter - DOT

Yes, I did. I have also tried disabling security and removing the '#' from the masterURL. ... but still get, "Master at: http://:9081/solr/collection1 is not available. Index fetch failed." Thanks -Original Message- From: Doug Turnbull [mailto:dturnb...@opensourceconnections.com] Sent

Re: SolrJ: SolrQuery and ModifiableSolrParams

2014-03-18 Thread Shawn Heisey

On 3/18/2014 7:48 AM, Cynthia Park wrote: > What is the difference between setting parameters via SolrQuery vs > ModifiableSolrParams? If there is a difference, is there a preferred > choice? I'm using Solr 4.6.1. > > SolrQuery query = new SolrQuery(); > query.setParam("wt", "json"); > > Modifi

Re: About enableLazyFieldLoading and memory

2014-03-18 Thread Shawn Heisey

On 3/18/2014 7:18 AM, david.dav...@correo.aeat.es wrote: > yes, but if I use enableLazyFieldLoading=trueand my queries only request > for very small fields like ID, DocumentCache shouldn't grow, although my > stored fields are very big. Am I wrong? Since Solr 4.1, stored fields are compressed.

Hiarachical facet on one filed

2014-03-18 Thread Alex

Hi all, I have a field that contains dates (it has date type) and I would like to make a hierarchical (pivot) facet based on that field. So I would like to have something like this: date_of_creation: |__2014 ||__January || |_01 || |_02 || |_14 |

Re: Best SSD block size for large SOLR indexes

2014-03-18 Thread Salman Akram

This SSD default size seems to be 4K not 16K (as can be seen below). Bytes Per Sector : 512 Bytes Per Physical Sector : 4096 Bytes Per Cluster : 4096 Bytes Per FileRecord Segment: 1024 I will go through the articles you sent. Thanks On Tue, Mar 18, 2014 at

Re: Best SSD block size for large SOLR indexes

2014-03-18 Thread Shawn Heisey

On 3/18/2014 7:12 AM, Salman Akram wrote: > Is there a rule of thumb for ideal block size for SSDs for large indexes > (in hundreds of GBs)? Read performance is of top importance for us and we > can sacrifice the space a little... > > This is the one we just got and wanted to see if there are any

Re: About enableLazyFieldLoading and memory

2014-03-18 Thread david . davila

Hi Miguel, yes, but if I use enableLazyFieldLoading=trueand my queries only request for very small fields like ID, DocumentCache shouldn't grow, although my stored fields are very big. Am I wrong? Best regards, David Dávila Atienza AEAT - Departamento de Informática Tributaria Subdirección de

Best SSD block size for large SOLR indexes

2014-03-18 Thread Salman Akram

All, Is there a rule of thumb for ideal block size for SSDs for large indexes (in hundreds of GBs)? Read performance is of top importance for us and we can sacrifice the space a little... This is the one we just got and wanted to see if there are any test results out there http://www.storagerevie

Re: About enableLazyFieldLoading and memory

2014-03-18 Thread Miguel

Hi David If you use lazy field loading (/enableLazyFieldLoading=true/) /documentCache/ functionality is somehow limited. This means that the document stored in the /documentCache/ will contain only those fields that were passed to the /fl /parameter. /documentCache/ requires memory, the

Re: Solr memory usage off-heap

2014-03-18 Thread Shawn Heisey

On 3/18/2014 5:30 AM, Avishai Ish-Shalom wrote: > My solr instances are configured with 10GB heap (Xmx) but linux shows > resident size of 16-20GB. even with thread stack and permgen taken into > account i'm still far off from these numbers. Could it be that jvm IO > buffers take so much space? doe

Re: Nested documents, block join - re-indexing a single document upon update

2014-03-18 Thread Jack Krupansky

That's a reasonable request and worth a Jira, but different from what you have specified in your subject line: "re-indexing a single document" - the entire block needs to be re-indexed. I suppose people might want a "block atomic update" - where multiple child documents as well as the parent d

RE: Solr memory usage off-heap

2014-03-18 Thread Doug Turnbull

How large is your index on disk? Solr memory maps the index into memory. Thus the virtual memory used will often be quite large. Your numbers don't sound inconceivable. A good reference point is Grant Ingersoll's blog post on searchhub: http://searchhub.org/2011/09/14/estimating-memory-and-storage

Solr memory usage off-heap

2014-03-18 Thread Avishai Ish-Shalom

Hi, My solr instances are configured with 10GB heap (Xmx) but linux shows resident size of 16-20GB. even with thread stack and permgen taken into account i'm still far off from these numbers. Could it be that jvm IO buffers take so much space? does lucene use JNI/JNA memory allocations?

Wiki edit rights

2014-03-18 Thread Anders Gustafsson

Yes, please. My Wiki ID is Anders Gustafsson > But yes, please, add the howto to Wiki. You will need to get your > account whitelisted first (due to spammers), so send a separate email > with your Apache wiki id and somebody will unlock you for editing. -- Anders Gustafsson Engineer, CNI, CNE6,

Re: Re: Re: Need help importing OOXML custom properties into Solr

2014-03-18 Thread Alexandre Rafalovitch

The metadata fields can be all sorts of strange, including spaces and other strange characters. So, often, there is some issue on mapping. But yes, please, add the howto to Wiki. You will need to get your account whitelisted first (due to spammers), so send a separate email with your Apache wiki i

Sv: Re: Re: Need help importing OOXML custom properties into Solr

2014-03-18 Thread Anders Gustafsson

Thanks again. I already had the Tika jars, but not the commandline one, so I downloaded 1.5 and ran it against the docx and found: So the name is prefixed, does that mean that I should add it prefixed in the conf files as well? Ie: Yes. Did that and now it works. Guess I should take the time

Re: Re: Need help importing OOXML custom properties into Solr

2014-03-18 Thread Alexandre Rafalovitch

You can just download Tika from Apache site, it's a separate product and has command line interface. Or to use Solr extract handler: go through Solr tutorial, it explains it. https://lucene.apache.org/solr/4_7_0/tutorial.html Specifically, http://wiki.apache.org/solr/ExtractingRequestHandler and

Sv: Re: Need help importing OOXML custom properties into Solr

2014-03-18 Thread Anders Gustafsson

Thanks for the quick reply. I am a bit of a newb when it comes to Solr, Lux and Tika so I would appreciate if you could give me some quick pointers how to use/call Tika directly and/or how to send one file directly and storing the dynamic field? -- Anders Gustafsson Engineer, CNI, CNE6, ASE

Re: Need help importing OOXML custom properties into Solr

2014-03-18 Thread Alexandre Rafalovitch

Have you tried just using Tika directly and seeing what gets output? Maybe it is all prefixed somehow. Or sending one file as a sample directly to the extract handler and temporarily storing the ignored_* dynamicField to see what actually happens? Basically, check what is there before trying to fi

Need help importing OOXML custom properties into Solr

2014-03-18 Thread Anders Gustafsson

solr-spec 4.6.1 lucene-spec 4.6.0 lux-appserver 1.1.0 tika 1.4 poi 3.9 Hi! I set it up, pretty much following the instructions at http://www.codewrecks.com/blog/index.php/2013/05/25/import-folder-of-documents-with-apache-solr-4-0-and-tika/ Problem is that I cannot seem to import custom propertie

Re: Nested documents, block join - re-indexing a single document upon update

2014-03-18 Thread danny teichthal

Thanks Jack, I understand that updating a single document on a block is currently not supported. But, atomic update to a single document does not have to be in conflict with block joins. If I got it right from the documentation: Currently, If a document is atomically updated, SOLR finds the stor

About enableLazyFieldLoading and memory

2014-03-18 Thread david . davila

Hello, we have a Solr Cloud 4.7, but this question is also related with other versions, because we have tested this in several installations. We have a very big index ( more than 400K docs) with big documents, but in our queries we don't fetch the large fields in fl parameter. But, we have s

Re: Doing spatial search on multiple location points

2014-03-18 Thread Varun Gupta

Hi David, Thanks for the quick reply. As I haven't migrated to 4.7 (I am still using 4.6), I tested using OR clause with multiple geofilt query based phrases and it seems to be working great. But I have one more question: How do I boost the score of the matching documents based on geodist? How wi

59 matches

Mail list logo