date:20130225

Re: DIH deleting documents

2013-02-25 Thread cveres

Thanks Arcadius, Excellent suggestion about the view.I'll try to simplify things and see how I go. thanks, Csaba -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-deleting-documents-tp4041811p4042663.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Grouping and empty fields

2013-02-25 Thread Oussama Jilal

Ok, Thank you all for precious help :) On 02/24/2013 04:37 PM, Teun Duynstee wrote: That would depend on your indexing setup. We have a custom application for indexing, so we just make a value up. In our case a GUID (UUID). But I imagine that you could also just copy your id field with a prefix

Re: Slaves always replicate entire index & Index versions

2013-02-25 Thread raulgrande83

Hello everybody. I have downloaded the 4.2-SNAPSHOT version that Mark linked at the JIRA and our first tests have been OK. Slaves now doesn't need to replicate the entire index and index versions between nodes are the same when replication process is completed. This 4.2 version is here: https://i

170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread zqzuk

Hi I am really frustrated by this problem. I have built an index of 1.5 billion data records, with a size of about 170GB. It's been optimised and has 12 separate files in the index directory, looking like below: _2.fdt --- 58G _2.fdx --- 80M _2.fnm--- 900bytes _2.si --- 380bytes _2.lucene41_0.

Many(one)-to-many relationship problems

2013-02-25 Thread ipuskaric

Let's say I have model in my db like this: product:n <-> n:package Product properties are: name, package ids. Package properties are: price, region, subscription. If the user requirement is to show all product data and product price (and to sort by price) for products that matched some user cri

id field doesn't match

2013-02-25 Thread b.riez...@pixel-ink.de

Hi all, i have an id field wich always contains a string with that schema "vw-200130315-" Wich field type and settings should i use to get exactly this id as a result. Actually i always get more then one result. Kind regards Benjamin

Re: id field doesn't match

2013-02-25 Thread Rafał Kuć

Hello! If you what you need is an exact match, try using the simple string type. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Hi all, > i have an id field wich always contains a string with that schema > "vw-200130315-" > Wich field

Re: zk Config URL?

2013-02-25 Thread Darren Govoni

Hi Mark, I download latest zk, and run it. In my glassfish server, I set these system wide properties: numShards = 1 zkHost = 10.x.x.x:2181 jetty.port = 8080 (port of my domain) bootstrap_config = true I copy all the solr 4.1 dist/*.jar into my glassfish domain lib/ext directory. Th

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Artem OXSEED

Hello, adding my 5 cents here as well: it seems that we experienced similar problem that was supposed to be fixed or not appear at all for 64-bit systems. Our current solution is custom build of Solr with DEFAULT_READ_CHUNK_SIZE set t0 10MB in FSDirectory class. This fix was done however not

Re: Many(one)-to-many relationship problems

2013-02-25 Thread Michael Della Bitta

Hello Puska, I might not have understood your requirements, but if for a given user, there's only one package per product that should ever be retrieved, I'd make the document represent one package/price combination, and then use a filter query to ensure the user's searches only retrieve package/pr

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread Jan Høydahl

Cool. I tried running from source (using the bundled griffonw), but I think the instructions may be wrong, had to download binary dist. The file permissions for bin/vifun in binary dist should have +w so you can execute it with ./vifun What about the ability to override the "wt" param, so that y

Re: solr search integration

2013-02-25 Thread Jan Høydahl

Have you tried one of the extensions out there, such as https://code.google.com/p/magento-community-edition-solr/ ? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 25. feb. 2013 kl. 14:12 skrev Rohan Thakur : > hi all > > I wanted

Max Score Query parser?

2013-02-25 Thread Jan Høydahl

Hi, A customer sends large, deeply nested boolean queries to Solr using the default (lucene) parser. The default scoring is summing up all the scores. For parts of this query they would like to use the Max score instead of the sum, e.g. for q=+A +B +(C D E) we want the max of C,D,E. I was think

Re: Slaves always replicate entire index & Index versions

2013-02-25 Thread Mark Miller

On Feb 25, 2013, at 5:54 AM, raulgrande83 wrote: > Mark, is going to be an official 4.2 release soon? I've suggested on the dev mailing list that I will create a Lucene/Solr 4.2 release within the next few weeks unless someone beats me to it. I can't do it this week, I likely can't do it next

How to set the shardid?

2013-02-25 Thread Markus.Mirsberger

Hi, I have two servers, each server one shard in a collection. Id like to have one server have the same shardId for every collection I create (eg shard1 on server1 and shard2 on server2) I thought this would work by setting -DshardId=shard1 when starting the server. But the shardId's shard1 a

Re: Introducing Solrstrap: A blazing fast tool for querying Solr in a Googleish fashion

2013-02-25 Thread Jan Høydahl

Great Fergus, You have really been working on this since the MeetUp in Oslo! Impressive how much you can do with little code. Have you started thinking about UI widget support for query box, breadcrumb path, facets, paging controls etc? Are you going to budle in a particular UI widget framewor

Re: Many(one)-to-many relationship problems

2013-02-25 Thread ipuskaric

Hi Michael, As I can see there are two directions how to do that: 1) store all package data in product documents 2) have separate documents for packages In reality there are lots of packages for every product. So if I make document for every combination product+package I'll get lots of documents

Re: Many(one)-to-many relationship problems

2013-02-25 Thread Michael Della Bitta

Hi Ivan, Generally the denormalization strategies you might use to optimize a relational database are antipatterns when dealing with Solr, so I wouldn't hesitate to give this option a try. Solr's very good at reducing the footprint of a field value duplicated across many documents down to a simpl

Re: How to set the shardid?

2013-02-25 Thread Mark Miller

On Feb 25, 2013, at 10:00 AM, "Markus.Mirsberger" wrote: > How can I fix the shardId used at one server when I create a collection? (Im > using the solrj collections api to create collections) You can't do it with the collections API currently. If you want to control the shard names explicit

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Teun Duynstee

You have to set group.ngroups=true (see http://wiki.apache.org/solr/FieldCollapsing). Be aware that including the number of groups is a surprisingly heavy operation, though. Teun 2013/2/25 Nicholas Ding > Hello, > > I grouped the result, and set group.main=true. I was expecting the numFound >

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Timothy Potter

My sense tells me that you're heading down the wrong path of trying to fit such a large index on one server. Even if you resolve this current issue, you're not likely to be happy with query performance as one thread searching 1.5B docs index is going to be slower than 10 threads searching 10 - 150M

CurrencyField querying in Solr 4

2013-02-25 Thread Gerald Blanck

We are attempting to leverage the CurrecyField type. We have defined the currency field type as: And defined a field as: When querying the field with something like: my_money:[* TO *] The result is ALL documents (even though only 1 document actually has this field populated. When query

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Shawn Heisey

On 2/25/2013 4:06 AM, zqzuk wrote: Hi I am really frustrated by this problem. I have built an index of 1.5 billion data records, with a size of about 170GB. It's been optimised and has 12 separate files in the index directory, looking like below: _2.fdt --- 58G _2.fdx --- 80M _2.fnm--- 900byte

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Carlos Maroto

Use group.ngroups, check it in the Solr wiki for FieldCollapsing Carlos Maroto Search Architect at Search Technologies (www.searchtechnologies.com) Nicholas Ding wrote: Hello, I grouped the result, and set group.main=true. I was expecting the numFound equals to the number of groups, but act

User Query Processing Sanity Check

2013-02-25 Thread z...@navigo.com

Have been working with Solr for about 6 months, straightforward stuff, basic keyword searches. We want to move to more advanced stuff, to support 'must include', 'must not include', set union, etc. I.e., more advanced query strings. We seem to have hit a block, and are considering two paths and wa

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Nicholas Ding

Thanks Teun and Carlos, I set group.ngroups=true, but I don't have this "ngroup" number when I was using group.main = true. On Mon, Feb 25, 2013 at 12:02 PM, Carlos Maroto < cmar...@searchtechnologies.com> wrote: > Use group.ngroups, check it in the Solr wiki for FieldCollapsing > > Carlos Maroto

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread zqzuk

Hi, thanks for your advice! I have deliberately allocated 32G to JVM, with the command "java -Xmx32000m -jar start.jar" etc. I am using our server which I think has a total of 48G. However it still crashes because of that error when I specify any keywords in my query. The only query that worked, a

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Timothy Potter

The other issue you need to be worried about is long full GC pauses with -Xmx32000m. Maybe try reducing your JVM Heap considerably (e.g. -Xmx8g) and switching to the MMapDirectory - see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html In solrconfig.xml, this would be:

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Shawn Heisey

On 2/25/2013 11:05 AM, zqzuk wrote: I have deliberately allocated 32G to JVM, with the command "java -Xmx32000m -jar start.jar" etc. I am using our server which I think has a total of 48G. However it still crashes because of that error when I specify any keywords in my query. The only query that

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread jmlucjav

Jan, thanks for looking at this! - Running from source: would you care to send me the error you get (if any) when running from source? I assume you have griffon1.1.0 installed right? - Binary dist: the distrib is created by griffon, so I'll check if the permission issue (I develop on windows, and

RE: User Query Processing Sanity Check

2013-02-25 Thread Swati Swoboda

Maybe I am not understanding correctly, but have you overlooked the qf parameter for Edismax? http://wiki.apache.org/solr/ExtendedDisMax#qf_.28Query_Fields.29 Suppose you want to search for the phrase "apples and bananas" in title, summary, and body. You also want it to have greater emphasis w

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread Roman Chyla

Oh, wonderful! Thank you :) I was hacking some simple python/R scripts that can do a similar job for qf... the idea was to let the algorithm create possible combinations of params and compare that against the baseline. Would it be possible/easy to instruct the tool to harvest results for different

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread zqzuk

Thanks again for your kind input! I followed Tim's advice and tried to use MMapDirectory. Then I get outofmemory on solr startup (tried giving only 8G, 4G to JVM) I guess this truely indicates that there arent sufficient memory for such a huge index. On another thread I posted days before, rega

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Michael Della Bitta

Hello Zqzuk, It's true that this index is probably too big for a single shard, but make sure you heed Shawn's advice and use a 64-bit JVM in any case! Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appini

Re: Max Score Query parser?

2013-02-25 Thread Mikhail Khludnev

Jan, I think it's worth to start from extending LuceneQParser. Then after parent's parse() returns a query instance. It can be cast to BooleanQuery, after that it's possible to check that all clauses have SHOULD occur, and to create an instance of DisjunctionMaxQuery() from the given clauses. Am I

Re: From a high level query call, tell Solr / Lucene to automatically apply a leaf operator?

2013-02-25 Thread Mikhail Khludnev

Mark, AFAIK http://lucene.apache.org/core/4_0_0-ALPHA/queryparser/org/apache/lucene/queryparser/flexible/core/package-summary.htmlis a convenient framework for such juggling. Please also be aware of the good starting point http://lucene.apache.org/core/4_0_0-ALPHA/queryparser/org/apache/lucene/que

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread jmlucjav

Hi Roman, I read with interest your thread about relevance testing a couple of weeks ago and yes, I noticed it was related somehow. But what you were proposing there is a different approach I think. In my tool, you have some baseline setting (it might be good or bad), and using a single query, yo

Re: 170G index, 1.5 billion documents, out of memory on query

2013-02-25 Thread Timothy Potter

Do you have the stack trace for the OOM during startup when using MMapDirectory? That would be interesting to know. Cheers, Tim On Mon, Feb 25, 2013 at 1:15 PM, zqzuk wrote: > Hi Michael > > Yes I have double checked and pretty sure its 64bit java. Thanks > > > > -- > View this message in contex

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread Jan Høydahl

Hi, I actually tried ../griffonw run-app but it says "griffon-app does not appear to be part of a Griffon application." I installed griffon and tried again "griffon run-app" inside of griffon-app, but same error. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr T

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Teun Duynstee

Ah, I see. The docs say "Although this result format does not have as much information, it may be easier for existing solr clients to parse". I guess the ngroups value could be added to this format, but apparently it isn't. I do agree with you that to be usefull (as in possible to read for a client

Toulouse JUG looks for speakers

2013-02-25 Thread Alexis Krier

Hello all, I am from the Toulouse JUG in France, I'm looking for speakers to talk about solr in our JUG, any body? French or English are welcome. thx Alexis

Re: Max Score Query parser?

2013-02-25 Thread Jack Krupansky

Bite the bullet and use a function query for the boost: &bf=max(query({!v='field:C'}),query({!v='field:D'}),query({!v='field:E'})) -- Jack Krupansky -Original Message- From: Jan Høydahl Sent: Monday, February 25, 2013 6:32 AM To: solr-user@lucene.apache.org Subject: Max Score Query pa

Re: numFound is not correct while using Result Grouping

2013-02-25 Thread Amit Nithian

Yeah I had a similar problem. I filed and submitted this patch: https://issues.apache.org/jira/browse/SOLR-4310 Let me know if this is what you are looking for! Amit On Mon, Feb 25, 2013 at 1:50 PM, Teun Duynstee wrote: > Ah, I see. The docs say "Although this result format does not have as mu

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-02-25 Thread Amit Nithian

This is cool! I had done something similar except changing via JConsole/JMX: https://issues.apache.org/jira/browse/SOLR-2306 We had something not as nice at Zvents but I wanted to expose these as MBean properties so you could change them via any JMX UI like JVisualVM Cheers! Amit On Mon, Feb 25

Re: CurrencyField querying in Solr 4

2013-02-25 Thread Chris Hostetter

: my_money:[* TO *] : : The result is ALL documents (even though only 1 document actually has this : field populated. ... : +my_money:[* TO *] -my_money:0 : : We get the single document back. Hmmm, i can reproduce, and that definitely doesn't make any sense to me. There are some open i

Re: 答复: solr shards

2013-02-25 Thread rulinma

I give some comments to this tompic: 1 compoisteId with setting numshards 1.1 uinique id (hash alogrith to set shard) 1.2 espically, prefix with "!" will be route to same shard if you set "!" in id 2 not set numshars 2.1 user using "_field_"(schema.xml) to set where to sink d

Distributed Search and the Stale Check

2013-02-25 Thread Ryan Zezeski

Hello Solr Users, I just wrote up a piece about some work I did recently to improve the throughput of distributed search. http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html The short of it is that the stale check in Apache's HTTP Client used by SolrJ can add a lot of l

RE: Distributed Search and the Stale Check

2013-02-25 Thread Michael Ryan

I don't have anything to add besides saying "this is awesome". Great analysis. -Michael

Re: Distributed Search and the Stale Check

2013-02-25 Thread Mark Miller

On Feb 25, 2013, at 8:14 PM, Ryan Zezeski wrote: > I would like to see a > similar fix made upstream and that is why I am posting here. Please file a JIRA issue and attach your patch. Great write up! (Saw it pop up on twitter, so I read it a little earlier). - Mark

Re: Distributed Search and the Stale Check

2013-02-25 Thread Yonik Seeley

> On my particular benchmark rig, each stale check call accounted for an > additional ~10ms. That's insane! It's still not even clear to me how the stale check works (reliably). Couldn't the server still close the connection between the stale check and the send of data by the client? -Yonik

RE: Solr Suggester component doesn't return hits for non-English words

2013-02-25 Thread Carlos Maroto

Hi Dejan, I wouldn't say your problem is because the words are non-English words as there is nothing in Solr to indicate that the terms are or not in English. I think it is a configuration thing in your implementation for the current data set or test, I would start by trying the following: -

Re: zk Config URL?

2013-02-25 Thread Anirudha Jadhav

Solr cloud reads solr cfg files from zookeeper. You need to push the cfg to zookeeper & link collection to cfg. This is exactly what mark suggested earlier in the thread. This is also explained in solr cloud wiki. On Monday, February 25, 2013, Darren Govoni wrote: > Hi Mark, > >I download la

Re: Solr Suggester component doesn't return hits for non-English words

2013-02-25 Thread Jack Krupansky

Try changing splitOnCaseChange="1" to splitOnCaseChange="0", and fully reindex your data. One possibility is that you may have indexed Marcos and Dejan before adding the lower case filter, which would cause the query to be lower case even though the indexed data might not be lower case. -- Jac

Re: zk Config URL?

2013-02-25 Thread darren

Ok. But its way too complicated than it should be. It should work smarter. Sent from my Verizon Wireless 4G LTE Smartphone Original message From: Anirudha Jadhav Date: To: solr-user@lucene.apache.org Subject: Re: zk Config URL? Solr cloud reads solr cfg files from zooke

Re: Distributed Search and the Stale Check

2013-02-25 Thread Ryan Zezeski

On Mon, Feb 25, 2013 at 8:42 PM, Yonik Seeley wrote: > > > That's insane! > It is insane. Keep in mind this was a 5-node cluster on the same physical machine sharing the same resources. It consist of 5 smartos zones on the same global zone. On my MacBook Pro I saw ~1.5ms per stale check bu

Re: splitting big, existing index into shards

2013-02-25 Thread Mark Miller

On Thu, Feb 21, 2013 at 1:19 PM, Upayavira wrote: > A splitter that uses the same split technique but uses the shard > assignment algorithm from SolrCloud could be a useful thing. There is some on going work on shard splitting, and I assume a splitter like this is part of that. -- - Mark

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-25 Thread Lance Norskog

"Do you use replication instead, or do you just have one instance?" On 02/25/2013 07:55 PM, Otis Gospodnetic wrote: Hi, Quick poll to see what % of Solr users use SolrCloud vs. Master-slave setup: http://blog.sematext.com/2013/02/25/poll-solr-cloud-or-not/ I have to say I'm surprised with the

Re: Multi-threaded post.jar?

2013-02-25 Thread Otis Gospodnetic

Upayavira, ever did this? Ha, look at my email from 20 days ago and this: https://github.com/javanna/elasticshell Otis -- Solr & ElasticSearch Support http://sematext.com/ On Wed, Feb 6, 2013 at 2:38 PM, Otis Gospodnetic wrote: > Btw wouldn't this be a chance to create a solr cli tool, muc

DataDirectory: relative path doesn't work

2013-02-25 Thread Patrick Mi

I am running Solr4.0/Tomcat 7 on Centos6 According to this page http://wiki.apache.org/solr/SolrConfigXml if is not absolute, then it is relative to the instanceDir of the SolrCore. However the index directory is always created under the directory where I start the Tomcat (startup.sh) rather tha

Re: Is their a way in which I can make spell suggestion dictionary build on specific fileds

2013-02-25 Thread Rohan Thakur

thanks On Thu, Feb 21, 2013 at 9:41 PM, Jack Krupansky wrote: > Yes, each spellchecker (or "dictionary") in your spellcheck search > component has a "field" parameter to specify the field to be used to > generate the dictionary index for that spellchecker: > > spell > > See the Solr example solrc

Re: Slaves always replicate entire index & Index versions

2013-02-25 Thread Artyom

Interesting, that there is no such a bug if I disable index compression, discussed here: https://issues.apache.org/jira/browse/SOLR-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566364#comment-13566364 -- View this message in context: http://lucen

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-25 Thread Walter Underwood

I cannot answer "yes" to any of those options. Master/slave and cloud have different strengths and weaknesses. We will use each one where it is appropriate. The loose coupling in master/slave is a very good thing and increases robustness for a corpus that does not have tight freshness requireme

62 matches

Mail list logo