Thanks Arcadius,
Excellent suggestion about the view.I'll try to simplify things and see how
I go.
thanks,
Csaba
--
View this message in context:
http://lucene.472066.n3.nabble.com/DIH-deleting-documents-tp4041811p4042663.html
Sent from the Solr - User mailing list archive at Nabble.com.
Ok, Thank you all for precious help :)
On 02/24/2013 04:37 PM, Teun Duynstee wrote:
That would depend on your indexing setup. We have a custom application for
indexing, so we just make a value up. In our case a GUID (UUID). But I
imagine that you could also just copy your id field with a prefix
Hello everybody.
I have downloaded the 4.2-SNAPSHOT version that Mark linked at the JIRA and
our first tests have been OK. Slaves now doesn't need to replicate the
entire index and index versions between nodes are the same when replication
process is completed.
This 4.2 version is here:
https://i
Hi
I am really frustrated by this problem.
I have built an index of 1.5 billion data records, with a size of about
170GB. It's been optimised and has 12 separate files in the index directory,
looking like below:
_2.fdt --- 58G
_2.fdx --- 80M
_2.fnm--- 900bytes
_2.si --- 380bytes
_2.lucene41_0.
Let's say I have model in my db like this:
product:n <-> n:package
Product properties are: name, package ids.
Package properties are: price, region, subscription.
If the user requirement is to show all product data and product price (and
to sort by price) for products that matched some user cri
Hi all,
i have an id field wich always contains a string with that schema
"vw-200130315-"
Wich field type and settings should i use to get exactly this id as a result.
Actually i always get more then one result.
Kind regards
Benjamin
Hello!
If you what you need is an exact match, try using the simple string
type.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch
> Hi all,
> i have an id field wich always contains a string with that schema
> "vw-200130315-"
> Wich field
Hi Mark,
I download latest zk, and run it.
In my glassfish server, I set these system wide properties:
numShards = 1
zkHost = 10.x.x.x:2181
jetty.port = 8080 (port of my domain)
bootstrap_config = true
I copy all the solr 4.1 dist/*.jar into my glassfish domain lib/ext
directory. Th
Hello,
adding my 5 cents here as well: it seems that we experienced similar
problem that was supposed to be fixed or not appear at all for 64-bit
systems. Our current solution is custom build of Solr with
DEFAULT_READ_CHUNK_SIZE set t0 10MB in FSDirectory class. This fix was
done however not
Hello Puska,
I might not have understood your requirements, but if for a given
user, there's only one package per product that should ever be
retrieved, I'd make the document represent one package/price
combination, and then use a filter query to ensure the user's searches
only retrieve package/pr
Cool. I tried running from source (using the bundled griffonw), but I think the
instructions may be wrong, had to download binary dist.
The file permissions for bin/vifun in binary dist should have +w so you can
execute it with ./vifun
What about the ability to override the "wt" param, so that y
Have you tried one of the extensions out there, such as
https://code.google.com/p/magento-community-edition-solr/ ?
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com
25. feb. 2013 kl. 14:12 skrev Rohan Thakur :
> hi all
>
> I wanted
Hi,
A customer sends large, deeply nested boolean queries to Solr using the default
(lucene) parser.
The default scoring is summing up all the scores. For parts of this query they
would like
to use the Max score instead of the sum, e.g. for q=+A +B +(C D E) we want the
max
of C,D,E. I was think
On Feb 25, 2013, at 5:54 AM, raulgrande83 wrote:
> Mark, is going to be an official 4.2 release soon?
I've suggested on the dev mailing list that I will create a Lucene/Solr 4.2
release within the next few weeks unless someone beats me to it.
I can't do it this week, I likely can't do it next
Hi,
I have two servers, each server one shard in a collection.
Id like to have one server have the same shardId for every collection I
create (eg shard1 on server1 and shard2 on server2)
I thought this would work by setting -DshardId=shard1 when starting the
server.
But the shardId's shard1 a
Great Fergus,
You have really been working on this since the MeetUp in Oslo! Impressive how
much you can do with little code.
Have you started thinking about UI widget support for query box, breadcrumb
path, facets, paging controls etc? Are you going to budle in a particular UI
widget framewor
Hi Michael,
As I can see there are two directions how to do that:
1) store all package data in product documents
2) have separate documents for packages
In reality there are lots of packages for every product.
So if I make document for every combination product+package I'll get lots of
documents
Hi Ivan,
Generally the denormalization strategies you might use to optimize a
relational database are antipatterns when dealing with Solr, so I
wouldn't hesitate to give this option a try. Solr's very good at
reducing the footprint of a field value duplicated across many
documents down to a simpl
On Feb 25, 2013, at 10:00 AM, "Markus.Mirsberger"
wrote:
> How can I fix the shardId used at one server when I create a collection? (Im
> using the solrj collections api to create collections)
You can't do it with the collections API currently. If you want to control the
shard names explicit
You have to set group.ngroups=true (see
http://wiki.apache.org/solr/FieldCollapsing). Be aware that including the
number of groups is a surprisingly heavy operation, though.
Teun
2013/2/25 Nicholas Ding
> Hello,
>
> I grouped the result, and set group.main=true. I was expecting the numFound
>
My sense tells me that you're heading down the wrong path of trying to
fit such a large index on one server. Even if you resolve this current
issue, you're not likely to be happy with query performance as one
thread searching 1.5B docs index is going to be slower than 10 threads
searching 10 - 150M
We are attempting to leverage the CurrecyField type. We have defined the
currency field type as:
And defined a field as:
When querying the field with something like:
my_money:[* TO *]
The result is ALL documents (even though only 1 document actually has this
field populated.
When query
On 2/25/2013 4:06 AM, zqzuk wrote:
Hi
I am really frustrated by this problem.
I have built an index of 1.5 billion data records, with a size of about
170GB. It's been optimised and has 12 separate files in the index directory,
looking like below:
_2.fdt --- 58G
_2.fdx --- 80M
_2.fnm--- 900byte
Use group.ngroups, check it in the Solr wiki for FieldCollapsing
Carlos Maroto
Search Architect at Search Technologies (www.searchtechnologies.com)
Nicholas Ding wrote:
Hello,
I grouped the result, and set group.main=true. I was expecting the numFound
equals to the number of groups, but act
Have been working with Solr for about 6 months, straightforward stuff, basic
keyword searches. We want to move to more advanced stuff, to support 'must
include', 'must not include', set union, etc. I.e., more advanced query
strings.
We seem to have hit a block, and are considering two paths and wa
Thanks Teun and Carlos, I set group.ngroups=true, but I don't have this
"ngroup" number when I was using group.main = true.
On Mon, Feb 25, 2013 at 12:02 PM, Carlos Maroto <
cmar...@searchtechnologies.com> wrote:
> Use group.ngroups, check it in the Solr wiki for FieldCollapsing
>
> Carlos Maroto
Hi, thanks for your advice!
I have deliberately allocated 32G to JVM, with the command "java -Xmx32000m
-jar start.jar" etc. I am using our server which I think has a total of 48G.
However it still crashes because of that error when I specify any keywords
in my query. The only query that worked, a
The other issue you need to be worried about is long full GC pauses
with -Xmx32000m.
Maybe try reducing your JVM Heap considerably (e.g. -Xmx8g) and
switching to the MMapDirectory - see:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
In solrconfig.xml, this would be:
On 2/25/2013 11:05 AM, zqzuk wrote:
I have deliberately allocated 32G to JVM, with the command "java -Xmx32000m
-jar start.jar" etc. I am using our server which I think has a total of 48G.
However it still crashes because of that error when I specify any keywords
in my query. The only query that
Jan, thanks for looking at this!
- Running from source: would you care to send me the error you get (if any)
when running from source? I assume you have griffon1.1.0 installed right?
- Binary dist: the distrib is created by griffon, so I'll check if the
permission issue (I develop on windows, and
Maybe I am not understanding correctly, but have you overlooked the qf
parameter for Edismax?
http://wiki.apache.org/solr/ExtendedDisMax#qf_.28Query_Fields.29
Suppose you want to search for the phrase "apples and bananas" in title,
summary, and body. You also want it to have greater emphasis w
Oh, wonderful! Thank you :) I was hacking some simple python/R scripts that
can do a similar job for qf... the idea was to let the algorithm create
possible combinations of params and compare that against the baseline.
Would it be possible/easy to instruct the tool to harvest results for
different
Thanks again for your kind input!
I followed Tim's advice and tried to use MMapDirectory. Then I get
outofmemory on solr startup (tried giving only 8G, 4G to JVM)
I guess this truely indicates that there arent sufficient memory for such a
huge index.
On another thread I posted days before, rega
Hello Zqzuk,
It's true that this index is probably too big for a single shard, but
make sure you heed Shawn's advice and use a 64-bit JVM in any case!
Michael Della Bitta
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appini
Jan,
I think it's worth to start from extending LuceneQParser. Then after
parent's parse() returns a query instance. It can be cast to BooleanQuery,
after that it's possible to check that all clauses have SHOULD occur, and
to create an instance of DisjunctionMaxQuery() from the given clauses.
Am I
Mark,
AFAIK
http://lucene.apache.org/core/4_0_0-ALPHA/queryparser/org/apache/lucene/queryparser/flexible/core/package-summary.htmlis
a convenient framework for such juggling.
Please also be aware of the good starting point
http://lucene.apache.org/core/4_0_0-ALPHA/queryparser/org/apache/lucene/que
Hi Roman,
I read with interest your thread about relevance testing a couple of weeks
ago and yes, I noticed it was related somehow. But what you were proposing
there is a different approach I think.
In my tool, you have some baseline setting (it might be good or bad), and
using a single query, yo
Do you have the stack trace for the OOM during startup when using
MMapDirectory? That would be interesting to know.
Cheers,
Tim
On Mon, Feb 25, 2013 at 1:15 PM, zqzuk wrote:
> Hi Michael
>
> Yes I have double checked and pretty sure its 64bit java. Thanks
>
>
>
> --
> View this message in contex
Hi,
I actually tried ../griffonw run-app but it says "griffon-app does not appear
to be part of a Griffon application."
I installed griffon and tried again "griffon run-app" inside of griffon-app,
but same error.
--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr T
Ah, I see. The docs say "Although this result format does not have as much
information, it may be easier for existing solr clients to parse". I guess
the ngroups value could be added to this format, but apparently it isn't. I
do agree with you that to be usefull (as in possible to read for a client
Hello all, I am from the Toulouse JUG in France, I'm looking for
speakers to talk about solr in our JUG, any body?
French or English are welcome.
thx
Alexis
Bite the bullet and use a function query for the boost:
&bf=max(query({!v='field:C'}),query({!v='field:D'}),query({!v='field:E'}))
-- Jack Krupansky
-Original Message-
From: Jan Høydahl
Sent: Monday, February 25, 2013 6:32 AM
To: solr-user@lucene.apache.org
Subject: Max Score Query pa
Yeah I had a similar problem. I filed and submitted this patch:
https://issues.apache.org/jira/browse/SOLR-4310
Let me know if this is what you are looking for!
Amit
On Mon, Feb 25, 2013 at 1:50 PM, Teun Duynstee wrote:
> Ah, I see. The docs say "Although this result format does not have as mu
This is cool! I had done something similar except changing via JConsole/JMX:
https://issues.apache.org/jira/browse/SOLR-2306
We had something not as nice at Zvents but I wanted to expose these as
MBean properties so you could change them via any JMX UI like JVisualVM
Cheers!
Amit
On Mon, Feb 25
: my_money:[* TO *]
:
: The result is ALL documents (even though only 1 document actually has this
: field populated.
...
: +my_money:[* TO *] -my_money:0
:
: We get the single document back.
Hmmm, i can reproduce, and that definitely doesn't make any sense to me.
There are some open i
I give some comments to this tompic:
1 compoisteId with setting numshards
1.1 uinique id (hash alogrith to set shard)
1.2 espically, prefix with "!" will be route to same shard if you set
"!" in id
2 not set numshars
2.1 user using "_field_"(schema.xml) to set where to sink d
Hello Solr Users,
I just wrote up a piece about some work I did recently to improve the
throughput of distributed search.
http://www.zinascii.com/2013/solr-distributed-search-and-the-stale-check.html
The short of it is that the stale check in Apache's HTTP Client used by
SolrJ can add a lot of l
I don't have anything to add besides saying "this is awesome". Great analysis.
-Michael
On Feb 25, 2013, at 8:14 PM, Ryan Zezeski wrote:
> I would like to see a
> similar fix made upstream and that is why I am posting here.
Please file a JIRA issue and attach your patch. Great write up! (Saw it pop up
on twitter, so I read it a little earlier).
- Mark
> On my particular benchmark rig, each stale check call accounted for an
> additional ~10ms.
That's insane!
It's still not even clear to me how the stale check works (reliably).
Couldn't the server still close the connection between the stale check
and the send of data by the client?
-Yonik
Hi Dejan,
I wouldn't say your problem is because the words are non-English words as there
is nothing in Solr to indicate that the terms are or not in English. I think
it is a configuration thing in your implementation for the current data set or
test, I would start by trying the following:
-
Solr cloud reads solr cfg files from zookeeper.
You need to push the cfg to zookeeper & link collection to cfg.
This is exactly what mark suggested earlier in the thread. This is also
explained in solr cloud wiki.
On Monday, February 25, 2013, Darren Govoni wrote:
> Hi Mark,
>
>I download la
Try changing splitOnCaseChange="1" to splitOnCaseChange="0", and fully
reindex your data. One possibility is that you may have indexed Marcos and
Dejan before adding the lower case filter, which would cause the query to be
lower case even though the indexed data might not be lower case.
-- Jac
Ok. But its way too complicated than it should be. It should work smarter.
Sent from my Verizon Wireless 4G LTE Smartphone
Original message
From: Anirudha Jadhav
Date:
To: solr-user@lucene.apache.org
Subject: Re: zk Config URL?
Solr cloud reads solr cfg files from zooke
On Mon, Feb 25, 2013 at 8:42 PM, Yonik Seeley wrote:
>
>
> That's insane!
>
It is insane. Keep in mind this was a 5-node cluster on the
same physical machine sharing the same resources. It consist of 5 smartos
zones on the same global zone. On my MacBook Pro I saw ~1.5ms per stale
check bu
On Thu, Feb 21, 2013 at 1:19 PM, Upayavira wrote:
> A splitter that uses the same split technique but uses the shard
> assignment algorithm from SolrCloud could be a useful thing.
There is some on going work on shard splitting, and I assume a
splitter like this is part of that.
--
- Mark
"Do you use replication instead, or do you just have one instance?"
On 02/25/2013 07:55 PM, Otis Gospodnetic wrote:
Hi,
Quick poll to see what % of Solr users use SolrCloud vs. Master-slave setup:
http://blog.sematext.com/2013/02/25/poll-solr-cloud-or-not/
I have to say I'm surprised with the
Upayavira, ever did this?
Ha, look at my email from 20 days ago and this:
https://github.com/javanna/elasticshell
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Wed, Feb 6, 2013 at 2:38 PM, Otis Gospodnetic wrote:
> Btw wouldn't this be a chance to create a solr cli tool, muc
I am running Solr4.0/Tomcat 7 on Centos6
According to this page http://wiki.apache.org/solr/SolrConfigXml if
is not absolute, then it is relative to the instanceDir of the
SolrCore.
However the index directory is always created under the directory where I
start the Tomcat (startup.sh) rather tha
thanks
On Thu, Feb 21, 2013 at 9:41 PM, Jack Krupansky wrote:
> Yes, each spellchecker (or "dictionary") in your spellcheck search
> component has a "field" parameter to specify the field to be used to
> generate the dictionary index for that spellchecker:
>
> spell
>
> See the Solr example solrc
Interesting, that there is no such a bug if I disable index compression,
discussed here:
https://issues.apache.org/jira/browse/SOLR-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566364#comment-13566364
--
View this message in context:
http://lucen
I cannot answer "yes" to any of those options.
Master/slave and cloud have different strengths and weaknesses. We will use
each one where it is appropriate.
The loose coupling in master/slave is a very good thing and increases
robustness for a corpus that does not have tight freshness requireme
62 matches
Mail list logo