tly text fields that cannot have DocValues
>
> -Original Message-
> From: Webster Homer
> Sent: Thursday, July 23, 2020 2:07 PM
> To: solr-user@lucene.apache.org
> Subject: RE: How to measure search performance
>
> Hi Erick,
>
> This is an example of a pseudo
as 10%
Thank you for your quick response.
Webster
-Original Message-
From: Erick Erickson
Sent: Thursday, July 23, 2020 12:52 PM
To: solr-user@lucene.apache.org
Subject: Re: How to measure search performance
This isn’t usually a cause for concern. Clearing the caches doesn’t
%
Thank you for your quick response.
Webster
-Original Message-
From: Erick Erickson
Sent: Thursday, July 23, 2020 12:52 PM
To: solr-user@lucene.apache.org
Subject: Re: How to measure search performance
This isn’t usually a cause for concern. Clearing the caches doesn’t necessarily
clear
This isn’t usually a cause for concern. Clearing the caches doesn’t necessarily
clear the OS caches for instance. I think you’re already aware that Lucene uses
MMapDirectory, meaning the index pages are mapped to OS memory space. Whether
those pages are actually _in_ the OS physical memory or no
I'm trying to determine the overhead of adding some pseudo fields to one of our
standard searches. The pseudo fields are simply function queries to report if
certain fields matched the query or not. I had thought that I could run the
search without the change and then re-run the searches with th
Hi Erick,
thanks for the reply.
Just to follow up, I'm using "unified" highlighter (fastVector does not
work for my purposes). I search and highlight on a multivalued string
string field which contains small strings (usually less than 200 chars).
This multivalued field is subject to various process
I suspect this is spurious. Norms are just an encoding
of the length of a field, offhand I have no clue how having
them (or not) would affect highlighting at all.
Term _vectors_ OTOH could have a major impact. If
FastVectorHighlighter is not used, the highlighter has
to re-analyze the text in ord
I'm using solr-8.3.1 on a solrcloud set up with 2 solr nodes and 2 ZK nodes.
I was experiencing very slow search-with-highlighting on a index that had
'omitNorms="true"' on all fields.
At the suggestion of a stackoverflow post, I changed all fields to be
'omitNorms="false"' and the search-with-high
Thanks Shawn, Alessandro for your feedback.
sorry if I took this for granted, I was just trying to understand if there
could be a performance gain when *real* queries happens (against other
fields too).
So a smaller collection, a smaller document space, a smaller query in terms
of number of filter
Seconding Shawn, if your queries will always aim the active documents you
will see :
High level this is what is going to happen :
A) You need to run your query + a filter query that will retrieve only
active documents.
The filter query results will be cached.
Solr will query over the entire docume
On 11/4/2016 8:22 AM, Vincenzo D'Amore wrote:
> Given 2 collection A and B:
>
> - A collection have 5 M documents with an attribute active: true/false.
> - B collection have only 2.5 M documents, but all the documents have
> attribute active:true
> - in any case, A or B, I can only search upon docu
Hi all,
it's trivia time :) hope you enjoy the question.
Given 2 collection A and B:
- A collection have 5 M documents with an attribute active: true/false.
- B collection have only 2.5 M documents, but all the documents have
attribute active:true
- in any case, A or B, I can only search upon do
ped.
> Restarting solr won't reset it.
>
>
> On unix, you may reset this cache with
> echo 3 > /proc/sys/vm/drop_caches
>
>
> Franck Brisbart
>
>
> Le mercredi 13 novembre 2013 à 11:58 +, Jacky.J.Wang
> (mis.cnsh04.Newegg) 41361 a écrit :
> >
> &g
2013 à 11:58 +, Jacky.J.Wang
(mis.cnsh04.Newegg) 41361 a écrit :
>
>
> Dear lucene
>
>
>
> In order to test the solr search performance ,I closed all the cache
> solr
>
>
>
> insert into the 10 million data,and find the first search very
> slowl
Dear lucene
In order to test the solr search performance ,I closed all the cache solr
[cid:image001.png@01CEE0AA.39ECDE90]
insert into the 10 million data,and find the first search very
slowly(700ms),and the secondary search very quick(20ms),I am sure no solr cache。
This problem bothering
Any guesses would be wild ones, but I'm pretty sure you'll notice it,
assuming the result size isn't trivially small. Also, LatLonType will use
much less memory and be more real-time search friendly (i.e. Commit
warming will be faster, assuming you do warming queries as everyone should
do).
To be
Thank you, David.
I believe the field doesn't need to be multivalued.
Can you give me some idea how much query-time performance gain
we can expect by switching to LatLonType from Solr-2155?
On 11/06/2013 09:56 AM, Smiley, David W. wrote:
Hi Kuro,
I don't know of any benchmarks featuring distanc
Hi Kuro,
I don't know of any benchmarks featuring distance-sort performance.
Presumably you are using SOLR-2155 because you have multi-valued spatial
fields? If so, LatLonType is not an option. SOLR-2155 sorting
performance is *probably* about the same as the equivalent in Solr 4 RPT.
If you ac
Are there any performance comparison results available comparing various
methods
to sort result by distance (not just filtering) on Solr 3 and 4?
We are using Solr 3.5 with Solr-2155 patch. I am particularly interested
in learning
performance difference among Solr 3 LatLongType, Solr-2155 GeoH
better.
From: Toke Eskildsen [t...@statsbiblioteket.dk]
Sent: Wednesday, August 07, 2013 7:45 AM
To: solr-user@lucene.apache.org
Subject: Re: poor facet search performance
On Tue, 2013-07-30 at 21:48 +0200, Robert Stewart wrote:
[Custom facet structure
We have a lot of cores on our servers so
it works well.
From: Toke Eskildsen [t...@statsbiblioteket.dk]
Sent: Wednesday, August 07, 2013 7:45 AM
To: solr-user@lucene.apache.org
Subject: Re: poor facet search performance
On Tue, 2013-07-30 at 21:48 +0200, R
ith the facet structures? Just
grep for "UnInverted". They look something like this:
UnInverted multi-valued field
{field=lma_long,memSize=42711405,tindexSize=42,time=979,phase1=964,nTerms=23,bigTerms=6,termInstances=1958468,uses=0}
> Given these metrics (200m docs, 20 facet fields, s
On Tue, Jul 30, 2013 at 11:48 PM, Robert Stewart wrote:
> Also we need to issue frequent commits since we are constantly streaming
> new content into the system.
I'd like to say show me profiler snapshot, but after that note. Solr's
filter/field caches are top level datastructures, hence they ar
facet search performance should I expect?
Also we need to issue frequent commits since we are constantly streaming new
content into the system.
Thanks
Bob
ext:
> http://lucene.472066.n3.nabble.com/Replication-process-on-Master-Slave-slowing-down-slave-read-search-performance-tp707934p4078090.html
> Sent from the Solr - User mailing list archive at Nabble.com.
://lucene.472066.n3.nabble.com/Replication-process-on-Master-Slave-slowing-down-slave-read-search-performance-tp707934p4078090.html
Sent from the Solr - User mailing list archive at Nabble.com.
On 6/12/2013 8:50 AM, adfel70 wrote:
> We have a multi-sharded and multi-replicated collection (solr 4.3).
>
> When we perform massive indexing (adding 5 million records with 5k bulks,
> commit after each bulk), the search performance is degrades a lot (1 sec
> query can turn
million records with 5k bulks,
> commit after each bulk), the search performance is degrades a lot (1 sec
> query can turn to 4 sec query).
>
> Any rule of thumb regarding best configuration for this kind of a scenario?
>
> thanks.
>
>
>
> --
> View this messa
Hi,
We have a multi-sharded and multi-replicated collection (solr 4.3).
When we perform massive indexing (adding 5 million records with 5k bulks,
commit after each bulk), the search performance is degrades a lot (1 sec
query can turn to 4 sec query).
Any rule of thumb regarding best
r now, but in the end it will be more than 1.000.000 pages.
How can I improve search performance?
I'm using this configuration:
explicit
json
true
text
edismax
id^10.
ick reply.
>>
>> Thus, replication seems to be the preferable solution. QTime decreases
>> proportional to replications number or there are any other drawbacks?
>>
>> Just to clarify, what amount of documents stands for "tons of documents"
>> in
>&
ur opinion? :)
2013/5/7 Jan Høydahl
Hi,
It depends(TM) on what kind of search performance problems you are seeing.
If you simply have so high query load that the server starts to kneal, it
will
definitely not help to shard, since ALL the shards will still be hit with
ALL the queries, and you add
to clarify, what amount of documents stands for "tons of documents"
> in your opinion? :)
>
>
> 2013/5/7 Jan Høydahl
>
>> Hi,
>>
>> It depends(TM) on what kind of search performance problems you are seeing.
>> If you simply have so high query l
an Høydahl
> Hi,
>
> It depends(TM) on what kind of search performance problems you are seeing.
> If you simply have so high query load that the server starts to kneal, it
> will
> definitely not help to shard, since ALL the shards will still be hit with
> ALL the queries, and y
Hi,
It depends(TM) on what kind of search performance problems you are seeing.
If you simply have so high query load that the server starts to kneal, it will
definitely not help to shard, since ALL the shards will still be hit with
ALL the queries, and you add some extra overhead with sharding as
r 15, 2012 6:27 AM
To: solr-user@lucene.apache.org
Subject: RE: Improving proximity search performance
i have the same problem.and did you got some good idea? wish you can share
it.thanks
在 2012-2-18 上午8:52,"Bryan Loofbourrow" 写道:
Apologies. I meant to type “1.4 TB” and somehow typed “1
must
> be using Sneakernet to run my searches.
>
>
>
> -- Bryan Loofbourrow
>
>
> --
>
> *From:* Bryan Loofbourrow [mailto:bloofbour...@knowledgemosaic.com]
> *Sent:* Thursday, February 16, 2012 7:07 PM
> *To:* 'solr-user@lucene.apache
...@knowledgemosaic.com]
*Sent:* Thursday, February 16, 2012 7:07 PM
*To:* 'solr-user@lucene.apache.org'
*Subject:* Improving proximity search performance
Here’s my use case. I expect to set up a Solr index that is approximately
1.4GB (this is a real number from the proof-of-concept using the
improve my proximity search performance?
Second question: If not, I’m very willing to dive into the code and come up
with a patch that would do this. Can someone with knowledge of the
internals comment on whether this is a plausible strategy for improving
performance, and, if so, give tips about the
I've switched my index to use pointtype instead of latlontype of spatial search
queries. Unfortunately I'm seeing much worse performance, and I was wondering
if anybody else knew of any issues between the two types. I would expect a flat
space calculation of pointtype to be better than the spher
5G memory per JVM
--
View this message in context:
http://lucene.472066.n3.nabble.com/my-index-has-500-million-docs-how-to-improve-solr-search-performance-tp1902595p2819179.html
Sent from the Solr - User mailing list archive at Nabble.com.
here's a brief reference to that approach in the lucene FAQ:
http://wiki.apache.org/lucene-java/LuceneFAQ#Can_I_store_the_Lucene_index_in_a_relational_database.3F
Jonathan
So the question is, what is this technique/trick? More broadly: Why can
Lucene/Solr achieve better faceted search perfor
hat is this technique/trick? More broadly: Why can
Lucene/Solr achieve better faceted search performance theoretically than
RDBMS could (if so)?
*Note: My first guess would be that Lucene would use some space partitioning
method for partitioning a vector space built from the document fields as
d
technique/trick? More broadly: Why can
Lucene/Solr achieve better faceted search performance theoretically than
RDBMS could (if so)?
*Note: My first guess would be that Lucene would use some space partitioning
method for partitioning a vector space built from the document fields as
dimensions, but as I
Please re-post the question here so others can see
the discussion without going to another list.
Best
Erick
On Wed, Apr 6, 2011 at 4:09 AM, Robin Palotai wrote:
> Hello List,
>
> Please see my question at
>
> http://stackoverflow.com/questions/5552919/how-does-lucene-solr-achieve-high-performanc
Hello List,
Please see my question at
http://stackoverflow.com/questions/5552919/how-does-lucene-solr-achieve-high-performance-in-multi-field-faceted-search,
I would be interested to know some details.
Thank you,
Robin
context:
http://lucene.472066.n3.nabble.com/solr-search-performance-tp2239298p2239714.html
Sent from the Solr - User mailing list archive at Nabble.com.
View this message in context:
http://lucene.472066.n3.nabble.com/solr-search-performance-tp2239298p2239378.html
Sent from the Solr - User mailing list archive at Nabble.com.
On Wednesday 12 January 2011 10:56 AM, Grijesh.singh wrote:
Which type of performance issues you have index time or query time?
-
Grijesh
i have query time index issues.Also tell me in which condition
field_type' textspell' is used. Is it effect the performance of solr query.
Which type of performance issues you have index time or query time?
-
Grijesh
--
View this message in context:
http://lucene.472066.n3.nabble.com/solr-search-performance-tp2239298p2239338.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi
Plz tell me changes that made in solr config file to improve the
solr search performance.
Thanks!
gt; http://lucene.472066.n3.nabble.com/my-index-has-500-million-docs-how-to-improve-solr-search-performance-tp1902595p1902869.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
;http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
EARTH has a Right To Life,
otherwise we all die.
- Original Message
From: Lance Norskog
To: solr-user@lucene.apache.org
Sent: Wed, November 17, 2010 10:53:47 PM
Subject: Re: my index has 500 million docs ,how to impro
e lucene-user list . Also, it's basically windows-specific,so not of use
> to everyone...
>
> The question: does NTFS fragmentation affect search performance "a little
> bit" or "a lot"? It's obvious that "fragmentation will slow things down",
> but
specific,so not of use
> to everyone...
>
> The question: does NTFS fragmentation affect search performance "a little
> bit" or "a lot"? It's obvious that "fragmentation will slow things down",
> but is it a factor of .1, 10 , or 100? (i.e what order of
d
up.
Tom
On Wed, Dec 8, 2010 at 9:59 AM, Will Milspec wrote:
> Hi all,
>
> Pardon if this isn't the best place to post this email...maybe it belongs on
> the lucene-user list . Also, it's basically windows-specific,so not of use
> to everyone...
>
> The questio
Hi all,
Pardon if this isn't the best place to post this email...maybe it belongs on
the lucene-user list . Also, it's basically windows-specific,so not of use
to everyone...
The question: does NTFS fragmentation affect search performance "a little
bit" or "
This is pretty standard. I think the problem is basic probabilities:
when there are multiple shards, the query waits until the final shard
responds, then does another query which may wait for more than one
shard. The nature of probabilities is that there will be "stragglers"
(late responses) an
ml content:
id
name
I'm looking forward to your opinion
--
View this message in context:
http://lucene.472066.n3.nabble.com/my-index-has-500-million-docs-how-to-improve-solr-search-performance-tp1902595p1916398.html
Sent from the Solr - User maili
-docs-how-to-improve-solr-search-performance-tp1902595p1916289.html
Sent from the Solr - User mailing list archive at Nabble.com.
It's not that EC2 instances have slow disks, it's that they have no
quota system to guarantee you X amount of throughput. I've benchmarked
1x to 3x on the same instance type at different times. That is, 300%
variation in disk speeds.
Filter queries are only slow once; after that they create a
On Mon, 2010-11-15 at 06:35 +0100, lu.rongbin wrote:
> In addition,my index has only two store fields, id and price, and other
> fields are index. I increase the document and query cache. the ec2
> m2.4xLarge instance is 8 cores, 68G memery. all indexs size is about 100G.
Looking at http://aws.ama
-500-million-docs-how-to-improve-solr-search-performance-tp1902595p1902869.html
Sent from the Solr - User mailing list archive at Nabble.com.
more time
to view the result. I use solr filter to search ,for example,
category:digital,price:[some price TO some price], I don't know if it cost
time?
Any way to improve the search performance? thanks.
--
View this message in context:
http://lucene.472066.n3.nabble.com/my-index-has-500-mi
ext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
>
>
>
> - Original Message
> > From: bbarani
> > To: solr-user@lucene.apache.org
> > Sent: Wed, June 16, 2010 5:06:55 PM
> > Subject: SOLR search performance - Linux v
; Subject: SOLR search performance - Linux vs Windows servers
>
>
Hi,
I have SOLR instances running in both Linux / windows server
> (same version /
same index data). Search performance is good in windows
> box compared to
Linux box.
Some queries takes more than 10 seconds in
&
Hi,
I have SOLR instances running in both Linux / windows server (same version /
same index data). Search performance is good in windows box compared to
Linux box.
Some queries takes more than 10 seconds in Linux box but takes just a second
in windows box. Have anyone encountered this kind of
egards
>
>
> Marco Martínez Bautista
> http://www.paradigmatecnologico.com
> Avenida de Europa, 26. Ática 5. 3ª Planta
> 28224 Pozuelo de Alarcón
> Tel.: 91 352 59 42
>
>
> 2010/4/9 Marcin
>
>> Hi guys,
>>
>> I have noticed that Master/Slave re
t;
> I have noticed that Master/Slave replication process is slowing down slave
> read/search performance during replication being done.
>
>
> please help
> cheers
>
Hi guys,
I have noticed that Master/Slave replication process is slowing down
slave read/search performance during replication being done.
please help
cheers
Agreed, Solr uses random access bitsets everywhere so I'm thinking
this could be an improvement or at least a great option to enable and
try out. I'll update LUCENE-1536 so we can benchmark.
On Thu, Aug 27, 2009 at 4:06 AM, Michael
McCandless wrote:
> On Thu, Aug 27, 2009 at 6:30 AM, Grant Ingerso
On Thu, Aug 27, 2009 at 6:30 AM, Grant Ingersoll wrote:
>> I am wondering... are new SOLR filtering features faster than standard
>> Lucene queries like
>> {query} AND {filter}???
>
> The new filtering features in Solr are just doing what Lucene started doing
> in 2.4 and that is using skipping wh
https://issues.apache.org/jira/browse/SOLR-1179
-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: August-26-09 8:50 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Lucene Search Performance Analysis Workshop
While Andrzej's talk will focus on things at the Lucene layer
PM
To: solr-user@lucene.apache.org
Subject: Fwd: Lucene Search Performance Analysis Workshop
While Andrzej's talk will focus on things at the Lucene layer, I'm
sure there'll be some great tips and tricks useful to Solrians too.
Andrzej is one of the sharpest folks I've
ssage-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com]
Sent: August-26-09 8:50 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Lucene Search Performance Analysis Workshop
While Andrzej's talk will focus on things at the Lucene layer, I'm
sure there'll be some great tips
Begin forwarded message:
From: Andrzej Bialecki
Date: August 26, 2009 5:44:40 PM EDT
To: java-u...@lucene.apache.org
Subject: Lucene Search Performance Analysis Workshop
Reply-To: java-u...@lucene.apache.org
Hi all,
I am giving a free talk/ workshop next week on how to analyze and
improve Luce
e filter will tokenize just
> as in Exhibit B, except that if the query is only one word long, it
> will return a corresponding single token, rather than zero tokens. In
> other words,
>
> [Exhibit C]
> "please" ->
> "please"
>
> Thing
Mike Klaas suggested last month that I might be able to improve phrase
search performance by indexing word bigrams, aka bigram shingles. I've
been playing with this, and the initial results are very promising. (I
may post some performance data later.) I wanted to describe my
technique, whic
>
>
> can you be more specific about what you mean when you say "And I got the
> time from dispatchfilter..." What *exactly* are you looking at (ie: is
> this a time you are seeing in a log file? ifso which log file? ... is this
> timing code you added to the dispatch filter yourself? what *exactl
: 1) how are you timing this (ie: what exactly are you measuring)
: And I got the time from dispatchfilter received the request to
: responsewriter write the response
: It is much larger than QTime.
can you be more specific about what you mean when you say "And I got the
time from dispatchfilte
2008/4/9, Chris Hostetter <[EMAIL PROTECTED]>:
>
>
> : most of time seems to be used for the writer getting and writing the
> docs
> : can those docs prefetched?
>
> as mentiond, the documentCache can help you out in the common case, but
> 1-4 seconds for just the XMLWriting seems pretty high ...
: most of time seems to be used for the writer getting and writing the docs
: can those docs prefetched?
as mentiond, the documentCache can help you out in the common case, but
1-4 seconds for just the XMLWriting seems pretty high ...
1) how are you timing this (ie: what exactly are you measur
I will try what u suggested !
Thanks a lot~
在08-4-9,Leonardo Santagada <[EMAIL PROTECTED]> 写道:
>
>
> On 09/04/2008, at 00:24, 李银松 wrote:
>
> > most of time seems to be used for the writer getting and writing the
> > docs
> > can those docs prefetched?
> >
>
>
> There is a cache on solr... if you
On 09/04/2008, at 00:24, 李银松 wrote:
most of time seems to be used for the writer getting and writing the
docs
can those docs prefetched?
There is a cache on solr... if you really want it you could make the
cache and the jvm as big as your memory it should probably fit most of
the 10gb i
most of time seems to be used for the writer getting and writing the docs
can those docs prefetched?
2008/4/9, Leonardo Santagada <[EMAIL PROTECTED]>:
>
>
> On 08/04/2008, at 23:13, 李银松 wrote:
>
> > I'm testing solr search performance using LoadRunner
> >
; On 08/04/2008, at 23:13, 李银松 wrote:
>
> > I'm testing solr search performance using LoadRunner
> > the index contains 5M+ docs , about 10.7GB large.
> > CPU:3.2GHz*2 RAM16GB
> > The result si dispirited
> > max:19s min 1.5s avg 11.7s
> > But the QTime is
On 08/04/2008, at 23:13, 李银松 wrote:
I'm testing solr search performance using LoadRunner
the index contains 5M+ docs , about 10.7GB large.
CPU:3.2GHz*2 RAM16GB
The result si dispirited
max:19s min 1.5s avg 11.7s
But the QTime is around 1s
(simple query without facet or mlt,just fetch the t
I'm testing solr search performance using LoadRunner
the index contains 5M+ docs , about 10.7GB large.
CPU:3.2GHz*2 RAM16GB
The result si dispirited
max:19s min 1.5s avg 11.7s
But the QTime is around 1s
(simple query without facet or mlt,just fetch the top 50 IDs)
So it seems that XMLWriter i
k as
new docs are added to them, but never explode.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: fireofenigma <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, February 19, 2008 6:30:34 PM
Subject: Search Performanc
[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Tuesday, February 19, 2008 6:30:34 PM
> Subject: Search Performance When There Are Many Segments
>
>
> Let me start with an example application/scenario.
>
> I have an application that allows users to uploa
ed. Thanks!
--
View this message in context:
http://www.nabble.com/Search-Performance-When-There-Are-Many-Segments-tp15578740p15578740.html
Sent from the Solr - User mailing list archive at Nabble.com.
91 matches
Mail list logo