On Wed, 2015-08-19 at 05:55 +, Maulin Rathod wrote:
> SLOW WITH FILTER QUERY (takes more than 1 second)
>
>
> q=+recipient_id:(4042) AND project_id:(332) AND resource_id:(13332247
> 13332245 13332243 13332241 13332239) AND entity_type:(2) AND -acti
On Tue, 2015-08-18 at 14:36 +0530, Modassar Ather wrote:
> So Toke/Daniel is the node showing *gone* on Solr cloud dashboard is
> because of GC pause and it is actually not gone but the ZK is not able to
> get the correct state?
That would be my guess.
> The issue is caused by a huge query with m
Hi,
http://stackoverflow.com/questions/11627427/solr-query-q-or-filter-query-fq
As per above link it suggests to use Filter Query but we observed Filter Query
is slower than Normal Query in our case. Are we doing something wrong?
SLOW WITH FILTER QUERY (takes more than 1 second)
=
We sometimes get a spike in Solr, and we get like 3K of threads and then
timeouts...
In Solr 5.2.1 the defult jetty settings is kinda crazy for threads - since
the value is HIGH!
What do others recommend?
Fusion jetty settings for Threads:
* *
false
T
Ahmet/Chris! Thanks for your replies.
Ahmet I think "net.agkn.hll.serialization" is used by hll() function
implementation of Solr.
Chris I will try to create sample data and create a jira ticket with
details.
Regards,
Modassar
On Tue, Aug 18, 2015 at 9:58 PM, Chris Hostetter
wrote:
>
> : > I
Hmm...so I think I have things setup correctly, I have a custom
QParserPlugin building a custom query that wraps the query built from the
base parser and stores the user who is executing the query. I've added the
username to the hashCode and equals checks so I think everything is setup
properly.
: My current expansion expands from the
:user-query
: to the
:+user-query favouring-query-depending-other-params overall-favoring-query
: (where the overall-favoring-query could be computed as a function).
: With the boost parameter, i'd do:
:(+user-query favouring-query-depending-othe
On Tue, Aug 18, 2015 at 9:51 PM, Jamie Johnson wrote:
> Thanks, I'll try to delve into this. We are currently using the parent
> query parser, within we could use {!secure} I think. Ultimately I would
> want the solr qparser to actually do the work of parsing and I'd just wrap
> that.
Right...
Thanks, I'll try to delve into this. We are currently using the parent
query parser, within we could use {!secure} I think. Ultimately I would
want the solr qparser to actually do the work of parsing and I'd just wrap
that. Are there any examples that I could look at for this? It's not
clear to
On Tue, Aug 18, 2015 at 8:38 PM, Jamie Johnson wrote:
> I really like this idea in concept. My query would literally be just a
> wrapper at that point, what would be the appropriate place to do this?
It depends on how much you are trying to make everything transparent
(that there is security) or
The boost parameter is part of the edismax query parser. If you have your
own query parser you could introduce your own argument "boost" and
interpret it as a value source. Here's the code that parses the external
function query in edismax
https://github.com/apache/lucene-solr/blob/15d634b9ef2ea52
I really like this idea in concept. My query would literally be just a
wrapper at that point, what would be the appropriate place to do this?
What would I need to do to the query to make it behave with the cache.
Again thanks for the idea, I think this could be a simple way to use the
caches.
On
On Tue, Aug 18, 2015 at 8:19 PM, Jamie Johnson wrote:
> when you say a security filter, are you asking if I can express my security
> constraint as a query? If that is the case then the answer is no. At this
> point I have a requirement to secure Terms (a nightmare I know).
Heh - ok, I figured
when you say a security filter, are you asking if I can express my security
constraint as a query? If that is the case then the answer is no. At this
point I have a requirement to secure Terms (a nightmare I know). Our
fallback is to aggregate the authorizations to a document level and secure
th
Doug Turnbull wrote:
> I'm not sure if you mean organizing function queries under the hood in a
> query component or externally.
>
> Externally, I've always followed John Berryman's great advice for working
> with Solr when dealing with complex/reusable function queries and boosts
> http://opensour
On Tue, Aug 18, 2015 at 7:11 PM, Jamie Johnson wrote:
> Yes, my use case is security. Basically I am executing queries with
> certain auths and when they are executed multiple times with differing
> auths I'm getting cached results.
If it's just simple stuff like top N docs returned, can't you j
I'm not sure if you mean organizing function queries under the hood in a
query component or externally.
Externally, I've always followed John Berryman's great advice for working
with Solr when dealing with complex/reusable function queries and boosts
http://opensourceconnections.com/blog/2013/11/2
Yes, my use case is security. Basically I am executing queries with
certain auths and when they are executed multiple times with differing
auths I'm getting cached results. One option is to have another
implementation that has a number of caches based on the auths, something
that I suspect we wil
You can comment out (some) of the caches.
There are some caches like field caches that are more at the lucene
level and can't be disabled.
Can I ask what you are trying to prevent from being cached and why?
Different caches are for different things, so it would seem to be an
odd usecase to disabl
I see that if Solr is in realtime mode that caching is disable within the
SolrIndexSearcher that is created in SolrCore, but is there anyway to
disable caching without being in realtime mode? Currently I'm implementing
a NoOp cache that implements SolrCache but returns null for everything and
does
Hello Solr experts,
I'm writing a "query expansion" QueryComponent which takes web-app
parameters (e.g. profile information) and turns them into a solr query.
Thus far I've used lucene TermQuery-ies with success.
Now, I would like to use something a bit more elaborate. Either I write
it with qui
Where is Zookeeper running? Is it running as an independent service on a
separate box?
Also, 4.0 is very old now - the code has matured a LOT since then.
Upayavira
On Tue, Aug 18, 2015, at 09:54 PM, Erick Erickson wrote:
> You might be hitting: https://issues.apache.org/jira/browse/SOLR-7361
>
You might be hitting: https://issues.apache.org/jira/browse/SOLR-7361
Note that the fix is in the (currently releasing) 5.3 and trunk code,
with virtually no possibility of back-porting to 4.0, unfortunately.
Best,
Erick
On Tue, Aug 18, 2015 at 1:19 PM, Gilles Comeau
wrote:
> Hi all,
>
> Sorry
bq: can I turn off the three cache and send a lot of queries to Solr
I really think you're missing the easiest way to do that.
To not put anything in the filter cache, just don't send any fq clauses.
As far as the doc cache is concerned, by and large I just wouldn't
worry about it. With MMapDirec
Hi all,
Sorry if this has been asked before, my online searching is not bringing up any
answers.
If I have two shards on different servers with zookeeper, Core1 and Core2, in a
collection that are identical to each other, why won't Core1 return any results
while Core2 is starting up? If Core
Hi Erick,
I just tested 10 different queries with or without the faceting search on
the two properties : departure_date, and hotel_code. Under cold cache
scenario, they have pretty much the same response time, and the faceting
took much less time than the query time. Under cold cache scenario, the
Question:
Can I configure solr to highlight the keyword also? The search results are
correct, but the highlighting is not complete.
*
Example:
Keyword: stocks
Request: (I only provided the url parameters below.)
hl=true&
hl.fl=spell&
hl.simple.pre=%5BHIGHLIGHT%5D&
hl.simple.post=%5B%2FHIGHLI
those are not that high. I was thinking of facets with thousands to
tens-of-thousands of unique values. I really wouldn't expect this to
be a huge hit unless you're querying all docs.
Let us know what you find.
Best,
Erick
On Tue, Aug 18, 2015 at 11:31 AM, wwang525 wrote:
> Hi Erick,
>
> Two fa
Hi Erick,
Two facets are probably demanding:
departure_date have 365 distinct values and hotel_code can have 800 distinct
values.
The docValues setting definitely helped me a lot even when all the queries
had the above two facets. I will test a list of queries with or without the
two facets afte
https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of your email, other mail headers still track which
First, do not think in terms of cores, think "replicas" ;). And do not,
use the "core admin" bits of the admin UI to do any SolrCloud-related
operations. It's possible, but far too easy to get wrong.
Use the collections API instead.
Second, 600 collections, assuming all on a single cluster is
get
bq: The issue is caused by a huge query with many wildcards and phrases in it.
Well, the very first thing I'd do is look at whether this is necessary.
For instance:
leading and trailing wildcards are an anti-pattern. You should investigate
using ngrams instead.
trailing wildcards usually resolve
hello,
i'am a bit confused about how solr cloud recovery is supposed to work
exactly in the case of loosing a single node completely.
My 600 collections are created with
numShards=3&replicationFactor=3&maxShardsPerNode=3
However, how do i configure a new node to take the place of the dead
node,
Cloudera has back-ported a _bunch_ of Solr JIRAs to their
release, so depending on which CDH version you have, the
functionality may or may not be there. I suggest you contact
Cloudera support to see what's been backported to the
version of CDH you're using because it may not be just Solr
4.4.
Clo
Lot of stuff here, let me reply to a few things:
If you're faceting on high-cardinality fields, this is expensive.
How many unique values are there in the fields you facet on?
Note, I am _not_ asking about how many values are in the fields
of the selected set, but rather how many values corpus-wid
Thanks for the response. Will take a look into using cloud solr server
for updates and review tlog mechanism.
On 8/18/15 9:29 AM, Erick Erickson wrote:
Couple of things:
1> Here's an excellent backgrounder for MMapDirectory, which is
what makes it appear that Solr is consuming all the physical
Couple of things:
1> Here's an excellent backgrounder for MMapDirectory, which is
what makes it appear that Solr is consuming all the physical memory
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
2> It's possible that your transaction log was huge. Perhaps not likel
On Tue, Aug 18, 2015 at 12:23 PM, naga sharathrayapati
wrote:
> Is it possible to clear the cache through query?
>
> I need this for performance valuation.
No, but you can prevent a query from being cached:
q={!cache=false}my query
What are you trying to test the performance of exactly?
If you t
: > I am getting following exception for the query :
: > *q=field:query&stats=true&stats.field={!cardinality=1.0}field*. The
: > exception is not seen once the cardinality is set to 0.9 or less.
: > The field is *docValues enabled* and *indexed=false*. The same exception
: > I tried to reproduce o
Is it possible to clear the cache through query?
I need this for performance valuation.
Thanks Shawn.
All participating cloud nodes are running Tomcat and as you suggested
will review the number of threads and increase them as needed.
Essentially, what I have noticed was that two of four nodes caught up
with "bulk" updates instantly while other two nodes took almost 3 hours
to
Hi All,
I am working on a search service based on Solr (v5.1.0). The data size is 15
M records. The size of the index files is 860MB. The test was performed on a
local machine that has 8 cores with 32 G memory and CPU is 3.4Ghz (Intel
Core i7-3770).
I found out that setting docValues=true for fa
Check out https://issues.apache.org/jira/browse/SOLR-4722, which will
return matching terms (and their offsets). Patch can be applied cleanly to
Solr 4; doesn't appear to have been tried with Solr 5
-Simon
On Tue, Aug 18, 2015 at 11:30 AM, Jack Krupansky
wrote:
> Maybe a specialized highlighter
On 8/18/2015 7:21 AM, Norgorn wrote:
> SOLR version - 4.10.3
> We have SOLR Cloud cluster, each node has documents only for several
> categories.
> Queries look like "...fq=cat(1 3 89 ...)&..."
> So, only some nodes need to process, others can answer with zero as soon as
> they check "cat".
>
> The
I did try Highlighting, but it is highlighting only those words which are
part of the query, not the matching phrase.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-Matched-Terms-tp4223649p4223688.html
Sent from the Solr - User mailing list archive at Nabble.com.
Maybe a specialized highlighter could be produced that simply lists the
matched terms in a form that apps can easily consume.
-- Jack Krupansky
On Tue, Aug 18, 2015 at 11:11 AM, Mikhail Khludnev <
mkhlud...@griddynamics.com> wrote:
> Hello,
>
> I just wonder what's wrong with highlighting?
>
> O
On 8/18/2015 8:18 AM, Rallavagu wrote:
> Thanks for the response. Does this cache behavior influence the delay
> in catching up with cloud? How can we explain solr cloud replication
> and what are the option to monitor and take proactive action (such as
> initializing, pausing etc) if needed?
I do
Hello,
I just wonder what's wrong with highlighting?
On Tue, Aug 18, 2015 at 4:19 PM, Basheer Shaik
wrote:
> Hi,
> I am new to Solr. We have a requirement to carry out fuzzy search. I am
> able
> to do this and figure out the documents that meet the fuzzy search
> criteria.
> Is there a way to
Solr Cloud Document Routing described at
https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
allows you to omit hitting certain shards, but they need to be assigned
with the different prefixes beforehand.
Do I get your point right?
On Tue, Aug 18, 2015 at 4:57 PM
I second that question! Inquiring minds want to know!
On 8/18/2015 7:19 AM, Basheer Shaik wrote:
Hi,
I am new to Solr. We have a requirement to carry out fuzzy search. I am able
to do this and figure out the documents that meet the fuzzy search criteria.
Is there a way to find out the list of t
Thanks for the response. Does this cache behavior influence the delay in
catching up with cloud? How can we explain solr cloud replication and
what are the option to monitor and take proactive action (such as
initializing, pausing etc) if needed?
On 8/18/15 5:57 AM, Shawn Heisey wrote:
On 8/
Have you tried this with Cache=false?
https://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters
Because the internal representation of the field value already may be
doing what you want. And the caching of non-repeating filters is what
slowing it down.
I would just do that as a sanity
I'm sorry for being so unclear.
The problem is in speed - while node holds only several cats, it can answer
with "numFound=0", if these cats are missed in query.
It looks like:
node 1 - cats 1,2,3
node 2 - cats 3,4,5
node 3 - cats 50,70
...
Query "q=cat:(1 4)"
QTime per node now is like
node1 -
I am not sure I understand the problem statement. Is it speed? Memory
usage? Something very specific about SolrCloud?
To me it seems the problem is that your 'fq' _are_ getting cached when
you may not want them as the list is different every time. You could
disable that cache.
Or you could try Te
Hi,
I am new to Solr. We have a requirement to carry out fuzzy search. I am able
to do this and figure out the documents that meet the fuzzy search criteria.
Is there a way to find out the list of terms from each selected document
that matched this search criteria?
Appreciate any help on this.
Than
SOLR version - 4.10.3
We have SOLR Cloud cluster, each node has documents only for several
categories.
Queries look like "...fq=cat(1 3 89 ...)&..."
So, only some nodes need to process, others can answer with zero as soon as
they check "cat".
The problem is to keep separate cache for "cat" values
On 8/18/2015 2:30 AM, Daniel Collins wrote:
> I think this is expected. As Shawn mentioned, your hard commits have
> openSearcher=false, so they flush changes to disk, but don't force a
> re-open of the active searcher.
> By contrast softCommit, sets openSearcher=true, the point of softCommit is
>
On 8/17/2015 10:53 PM, Rallavagu wrote:
> Also, I have noticed that the memory consumption goes very high. For
> instance, each node is configured with 48G memory while java heap is
> configured with 12G. The available physical memory is consumed almost
> 46G and the heap size is well within the li
Hi Modassar,
What is this "net.agkn.hll.serialization" ? Custom plugin or something?
Ahmet
On Tuesday, August 18, 2015 9:23 AM, Modassar Ather
wrote:
Any suggestions please.
Regards,
Modassar
On Thu, Aug 13, 2015 at 4:25 PM, Modassar Ather
wrote:
> Hi,
>
> I am getting following exceptio
This arrived with the latest 5.1/5.2 Solr, so no, it won't work on 4.4,
which is quite old by now.
As to how to do it on an older Solr, if you have the ability to do
additional work at index time, create and entryDate_month field, which
is truncated to the beginning of the month, then do a normal
So Toke/Daniel is the node showing *gone* on Solr cloud dashboard is
because of GC pause and it is actually not gone but the ZK is not able to
get the correct state?
The issue is caused by a huge query with many wildcards and phrases in it.
If you see I have mentioned about (*The request took too l
Ah ok, its ZK timeout then
(org.apache.zookeeper.KeeperException$SessionExpiredException)
which is because of your GC pause.
The page Shawn mentioned earlier has several links on how to investigate GC
issues and some common GC settings, sounds like you need to tweak those.
Generally speaking, I b
I think this is expected. As Shawn mentioned, your hard commits have
openSearcher=false, so they flush changes to disk, but don't force a
re-open of the active searcher.
By contrast softCommit, sets openSearcher=true, the point of softCommit is
to make the changes visible so do to that you have to
On Tue, 2015-08-18 at 10:38 +0530, Modassar Ather wrote:
> Kindly help me understand, even if there is a a GC pause why the solr node
> will go down.
If a stop-the-world GC is in progress, it is not possible for an
external service to know if this is because a GC is in progress or the
node is dead
64 matches
Mail list logo