Re: docValues usage

2020-11-04 Thread Wei
And in the case of both stored=true and docValues=true, Solr 8.x shall be choosing the optimal approach by itself? On Wed, Nov 4, 2020 at 9:15 AM Wei wrote: > Thanks Erick. As indexed is not necessary, and docValues is more > efficient than stored fields for function queries, so we sh

Re: docValues usage

2020-11-04 Thread Wei
Thanks Erick. As indexed is not necessary, and docValues is more efficient than stored fields for function queries, so we shall go with the following: 3) indexed=false, stored=false, docValues=true. Is my understanding correct? Best, Wei On Wed, Nov 4, 2020 at 5:24 AM Erick Erickson

docValues usage

2020-11-03 Thread Wei
, docValues=false 2) indexed=true, stored=false, docValues=true 3) indexed=false, stored=false, docValues=true What would be the performance implications for these options? Best, Wei

Re: solr performance with >1 NUMAs

2020-10-22 Thread Wei
-XX:G1MaxNewSizePercent=20 -XX:MaxGCPauseMillis=150 -XX:+DisableExplicitGC -XX:+DoEscapeAnalysis -XX:+ParallelRefProcEnabled -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions Compared to previous Java 8 + CMS on 2 NUMA servers, P99 latency has improved over 20%. Thanks, Wei On Mon

Re: solr performance with >1 NUMAs

2020-09-28 Thread Wei
Thanks Shawn. Looks like Java 11 is the way to go with -XX:+UseNUMA. Do you see any backward compatibility issue for Solr 8 with Java 11? Can we run Solr 8 built with JDK 8 in Java 11 JRE, or need to rebuild solr with Java 11 JDK? Best, Wei On Sat, Sep 26, 2020 at 6:44 PM Shawn Heisey wrote

Re: What does current mean?

2020-09-26 Thread Wei
My understanding is that current means whether there is data pending to be committed. Best, Wei On Sat, Sep 26, 2020 at 5:09 PM Kayak28 wrote: > Hello, Solr community: > > > > I would like to ask a question about the current icon on the core-overview > > under stat

Re: solr performance with >1 NUMAs

2020-09-26 Thread Wei
tml, seems Java 14 is not officially supported for Solr 8. Best, Wei On Fri, Sep 25, 2020 at 5:50 PM Shawn Heisey wrote: > On 9/23/2020 7:42 PM, Wei wrote: > > Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I > > noticed that query latency almost d

Re: solr performance with >1 NUMAs

2020-09-25 Thread Wei
Thanks Dominique. I'll start with the -XX:+UseNUMA option. Best, Wei On Fri, Sep 25, 2020 at 7:04 AM Dominique Bejean wrote: > Hi, > > This would be a Java VM option, not something Solr itself can know about. > Take a look at this article in comments. May be it wil

solr performance with >1 NUMAs

2020-09-23 Thread Wei
inter is appreciated. Best, Wei

How to disable cache for facet.query?

2020-08-08 Thread Wei
) does not work. Is it possible to stop solr from putting facet.query into filter cache? Thanks, Wei

Re: Unbalanced shard requests

2020-05-22 Thread Wei
Hi Michael, I also verified the patch in SOLR-14471 with 8.4.1 and it fixed the issue with shards.preference=replica.location:local,replica.type:TLOG in my setting. Thanks! Wei On Thu, May 21, 2020 at 12:09 PM Phill Campbell wrote: > Yes, JVM heap settings. > > > On May 19, 2020,

Re: Unbalanced shard requests

2020-05-19 Thread Wei
Hi Phill, What is the RAM config you are referring to, JVM size? How is that related to the load balancing, if each node has the same configuration? Thanks, Wei On Mon, May 18, 2020 at 3:07 PM Phill Campbell wrote: > In my previous report I was configured to use as much RAM as possi

Re: Unbalanced shard requests

2020-05-11 Thread Wei
l Gibney wrote: > FYI: https://issues.apache.org/jira/browse/SOLR-14471 > Wei, assuming you have only TLOG replicas, your "last place" matches > (to which the random fallback ordering would not be applied -- see > above issue) would be the same as the "first place" matches selec

Re: Unbalanced shard requests

2020-05-08 Thread Wei
for shard requests are the first node in each shard returned from the CLUSTERSTATUS api. Seems something wrong with shuffling equally compared nodes when shards.preference is set. Will report back if I find more. On Mon, Apr 27, 2020 at 5:59 PM Wei wrote: > Hi Eric, > > I am meas

solr payloads performance

2020-05-08 Thread Wei
terms of filtering, sorting or faceting, how would query performance compare between the two? Thanks, Wei

Re: Unbalanced shard requests

2020-04-27 Thread Wei
:TLOG Nothing seems to cause the strange behavior. Any suggestions how to debug this? -Wei On Mon, Apr 27, 2020 at 5:42 PM Erick Erickson wrote: > Wei: > > How are you measuring utilization here? The number of incoming requests or > CPU? > > The leader for each shard are certai

Unbalanced shard requests

2020-04-27 Thread Wei
idle. There is no change in shard handler configuration: 3 3 500 What could cause the unbalanced internal distributed request? Thanks in advance. Wei

Re: Early termination in Lucene 8

2020-01-23 Thread Wei
uce fewer counts. > > On Thu, Jan 23, 2020 at 2:11 AM Wei wrote: > > > Hi, > > > > I am excited to see Lucene 8 introduced BlockMax WAND as a major speed > > improvement https://issues.apache.org/jira/browse/LUCENE-8135. My > > question > > is, how does it

Early termination in Lucene 8

2020-01-22 Thread Wei
on on this. Any pointer is greatly appreciated. Best, Wei

Convert javabin to json

2019-11-27 Thread Wei
til https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/noggit/JSONUtil.java but seems it is not able to convert parts of the query response such as facet. Are there any other options available? Thanks, Wei

Re: Updates blocked in Tlog solr cloud?

2019-11-25 Thread Wei
wondering if the commits could be caused by the leader initialed recovery process. Will the Tlog leader do extra commits for the replica to sync up in recovery process? Best, Wei On Tue, Nov 19, 2019 at 1:22 PM Wei wrote: > Hi Erick, > > I observed that the update request rate dropped fr

Re: Lucene optimization to disable hit count

2019-11-20 Thread Wei
Thanks! Looking forward to have this feature in Solr. On Wed, Nov 20, 2019 at 5:30 PM Tomás Fernández Löbbe wrote: > Not yet: > https://issues.apache.org/jira/browse/SOLR-13289 > > On Wed, Nov 20, 2019 at 4:57 PM Wei wrote: > > > Hi, > > > > I see this lucene o

Lucene optimization to disable hit count

2019-11-20 Thread Wei
Hi, I see this lucene optimization to disable hit counts for better query performance: https://issues.apache.org/jira/browse/LUCENE-8060 Is the feature available in Solr 8.3? Thanks, Wei

Re: Updates blocked in Tlog solr cloud?

2019-11-19 Thread Wei
g for the time out? Also the bad tlog replica is not reachable at the time, so we did a DELETEREPLICA command with collections API to remove it from the cloud. Thanks, Wei On Tue, Nov 19, 2019 at 5:52 AM Erick Erickson wrote: > How long are updates blocked and how did the tlog replica on

Updates blocked in Tlog solr cloud?

2019-11-18 Thread Wei
requests to the cloud are blocked? Does leader need to wait for response from each replica to inform client that update is successful? Best, Wei

Re: How to block expensive solr queries

2019-10-10 Thread Wei
On Wed, Oct 9, 2019 at 9:59 AM Wei wrote: > Thanks all. I debugged a bit and see timeAllowed does not limit stats > call. Also I think it would be useful for solr to support a white list or > black list of operations as Toke suggested. Will create jira for it. > Currently seems the

Re: How to block expensive solr queries

2019-10-07 Thread Wei
Hi Mikhail, Yes I have the timeAllowed parameter configured, still is this case it doesn't seem to prevent the stats request from blocking other normal queries. Is it possible to drop the request before solr executes it? maybe at the jetty request filter? Thanks, Wei On Mon, Oct 7, 2019

How to block expensive solr queries

2019-10-07 Thread Wei
We are using solr 7.6.2 with a 10 shard cloud set up. Is there a way to block certain solr queries based on url pattern? i.e. ignore the stats.calcdistinct request in this case. Thanks, Wei

Re: Function Query with multi-value field

2019-07-13 Thread Wei
Any suggestion? On Thu, Jul 11, 2019 at 3:03 PM Wei wrote: > Hi, > > I have a question regarding function query that operates on multi-value > fields. For the following field: > > multivalued="true"/> > > Each value is a hex string representation of

Function Query with multi-value field

2019-07-11 Thread Wei
operates on all values of the field? Given color S in query, how to calculate the similarities between S and C1/C2/C3 and find which one is the closest? I checked https://lucene.apache.org/solr/guide/6_6/function-queries.html but didn't see an example. Thanks, Wei

Re: Solr 7.7 autoscaling trigger

2019-07-09 Thread zhenyuan wei
Hi, Maybe SOLR-12715&SOLR-12716 can help you. Mark Thill 于2019年7月9日周二 上午2:42写道: > My scenario is: > >- 60 GB collection >- 2 shards of ~30GB >- Each shard having 2 replicas so I have a backup >- So I have 4 nodes with each node holding a single core > > My goal is to have autosc

A consistency of split shard bug in v7.3.1-release

2019-07-02 Thread zhenyuan wei
Hi all, I have a collection1 with 8 shards,each shard‘s replicationFactor=1. I have an application adding 6000w document with infinite retry if any Exception catch. That is to say, finally it should be found 6000w docs when query=*:*. Normally, all things good, but if in the same time, a

Re: Solr filter query on text fields

2019-06-25 Thread Wei
Thanks Erick for the clarification. How does the ps work for fq? I configured ps=4 for q, it doesn't apply to fq though. For phrase queries in fq seems ps=0 is used. Is there a way to config it for fq also? Best, Wei On Tue, Jun 25, 2019 at 9:51 AM Erick Erickson wrote: > q a

Re: Solr filter query on text fields

2019-06-24 Thread Wei
e the values for relate parameters such as ps? Thanks, Wei On Mon, Jun 24, 2019 at 4:51 PM Shawn Heisey wrote: > On 6/24/2019 5:37 PM, Wei wrote: > > stored="true"/> > > I'm assuming that the asterisks here are for emphasis, that they are not > actually pre

Solr filter query on text fields

2019-06-24 Thread Wei
ream bar” and “vanilla ice cream” , but does not match for “ice cold cream”. The results seem neither exact match nor phrase match. What's the expected behavior for fq on text fields? I have tried to look into the solr docs but there is no clear explanation. Thanks, Wei

Hi, how to deal with a shard in recovery_failed status?

2019-06-17 Thread zhenyuan wei
Hi all, I use solr-7.3.1 release,when split a shard1 to shard1_0&shard1_1, encountered OOM error,then shard1_0&shard1_1 publish a status as recovery_failed. How to deal with a shard in recovery_failed status? Remove shard1_0&shard1_1 and then do split shard1 again? Or any other way to retry?

Mistake assert tips in FST builder ?

2019-04-15 Thread zhenyuan wei
Hi, With current newest version, 9.0.0-snapshot,In Builder.UnCompileNode.addArc() function, found this line: assert numArcs == 0 || label > arcs[numArcs-1].label: "arc[-1].label=" + arcs[numArcs-1].label + " new label=" + label + " numArcs=" + numArcs; Maybe assert tips is : assert numArcs ==

Question for separate query and updates with TLOG and PULL replicas

2019-04-10 Thread Wei
completely separate query and updates, I think that I might need to have the load-balancer set up to include only the PULL replicas. Is there any other option? Thanks, Wei

Re: solr 7 optimize with Tlog/Pull replicas

2019-03-12 Thread Wei
segment just to add a 1G so > having multiple segments < 20G is perfectly normal. > > Best, > Erick > > > On Mar 10, 2019, at 10:36 PM, Wei wrote: > > > > A side question, for heavy bulk indexing, what's the recommended setting > > for auto commit? As th

Re: solr 7 optimize with Tlog/Pull replicas

2019-03-10 Thread Wei
A side question, for heavy bulk indexing, what's the recommended setting for auto commit? As there is no query needed during the bulking indexing process, I have auto soft commit disabled. Is there any side effect if I also disable auto commit? On Sun, Mar 10, 2019 at 10:22 PM Wei

Re: solr 7 optimize with Tlog/Pull replicas

2019-03-10 Thread Wei
10 20480 But in the end I see multiple segments much smaller than the 20GB limit. In 7.6 is it required to explicitly set the number of segments to 1? e.g shall I use /update?optimize=true&waitSearcher=false&maxSegments=1 Best, Wei On Fri, Mar 8, 2019 at 12:29 PM Erick Erickson

solr 7 optimize with Tlog/Pull replicas

2019-03-08 Thread Wei
finished optimization to a single segment, however all the leader replicas still have multiple segments. Previously inn the all NRT replica cloud, I see optimization is triggered on all nodes. Is the optimization process different with Tlog/Pull replicas? Best, Wei

Re: The parent shard will never be delete/clean?

2019-01-29 Thread zhenyuan wei
uestion directly though, no. Split-shard creates two > > new subshards, but it doesn't do anything to remove or cleanup the > > original shard. The original shard remains with its data and will > > delegate future requests to the result shards. > > > > Hope that h

The parent shard will never be delete/clean?

2019-01-22 Thread zhenyuan wei
Hi, If I split shard1 to shard1_0,shard1_1, Is the parent shard1 will never be clean up? Best, Tinswzy

Re: Questions for SynonymGraphFilter and WordDelimiterGraphFilter

2019-01-08 Thread Wei
bump.. On Mon, Jan 7, 2019 at 11:53 AM Wei wrote: > Thanks Thomas. You mentioned "Also there is no need for the > FlattenGraphFilter", that's quite interesting because the Solr > documentation says it's mandatory for indexing: > https://lucene.apache.org/solr/gu

Re: Questions for SynonymGraphFilter and WordDelimiterGraphFilter

2019-01-07 Thread Wei
Thanks Thomas. You mentioned "Also there is no need for the FlattenGraphFilter", that's quite interesting because the Solr documentation says it's mandatory for indexing: https://lucene.apache.org/solr/guide/7_6/filter-descriptions.html. Is there any more explanation for this

Questions for SynonymGraphFilter and WordDelimiterGraphFilter

2019-01-04 Thread Wei
dvance for you input. Thanks, Wei

Re: Is there a common tool for SOLR benckmark?

2019-01-01 Thread zhenyuan wei
ej...@eolya.fr> > wrote: > > > Hi, > > > > There are the powerfull JMeter obviously and also SolrMeter ( > > https://github.com/tflobbe/solrmeter). > > > > Regards > > > > Dominique > > > > > > Le jeu. 20 déc. 2018 à 03:17, zhenyua

Is there a common tool for SOLR benckmark?

2018-12-19 Thread zhenyuan wei
Hi all, Is there a common tool for SOLR benckmark? YCSB is not very suitable for SOLR. Currently, Is there a good benchmark tool for SOLR? Best, TinsWzy

The result of Query all will change all the time?

2018-12-07 Thread zhenyuan wei
Hi all, I indexed 4810 documents,and make some kill processes tests. After all indexed done, I query all many time, and found the numFound is not the same number. OutPut Example: "responseHeader":{ "zkConnected":true, "status":0, "QTime":8, "params":{ "q ":"*:*", "_":"1544171213624"}}, "res

solr optimize command

2018-11-28 Thread Wei
there are more than 1 segments. Is the optimize command async? What is the best approach to validate that optimize is truly completed? Thanks, Wei

Re: Retrieve field from docValues

2018-11-06 Thread Wei
Also I notice this issue is still open: https://issues.apache.org/jira/browse/SOLR-10816 Does that mean we still need to have stored=true for uniqueKey? On Tue, Nov 6, 2018 at 2:14 PM Wei wrote: > I see there is also a docValuesFormat option, what's the default for this > setting?

Re: Retrieve field from docValues

2018-11-06 Thread Wei
I see there is also a docValuesFormat option, what's the default for this setting? Performance wise is it good to set docValuesFormat="Memory" ? Best, Wei On Tue, Nov 6, 2018 at 11:55 AM Erick Erickson wrote: > Yes, "the most efficient possible" is associated wit

Re: Retrieve field from docValues

2018-11-06 Thread Wei
the uniqueKey field need to be always docValues? Since it is used in the first phase of distributed search. Thanks, Wei On Tue, Nov 6, 2018 at 8:30 AM Erick Erickson wrote: > 2. "it depends". Solr will try to do the most efficient thing > possible. If _all_ the fields are

Retrieve field from docValues

2018-11-05 Thread Wei
Solr retrieve id from docValues instead of stored field? if fl= id, title, score, both id and title are single value field: Do I need to have all fields stored="false" docValues="true" to make solr retrieve from docValues only? I am using Solr 6.6. Thanks, Wei

Re: Index optimization takes too long

2018-11-03 Thread Wei
Thanks everyone! I checked the system metrics during the optimization process. CPU usage is quite low, there is no I/O wait, and memory usage is not much different from before the docValues change. So I wonder what could be the bottleneck. Thanks, Wei On Sat, Nov 3, 2018 at 1:38 PM Erick

Index optimization takes too long

2018-11-02 Thread Wei
maxMergeAtOnceExplicit because the default 30 could be too low: 100 But it doesn't seem to help. Any suggestions? Thanks, Wei

Is that a simple way to start a Mini Solr cluster in other project unittest?

2018-10-12 Thread zhenyuan wei
Hi all, I found it is too troublesome to start a solr mini cluster in my project,the MiniSolrCloudCluster has too many properties related to the folders of Solr Source Project. Is there a simple way to start a mini solr cluster out of Solr Project,such as in my custom Project?

Re: Is that solr supports multi version operations?

2018-09-19 Thread zhenyuan wei
to solr, it will retry to solr infinitely。 If write to solr is failed,and server was kill,I can use the transaction log of the true data store to replay and write to solr again。 Shawn Heisey 于2018年9月19日周三 下午10:38写道: > On 9/18/2018 8:11 PM, zhenyuan wei wrote: > > Hi all, >

Re: Is that solr supports multi version operations?

2018-09-19 Thread zhenyuan wei
Walter Underwood > > wun...@wunderwood.org > > http://observer.wunderwood.org/ (my blog) > > > >> On Sep 18, 2018, at 7:11 PM, zhenyuan wei wrote: > >> > >> Hi all, > >>add solr document with overwrite=false will keepping multi version >

Is that solr supports multi version operations?

2018-09-18 Thread zhenyuan wei
Hi all, add solr document with overwrite=false will keepping multi version documents, My question is : 1. How to search newest documents?with what options? 2. How to delete old version < newest version documents? for example: { "id":"1002", "name":["james"]

Does solr support rollback or any method to do the same job?

2018-09-18 Thread zhenyuan wei
Hi all, Does solr support rollback or any method to do the same job? Like update/add/delete a document, can I rollback them? Best~ TinsWzy

Re: 20180917-Need Apache SOLR support

2018-09-18 Thread zhenyuan wei
requests per second. Shawn Heisey 于2018年9月18日周二 下午12:07写道: > On 9/17/2018 9:05 PM, zhenyuan wei wrote: > > Is that means: Small amount of shards gains better performance? > > I also have a usecase which contains 3 billion documents,the collection > > contains 60 shard now. Is th

Re: 20180917-Need Apache SOLR support

2018-09-17 Thread zhenyuan wei
Is that means: Small amount of shards gains better performance? I also have a usecase which contains 3 billion documents,the collection contains 60 shard now. Is that 10 shard is better than 60 shard? Shawn Heisey 于2018年9月18日周二 上午12:04写道: > On 9/17/2018 7:04 AM, KARTHICKRM wrote: > > Dear SO

Re: Is that a mistake or bug?

2018-09-17 Thread zhenyuan wei
; > } > > Got it now? :) > > Petr > ______ > > Od: "zhenyuan wei" > > Komu: solr-user@lucene.apache.org > > Datum: 03.09.2018 11:21 > > Předmět: Re: Is that a mistake or bug? > > > >

preferLocalShards setting

2018-09-06 Thread Wei
ach server will only host 2 of the 5 shards( 2 JVMs per server, each JVM have one replica from different shards). Is it useful to set preferLocalShards=true in this case? Thanks, Wei

Re: How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread zhenyuan wei
ot;:0.0}, "facet_module":{ "time":0.0}, "mlt":{ "time":0.0}, "highlight":{ "time":0.0}, "stats":{ "time":0.0}, "expand":{ &q

Re: How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread zhenyuan wei
ot;:0.0}, "facet_module":{ "time":0.0}, "mlt":{ "time":0.0}, "highlight":{ "time":0.0}, "stats":{ "time":0.0}, "expand":{ &

How long does a query?q=field1:2312 should cost? exactly hit one document.

2018-09-03 Thread zhenyuan wei
Hi , I am curious “How long does a query q=field1:2312 cost , which exactly match only one document? ”, Of course we just discuss no queryResultCache with match in this situation. In fact my QTime is 150ms+, it is too long.

Re: Is that a mistake or bug?

2018-09-03 Thread zhenyuan wei
; > The only line that could be improved, is probably replacing > "Boolean.FALSE" by simply "false", but that is really a minor thing... > > Regards > > PB > ______ > > Od: "zhenyuan wei&quo

Re: Is that a mistake or bug?

2018-09-03 Thread zhenyuan wei
> > On Mon, Sep 3, 2018 at 9:09 AM zhenyuan wei wrote: > > > Yeah,got it~. So the QueryResult.segmentTerminatedEarly maybe a boolean, > > instead of Boolean, is better, right? > > > > Mikhail Khludnev 于2018年9月3日周一 下午1:36写道: > > > > > It's neit

Re: Is that a mistake or bug?

2018-09-02 Thread zhenyuan wei
in result output. see > ResponseBuilder.setResult(QueryResult). > So, if cmd requests early termination, it sets false by default, enabling > "false" output even it won't be the case. And later it might be flipped to > true. > > > On Mon, Sep 3, 2018 at 5:57 AM zheny

Re: question for rule based replica placement

2018-09-02 Thread Wei
#x27;? I cannot find more relevant documentation on how to configure and customize 'snitch'. Thanks, Wei On Sun, Sep 2, 2018 at 9:30 PM Erick Erickson wrote: > You need to provide a "snitch" and define a rule appropriately. This > is a variant of "rack awareness"

Is that a mistake or bug?

2018-09-02 Thread zhenyuan wei
Hi all, I saw the code like following: QueryResult result = new QueryResult(); cmd.setSegmentTerminateEarly(params.getBool(CommonParams.SEGMENT_TERMINATE_EARLY, CommonParams.SEGMENT_TERMINATE_EARLY_DEFAULT)); if (cmd.getSegmentTerminateEarly()) { result.setSegmentTerminatedEarly(Boolean.FAL

question for rule based replica placement

2018-09-02 Thread Wei
for defining the physical host? Thanks, Wei

Re: Multiple solr instances per host vs Multiple cores in same solr instance

2018-08-31 Thread Wei
as you want. > > The node placement rules are primarily intended for automated or very large > setups. Manually placing replicas is simpler for limited numbers. > > Best, > Erick > On Sun, Aug 26, 2018 at 8:10 PM Wei wrote: > > > > Thanks Shawn. When using multiple Sol

Re: “solr.data.dir” can only config a single directory

2018-08-29 Thread zhenyuan wei
Oh ~ my fault!Sorry for that, I should say somebody,like me~ Bram Van Dam 于2018年8月29日周三 下午3:28写道: > On 28/08/18 08:03, zhenyuan wei wrote: > > But this is not a common way to do so, I mean, nobody want to ADDREPLICA > > after collection was created. > > I wouldn't say "nobody".. >

Re: “solr.data.dir” can only config a single directory

2018-08-28 Thread zhenyuan wei
Pretty cool,here creates an issue to put this discussion into practice. issues: https://issues.apache.org/jira/browse/SOLR-12713 Best, TinsWzy Erick Erickson 于2018年8月28日周二 下午11:51写道: > Patches welcome. > > On Mon, Aug 27, 2018, 23:03 zhenyuan wei wrote: > > > But this is n

Re: “solr.data.dir” can only config a single directory

2018-08-27 Thread zhenyuan wei
d you can have as many replicas per Solr instance > as makes sense. > > Best, > Erick > On Mon, Aug 27, 2018 at 8:48 PM zhenyuan wei wrote: > > > > @Christopher Schultz > > So you mean that one 4TB disk is the same as four 1TB disks ? > > HDFS、cassandra、

Re: “solr.data.dir” can only config a single directory

2018-08-27 Thread zhenyuan wei
xplain Christopher Schultz 于2018年8月28日周二 上午11:16写道: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Shawn, > > On 8/27/18 22:37, Shawn Heisey wrote: > > On 8/27/2018 8:29 PM, zhenyuan wei wrote: > >> I found the “solr.data.dir” can only config a single directory. &g

Re: “solr.data.dir” can only config a single directory

2018-08-27 Thread zhenyuan wei
写道: > On 8/27/2018 8:29 PM, zhenyuan wei wrote: > > I found the “solr.data.dir” can only config a single directory. I > > think it is necessary to be config multi dirs,such as > > ”solr.data.dir:/mnt/disk1,/mnt/disk2,/mnt/disk3" , due to one disk > overload >

“solr.data.dir” can only config a single directory

2018-08-27 Thread zhenyuan wei
Hi all, I found the “solr.data.dir” can only config a single directory. I think it is necessary to be config multi dirs,such as ”solr.data.dir:/mnt/disk1,/mnt/disk2,/mnt/disk3" , due to one disk overload or capacity limitation. Any reason to support why not do so? Best, TinsWzy

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-27 Thread zhenyuan wei
@Shawn Heisey Yeah, delete "write.lock" files manually is ok finally。 @Walter Underwood Have some performace evaluation about Solr on HDFS vs LocalFS recently? Shawn Heisey 于2018年8月28日周二 上午4:10写道: > On 8/26/2018 7:47 PM, zhenyuan wei wrote: > > I found an exception w

Re: Multiple solr instances per host vs Multiple cores in same solr instance

2018-08-27 Thread Wei
Thanks Bernd. Do you have preferLocalShards=true in both cases? Do you notice CPU/memory utilization difference between the two deployments? How many servers did you use in total? I am curious what's the bottleneck for the one instance and 3 cores configuration. Thanks, Wei On Mon, A

Re: An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-27 Thread zhenyuan wei
Erickson 于2018年8月27日周一 上午11:41写道: > Because HDFS doesn't follow the file semantics that Solr expects. > > There's quite a bit of background here: > https://issues.apache.org/jira/browse/SOLR-8335 > > Best, > Erick > On Sun, Aug 26, 2018 at 6:47 PM zhenyuan wei

Re: Multiple solr instances per host vs Multiple cores in same solr instance

2018-08-26 Thread Wei
will be able to get better CPU utilization on multi-core server? Thanks, Wei On Sun, Aug 26, 2018 at 4:37 AM Shawn Heisey wrote: > On 8/26/2018 12:00 AM, Wei wrote: > > I have a question about the deployment configuration in solr cloud. When > > we need to increase the number of

An exception when running Solr on HDFS,why a solr server can not recognize the write.lock file is created by itself before?

2018-08-26 Thread zhenyuan wei
Hi all, I found an exception when running Solr on HDFS。The detail is: Running solr on HDFS,and update doc was running always, then,kill -9 solr JVM or reboot linux os/shutdown linux os,then restart all. The exception appears like: 2018-08-26 22:23:12.529 ERROR (coreContainerWorkExecutor-2-thr

Multiple solr instances per host vs Multiple cores in same solr instance

2018-08-25 Thread Wei
per host, and have multiple cores(shards) in the same solr instance. Which would be better performance wise? For the first option I think JVM size for each solr instance can be smaller, but deployment is more complicated? Are there any differences for cpu utilization? Thanks, Wei

Why queryNorm is negative?

2018-08-24 Thread Wei Zhao
I got the debug info like this. It turns out that queryNorm is negative so the total solr score is negative too. It is not really a problem for me. But I'm curious why it can even be negative after reading its definition (1/sumOfSquaredWeights)? -1.1151254E-4 = (MATCH) product of: -0.0032338637

How to hit filterCache?if filterQuery is a sub range query of another already cache range filterQuery

2018-08-24 Thread zhenyuan wei
Hi All, I am confuse about How to hit filterCache? If filterQuery is range [3 to 100] , but not cache in FilterCache, and filterCache already exists filterQuery range [2 to 100], My question is " Dose this filterQuery range [3 to 100] will fetch DocSet from FilterCache range[2 to 100]" ?

Re: How to trace one query?the debug/debugQuery info are not enough to find out why a query is slow

2018-08-24 Thread zhenyuan wei
Count on filterCache, find alternatives to > wildcard query and more. > > But all in all, I'd be very very satisfied with those low response times > given the size of your data. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > &

Re: How to trace one query?the debug/debugQuery info are not enough to find out why a query is slow

2018-08-23 Thread zhenyuan wei
Thanks for your detail answer @Shawn Yes I run the query in SolrCloud mode, and my collection has 20 shards, each shard size is 30~50GB。 4 solr server, each solr JVM use 6GB, HDFS datanode are 4 too, each datanode JVM use 2.5GB。 Linux server host are 4 node too,each node is 16 core/32GB RAM/1600G

Re: How to trace one query?the debug/debugQuery info are not enough to find out why a query is slow

2018-08-23 Thread zhenyuan wei
I have 4 solr server, each allocated 6GB。My dataset on HDFS is 787GB, 2 billion documents totally,each document is 300 Bytes。 Follow is my cache related configuration。 20 200 zhenyuan wei 于2018年8月23日周四 下午5:41写道: > Thank you very much to answer. @Jan Høydahl > My query is simple

Re: How to trace one query?the debug/debugQuery info are not enough to find out why a query is slow

2018-08-23 Thread zhenyuan wei
ponent spends the most time? > With shards.info=true you see what shard is the slowest, if your index is > sharded. > With echoParams=all you get the full list of query parameters in use, > perhaps you spot something? > If you start Solr with -v option then you get more verbose logg

How to trace one query?the debug/debugQuery info are not enough to find out why a query is slow

2018-08-23 Thread zhenyuan wei
Hi all, I do care query performance, but do not know how to find out the reason why a query so slow. *How to trace one query?*the debug/debugQuery info are not enough to find out why a query is slow。 Thanks a lot~

4.10 default ranking scorer, BM25 or classic? How to change that?

2018-08-16 Thread Wei Zhao
Hi, Does anyone know what the default scorer for 4.10 is? BM25 or classic tf-idf? I have been trying to change that, in cloud mode. I have managed to change the schema.xml in the zookeeper to add the following lines: The commented line was also tried. So I have tried different syntax, usin

Re: Solr timeAllowed metric

2018-08-06 Thread Wei
Thanks Mikhail! Is traditional facet subject to timeAllowed? On Mon, Aug 6, 2018 at 3:46 AM, Mikhail Khludnev wrote: > One note: enum facets might be stopped by timeAllowed. > > On Mon, Aug 6, 2018 at 1:45 PM Mikhail Khludnev wrote: > > > Hello, Wei. > > > > "

Solr timeAllowed metric

2018-08-03 Thread Wei
Expansion and Document collection" . Does that mean Solr will not abort the request if timeAllowed is exceeded during the scoring process? What are the components (query, facet, stats, debug etc) this metric is effectively used? Thanks, Wei

solr config ganglia reporter encounter an exception

2018-07-11 Thread zhenyuan wei
Hi all, My solr version is release7.3.1, and I follow the solr 7.3.0 ref guide to config ganglia reporter in solr.xml as below: .. emr-header-1 8649 than start solr service and encounted the execption like: 2018-07-11 17:47:31.246 ERROR (main) [ ] o.a

Re: solr filter query on text field

2018-07-11 Thread Wei
btw, is there any difference if the fq field is a string field vs test field? On Wed, Jul 11, 2018 at 11:59 AM, Wei wrote: > Thanks Erick and Andrea! If my default operator is OR, fq= > my_text_field:(Jurassic park the movie) is equivalent to > my_text_field:(Jurassic > OR pa

  1   2   >