Re: Faster Vector Highlight

2020-06-06 Thread Yasufumi Mizoguchi
Hi, Kaya. How about using hl.maxAnalyzedChars parameter ? Thanks, Yasufumi > 2020/06/06 午後5:56、Kayak28 のメール: > > Hello, Solr Community: > > I have a question about FasterVectorHighlight. > I know Solr highlight does not return highlighted text if the text in the > highlighted field is too lo

Re: How to get boosted field and values?

2020-03-24 Thread Yasufumi Mizoguchi
Hi, I think "debug" query parameter or "explain" document transformer will help you to know which fields and query conditions are boosted. https://lucene.apache.org/solr/guide/7_5/common-query-parameters.html https://lucene.apache.org/solr/guide/7_5/transforming-result-documents.html Thanks, Yas

Re: Japanese Query Unexpectedly Misses

2019-10-18 Thread Yasufumi Mizoguchi
Hi, There are two solutions as far as I know. 1. Use userDictionary attribute This is common and safe way I think. Add userDictionary attribute into your tokenizer configuration and define userDictionary file as follows. Tokenizer: userDictionary(lang/userdict_ja.txt in above setting): 日本人,日本

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
cumentCache. Why not just comment out Cache > settings in solrconfig.xml? > > > > On Tue, 1 Oct 2019 at 15:39, Yasufumi Mizoguchi > wrote: > > > Thank you for replying me. > > > > Followings are current load test set up. > > > > * Load test program :

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
It is difficult to answer that for me. My customer requested me to achieve 1000 qps with single Solr. Thanks, Yasufumi. 2019年10月1日(火) 14:59 Jörn Franke : > Why do you need 1000 qps? > > > > Am 30.09.2019 um 07:45 schrieb Yasufumi Mizoguchi < > yasufumi0...@gmail.com>: &

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
Thank you for replying me. I will try to resize NewRatio. Thanks, Yasufumi. 2019年10月1日(火) 11:19 Deepak Goel : > Hello > > Can you please try increasing 'new size' and 'max new size' to 1GB+? > > Deepak > > On Mon, 30 Sep 2019, 13:35 Yasufumi Mizoguchi,

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
> > > On Sep 30, 2019, at 1:28 AM, Yasufumi Mizoguchi > wrote: > > > > Hi, Ere. > > > > Thank you for valuable feedback. > > I will try Xmx31G and Xms31G instead of current ones. > > > > Thanks and Regards, > > Yasufumi. > > > >

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
Ah, sorry. Not JUnit, we use JMeter. Thanks, Yasufumi 2019年10月1日(火) 19:08 Yasufumi Mizoguchi : > Thank you for replying me. > > Followings are current load test set up. > > * Load test program : JUnit > * The number of Firing hosts : 6 > * [Each host]ThreadGroup.num_th

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
, Yasufumi. 2019年9月30日(月) 22:18 Shawn Heisey : > On 9/29/2019 11:44 PM, Yasufumi Mizoguchi wrote: > > I am trying some tests to confirm if single Solr instance can perform > over > > 1000 queries per second(!). > > In general, I would never expect a single instance to ha

Re: Throughput does not increase in spite of low CPU usage

2019-10-01 Thread Yasufumi Mizoguchi
ifferent queries that you don’t get them from > the queryResultCache. I had one client who was thrilled they were getting > 3ms response times…. by firing the same query over and over and hitting the > queryResultCache 99.% of the time ;). > > Best, > Erick > > > On

Re: Throughput does not increase in spite of low CPU usage

2019-09-30 Thread Yasufumi Mizoguchi
ctually get better results > with -Xmx31G. For more information, see e.g. > > https://blog.codecentric.de/en/2014/02/35gb-heap-less-32gb-java-jvm-memory-oddities/ > > Regards, > Ere > > Yasufumi Mizoguchi kirjoitti 30.9.2019 klo 11.05: > > Hi, Deepak. > > Thank

Re: Throughput does not increase in spite of low CPU usage

2019-09-30 Thread Yasufumi Mizoguchi
l? > > Deepak > > On Mon, 30 Sep 2019, 11:15 Yasufumi Mizoguchi, > wrote: > > > Hi, > > > > I am trying some tests to confirm if single Solr instance can perform > over > > 1000 queries per second(!). > > > > But now, although CPU usage is 40% o

Throughput does not increase in spite of low CPU usage

2019-09-29 Thread Yasufumi Mizoguchi
Hi, I am trying some tests to confirm if single Solr instance can perform over 1000 queries per second(!). But now, although CPU usage is 40% or so and iowait is almost 0%, throughput does not increase over 60 queries per second. I think there are some bottlenecks around Kernel, JVM, or Solr set

Re: autoGeneratePhraseQueries does not work?

2019-07-25 Thread Yasufumi Mizoguchi
Hi, Since Solr 7.0, sow(Split-on-Whitespace) parameter has set false as implicit default. Because autoGeneratePhraseQueries option's behavior depends on that parameter, you should add sow=true parameter to your query. e.g) q=trigram:&fq=syo_id:1237&debugQuery=on&sow=true Thanks, Yasufumi. 2

Rebalancing shards between some node groups

2019-07-24 Thread Yasufumi Mizoguchi
Hi, community. I am using Solr 7.7.2 in SolrCloud mode. I am looking for the feature for re-balancing replicas among some node groups. Such as, Initial state) Node0 : shard0, shard1 Node1 : shard1, shard2 Node2 : shard2, shard3 Node3 : shard3, shard0 After Re-balancing replicas between two group

Re: Numeric value ignored by EdgeNGramFilterFactory

2019-07-04 Thread Yasufumi Mizoguchi
Hi, EdgeNGramFilterFactory seems to drop tokens shorter than minGramSize param. Check the example of minGramSize="4" maxGramSize="6" case in below page. https://lucene.apache.org/solr/guide/8_1/filter-descriptions.html#edge-n-gram-filter So, you should set minGramSize=2 or 1 if you want to keep 7

Re: Sort on PointFieldType

2019-07-04 Thread Yasufumi Mizoguchi
Hi, Which version of Solr are you using? And what is the field settings? Reference guide says that sorting with single valued *PointType fields requires docValues="true" option in field settings. https://lucene.apache.org/solr/guide/8_1/field-types-included-with-solr.html#field-types-included-wit

How to migrate the queries having core-across join and json.facet to SolrCloud

2019-05-27 Thread Yasufumi Mizoguchi
Hi, community. We are trying to migrate from single Solr instance to SolrCloud with Solr 7.4.0 due to the increase of documents. We have some join query running on current Solr, and need to migrate these because join queries has some restrictions when running on SolrCloud. (We cannot use custom do

Cannot get MBean info via JConsole

2019-02-26 Thread Yasufumi Mizoguchi
Hi, I want to access MBean information via JConsole with Solr 6.2. Now, I could get the information via MBeanRequestHandler, but could not via JConsole from the same host that Solr ran. So, how can I do it via JConsole? Any information about this would be greatly appreciated. Thanks, Yasufumi.

Re: What is the benefit of stored="true" in *PointFields

2019-02-06 Thread Yasufumi Mizoguchi
ried stored="false" on some numeric fields, but it was not good. So, I am trying to set stored="false" on some string fields... Thank you for your advice, Yasufumi. 2019年2月7日(木) 0:48 Shawn Heisey : > On 2/6/2019 12:42 AM, Yasufumi Mizoguchi wrote: > > I am usin

What is the benefit of stored="true" in *PointFields

2019-02-05 Thread Yasufumi Mizoguchi
Hi, I am using Solr 7.6 and want to reduce index size due to hardware limitation. I already tried to 1. set false to unnecessary field's indexed/stored/docValues parameter in schema. 2. set compressionMode="BEST_COMPRESSION" in solrconfig. These were quite good, but I still need to reduce index

Re: Per-field slop param in eDisMax

2019-01-25 Thread Yasufumi Mizoguchi
dd an example to the documentation here: > > https://lucene.apache.org/solr/guide/7_6/the-extended-dismax-query-parser.html#using-slop > > Elizabeth > > On Wed, Jan 23, 2019 at 10:30 PM Yasufumi Mizoguchi < > yasufumi0...@gmail.com> > wrote: > > > Hi, > >

Per-field slop param in eDisMax

2019-01-23 Thread Yasufumi Mizoguchi
Hi, I am struggling to set per-field slop param in eDisMax query parser with Solr 6.0 and 7.6. What I want to do with eDixMax is similar to following in the default query parser. * Query string : "aaa bbb" * Target fields : fieldA(TextField), fieldB(TextField) q=fieldA:"aaa bbb"~2 OR fieldB:"aaa

Re: UnifiedHighlighter returns an error when setting hl.maxAnalyzedChars=-1

2019-01-06 Thread Yasufumi Mizoguchi
Hi, I opened a JIRA about this. https://issues.apache.org/jira/browse/SOLR-13121 Thanks, Yasufumi. 2018年12月28日(金) 13:39 Yasufumi Mizoguchi : > Hi, > > I faced UnifiedHighlighter error when setting hl.maxAnalyzedChars=-1 in > Solr 7.6. > Here is the procedure for reproducing. &g

UnifiedHighlighter returns an error when setting hl.maxAnalyzedChars=-1

2018-12-27 Thread Yasufumi Mizoguchi
Hi, I faced UnifiedHighlighter error when setting hl.maxAnalyzedChars=-1 in Solr 7.6. Here is the procedure for reproducing. $ bin/solr -e techproducts $ curl -XGET "localhost:8983/solr/techproducts/select?hl.fl=name&hl.maxAnalyzedChars=-1&hl.method=unified&hl=on&q=memory&df=name" I have written

Re: ZooKeeper for Solr 7.6

2018-12-20 Thread Yasufumi Mizoguchi
I also haven't heard of any problems either. > > And do note that 3.4.13 is now being used in Solr branch_7x (which > will be Solr 7.7 if one is released). > > Best, > Erick > > On Wed, Dec 19, 2018 at 4:53 PM Yasufumi Mizoguchi > wrote: > > > > Hi,

Re: ZooKeeper for Solr 7.6

2018-12-19 Thread Yasufumi Mizoguchi
8, 2018 at 12:37 AM Yasufumi Mizoguchi < > yasufumi0...@gmail.com> > wrote: > > > Thank you Jan. > > > > I will try it. > > > > Thanks, > > Yasufumi. > > > > 2018年12月18日(火) 17:21 Jan Høydahl : > > > > > That is no problem

Re: ZooKeeper for Solr 7.6

2018-12-18 Thread Yasufumi Mizoguchi
Thank you Jan. I will try it. Thanks, Yasufumi. 2018年12月18日(火) 17:21 Jan Høydahl : > That is no problem, doing it myself. > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > > > 18. des. 2018 kl. 04:34 skrev Yasufumi Mizoguchi >: >

ZooKeeper for Solr 7.6

2018-12-17 Thread Yasufumi Mizoguchi
Hi I am trying Solr 7.6 in SolrCloud mode. But I found that ZooKeeper 3.4.11 has a critical issue about handling data/log directories. (https://issues.apache.org/jira/browse/ZOOKEEPER-2960) So, I want to know if using ZooKeeper 3.4.12 with Solr 7.6 is safe. Does anyone know this? Thanks, Yasufu

Re: Retrieve field from docValues

2018-11-06 Thread Yasufumi Mizoguchi
Hi, > 1. For schema version 1.6, useDocValuesAsStored=true is default, so there > is no need to explicitly set it in schema.xml? Yes. > 2. With useDocValuesAsStored=true and the following definition, will Solr > retrieve id from docValues instead of stored field? No. AFAIK, if you define both

Re: Java Advanced Imaging (JAI) Image I/O Tools are not installed

2018-11-06 Thread Yasufumi Mizoguchi
Hi, It seems a PDFBox issue, I think. ( https://pdfbox.apache.org/2.0/dependencies.html ) Thanks, Yasufumi 2018年11月6日(火) 16:10 Furkan KAMACI : > Hi All, > > I use Solr 6.5.0 and test OCR capabilities. It OCRs pdf files even it is so > slow. However, I see that error when I check logs: > > o.a.

Re: Solr Cell Input Parameter tika.config

2018-10-25 Thread Yasufumi Mizoguchi
Hello, I could not find the process that parse tika.config parameter from solr request. Maybe, tika.config parameter can only be defined in solrconfig.xml as following. tika-config.xml true ignored_ true links ignored_ Thanks, Yasufumi 2018年10月26日(金) 7:07 Robertson

Re: Creating CJK bigram tokens with ClassicTokenizer

2018-10-03 Thread Yasufumi Mizoguchi
lysis/common/src/java/org/apache/lucene/analysis/cjk/CJKBigramFilter.java#L64 ) ClassicTokenizer also adds obsolete TOKEN_TYPES "CJ" to the CJ token and "ALPHANUM" to the Korean alphabet, but both are not targets for CJKBigramFilter... Thanks, Yasufumi 2018年10月2日(火) 0:05 Shawn H

Creating CJK bigram tokens with ClassicTokenizer

2018-09-30 Thread Yasufumi Mizoguchi
Hi, I am looking for the way to create CJK bigram tokens with ClassicTokenizer. I tried this by using CJKBigramFilter, but it only supports for StandardTokenizer... So, is there any good way to do that? Thanks, Yasufumi

Re: The way to update Managed Resources.

2018-09-30 Thread Yasufumi Mizoguchi
tes it then writes it back out. So > if your sequence is: > manually change someting > issue a REST API call > > then the REST call overwrites your manual changes. > HTH, > Erick > On Wed, Sep 26, 2018 at 10:25 AM Yasufumi Mizoguchi > wrote: > > > > Hi, &g

The way to update Managed Resources.

2018-09-26 Thread Yasufumi Mizoguchi
Hi, I am trying to use ManagedSynonymGraphFilterFactory and want to add "tokenizerFactory" attribute into Managed Resources(_schema_analysis_synonyms_*.json under conf directory). To do this, is it OK to update json file manually? If should not, is there any way to update ManagedResources except R

Re: Solr empty highlight entry on match?

2018-09-26 Thread Yasufumi Mizoguchi
Hi, The documents might be too long to highlight, I think. See "hl.maxAnalyzedChars" in reference guide. https://lucene.apache.org/solr/guide/7_4/highlighting.html Try to increase hl.maxAnalyzedChars value or to use hl.alternateField, hl.maxAlternateFieldLength to create snippets even if Solr fai

Re: Solr Import

2018-09-24 Thread Yasufumi Mizoguchi
Hi, I do not have a good idea about No. 1, but No. 2 is clear. > 2. Delta indexing of xml file. > We would be provided with an xml file and that would be imported to Solr > using full-import during the first import. Subsequently we would be > provided with changes made to the xml file (will be pr

Re: Modify Schema for Solr Cloud

2018-09-18 Thread Yasufumi Mizoguchi
Hi, One way is re-upload config files via zkcli.sh and reload the collection. See following. https://lucene.apache.org/solr/guide/7_4/command-line-utilities.html Thanks, Yasufumi. 2018年9月18日(火) 14:30 Rathor, Piyush (US - Philadelphia) : > Hi All, > > > > I am new to solr cloud. > > > > Can you

Lucene/Solr bug list caused by JVM's implementations

2018-08-14 Thread Yasufumi Mizoguchi
Hi, I am looking for Lucene/Solr's bug list caused by JVM's implementations. And I found the following, but it seems not to be updated. https://wiki.apache.org/lucene-java/JavaBugs Where can I check the latest one? Thanks, Yasufumi

How to use tika-OCR in data import handler?

2018-07-23 Thread Yasufumi Mizoguchi
Hi, I am trying to use tika-OCR(Tesseract) in data import handler and found that processing English documents was quite good. But I am struggling to process the other languages such as Japanese, Chinese, etc... So, I want to know how to switch Tesseract-OCR's processing language via data import

Re: How to know the name(url) of documents that data import handler skipped

2018-07-08 Thread Yasufumi Mizoguchi
: > Have you tried changing the log level > https://lucene.apache.org/solr/guide/7_2/configuring-logging.html > > > -- > Rahul Singh > rahul.si...@anant.us > > Anant Corporation > On Jul 8, 2018, 8:54 PM -0500, Yasufumi Mizoguchi , > wrote: > > Hi, > >

How to know the name(url) of documents that data import handler skipped

2018-07-08 Thread Yasufumi Mizoguchi
Hi, I am trying to indexing files into Solr 7.2 using data import handler with onError=skip option. But, I am struggling with determining the skipped documents as logs do not tell which file was bad. So, how can I know those files? Thanks, Yasufumi

Re: Server refused connection at: http://localhost:xxxx/solr/collectionName

2018-07-02 Thread Yasufumi Mizoguchi
Hi, I think ZooKeeper can not notice requests to dead nodes, if you send requests to Solr nodes directly. It will be better that asking ZooKeeper which Solr nodes will be running before requesting Solr nodes with CloudSolrClient etc... Thanks, Yasufumi 2018年7月2日(月) 16:49 Ritesh Kumar : > Hello

Re: Changing Field Assignments

2018-06-11 Thread Yasufumi Mizoguchi
Hi, You can do that via adding the following lines in managed-schema. After adding the above and re-indexing docs, you will get the result like following. { "responseHeader":{ "status":0, "QTime":0, "params":{ "q":"*:*", "indent": "on", "wt":"json", "_":"1528772599296"}}, "response":{"

Re: Synonym(Graph)FilterFactory seems to ignore tokenizerFactory.* parameters.

2018-06-02 Thread Yasufumi Mizoguchi
Does anyone teach me if this is a bug or intended? 2018年5月31日(木) 13:53 Yasufumi Mizoguchi : > Hi, community. > > I want to use Synonym(Graph)Filter with JapaneseTokenizer and > NGramTokenizer. > But it turned out that Synonym(Graph)FilterFactory seemed to ignore > tokenizerFa

Synonym(Graph)FilterFactory seems to ignore tokenizerFactory.* parameters.

2018-05-30 Thread Yasufumi Mizoguchi
Hi, community. I want to use Synonym(Graph)Filter with JapaneseTokenizer and NGramTokenizer. But it turned out that Synonym(Graph)FilterFactory seemed to ignore tokenizerFactory.* parameters such as "tokenizerFactory.maxGramSize", "tokenizerFactory.userDictionary" etc... when using managed-schema(

Re: Atomic update error with JSON handler

2018-05-21 Thread Yasufumi Mizoguchi
Hi, At least, it is better to enclose your json body with '[ ]', I think. Following is the result I tried using curl. $ curl -XPOST "localhost:8983/solr/test_core/update/json?commit=true" --data-binary '{"id":"test1","title":{"set":"Solr Rocks"}}' { "responseHeader":{ "status":400, "QT

Re: Caching Solr Grouping Results

2018-05-21 Thread Yasufumi Mizoguchi
Hi, Have you already tried "group.cache.percent" parameter? It might improve grouping performance. Or if you try CollapsingQParser, you can use expand component to acquire all values in groups, I think. ( https://lucene.apache.org/solr/guide/6_6/collapse-and-expand-results.html#collapse-and-expand

Re: Caching Solr Grouping Results

2018-05-20 Thread Yasufumi Mizoguchi
Hi, I know few about groping component, but I think it is very hard. Because query result cache has {query and conditions} -> {DocList} structure. ( https://github.com/apache/lucene-solr/blob/e30264b31400a147507aabd121b1152020b8aa6d/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#

Re: Solr admin Segments page legend

2018-05-17 Thread Yasufumi Mizoguchi
Hi, I found some information about the pink bar from mail archive. I think this should be written in ref. guide. > I think that pink segments are those segments > which the system thinks are most likely to be chosen for automatic > merging, according to whatever merge policy you have active. Mos

Re: Solr 6.6.3 won't start - unrecognized vm option "UseParNewGC" - Java 10.0.1

2018-05-13 Thread Yasufumi Mizoguchi
Instead of setting JAVA_HOME variables, if you want to use Java8 only with Solr, you can use SOLR_JAVA_HOME variable in Solr's bin/solr.in.sh script( or bin\solr.in.cmd if you use Windows.) e.g. ) SOLR_JAVA_HOME="/home/ubuntu/jdk1.8.0_171" Regards, Yasufumi 2018年5月13日(日) 11:17 Alexandre Rafalovi

Re: Trouble Installing Solr 7.1.0 On Ubunti 17

2017-10-23 Thread Yasufumi Mizoguchi
Hi, Maybe, you have a wrong path. Try below. $ sudo solr-7.1.0/bin/install_solr_service.sh Thanks, Yasufumi. 2017-10-24 12:11 GMT+09:00 Dane Terrell : > Hi I'm new to apache solr. I'm looking to install apache solr 7.1.0 on my > localhost computer. I downloaded and extracted the tar file in my

Re: Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-06 Thread Yasufumi Mizoguchi
Hi Shawn, Thank you for your reply. > that sounds like a bug in the argument parser that needs to be fixed. I have created a JIRA about this. https://issues.apache.org/jira/browse/SOLR-11334 Thanks, Yasufumi On 2017/09/06 9:48 PM, Shawn Heisey wrote: On 9/4/2017 9:49 PM, Yasuf

Re: Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-04 Thread Yasufumi Mizoguchi
me,%20manu). This is because UnifiedSolrHighlighter detects that there is a zero-length string between "," and " ", and treats the string as a field name. Is this a correct behavior? Thanks, Yasufumi On 2017/09/05 12:21 AM, Shawn Heisey wrote: On 9/3/2017 10:31 PM, Yasufumi

Unified highlighter returns an error when hl.fl param has undefined fields

2017-09-03 Thread Yasufumi Mizoguchi
Hi, I am testing UnifiedHighlighter(hl.method=unified) with Solr 6.6 and found that the highlighter returns following error when hl.fl parameter has undefined fields. The error occurs even if hl.fl parameter has ", "( + ) as a field delimiter. (e.g. hl.fl=name, manu) Is this a bug? I think tha

Re: Explicit OR in edismax query with mm=100%

2017-04-19 Thread Yasufumi Mizoguchi
Hi, It looks that edismax respects the mm parameter in your case. You should set "mm=1", if you want to obtain the results of OR search. "mm=100%" means that all terms in your query should match. Regards, Yasufumi On 2017/04/20 10:40, Nguyen Manh Tien wrote: Hi, I run a query "Solr OR Lucene

Re: Integrating Solr with OpenNLP-UIMA pear file

2017-04-09 Thread Yasufumi Mizoguchi
Hi, I know few about Solr-UIMA integration, but found a suspicious point in your configuration below. name="analysisEngine">D/:temp/opennlp.uima.OpenNlpTextAnalyzer/opennlp.uima.OpenNlpTextAnalyzer_pear.xml I think Windows file path should start with [Drive letter] + ':', but the above star

Re: Problem starting solr 6.5

2017-04-03 Thread Yasufumi Mizoguchi
Hi, I think you should check the permission of /usr/local/solr-6/solr-6.5.0/server/log (maybe, you do not have write permission on the directory) regards, Yasufumi On 2017/04/04 11:42, wlee wrote: Try to start solr and get this error message. What is the problem ? $ bin/solr start Exce

Re: Is there any alternative to '*' in SQL interfaces?

2017-02-12 Thread Yasufumi Mizoguchi
t this feature in a release in the near future. The 6.5 release is based on Apache Calcite SQL engine and it has much deeper ties into the Solr schema. Joel Bernstein http://joelsolr.blogspot.com/ On Sun, Feb 12, 2017 at 10:16 AM, Yasufumi Mizoguchi wrote: Hi, I'm a newbie and trying SQL

Is there any alternative to '*' in SQL interfaces?

2017-02-12 Thread Yasufumi Mizoguchi
Hi, I'm a newbie and trying SQL interfaces on Solr 6.4.1. Firstly, I tried to get the values of all fields, but found the following on 6.4 Draft Ref Guide. > The * syntax to indicate all fields is not supported in either limited or unlimited queries. So, is there any alternative to '*' ?