RE: SolrCloud recommended I/O RAID level

2019-07-31 Thread Kaminski, Adi
Hi Shawn, Thanks for your reply, fully agree with your comments, it clarifies more the need of RAID10 in this case. One additional follow-up question - in case we follow this guidelines and having RAID10 (which leaves us with effective capacity of 50%), why would I need replication factor of 2 i

Upgrading from 7.3 to 8.2

2019-07-31 Thread Jayadevan Maymala
Hi all, We have a 3 node solr cluster - version 7.3.0 using zookeeper -version 3.4.12, running on CentOS Linux release 7.6.1810. I would like to upgrade to Solr version 8.2. Is it necessary to upgrade zookeeper also? Is it OK to upgrade directly to 8.2? Any tips/checklist would be welcome. Regard

Re: Mismatch between replication API & index.properties

2019-07-31 Thread Aman Tandon
Yes, that is what my understanding is but if you see the Replication handler response it is saying it is referring to the index folder not to the one shown in index.properties. Due to that confusion I am not able to delete the folder. Is this some bug or default behavior where irrespective of the

Indexing information on number of attachments and their names in EML file

2019-07-31 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, Is there anyway which we can detect the number of attachments and their names during indexing of EML files in Solr, and index those information into Solr? Currently, Solr is able to use Tika and Tesseract OCR to extract the contents of the attachments. However, I could no

Re: Solr 8.2.0 having issue with ZooKeeper 3.5.5

2019-07-31 Thread Zheng Lin Edwin Yeo
Yes. You can get my full solr.log from the link below. The error is there when I tried to create collection1 (around line 170 to 300) . https://drive.google.com/open?id=1qkMLTRJ4eDSFwbqr15wSqjbg4dJV-bGN Regards, Edwin On Wed, 31 Jul 2019 at 18:39, Jan Høydahl wrote: > Please look for the full

Re: Solr 8.2.0 having issue with ZooKeeper 3.5.5

2019-07-31 Thread Zheng Lin Edwin Yeo
Yes, I have restarted both Solr and ZooKeeper after the changes. In fact I have tried to restart the whole system, but the problem still persists. Below is my configuration for zoo.cfg. # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronizatio

[CVE-2019-0193] Apache Solr, Remote Code Execution via DataImportHandler

2019-07-31 Thread David Smiley
The DataImportHandler, an optional but popular module to pull in data from databases and other sources, has a feature in which the whole DIH configuration can come from a request's "dataConfig" parameter. The debug mode of the DIH admin screen uses this to allow convenient debugging / development o

Re: Mismatch between replication API & index.properties

2019-07-31 Thread jai dutt
It's correct behaviour , Solr put replica index file in this format only and you can find latest index pointing in index.properties file. Usually afer successful full replication Solr remove old timestamp dir. On Wed, 31 Jul, 2019, 8:02 PM Aman Tandon, wrote: > Hi, > > We are having a situation

Re: Dataimport problem

2019-07-31 Thread Alexandre Rafalovitch
I wonder if you have some sort of JDBC pool enabled and/or the number of worker threads is configured differently. Compare tomcat level configuration and/or try thread dump of the java runtime when you are stuck. Or maybe something similar on the Postgres side. Regards, Alex. On Wed, 31 Jul 2

RE: Dataimport problem

2019-07-31 Thread Srinivas Kashyap
Hi, Hi, 1)Have you tried running _just_ your SQL queries to see how long they take to respond and whether it responds with the full result set of batches The 9th request returns only 2 rows. This behaviour is happening for all the cores which have more than 8 SQL requests. But the same is worki

Mismatch between replication API & index.properties

2019-07-31 Thread Aman Tandon
Hi, We are having a situation where whole disk space is full and in server where we are seeing the multiple index directories ending with the timestamp. Upon checking the index.properties file for a particular shard replica, it is not referring to the folder name *index *but when I am using the re

Re: Dataimport problem

2019-07-31 Thread Erick Erickson
This code is a little old, but should give you a place to start: https://lucidworks.com/post/indexing-with-solrj/ As for DIH, my guess is that when you moved to Azure, your connectivity to the DB changed, possibly the driver Solr uses etc., and your SQL query in step 9 went from, maybe, batchin

RE: Dataimport problem

2019-07-31 Thread Srinivas Kashyap
Hi, 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one? We are using Solr 5.2.1(WAR based deployment so) 5) DIH is not actually recommended for production, more for exploration; you may want to consider moving to a stronger ar

Re: Dataimport problem

2019-07-31 Thread Alexandre Rafalovitch
A couple of things: 1) Solr on Tomcat has not been an option for quite a while. So, you must be running an old version of Solr. Which one? 2) Compare that you have the same Solr config. In Admin UI, there will be all O/S variables passed to the Java runtime, I would check them side-by-side 3) You c

Dataimport problem

2019-07-31 Thread Srinivas Kashyap
Hello, We are trying to run Solr(Tomcat) on Azure instance and postgres being the DB. When I run full import(my core has 18 SQL queries), for some reason, the requests will go till 9 and it gets hung for eternity. But the same setup, solr(tomcat) and postgres database works fine with AWS hosti

Re: SOLR 8.1.1 EdgeNGramFilterFactory parsing query

2019-07-31 Thread Erick Erickson
This works fine for me. Are you completely sure that 1> you pushed the changed config to the right place 2> you reloaded your server? One thing I do is go to the admin UI and check for the collection core) and bring up the schema file just to be sure that I’m using the schema I think I am. I’d

Re: Single field in "qf" vs multiple

2019-07-31 Thread Erick Erickson
The short answer is “yes, ranking will be different”. This is inevitable since the stats are different in your X field, there are more terms, the frequency of any given term is different, etc. I’d argue, though, that using qf with a list of fields can be tweaked to give you better results. For ins

Single field in "qf" vs multiple

2019-07-31 Thread Steven White
Hi everyone, I'm indexing my data into multiple Solr fields, such as A, B, C and I'm also copying all the data of those fields into a master field such as X. By default, my "qf" is set to X so anytime a user is searching they are searching across the data that also exist in fields A, B and C. In

NRT for new items in index

2019-07-31 Thread profiuser
Hi, we have something about 400 000 000 items in a solr collection. We have set up auto commit property for this collection to 15 minutes. Is a big collection and we using some caches etc. Therefore we have big autocommit value. This have disadvantage that we haven't NRT searches. We would like

Re: Problem with solr suggester in case of non-ASCII characters

2019-07-31 Thread Szűcs Roland
Hi Erick, Thanks your advice. I already removed it from the field definition used by the suggester and it works great. I will consider to took it from the entire processing of the other fields. I have only 7000 docs with index size of 18MB so far, so the memory footprint is not a key issue for me

Re: Contact for Wiki / Support page maintainer

2019-07-31 Thread Jan Høydahl
I tried to add Jaroslaw as an editor of that one page by adding him under "Restrictions" tab of the page. But it does not work. Anyone with higher Confluence skills who can tell how to give the edit bit for a single page to individuals. I know how to add edit permission for the whole WIKI space

Re: Problem with solr suggester in case of non-ASCII characters

2019-07-31 Thread Erick Erickson
Roland: Have you considered just not using stopwords anywhere? Largely they’re a holdover from a long time ago when every byte counted. Plus using stopwords has “interesting” issues with things like highlighting and phrase queries and the like. Sure, not using stopwords will make your index lar

Re: Solr 7.7.2 vs Solr 8.2.0

2019-07-31 Thread Erick Erickson
Do be aware that if you are using indexes created with 6x you will be required to completely re-index when you upgrade to Solr 8. IndexupgraderTool doesn’t help with this, i.e. you _cannot_ go from 6x->7x with 7x's IndexupGraderTool then go from 7x->8x. Best, Erick > On Jul 31, 2019, at 6:35 A

Issue with inplace update when TimeStampUpdateProcessor is added in updateRequestProcessorChain in solrconfig

2019-07-31 Thread Dominic Dsouza
Hello, I have found a strange issue with inplace updates. When i have TimeStampUpdateProcessor configured in my updateRequestProcessorChain like following: mydate and i am doing a inplace update on pint field, then inplace update runs fine, but the indexed fields(fieldtype t

Re: Solr 8.2.0 having issue with ZooKeeper 3.5.5

2019-07-31 Thread Jörn Franke
Updated correct zoo.cfg? Did you restart zookeeper after config change ? > Am 30.07.2019 um 04:05 schrieb Zheng Lin Edwin Yeo : > > Hi, > > I am using the new Solr 8.2.0 with SolrCloud and external ZooKeeper 3.5.5. > > However, after adding in the line under zoo.cfg > *4lw.commands.whitelist=**

Re: Solr 8.2.0 having issue with ZooKeeper 3.5.5

2019-07-31 Thread Jan Høydahl
Please look for the full log file solr.log in your Solr server, and share it via some file sharing service or gist or similar for us to be able to decipher the collection create error. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 31. jul. 2019 kl. 08:33 skrev Zhe

Re: Solr Backup

2019-07-31 Thread Jayadevan Maymala
On Tue, Jul 30, 2019 at 7:54 PM Jan Høydahl wrote: > The FS backup feature requires a shared drive as you say, and this is > clearly documented. No way around it. Cloud Filestore would likely fix it. > > Or you could write a new backup repo plugin for backup directly to Google > Cloud Storage? >

Re: Solr 7.7.2 vs Solr 8.2.0

2019-07-31 Thread Jan Høydahl
Hi Go for 8.2, as 7.x will be end of life later this year. If you find any know bugs in 8.2.0 that you cannot live with, wait for 8.2.1 which would maximize stability. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 30. jul. 2019 kl. 22:53 skrev Arnold Bronley : >