SolrCloud: Configured socket timeouts not reflecting

2019-06-12 Thread Rahul Goswami
Hello, I am running Solr 7.2.1 in cloud mode. To overcome a setup hardware bottleneck, I tried to configure distribUpdateSoTimeout and socketTimeout to a value greater than the default 10 mins. I did this by passing these as system properties at Solr start up time (-DdistribUpdateSoTimeout and -Ds

Re: SolrCloud indexing triggers merges and timeouts

2019-06-12 Thread Rahul Goswami
Updating the thread with further findings: So turns out that the nodes hosting Solr are VMs with Virtual disks. Additionally, a Windows system process (the infamous PID 4) is hogging a lot of disk. This is indicated by disk reponse times in excess of 100 ms and a disk drive queue length of 5 which

RE: Is it possible configure a single data-config.xml file for all the environments?

2019-06-12 Thread Hugo Angel Rodriguez
Thanks Shawn for your answers Regarding your question: " Are these environments on separate Solr instances, separate servers, or are they on the same Solr instance?" My answers is: These environments are on separate solr instances, separate servers Are we dealing with SolrCloud (which is Solr +

Re: CursorMark, batch size/speed

2019-06-12 Thread Erick Erickson
If there’s any chance of using Streaming for this rather than re-querying the data using CursorMark, it would solve a lot of these issues. > On Jun 12, 2019, at 3:26 PM, Mikhail Khludnev wrote: > > Every cursorMark request goes through full results. Previous results just > bypass scoring heap. S

Re: CursorMark, batch size/speed

2019-06-12 Thread Mikhail Khludnev
Every cursorMark request goes through full results. Previous results just bypass scoring heap. So, reducing number of such request should reasonably reduce wall-clock time exporting all results. On Wed, Jun 12, 2019 at 11:59 PM Markus Jelsma wrote: > Hello, > > One of our collections hates Curso

Re: Is it possible configure a single data-config.xml file for all the environments?

2019-06-12 Thread Shawn Heisey
On 6/12/2019 9:05 AM, Hugo Angel Rodriguez wrote: I need to configure a single data-config.xml file in solr for SAS AML 7.1. I have three environments: Development, quality and production, and you know the first lines in a data-config.xml file is for connection to a database (database name, dat

Re: Different facet count between 7.7.1 and 8.1.1

2019-06-12 Thread Jan Høydahl
Can you reproduce it from a clean 7.7.1 install? I mean, index N docs and then run the facet query? Is it a distributed query or a single shard? Does an "optimize" change anything? Is this DocValues strings? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 12. jun. 2

CursorMark, batch size/speed

2019-06-12 Thread Markus Jelsma
Hello, One of our collections hates CursorMark, it really does. When under very heavy load the nodes can occasionally consume GBs additional heap for no clear reason immediately after downloading the entire corpus. Although the additional heap consumption is a separate problem that i hope anyo

Different facet count between 7.7.1 and 8.1.1

2019-06-12 Thread Markus Jelsma
Hello again, We found another oddity when upgrading to Solr 8. For a *:* query, the facet counts for a simple string field do not match at all between these versions. Solr 7.7.1 gives less or zero counts where as for 8 we see the correct counts. So something seems fixed for a bug that i was not

Re: Issue with connect to zookeeper

2019-06-12 Thread Erick Erickson
You do not send Solr queries to ZooKeeper, send them to a Solr node. You should create a CloudSolrClient with your ZK ensemble, but thereafter you send queries to a _collection_ that you specify as part of the request. Best, Erick > On Jun 12, 2019, at 8:18 AM, Kiran Shetty wrote: > > Hi, >

Re: Intermittent BasicAuthPlugin Not Authorized

2019-06-12 Thread Erick Erickson
Thanks for letting us know… Erick > On Jun 12, 2019, at 10:42 AM, Brian Lininger wrote: > > Turns out that the problem wasn't intermittent, but that our streaming code > isn't properly setting the credentials on the client. Not sure why that is > yet, but it's not an issue with Solr, it just l

Issue with connect to zookeeper

2019-06-12 Thread Kiran Shetty
Hi, I am having issue with Solr on my Search related project which is an Adobe Experience Manager(AEM) maven project. I am not able to connect with zookeeper properly to get the Solr query response. We are using Solr version 7.6.0. and using a Solr dependency bundle org.apache.servicemix.bun

Re: Intermittent BasicAuthPlugin Not Authorized

2019-06-12 Thread Brian Lininger
Turns out that the problem wasn't intermittent, but that our streaming code isn't properly setting the credentials on the client. Not sure why that is yet, but it's not an issue with Solr, it just looked intermittent because of the intermittent use of streaming in our application... On Wed, Jun

Re: ExtractRequestHandler with url instead of path to file

2019-06-12 Thread marotosg
Found the issue. Was using the wrong parameter. stream.url instead of stream.file http://solrhost:8983/solr/document/update/extract?&extractOnly=true&stream.url=http://serverwith -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

ApacheCon North America 2019 Schedule Now Live!

2019-06-12 Thread Rich Bowen
Dear Apache Enthusiast, (You’re receiving this message because you’re subscribed to one or more Apache Software Foundation project user mailing lists.) We’re thrilled to announce the schedule for our upcoming conference, ApacheCon North America 2019, in Las Vegas, Nevada. See it now at https

Is it possible configure a single data-config.xml file for all the environments?

2019-06-12 Thread Hugo Angel Rodriguez
Hi I need to configure a single data-config.xml file in solr for SAS AML 7.1. I have three environments: Development, quality and production, and you know the first lines in a data-config.xml file is for connection to a database (database name, database server, port, user, password, etc). Accord

Re: SOLR JOIN

2019-06-12 Thread Erick Erickson
How are you trying to do this? Streaming Expressions? ParallelSQL? What _Solr_ constructs are you planning on using? Best, Erick > On Jun 12, 2019, at 7:33 AM, Paresh wrote: > > Hi, > > I have two collections both having different schema. > Collection1: ID, Field1, Field2 > Collection3: ID, O

SOLR JOIN

2019-06-12 Thread Paresh
Hi, I have two collections both having different schema. Collection1: ID, Field1, Field2 Collection3: ID, Oc1, OC2, col3_Field1 ID1, col3_Oc1 I want to do following JOINs (-)Query on Collection1 (-)Field2:value2 (-)JOIN across collection3 Field1 = collection3.col3_Field1

Re: Loading of zkCredentialsProvider has changed in Solr 7 or 8?

2019-06-12 Thread Colvin Cowie
I realize that attachments might not work on the mailing list, so here is the log on Drive https://drive.google.com/file/d/0B7mypFpwbHptWkp0X2U0azU2dGREb1k2WGlpeUM3MlRIWmRB/view?usp=sharing On Tue, 11 Jun 2019 at 11:21, Colvin Cowie wrote: > Hello all > > I hit another problem in moving from Sol

Re: ContentStreamUpdateRequest no longer closes stream

2019-06-12 Thread Colvin Cowie
I realize that attachments might not work on the mailing list, so here is the test case on Drive https://drive.google.com/file/d/0B7mypFpwbHptTE5nZE0weURFOExFSHphRFlUV0EyTElaOC0w/view?usp=sharing On Mon, 10 Jun 2019 at 13:17, Colvin Cowie wrote: > Hello, I'm in the process of moving from Solr 6.

Re: ExtractRequestHandler with url instead of path to file

2019-06-12 Thread Alexandre Rafalovitch
Have you tried enabling remoteStreaming? https://lucene.apache.org/solr/guide/8_0/content-streams.html#remote-streaming Regards, Alex. On Wed, 12 Jun 2019 at 08:59, marotosg wrote: > > Hi, > > I would like to make a request to Solr to index documents hosted as urls. > This works when I send a

Re: Issue in solr result

2019-06-12 Thread Shawn Heisey
On 6/12/2019 12:22 AM, Nikhil Reddy wrote: I am setting up an elastic search engine using SOLR. I have few columns where the datatype is VARCHAR(). The column in the result below(image uploaded) is a alphanumric document ID. So when I use varchar as a column datatype and ingest the data into so

ExtractRequestHandler with url instead of path to file

2019-06-12 Thread marotosg
Hi, I would like to make a request to Solr to index documents hosted as urls. This works when I send a path to the file but seems to fail when sending an url. Sample request http://solrhost:8983/solr/document/update/extract?&extractOnly=true&stream.file=http://serverwith docs:8080/Box_Sync.log

Facing issue ith MinMaxNormalizer

2019-06-12 Thread Kamal Kishore Aggarwal
Hi All, Appreciate if someone can help. I am using LTR with MinMaxNormalizer in solr 6.6.2. Model.json "class": "org.apache.solr.ltr.model.MultipleAdditiveTreesModel", "name": "XGBOOST-BBB-LTR-Model", "store":"BBB-Feature-Model", "features": [ { "name": "TFIDF",

Issue in solr result

2019-06-12 Thread Nikhil Reddy
Dear community members, I am setting up an elastic search engine using SOLR. I have few columns where the datatype is VARCHAR(). The column in the result below(image uploaded) is a alphanumric document ID. So when I use varchar as a column datatype and ingest the data into solr as shown in the bel

Re: Intermittent BasicAuthPlugin Not Authorized

2019-06-12 Thread Colvin Cowie
Hi Brian, That's correct, we never had any issues with authentication on Solr 6.x, as far as I remember. Have you checked Solr's own logs to see whether the 401s are coming from internode requests between shards, or if they are coming from the initial request to Solr from the client? I believe for