AW: AW: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-22 Thread Clemens Wyss DEV
On 10/22/2018 6:15 AM, Shawn Heisey wrote: > autoSoftCommit is pretty aggressive . If your commits are taking 1-2 seconds > or les well, some take minutes (re-index)! > autoCommit is quite long. I'd probably go with 60 seconds Which means every 1min the "pending"/"soft" commits are effectively s

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Susheel Kumar
Hi Shawn, Here is the link for Solr GC log and it doesn't look Solr GC problem. The total GC is 12 GB. The GC log is from yesterday and the issue happened this morning i.e. 10/22. https://www.dropbox.com/s/zdlu9sk8kc469ls/Screen%20Shot%202018-10-22%20at%2010.08.37%20PM.png?dl=0 It may be netwo

Slow import from MsSQL and down cluster during process

2018-10-22 Thread Daniel Carrasco
annoyingHello, I've a Solr Cluster that is created with 7 machines on AWS instances. The Solr version is 7.2.1 (b2b6438b37073bee1fca40374e85bf91aa457c0b) and all nodes are running on NTR mode and I've a replica by node (7 replicas). One node is used to import, and the rest are just for serve data.

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Shawn Heisey
On 10/22/2018 7:32 PM, Susheel Kumar wrote: Hi Shawn, you meant ZK GC log correct? There was another potential cause I was thinking of, but when I got to where I was going to list them in the previous message, I could not for the life of me remember what the other one was. I just remembered

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Shawn Heisey
On 10/22/2018 7:32 PM, Susheel Kumar wrote: Hi Shawn, you meant ZK GC log correct? No, the GC log from Solr.  A heap that's too small could happen to ZK as well, but I would expect that problem more on the Solr side. You could try increasing the heap size to see if that makes any difference

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Susheel Kumar
Hi Shawn, you meant ZK GC log correct? Thnx On Mon, Oct 22, 2018 at 7:03 PM Shawn Heisey wrote: > On 10/22/2018 3:31 PM, Susheel Kumar wrote: > > Hello, > > > > I am seeing "ZookeeperServer not running" WARM messages in zookeeper logs > > which is causing the Solr client connections to timeout.

Re: Storing & using feature vectors

2018-10-22 Thread Ken Krugler
Hi Doug, Many thanks for the tons of useful information! Some comments/questions inline below. — Ken > On Oct 19, 2018, at 10:46 AM, Doug Turnbull > wrote: > > This is a pretty big hole in Lucene-based search right now that many > practitioners have struggled with > > I know a couple of peo

Re: Query to multiple collections

2018-10-22 Thread Rohan Kasat
Thanks Shawn for the update. I am going ahead with the standard aliases approach , suits my use case. Regards, Rohan Kasat On Mon, Oct 22, 2018 at 4:49 PM Shawn Heisey wrote: > On 10/22/2018 1:26 PM, Chris Ulicny wrote: > > There weren't any particular problems we ran into since the client tha

Re: Query to multiple collections

2018-10-22 Thread Shawn Heisey
On 10/22/2018 1:26 PM, Chris Ulicny wrote: There weren't any particular problems we ran into since the client that makes the queries to multiple collections previously would query multiple cores using the 'shards' parameter before we moved to solrcloud. We didn't have any complicated sorting or s

Re: Integrate nutch with solr

2018-10-22 Thread Stephen Bianamara
To highlight Shawn's point, nutch leverages SOLR. That means that nutch defines the integration and is responsibile for providing their documentation. On Mon, Oct 22, 2018, 4:14 PM Atita Arora wrote: > and > https://lobster1234.github.io/2017/08/14/search-with-nutch-mongodb-solr/ > > On Tue, Oct

Re: Integrate nutch with solr

2018-10-22 Thread Atita Arora
and https://lobster1234.github.io/2017/08/14/search-with-nutch-mongodb-solr/ On Tue, Oct 23, 2018 at 1:12 AM Atita Arora wrote: > I think this should be kind of useful : > > > https://blog.building-blocks.com/building-a-search-engine-with-nutch-and-solr-in-10-minutes/ > > I integrated Aperture w

Re: Integrate nutch with solr

2018-10-22 Thread Atita Arora
I think this should be kind of useful : https://blog.building-blocks.com/building-a-search-engine-with-nutch-and-solr-in-10-minutes/ I integrated Aperture with Solr way back in 2008. On Mon, Oct 22, 2018 at 11:27 PM Dinesh Sundaram wrote: > Thanks Shawn for the reply, yes I do have some questi

Re: ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Shawn Heisey
On 10/22/2018 3:31 PM, Susheel Kumar wrote: Hello, I am seeing "ZookeeperServer not running" WARM messages in zookeeper logs which is causing the Solr client connections to timeout... What could be the problem? ZK: 3.4.10 Zookeeper.out == For help with the ZK server log, you'll need to cons

Re: Integrate nutch with solr

2018-10-22 Thread Shawn Heisey
On 10/22/2018 3:26 PM, Dinesh Sundaram wrote: Thanks Shawn for the reply, yes I do have some questions on the solr too. can you please share the steps for solr side to integate the nutch or no steps are needed in solr? Since I have no idea what has to happen on the nutch side, I really can't o

Re: SOLR External Id field

2018-10-22 Thread Rohan Kasat
Hi Piyush, There can be only a single unique identifier for a particular collection. And you can index the external field as Id for already existing record and it will replace the existing record. Regards, Rohan Kasat On Mon, Oct 22, 2018 at 2:20 PM Rathor, Piyush (US - Philadelphia) < prat...@

ZookeeperServer not running/Client Session timed out

2018-10-22 Thread Susheel Kumar
Hello, I am seeing "ZookeeperServer not running" WARM messages in zookeeper logs which is causing the Solr client connections to timeout... What could be the problem? ZK: 3.4.10 Zookeeper.out == 2018-10-22 06:04:51,071 [myid:2] - INFO [WorkerReceiver[myid=2]:FastLeaderElection@600] - Notificati

Re: Integrate nutch with solr

2018-10-22 Thread Dinesh Sundaram
Thanks Shawn for the reply, yes I do have some questions on the solr too. can you please share the steps for solr side to integate the nutch or no steps are needed in solr? On Thu, Oct 18, 2018 at 8:35 PM Shawn Heisey wrote: > On 10/18/2018 12:35 PM, Dinesh Sundaram wrote: > > Can you please sha

Re: SOLR External Id field

2018-10-22 Thread Walter Underwood
Solr doesn’t have an “external id”, so people aren’t understanding your question. Each document in a Solr collection has a unique ID. One field is chosen to be that ID. I usually make a field named “id”, but that isn’t necessary. If documents have an ID in the repository, you can send that ID t

RE: SOLR External Id field

2018-10-22 Thread Rathor, Piyush (US - Philadelphia)
Thanks Shawn. So we cannot update records based on external id? Thanks & Regards Piyush Rathor Please consider the environment before printing. -Original Message- From: Shawn Heisey Sent: Monday, October 22, 2018 2:56 PM To: solr-user@lucene.apache.org Subject: [EXT] Re: SOLR External

RE: SOLR External Id field

2018-10-22 Thread Rathor, Piyush (US - Philadelphia)
Hi Rohan, We need to update certain records based on external id. Please let me know how can we do it. Thanks & Regards Piyush Rathor Please consider the environment before printing. -Original Message- From: Rohan Kasat Sent: Monday, October 22, 2018 2:46 PM To: solr-user@lucene.apach

Re: Query to multiple collections

2018-10-22 Thread Rohan Kasat
Thanks Chris. This help. Regards, Rohan On Mon, Oct 22, 2018 at 12:26 PM Chris Ulicny wrote: > There weren't any particular problems we ran into since the client that > makes the queries to multiple collections previously would query multiple > cores using the 'shards' parameter before we move

Re: Query to multiple collections

2018-10-22 Thread Chris Ulicny
There weren't any particular problems we ran into since the client that makes the queries to multiple collections previously would query multiple cores using the 'shards' parameter before we moved to solrcloud. We didn't have any complicated sorting or scoring requirements fortunately. The one thi

Re: Dealing with null values in streaming rollup

2018-10-22 Thread Joel Bernstein
This sounds like a bug, please log a ticket. I think the workaround you suggest is the only way to solve this problem currently. Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Oct 22, 2018 at 11:28 AM Kojo wrote: > I think that you can use stream evaluators in your expressions to filter

Re: SOLR External Id field

2018-10-22 Thread Shawn Heisey
On 10/22/2018 12:46 PM, Rathor, Piyush (US - Philadelphia) wrote: We are storing data in solr. Please let me know on the following: * How can we set a field as external id which can be used for update. * What operation/ query needs to sent to update the same external id record. Solr

Re: SOLR External Id field

2018-10-22 Thread Rohan Kasat
Piyush, can you elaborate your question for external ID ? is this the field which distinguish each record in your indexes ? Regards, Rohan Kasat On Mon, Oct 22, 2018 at 11:46 AM Rathor, Piyush (US - Philadelphia) < prat...@deloitte.com> wrote: > Hi All, > > > > We are storing data in solr. Pleas

SOLR External Id field

2018-10-22 Thread Rathor, Piyush (US - Philadelphia)
Hi All, We are storing data in solr. Please let me know on the following: * How can we set a field as external id which can be used for update. * What operation/ query needs to sent to update the same external id record. Thanks & Regards Piyush Rathor This message (including any

Re: Help with multi-lang searches

2018-10-22 Thread Tim Casey
Hi Sambhav, Calculate the percentage of letter pairs per language in the index. Given the letter pairs in the incoming token, find the closest "match" for the languages in the indexes. Even on a small number of tokens you will get close to the intended language. You can also calculate the "sourc

Re: Query to multiple collections

2018-10-22 Thread Rohan Kasat
Thanks Alex. I check aliases but dint focused much , will try to relate more to my use case and have a look again at the same. I guess the specification of collection in the query should be useful. Regards, Rohan Kasat On Mon, Oct 22, 2018 at 11:21 AM Alexandre Rafalovitch wrote: > Have you tri

Re: Query to multiple collections

2018-10-22 Thread Rohan Kasat
Thanks Chris for the update. I was thinking on the same grounds just wanted to check if you faced any specific issues. Regards, Rohan Kasat On Mon, Oct 22, 2018 at 11:20 AM Chris Ulicny wrote: > Rohan, > > I do not remember where I came across it or what restrictions exist on it, > but it work

Re: Query to multiple collections

2018-10-22 Thread Alexandre Rafalovitch
Have you tried using aliases: http://lucene.apache.org/solr/guide/7_5/collections-api.html#collections-api You can also - I think - specify a collection of shards/collections directly in the query, but there may be side edge-cases with that (not sure). Regards, Alex. On Mon, 22 Oct 2018 at 13

Re: Query to multiple collections

2018-10-22 Thread Chris Ulicny
Rohan, I do not remember where I came across it or what restrictions exist on it, but it works for our use case of querying multiple archived collections with identical schemas in the same SolrCloud cluster. The queries have the following form: http::/solr/current/select?collection=current,archiv

Re: Help with multi-lang searches

2018-10-22 Thread Alexandre Rafalovitch
Additional possibilities: 1) omitNorms and maybe omitTermFreqAndPositions for the fields to avoid frequency of term mattering http://lucene.apache.org/solr/guide/7_5/defining-fields.html#optional-field-type-override-properties 2) Constant score: http://lucene.apache.org/solr/guide/7_5/the-standard-

Query to multiple collections

2018-10-22 Thread Rohan Kasat
Hi All , I have a SolrCloud setup with multiple collections. I have created say - two collections here as the data source for the both collections are different and hence wanted to store them differently. There is a use case , where i need to query both the collections and show unified search res

Help with multi-lang searches

2018-10-22 Thread Sambhav Kothari (BLOOMBERG/ LONDON)
Hi, We have a problem with searches with multiple languages. Our schema looks something like this: field_en = English content for field field_es = Spanish field_it = Italian etc. When a user searches for a keyword, e.g.: "brexit" it can also specify several languages s/he wants to

RE: Securying ONLY the web interface console

2018-10-22 Thread Davis, Daniel (NIH/NLM) [C]
I think that it is not really Solr's job to solve this. I'm sure that there are many Java ways to solve this with Jetty configuration of JAAS, but the *safest* ways involve ports and rights. In other words, port 8983 and zookeeper ports are then for Solr nodes to communicate with each other

Re: Dealing with null values in streaming rollup

2018-10-22 Thread Kojo
I think that you can use stream evaluators in your expressions to filter the values you want: https://lucene.apache.org/solr/guide/6_6/stream-evaluators.html Em seg, 22 de out de 2018 às 12:10, RAUNAK AGRAWAL escreveu: > Thanks a lot Jan. Will try with 7.5 > > I am currently using 7.2.1 ver

Re: Dealing with null values in streaming rollup

2018-10-22 Thread RAUNAK AGRAWAL
Thanks a lot Jan. Will try with 7.5 I am currently using 7.2.1 version. Is there a way to fix it? On Fri, Oct 19, 2018 at 12:31 AM Jan Høydahl wrote: > Have you tried with Solr 7.5? I think it may have been fixed in that > version? At least for the timeseries() expression... > > -- > Jan Høydah

Re: Securying ONLY the web interface console

2018-10-22 Thread Amanda Shuman
Just a follow-up to say that I never have resolved this issue satisfactorily. -- Dr. Amanda Shuman Post-doc researcher, University of Freiburg, The Maoist Legacy Project PhD, University of California, Santa Cruz http://www.amandashuman.net/ http://www

Re: indexed and stored for fields that are sources of a copy field

2018-10-22 Thread Emir Arnautović
Hi Chris, Even better - you can contribute with documentation - you can create jira with patch. Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 22 Oct 2018, at 15:43, Chris Wareham > wrote

Re: indexed and stored for fields that are sources of a copy field

2018-10-22 Thread Chris Wareham
Hi Emir, Many thanks for the confirmation. I'd kind of inferred this was correct from the paragraph starting with "Copying is done at the stream source level", but it would be good to mention it in the "Copying Fields" section of the Solr documentation. Should I create a JIRA issue asking for thi

Re: indexed and stored for fields that are sources of a copy field

2018-10-22 Thread Emir Arnautović
Hi Chris, Yes you can do that. There is also type=“ignored” that you can use in such scenario. HTH, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 22 Oct 2018, at 15:22, Chris Wareham > wrote: >

indexed and stored for fields that are sources of a copy field

2018-10-22 Thread Chris Wareham
Hi folks, I have a number of fields defined in my managed-schema file that are used as the sources for a copy field: stored="true"/> stored="true" multiValued="true"/> stored="true" multiValued="true"/> stored="false" multiValued="true"/> Can I set both the indexed and

Re: AW: AW: 6.6 -> 7.5 SolrJ, seeing many "Connection evictor"-Threads

2018-10-22 Thread Shawn Heisey
On 10/21/2018 11:10 PM, Clemens Wyss DEV wrote: For the UpdateRequests it is the "commitWithinMs"-parameter? To me this parameter sounds like telling the solr-server I need to see this data within "x ms". As we have autoCommit and autoSoftCommit The commitWithin parameter is effectively equiva

Re: Device I/O trouble with solr 7.5

2018-10-22 Thread Toke Eskildsen
On Tue, 2018-10-16 at 14:04 +0200, zoolette wrote: > The SOLR instance in 7.5 is up and ready but no trafic is sent to it. > On the 2 websites, one generated approximately between 5000 and 8000 > requests / minute on SOLR on 2 handlers. > One search handler is dedicated to complex search from the s

Re: Is there a tool to directly index hdfs files to solr?

2018-10-22 Thread Shawn Heisey
On 10/18/2018 6:17 AM, shreck wrote: why remove "\solr\contrib\map-reduce" lib from solr6.6.1? Those contrib modules were removed for two primary reasons: * They are available elsewhere. * The copy included with Solr was not being maintained. See this issue: https://issues.apache.org/jira/b