Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-03 Thread Tomás Fernández Löbbe
t; at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:214) > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627) > > > > > > Can this be fixed in a patch for Solr 8.8? I do not want to have to go > back to Solr 6 and reindex the system, that takes 2 days using 180 EMR > instances. > > > > Pease advise. Thank you. > >

Re: NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-03-01 Thread Phill Campbell
xed in a patch for Solr 8.8? I do not want to have to go back > to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances. > > Pease advise. Thank you.

NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
equest(RequestHandlerBase.java:214) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627) Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances. Pease advise. Thank you.

NPE in QueryComponent.mergeIds when using timeAllowed and sorting SOLR 8.7

2021-02-24 Thread Phill Campbell
equest(RequestHandlerBase.java:214) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2627) Can this be fixed in a patch for Solr 8.8? I do not want to have to go back to Solr 6 and reindex the system, that takes 2 days using 180 EMR instances. Pease advise. Thank you.

Re: Using multiple language stop words in Solr Core

2021-02-11 Thread Markus Jelsma
Hell Abhay, Do not enable stopwords unless you absolutely know what you are doing. In general, it is a bad practice that somehow still lingers on. But to answer the question, you must have one field and fieldType for each language, so language specific filters go there. Also, using edismax and

Using multiple language stop words in Solr Core

2021-02-11 Thread Abhay Kumar
Hello Team, Solr provides some data type out of box in managed schema for different languages such as english, french, japanies etc. We are using common data type "text_general" for fields declaration and using stopwards.txt for stopword

Re: SSL using CloudSolrClient

2021-02-03 Thread ChienHuaWang
Thanks for the information. Could you advise whether CloudSolrClient is compatible with non-TLS? even client is not configure, it can still connect to Solr (TLS enabled)? -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: SSL using CloudSolrClient

2021-02-03 Thread Jörn Franke
schrieb ChienHuaWang : >> >> Hi, >> >> I am implementing SSL between Solr and Client communication. The clients >> connect to Solr via CloudSolrClient >> >> According to doc >> <https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html#index-a-docum

Re: SSL using CloudSolrClient

2021-02-03 Thread Jörn Franke
onnect to Solr via CloudSolrClient > > According to doc > <https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html#index-a-document-using-cloudsolrclient> > > , the passwords should also be set in clients. > However, in testing, client is still working well without

SSL using CloudSolrClient

2021-02-03 Thread ChienHuaWang
Hi, I am implementing SSL between Solr and Client communication. The clients connect to Solr via CloudSolrClient According to doc <https://lucene.apache.org/solr/guide/8_5/enabling-ssl.html#index-a-document-using-cloudsolrclient> , the passwords should also be set in clients. Howev

Re: Change uniqueKey using SolrJ

2021-02-01 Thread Jason Gerlowski
Hi, SolrJ doesn't have any purpose-made request class to change the uniqueKey, afaict. However doing so is still possible (though less convenient) using the "GenericSolrRequest" class, which can be used to hit arbitrary Solr APIs. If you'd like to see better support for t

Re: Getting Solr's statistic using SolrJ

2021-02-01 Thread Jason Gerlowski
st request = new GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/metrics/history", params); final SimpleSolrResponse response = request.process(solrClient); Hope that helps, Jason On Fri, Jan 22, 2021 at 11:21 AM Gael Jourdan-Weil wrote: > > Hello Steven, > > I believe w

Re: Is there way to autowarm new searcher using recently ran queries

2021-01-28 Thread Chris Hostetter
: I am wondering if there is a way to warmup new searcher on commit by : rerunning queries processed by the last searcher. May be it happens by : default but then I can't understand why we see high query times if those : searchers are being warmed. it only happens by default if you have an 'auto

Re: Is there way to autowarm new searcher using recently ran queries

2021-01-27 Thread Joel Bernstein
Typically what you would do is add static warming queries to warm all the caches. These queries are hardcoded into the solrconfig.xml. You'll want to run the facets you're using in the warming queries particularly facets on string fields. Once you add these it will take longer to wa

Is there way to autowarm new searcher using recently ran queries

2021-01-27 Thread Pushkar Raste
Hi, A rookie question. We have a Solr cluster that doesn't get too much traffic. We see that our queries take long time unless we run a script to send more traffic to Solr. We are indexing data all the time and use autoCommit. I am wondering if there is a way to warmup new searcher on commit by

RE: Getting Solr's statistic using SolrJ

2021-01-22 Thread Gael Jourdan-Weil
Hello Steven, I believe what you are looking for cannot be accessed using SolrJ (I didn't really check though). But you can easily access it either via the Collections APIs and/or the Metrics API depending on what you need exactly. See https://lucene.apache.org/solr/guide/8_4/cluster

Getting Solr's statistic using SolrJ

2021-01-22 Thread Steven White
each core, etc. etc. using SolrJ API. Thanks Steven

Change uniqueKey using SolrJ

2021-01-22 Thread Timo Grün
Hi All, I’m currently trying to change the uniqueKey of my Solr Cloud schema using Solrj. While creating new Fields and FieldDefinitions is pretty straight forward, I struggle to find any solution to change the Unique Key field with Solrj. Any advice here? Best Regards, Timo Gruen

Re: Exact matching without using new fields

2021-01-21 Thread Alexandre Rafalovitch
> > > START information retrieval END > > START advanced information retrieval with solr END > > > > And with our custom query parser, when an EXACT operator is found, I > > tokenize the query to match the first case. Otherwise pass it through. > > > >

Re: Exact matching without using new fields

2021-01-21 Thread Doss
ass it through. > > > > Needs custom analyzers on the query and index sides to generate the > > correct token sequences. > > > > It's worked out well for our case. > > > > Dave > > > > > > > >

Re: Exact matching without using new fields

2021-01-19 Thread gnandre
kenize the query to match the first case. Otherwise pass it through. > > Needs custom analyzers on the query and index sides to generate the > correct token sequences. > > It's worked out well for our case. > > Dave > > > > &g

Re: Exact matching without using new fields

2021-01-19 Thread David R
quences. It's worked out well for our case. Dave From: gnandre Sent: Tuesday, January 19, 2021 4:07 PM To: solr-user@lucene.apache.org Subject: Exact matching without using new fields Hi, I am aware that to do exact matching (only whatever is provided inside double quotes should be matc

Exact matching without using new fields

2021-01-19 Thread gnandre
Hi, I am aware that to do exact matching (only whatever is provided inside double quotes should be matched) in Solr, we can copy existing fields with the help of copyFields into new fields that have very minimal tokenization or no tokenization (e.g. using KeywordTokenizer or using string field

Re: Solr using all available CPU and becoming unresponsive

2021-01-12 Thread Charlie Hull
amount for OS page cache) 2. disable swap, if you can (this is esp. important if using network storage as swap). There are potential downsides to this (so proceed with caution); but if part of your heap gets swapped out (and it almost certainly will, with a sufficiently large heap) full GCs lead to a

Re: Solr using all available CPU and becoming unresponsive

2021-01-12 Thread Michael Gibney
mount for OS page cache) 2. disable swap, if you can (this is esp. important if using network storage as swap). There are potential downsides to this (so proceed with caution); but if part of your heap gets swapped out (and it almost certainly will, with a sufficiently large heap) full GCs lead to a

Re: Solr using all available CPU and becoming unresponsive

2021-01-12 Thread Jeremy Smith
e query section (and maybe the StopFilterFactory from the index section as well)? Thanks again, Jeremy From: Michael Gibney Sent: Monday, January 11, 2021 8:30 PM To: solr-user@lucene.apache.org Subject: Re: Solr using all available CPU and becoming unresponsive

Re: Solr using all available CPU and becoming unresponsive

2021-01-11 Thread Michael Gibney
t;/> > > For the filterCache, we have tried sizes as low as 128, which caused our > CPU usage to go up and didn't solve our issue. autowarmCount used to be > much higher, but we have reduced it to try to address this issue. > > > The behavior we see: > > Solr i

Solr using all available CPU and becoming unresponsive

2021-01-11 Thread Jeremy Smith
filterCache, we have tried sizes as low as 128, which caused our CPU usage to go up and didn't solve our issue. autowarmCount used to be much higher, but we have reduced it to try to address this issue. The behavior we see: Solr is normally using ~3-6GB of heap and we usually have

Re: Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102)

2021-01-06 Thread Florin Babes
racting-features > [2] > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.3/solr/contrib/ltr/src/java/org/apache/solr/ltr/feature/SolrFeature.java#L243 > [3] > https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.6.3/solr/contrib/ltr/src/java/org/apache/

Re:Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102)

2021-01-05 Thread Christine Poerschke (BLOOMBERG/ LONDON)
.java#L520-L525 From: solr-user@lucene.apache.org At: 01/04/21 17:31:44To: solr-user@lucene.apache.org Subject: Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102) Hello, We are trying to update Solr from 8.3.1 to 8.6.3. On Solr 8.

Possible bug on LTR when using solr 8.6.3 - index out of bounds DisiPriorityQueue.add(DisiPriorityQueue.java:102)

2021-01-04 Thread Florin Babes
Hello, We are trying to update Solr from 8.3.1 to 8.6.3. On Solr 8.3.1 we are using LTR in production using a MultipleAdditiveTrees model. On Solr 8.6.3 we receive an error when we try to compute some SolrFeatures. We didn't find any pattern of the queries that fail. Example: We have the foll

Suggester using up memory

2020-11-20 Thread Nick Vercammen
Hey, We have a problem on one of our installations with the suggestComponent. The index has about 16 million documents and contains a "Global" field which contains the data of multiple other fields. This "Global" field is used to build up the suggestions. A short time after starting Solr it is k

Re: Using fromIndex for single collection

2020-11-19 Thread Jason Gerlowski
Hi Irina, Yes, the "fromIndex" parameter can be used to perform a join from the host collection to a separate, single-shard collection in SolrCloud. If specified, this "fromIndex" collection must be present on whichever host is processing the request. (Often this involves over-replicating your "f

RE: Using Multiple collections with streaming expressions

2020-11-12 Thread ufuk yılmaz
Many thanks for the info Joel --ufuk Sent from Mail for Windows 10 From: Joel Bernstein Sent: 12 November 2020 17:00 To: solr-user@lucene.apache.org Subject: Re: Using Multiple collections with streaming expressions T

Re: Using Multiple collections with streaming expressions

2020-11-12 Thread Joel Bernstein
> > From: Erick Erickson > Sent: 10 November 2020 16:48 > To: solr-user@lucene.apache.org > Subject: Re: Using Multiple collections with streaming expressions > > Y > >

RE: Using Multiple collections with streaming expressions

2020-11-10 Thread ufuk yılmaz
16:48 To: solr-user@lucene.apache.org Subject: Re: Using Multiple collections with streaming expressions Y

Re: Using Multiple collections with streaming expressions

2020-11-10 Thread Erick Erickson
You need to open multiple streams, one to each collection then combine them. For instance, open a significantTerms stream to collection1, another to collection2 and wrap both in a merge stream. Best, Erick > On Nov 9, 2020, at 1:58 PM, ufuk yılmaz wrote: > > For example the streaming expressi

Using Multiple collections with streaming expressions

2020-11-09 Thread ufuk yılmaz
For example the streaming expression significantTerms: https://lucene.apache.org/solr/guide/8_4/stream-source-reference.html#significantterms significantTerms(collection1, q="body:Solr", field="author", limit="50", minDocFreq="1

solr-exporter using string arrays - 2

2020-11-03 Thread Maximilian Renner
Sorry for the bad format of the first mail, once again: Hello there, while playing around with the https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml I found a bug when trying to use string arrays like 'facet.field': Exception

~solr-exporter using string arrays

2020-11-03 Thread Maximilian Renner
Hello there, while playing around with the https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml I found a bug when trying to use string arrays like 'facet.field': __ __ __ __ __ __ _test_ _/select_ __ __ __ _{!EX=PUBLICATION}PUBLICAT

Using fromIndex for single collection

2020-10-07 Thread Irina Kamalova
I suppose my question is very simple. Am I right that if I want to use joins in the single collection in SolrCloud across several shards, I need to use semantic "fromIndex"? According to documentation I should use it only if I have different collections. I have one single collection across multiple

Re: Help using Noggit for streaming JSON data

2020-10-07 Thread Christopher Schultz
Yonic, Thanks for the reply, and apologies for the long delay in this reply. Also apologies for top-posting, I’m writing from my phone. :( Oh, of course... simply subclass the CharArr. In my case, I should be able to immediately base64-decode the value (saves 1/4 in-memory representation) and,

Re: Daylight savings time issue using NOW in Solr 6.1.0

2020-10-06 Thread Bernd Fehling
Hi, because you are using solr.in.cmd I guess you are using Windows OS. I don't know much about Solr and Windows but you can check your Windows, Jetty and Solr time by looking at your solr-8983-console.log file after starting Solr. First the timestamp of the file itself, then the timestamp o

RE: Using streaming expressions with shards filter

2020-10-06 Thread Gael Jourdan-Weil
Thanks Joel. I will try it in the future if I still need it (for now I went for another solution that fits my needs). Gaël

Daylight savings time issue using NOW in Solr 6.1.0

2020-10-06 Thread vishal patel
Hi I am using Solr 6.1.0. My SOLR_TIMEZONE=UTC in solr.in.cmd. My current Solr server machine time zone is also UTC. My one collection has below one field in schema. Suppose my current Solr server machine time is 2020-10-01 10:00:00.000. I have one document in that collection and in that

Re: Using streaming expressions with shards filter

2020-10-06 Thread Joel Bernstein
;> I expected to be able to use the "shards" parameter like on a regular >> query on "/select" for instance but this appear to not work or I don't know >> how to do it. >> >> Is this somehow a feature/restriction of Streaming expressions? >>

Re: Using streaming expressions with shards filter

2020-10-06 Thread Joel Bernstein
xpressions? > Or am I missing something? > > Note that the Streaming Expression I use is actually using the "/export" > request handler. > > Example of the streaming expression: > curl -X POST -v --data-urlencode > 'expr=search(myCollection,q="*:*",fl=&qu

Re: Daylight savings time issue using NOW in Solr 6.1.0

2020-10-04 Thread vishal patel
Hello, Can anyone help me? Regards, Vishal Sent from Outlook<http://aka.ms/weboutlook> From: vishal patel Sent: Thursday, October 1, 2020 4:51 PM To: solr-user@lucene.apache.org Subject: Daylight savings time issue using NOW in Solr 6.1.0 Hi I am usin

Using streaming expressions with shards filter

2020-10-01 Thread Gael Jourdan-Weil
Is this somehow a feature/restriction of Streaming expressions? Or am I missing something? Note that the Streaming Expression I use is actually using the "/export" request handler. Example of the streaming expression: curl -X POST -v --data-urlencode 'expr=search(myCollection,q

Daylight savings time issue using NOW in Solr 6.1.0

2020-10-01 Thread vishal patel
Hi I am using Solr 6.1.0. My SOLR_TIMEZONE=UTC in solr.in.cmd. My current Solr server machine time zone is also UTC. My one collection has below one field in schema. Suppose my current Solr server machine time is 2020-10-01 10:00:00.000. I have one document in that collection and in that

Using Autoscaling Simulation Framework to simulate a lost node in a cluster

2020-09-21 Thread Howard Gonzalez
Hello folks, has anyone tried to use the autoscaling simulation framework to simulate a lost node in a solr cluster? I was trying to do the following: 1.- Take a current production cluster state snapshout using bin/solr autoscaling -save 2.- Modify the clusterstate and livenodes json files in

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
pache.org/solr/guide/8_6/update-request-processors.html > and > >> see the extensive list of processors you can leverage. The specific > >> mentioned one is this one: > >> > https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScr

Re: Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Erick Erickson
I recommend _against_ issuing explicit commits from the client, let your solrconfig.xml autocommit settings take care of it. Make sure either your soft or hard commits open a new searcher for the docs to be searchable. I’ll bend a little bit if you can _guarantee_ that you only ever have one index

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
>> You can read all about it at: >> https://lucene.apache.org/solr/guide/8_6/update-request-processors.html and >> see the extensive list of processors you can leverage. The specific >> mentioned one is this one: >> https://lucene.apache.org/solr/8_6_0//solr-core/org/apache

Re: Doing what does using SolrJ API

2020-09-17 Thread Walter Underwood
specific > mentioned one is this one: > https://lucene.apache.org/solr/8_6_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html > > Just a word of warning that Stateless URP is using Javascript, which is > getting a bit of a complicated story as underlying

Re: Doing what does using SolrJ API

2020-09-17 Thread Alexandre Rafalovitch
/update/processor/StatelessScriptUpdateProcessorFactory.html Just a word of warning that Stateless URP is using Javascript, which is getting a bit of a complicated story as underlying JVM is upgraded (Oracle dropped their javascript engine in JDK 14). So if one of the simpler URPs will do the job or a

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
n might save you some grief down the line. > > > > > > For instance, I’ve seen designs where instead of > > > field1:some_value > > > field2:other_value…. > > > > > > you use a single field with _tokens_ like: > > > field:field1_some_va

Handling failure when adding docs to Solr using SolrJ

2020-09-17 Thread Steven White
Hi everyone, I'm trying to figure out when and how I should handle failures that may occur during indexing. In the sample code below, look at my comment and let me know what state my index is in when things fail: SolrClient solrClient = new HttpSolrClient.Builder(url).build(); solrClient.

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
_value > > field2:other_value…. > > > > you use a single field with _tokens_ like: > > field:field1_some_value > > field:field2_other_value > > > > that drops the complexity and increases performance. > > > > Anyway, just a th

Re: Doing what does using SolrJ API

2020-09-17 Thread Steven White
ases performance. > > Anyway, just a thought you might want to consider. > > Best, > Erick > > > On Sep 16, 2020, at 9:31 PM, Steven White wrote: > > > > Hi everyone, > > > > I figured it out. It is as simple as creating a List and using > >

Re: Help using Noggit for streaming JSON data

2020-09-17 Thread Yonik Seeley
See this method: /** Reads a JSON string into the output, decoding any escaped characters. */ public void getString(CharArr output) throws IOException And then the idea is to create a subclass of CharArr to incrementally handle the string that is written to it. You could overload write method

Help using Noggit for streaming JSON data

2020-09-17 Thread Christopher Schultz
All, Is this an appropriate forum for asking questions about how to use Noggit? The Github doesn't have any discussions available and filing an "issue" to ask a question is kinda silly. I'm happy to be redirected to the right place if this isn't appropriate. I've been able to figure out most thin

Re: Doing what does using SolrJ API

2020-09-17 Thread Erick Erickson
. Anyway, just a thought you might want to consider. Best, Erick > On Sep 16, 2020, at 9:31 PM, Steven White wrote: > > Hi everyone, > > I figured it out. It is as simple as creating a List and using > that as the value part for SolrInputDocument.addField() API. > > Thanks

Re: Doing what does using SolrJ API

2020-09-16 Thread Steven White
Hi everyone, I figured it out. It is as simple as creating a List and using that as the value part for SolrInputDocument.addField() API. Thanks, Steven On Wed, Sep 16, 2020 at 9:13 PM Steven White wrote: > Hi everyone, > > I want to avoid creating a source="OneFieldOfMany&q

Doing what does using SolrJ API

2020-09-16 Thread Steven White
Hi everyone, I want to avoid creating a in my schema (there will be over 1000 of them and maybe more so managing it will be a pain). Instead, I want to use SolrJ API to do what does. Any example of how I can do this? If there is an example online, that would be great. Thanks in advance. Ste

NullPointerException in IndexSearcher.explain() when using ComplexPhraseQueryParser

2020-09-09 Thread Michał Słomkowski
Hello, I get NPE when I use IndexSearcher.explain(). Checked with Lucene 8.6.0 and 8.6.2. The query: (lorem AND NOT "dolor lorem") OR ipsum The text: dolor lorem ipsum Stack trace: > java.lang.NullPointerException > at java.util.Objects.requireNonNull(Objects.java:203) > at org.apach

Retrieving Parent and Child Documents using the Bock Join Query Technique when the Child and parent Document having the identical field

2020-09-08 Thread Nagaraj S
Hi Solr Team, I am trying to retrieve the Parent Document by using the Block Join Parent Query Parser (q={!parent which=allParents}someChildren), but the filter condition i gave is having the same field in both the parent and the child document, So the Parser is throwing the Error : "

HEY, are you using the Analytics contrib?

2020-09-03 Thread David Smiley
I wonder who is using the Analytics contrib? Why do you use it instead of other Solr features like the JSON Faceting module that seem to have competing functionality. My motivation is to ascertain if it ought to be maintained as a 3rd party plugin/package or remain as a 1st party contrib where

RE: Using Solr's zkcli.sh

2020-09-02 Thread Victor Kretzer
Vincent -- Your suggestion worked perfectly. After using chmod I'm now able to use the zkcli script. Thank you so much for the quick save. Victor Victor Kretzer Sitecore Developer Application Services GDC IT Solutions Office: 717-262-2080 ext. 151 www.gdcitsolutions.com -Ori

Re: Using Solr's zkcli.sh

2020-09-02 Thread Vincent Brehin
other commands, including zkcli. So you should first launch "sudo chmod a+x server/scripts/cloud-scripts/zkcli.sh" , then you should be able to use the command. Let us know ! Vincent Le mar. 1 sept. 2020 à 23:35, Victor Kretzer a écrit : > Thank you in advance. This is my first time us

Using Solr's zkcli.sh

2020-09-01 Thread Victor Kretzer
Thank you in advance. This is my first time using a mailing list like this so hopefully I am doing so correctly. I am attempting to setup SolrCloud (Solr 6.6.6) and an external zookeeper ensemble on Azure. I have three dedicated to the zookeeper ensemble and two for solr all running Ubuntu

Re: PDF extraction using Tika

2020-08-26 Thread Walter Underwood
ks, >>>Joe D. >>> >>> On 25/08/2020 10:54, Charlie Hull wrote: >>>> On 25/08/2020 06:04, Srinivas Kashyap wrote: >>>>> Hi Alexandre, >>>>> >>>>> Yes, these are the same PDF files running in windows a

RE: [EXT] Re: PDF extraction using Tika

2020-08-26 Thread Hanjan, Harinderdeep S.
) and if one is not responding, move on to the next one. This will also allow you to easily incorporate using multiple PDF extraction tools, should Tika fail on a PDF. The way this would work is something like this: - Your code sees a PDF - It sends the PDF to Tika Server - Tika Server parses the PDF

Re: PDF extraction using Tika

2020-08-26 Thread Jan Høydahl
>> >> On 25/08/2020 10:54, Charlie Hull wrote: >>> On 25/08/2020 06:04, Srinivas Kashyap wrote: >>>> Hi Alexandre, >>>> >>>> Yes, these are the same PDF files running in windows and linux. There are >>>> around 30 pdf files and I t

Re: PDF extraction using Tika

2020-08-26 Thread Charlie Hull
re, Yes, these are the same PDF files running in windows and linux. There are around 30 pdf files and I tried indexing single file, but faced same error. Is it related to how PDF stored in linux? Did you try running Tika (the same version as you're using in Solr) standalone on the file as

RE: PDF extraction using Tika

2020-08-25 Thread Srinivas Kashyap
Thanks Phil, I will modify it according to the need. Thanks, Srinivas -Original Message- From: Phil Scadden Sent: 26 August 2020 02:44 To: solr-user@lucene.apache.org Subject: RE: PDF extraction using Tika Code for solrj is going to be very dependent on your needs but the beating

RE: PDF extraction using Tika

2020-08-25 Thread Phil Scadden
Admin", password); UpdateResponse ur = req.process(solr,"prindex"); req.commit(solr, "prindex"); -Original Message- From: Srinivas Kashyap Sent: Tuesday, 25 August 2020 17:04 To: solr-user@lucene.apache.org Subject: RE: PDF extraction usi

Re: PDF extraction using Tika

2020-08-25 Thread Joe Doupnik
Alexandre, Yes, these are the same PDF files running in windows and linux. There are around 30 pdf files and I tried indexing single file, but faced same error. Is it related to how PDF stored in linux? Did you try running Tika (the same version as you're using in Solr) standalone on the fi

Re: PDF extraction using Tika

2020-08-25 Thread Charlie Hull
as you're using in Solr) standalone on the file as Alexandre suggested? And with regard to DIH and TIKA going away, can you share if any program which extracts from PDF and pushes into solr? https://lucidworks.com/post/indexing-with-solrj/ is one example. You should run Tika separate

RE: PDF extraction using Tika

2020-08-24 Thread Srinivas Kashyap
from PDF and pushes into solr? Thanks, Srinivas Kashyap -Original Message- From: Alexandre Rafalovitch Sent: 24 August 2020 20:54 To: solr-user Subject: Re: PDF extraction using Tika The issue seems to be more with a specific file and at the level way below Solr's or possibly

Re: How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using the autoscaling Java API

2020-08-24 Thread Howard Gonzalez
Good morning! To add more context on the question, I can successfully use the Java API to build the list of new Clauses. However, the problem that I have is that I don't know how to "write" those changes back to solr using the Java API. I see there's a writeMap method

Re: PDF extraction using Tika

2020-08-24 Thread Alexandre Rafalovitch
Solr, possibly pre-processed. On Mon, 24 Aug 2020 at 11:09, Srinivas Kashyap wrote: > > Hello, > > We are using TikaEntityProcessor to extract the content out of PDF and make > the content searchable. > > When jetty is run on windows based machine, we are able to successful

PDF extraction using Tika

2020-08-24 Thread Srinivas Kashyap
Hello, We are using TikaEntityProcessor to extract the content out of PDF and make the content searchable. When jetty is run on windows based machine, we are able to successfully load documents using full import DIH(tika entity). Here PDF's is maintained in windows file system. But

How to Write Autoscaling Policy changes to Zookeeper/SolrCloud using the autoscaling Java API

2020-08-21 Thread Howard Gonzalez
Hello. I am trying to use the autoscaling Java API to write some cluster policy changes to a Zookeeper/SolrCloud cluster. However, I can't find the right way to do it. I can get all the autoscaling cluster policy clauses using: autoScalingConfig.getPolicy.getClusterPolicy However,

Re: Manipulating client's query using a Query object

2020-08-17 Thread Erick Erickson
al input string. In there you have access to that too. >> >> Regards, >> Markus >> >> >> -Original message- >>> From:Edward Turner >>> Sent: Monday 17th August 2020 21:25 >>> To: solr-user@lucene.apache.org >>> Su

Re: Manipulating client's query using a Query object

2020-08-17 Thread Edward Turner
g. In there you have access to that too. > > Regards, > Markus > > > -Original message- > > From:Edward Turner > > Sent: Monday 17th August 2020 21:25 > > To: solr-user@lucene.apache.org > > Subject: Re: Manipulating client's query using a Quer

RE: Manipulating client's query using a Query object

2020-08-17 Thread Markus Jelsma
Dismax has some variable (i think it was qstr) that contains the original input string. In there you have access to that too. Regards, Markus -Original message- > From:Edward Turner > Sent: Monday 17th August 2020 21:25 > To: solr-user@lucene.apache.org > Subject: Re: Manipula

Re: Manipulating client's query using a Query object

2020-08-17 Thread Edward Turner
Hi Markus, That's really great info. Thank you. Supposing we've now modified the Query object, do you know how we would get the corresponding query String, which we could then forward to our Solrcloud via SolrClient? (Or should we be using this extended ExtendedDisMaxQParser class s

RE: Manipulating client's query using a Query object

2020-08-17 Thread Markus Jelsma
ssage- > From:Edward Turner > Sent: Monday 17th August 2020 15:53 > To: solr-user@lucene.apache.org > Subject: Manipulating client's query using a Query object > > Hi all, > > Thanks for all your help recently. We're now using the edismax query parser > and are happy wi

Manipulating client's query using a Query object

2020-08-17 Thread Edward Turner
Hi all, Thanks for all your help recently. We're now using the edismax query parser and are happy with its behaviour. We have another question which maybe someone can help with. We have one use case where we optimise our query before sending it to Solr, and we do this by manipulatin

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Bram Van Dam
On 11/08/2020 13:15, Erick Erickson wrote: > CDCR is being deprecated. so I wouldn’t suggest it for the long term. Ah yes, thanks for pointing that out. That makes Dominique's alternative less attractive. I guess I'll stick to my original proposal! Thanks Erick :-) - Bram

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
gt;>>> > >>>> Been reading up about the various ways of creating backups. The whole > >>>> "shared filesystem for Solrcloud backups"-thing is kind of a no-go in > >>>> our environment, so I've been looking for ways around that, an

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Erick Erickson
of a no-go in >>>> our environment, so I've been looking for ways around that, and here's >>>> what I've come up with so far: >>>> >>>> 1. Stop applications from writing to solr >>>> >>>> 2. Commit everything >&

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
backups"-thing is kind of a no-go in > >> our environment, so I've been looking for ways around that, and here's > >> what I've come up with so far: > >> > >> 1. Stop applications from writing to solr > >> > >> 2. Commit every

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Erick Erickson
1. Stop applications from writing to solr >> >> 2. Commit everything >> >> 3. Identify a single core for each shard in each collection >> >> 4. Snapshot that core using CREATESNAPSHOT in the Collections API >> >> 5. Once complete, re-enable appl

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-11 Thread Dominique Bejean
p applications from writing to solr > > 2. Commit everything > > 3. Identify a single core for each shard in each collection > > 4. Snapshot that core using CREATESNAPSHOT in the Collections API > > 5. Once complete, re-enable application write access to Solr > > 6

Re: Backups in SolrCloud using snapshots of individual cores?

2020-08-10 Thread Ashwin Ramesh
've been looking for ways around that, and here's > what I've come up with so far: > > 1. Stop applications from writing to solr > > 2. Commit everything > > 3. Identify a single core for each shard in each collection > > 4. Snapshot that core using CR

Backups in SolrCloud using snapshots of individual cores?

2020-08-06 Thread Bram Van Dam
ions from writing to solr 2. Commit everything 3. Identify a single core for each shard in each collection 4. Snapshot that core using CREATESNAPSHOT in the Collections API 5. Once complete, re-enable application write access to Solr 6. Create a backup from these snapshots using the replication

Re: Querying solr using many QueryParser in one call

2020-07-20 Thread Charlie Hull
erformance with strategies like the cacheing you describe. Charlie On 16/07/2020 18:14, harjag...@gmail.com wrote: Hi All, Below are question regarding querying solr using many QueryParser in one call. We have need to do a search by keyword and also include few specific documents to result. We do

How do I use dismax or edismax to rank using 60% tf-idf and 40% a numeric field?

2020-07-16 Thread Russell Jurney
Hello Solarians, I know how to boost a query and I see the methods for tf and idf in streaming scripting. What I don’t know is how to incorporate these things together at a specific percentage of the ranking function. How do I write a query to use dismax or edismax to rank using 60% tf-idf score

  1   2   3   4   5   6   7   8   9   10   >