Thanks Shawn!
Can any of the committers comment about the CDCR error that I posted above?
Thanks
Jay
On Fri, Oct 25, 2019 at 2:56 PM Shawn Heisey wrote:
> On 10/25/2019 3:22 PM, Jay Potharaju wrote:
> > Is there a solr slack channel?
>
> People with @apache.org email addresses can readily joi
On 10/25/2019 2:30 PM, rhys J wrote:
So I went back to one of the fields that is multi-valued, which I
explicitly did not choose when I created the field, and I re-created it.
It still made the field multi-valued as true.
Why is this?
Did you reload the core/collection or restart Solr so the
On 10/25/2019 3:22 PM, Jay Potharaju wrote:
Is there a solr slack channel?
People with @apache.org email addresses can readily join the ASF
workspace, I do not know whether it is possible for others. That
workspace might be only for ASF members.
https://the-asf.slack.com
In that workspace
Is there a solr slack channel?
Thanks
Jay Potharaju
On Fri, Oct 25, 2019 at 9:00 AM Jay Potharaju wrote:
> Hi,
> I am frequently seeing cdcr-replicator null pointer exception errors in
> the logs.
> Any suggestions on how to address this?
> *Solr version: 7.7.2*
>
> ExecutorUtil
> Uncaught exc
>
>
> > "dl2":["Great Plains"],
> > "do_not_call":false,
>
> There are no hashes inside the document. If there were, they would be
> surrounded by {} characters. The whole document is a hash, which is why
> it has {} characters. Referring to the snippet that I included above,
On 10/25/2019 1:48 PM, rhys J wrote:
Is there some reason that text_general fields are returned as arrays, and
other fields are returned as hashes in the json response from a curl query?
Here's the response:
"dl2":["Great Plains"],
"do_not_call":false,
There are no h
Is there some reason that text_general fields are returned as arrays, and
other fields are returned as hashes in the json response from a curl query?
Here's my curl query:
curl "http://10.40.10.14:8983/solr/dbtr/select?indent=on&q=debtor_id:393291";
Here's the response:
response":{"numFound":1,
Also the openNlp solr POS tagger [1] uses the typeAsSynonymFilter to
store the POS:
" Index the POS for each token as a synonym, after prefixing the POS with @ "
Not sure how to deal with POS after such indexing, but this looks
interesting approach ?
[1]
http://lucene.apache.org/solr/guide/7_3
Just to stir the pot on this topic, here is an article about why and how to use
Tika inside of Solr:
https://opensourceconnections.com/blog/2019/10/24/it-s-okay-to-run-tika-inside-of-solr-if-and-only-if/
> On Oct 23, 2019, at 7:21 PM, Erick Erickson wrote:
>
> Here’s a blog about why and how t
Yeah. My mistake in explanation. But it really does help with better relevance
in the returned documents
> On Oct 25, 2019, at 12:39 PM, Audrey Lorberfeld - audrey.lorberf...@ibm.com
> wrote:
>
> Oh I see I see
>
> --
> Audrey Lorberfeld
> Data Scientist, w3 Search
> IBM
> audrey.lorberf...
Oh I see I see
--
Audrey Lorberfeld
Data Scientist, w3 Search
IBM
audrey.lorberf...@ibm.com
On 10/25/19, 12:21 PM, "David Hastings" wrote:
oh i see what you mean, sorry, i explained it incorrectly.
those sentences are what would be in the index, and a general search for
'rush
How can a field itself be tagged with a part of speech?
--
Audrey Lorberfeld
Data Scientist, w3 Search
IBM
audrey.lorberf...@ibm.com
On 10/25/19, 12:12 PM, "David Hastings" wrote:
nope, i boost the fields already tagged at query time against teh query
On Fri, Oct 25, 2019 at 12
> Do you use the POS tagger at query time, or just at index time?
I have the POS tagger pipeline ready but nothing done yet on the solr
part. Right now I am wondering how to use it but still looking for
relevant implementation.
I guess having the POS information ready before indexation gives the
oh i see what you mean, sorry, i explained it incorrectly.
those sentences are what would be in the index, and a general search for
'rush limbaugh' would come back with results where he is an entity higher
than if it was two words in a sentence
On Fri, Oct 25, 2019 at 12:12 PM David Hastings <
ha
nope, i boost the fields already tagged at query time against teh query
On Fri, Oct 25, 2019 at 12:11 PM Audrey Lorberfeld -
audrey.lorberf...@ibm.com wrote:
> So then you do run your POS tagger at query-time, Dave?
>
> --
> Audrey Lorberfeld
> Data Scientist, w3 Search
> IBM
> audrey.lorberf...
So then you do run your POS tagger at query-time, Dave?
--
Audrey Lorberfeld
Data Scientist, w3 Search
IBM
audrey.lorberf...@ibm.com
On 10/25/19, 12:06 PM, "David Hastings" wrote:
I use them for query boosting, so if someone searches for:
i dont want to rush limbaugh out the do
Nicolas,
Do you use the POS tagger at query time, or just at index time?
We are thinking of using it to filter the tokens we will eventually perform ML
on. Basically, we have a bunch of acronyms in our corpus. However, many
departments use the same acronyms but expand those acronyms to differe
I use them for query boosting, so if someone searches for:
i dont want to rush limbaugh out the door
vs
i talked to rush limbaugh through the door
my documents where 'rush limbaugh' is a known entity (noun) and a person
(look at the sentence, its obviously a person and the nlp finds that) have
'r
Hi,
I am frequently seeing cdcr-replicator null pointer exception errors in the
logs.
Any suggestions on how to address this?
*Solr version: 7.7.2*
ExecutorUtil
Uncaught exception java.lang.NullPointerException thrown by thread:
cdcr-replicator-773-thread-3
java.lang.Exception: Submitter stack tra
Also we are using stanford POS tagger for french. The processing time is
mitigated by the spark-corenlp package which distribute the process over
multiple node.
Also I am interesting in the way you use POS information within solr
queries, or solr fields.
Thanks,
On Fri, Oct 25, 2019 at 10:42:43A
How can I group my Solr query results using a numeric field into x buckets,
where the bucket start and end values are determined when the query is run?
For example, if I want to count and group documents into 5 buckets by a
wordCount field, the results should be:
250-500 words: 3438 results
500-75
How can I group my Solr query results using a numeric field into x buckets,
where the bucket start and end values are determined when the query is run?
For example, if I want to count and group documents into 5 buckets by a
wordCount field, the results should be:
250-500 words: 3438 results
500-75
ah, yeah its not the fastest but it proved to be the best for my purposes,
I use it to pre-process data before indexing, to apply more metadata to the
documents in a separate field(s)
On Fri, Oct 25, 2019 at 10:40 AM Audrey Lorberfeld -
audrey.lorberf...@ibm.com wrote:
> No, I meant for part-of-
No, I meant for part-of-speech tagging __ But that's interesting that you use
StanfordNLP. I've read that it's very slow, so we are concerned that it might
not work for us at query-time. Do you use it at query-time, or just index-time?
--
Audrey Lorberfeld
Data Scientist, w3 Search
IBM
audrey.l
https://nlp.stanford.edu/
On Fri, Oct 25, 2019 at 10:29 AM David Hastings <
hastings.recurs...@gmail.com> wrote:
> Do you mean for entity extraction?
> I make a LOT of use from the stanford nlp project, and get out the
> entities and use them for different purposes in solr
> -Dave
>
> On Fri, Oct
Do you mean for entity extraction?
I make a LOT of use from the stanford nlp project, and get out the entities
and use them for different purposes in solr
-Dave
On Fri, Oct 25, 2019 at 10:16 AM Audrey Lorberfeld -
audrey.lorberf...@ibm.com wrote:
> Hi All,
>
> Does anyone use a POS tagger with t
Hi All,
Does anyone use a POS tagger with their Solr instance other than OpenNLP’s? We
are considering OpenNLP, SpaCy, and Watson.
Thanks!
--
Audrey Lorberfeld
Data Scientist, w3 Search
IBM
audrey.lorberf...@ibm.com
I’m also surpised that you see a slowdown, it’s worth investigating.
Let’s take the NRT case with only a leader. I’ve seen the NRT indexing time
increase when even a single follower was added (30-40% in this case). We
believed that the issue was the time the leader sat waiting around for the
fo
If you _are_ using SolrCloud, you can use the collections API SPLITSHARD
command.
> On Oct 25, 2019, at 7:37 AM, Shawn Heisey wrote:
>
> On 10/24/2019 11:19 PM, Hafiz Muhammad Shafiq wrote:
>> HI,
>> I am using Solr 6.x version for search purposes. Now data has been
>> increased into one shard.
Shawn Heisey kirjoitti 25.10.2019 klo 14.54:
> With newer Solr versions, you can ask SolrCloud to prefer PULL replicas
> for querying, so queries will be targeted to those replicas, unless they
> all go down, in which case it will go to non-preferred replica types. I
> do not know how to do this,
Shawn,
So, I understand that while non leader TLOG is copying the index from
leader, the leader stop indexing.
One shot large heavy bulk indexing should be very much more impacted than
continus ligth indexing.
Regards.
Dominique
Le ven. 25 oct. 2019 à 13:54, Shawn Heisey a écrit :
> On 10/25
On 10/25/2019 5:44 AM, Danilo Tomasoni wrote:
Another question, is softCommit sufficient to ensure visibility or
should I call a commit to ensure a new searcher will be opened?
softCommit automatically opens a new searcher?
There would be little point to doing a soft commit with openSearcher
On 10/25/2019 1:16 AM, Dominique Bejean wrote:
For collection created with all replicas as NRT
* Indexing time : 22 minutes
For collection created with all replicas as TLOG
* Indexing time : 34 minutes
NRT indexes simultaneously on all replicas. So when indexing is done on
one, it is a
Thank you all for your suggestions.
Now I changed my import strategy to ensure that the same document will
be updated eventually by different "batches",
in this way I need a single programmatic softcommit at the end of each
batch.
Configuration-side I enabled autoCommit with opensearcher=f
Hello Erick/Emir
Thanks for your valuable suggestions. I will it keep in mind while doing
such operations.
Best,
Shubham
On Wed, Oct 23, 2019 at 5:56 PM Erick Erickson
wrote:
> Really, just don’t do this. Please. As others have pointed out, it may
> look like it works, but it won’t. I’ve spent
On 10/24/2019 11:19 PM, Hafiz Muhammad Shafiq wrote:
HI,
I am using Solr 6.x version for search purposes. Now data has been
increased into one shard. I have to create some additional shards and also
have to balance base on number of documents. According to my search, solr
does not provide rebalan
Hi Russell,
I've noticed few differences between solr8 schema and solr6. Few omitNorms
params missing and few solr.FlattenGraphFilterFactory missing too.
But perhaps the most important difference between the 6 and 8 is the memory
configuration.
solr 6 has
SOLR_HEAP="27158m"
SOLR_JAVA_MEM="-Xms27
Hi Jörn ,
I am using version 8.2.
I repeated the test twice for each mode.
I restarted solr nodes and deleted / created empty collection each time.
Regards.
Dominique
Le ven. 25 oct. 2019 à 09:20, Jörn Franke a écrit :
> Which Solr version are you using and how often you repeated the test?
>
Which Solr version are you using and how often you repeated the test?
> Am 25.10.2019 um 09:16 schrieb Dominique Bejean :
>
> Hi,
>
> I made some benchmarks for bulk indexing in order to compare performances
> and ressources usage for NRT versus TLOG replica.
>
> Environnent :
> * Solrcloud wi
Hi,
I made some benchmarks for bulk indexing in order to compare performances
and ressources usage for NRT versus TLOG replica.
Environnent :
* Solrcloud with 4 Solr nodes (8 Gb RAM, 4 Gb Heap)
* 1 collection with 2 shards x 2 replicas (all NRT or all TLOG)
* 1 core per Solr Server
Indexing of a
HI,
I am using Solr 6.x version for search purposes. Now data has been
increased into one shard. I have to create some additional shards and also
have to balance base on number of documents. According to my search, solr
does not provide rebalance API. Is it correct ? How can I do my job.
On Fri, O
41 matches
Mail list logo