Hi,
Is it possible in any way to get the first value in a multivalued field? Using
function queries, streaming expressions or any other way without reindexing?
(Stream decorators have array(), but no way to get a value at a specific index?)
Another one, is it possible to match a regex to a text
ore shards in the query are idle
> beyond the timeout threshold. This happens because lot's of data is being
> read from other shards.
>
> Breaking the query into small parts would be a good strategy.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
I changed the tokenizer class from KeywordTokenizerFactory to
WhitespaceTokenizerFactory for the query analyzer using the Schema API, it
seems to have solved the problem.
Sent from Mail for Windows 10
From: ufuk yılmaz
Sent: 02 March 2021 20:47
To: solr-user@lucene.apache.org
Subject: Default
quot;class":"solr.KeywordTokenizerFactory" }}}
}' http://localhost:8983/solr/gettingstarted/schema
So, indexAnalyzer/queryAnalyzer, rather than array:
https://lucene.apache.org/solr/guide/8_8/schema-api.html#add-a-new-field-type
Hope this works,
Alex.
P.s. Also check wh
I’m using the following example on Lucidworks to use streaming expressions from
SolrJ:
https://lucidworks.com/post/streaming-expressions-in-solrj/
Problem is, when I run it inside a for loop, even the simplest expression
(echo) stops executing after about 5 iterations. I thought the underlying
Hello,
I’m trying to change a field’s query analysers. The following works but it
replaces both index and query type analysers:
{
"replace-field-type": {
"name": "string_ci",
"class": "solr.TextField",
"sortMissingLast": true,
"omitNorms": true,
"store
Hello all,
>From the Solr 8.4 (my version) documentation:
“The OR operator is the default conjunction operator. This means that if there
is no Boolean operator between two terms, the OR operator is used. To search
for documents that contain either "jakarta apache" or just "jakarta," use the
qu
e that specifically suppresses these
> errors without backporting the full Solr 9.0 changes which impact the
> memory footprint of export.
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, Mar 1, 2021 at 10:29 AM ufuk yılmaz
> wrote:
>
>
Hello all,
I’m running a large streaming expression and feeding the result to update
expression.
update(targetCollection, ...long running stream here...,
I tried sending the exact same query multiple times, it sometimes works and
indexes some results, then gives exception, other times fails
"text": "abc"
},
{
"EOF": true,
"RESPONSE_TIME": 70
}
]
}
}
--ufuk yilmaz
Sent from Mail for Windows 10
From: ufuk yılmaz
Sent: 26 February 2021 16:38
To: solr-user@lucene.apache.org
Subject: Select streaming expression, add
Hello all,
Solr version 8.4
I have a very simple select expression here. What I’m trying to do is to add a
constant value to incoming tuples.
My collection has only 1 document. Id_str is of type String. Other fields are
Solr generated.
{
"_version_":1692761378187640832,
"id_st
I have a type=”text_general” multivalued=”true” field, named fieldA.
When I use a function query, with fields like
fields=if(true, fieldA, -1), fieldA
Response is:
"response":{"numFound":1,"start":0,"maxScore":4.6553917,"docs":[
{
"fieldA":["SomeMixedCaseValue"],
"if(true,fiel
erties and schema
This list strips attachments so you'll have to figure out another way to
show the difference,
Cheers
Charlie
On 16/02/2021 15:16, ufuk yılmaz wrote:
>
> There’s a collection at our customer’s site giving weird exceptions
> when a particular field is involved
There’s a collection at our customer’s site giving weird exceptions when a
particular field is involved (asked another question detailing that).
When I inspected it, there’s only one difference between it and other dozens of
fine working collections, which is,
A text_general field in all othe
We have a SolrCloud cluster, version 8.4
At the customer’s site there’s a collection with very few documents, around 12.
We usually have collections with hundreds of millions of documents, so that
collection is a bit of an exception.
When I send a significantTerms streaming expression it immedi
Is it because the main place for q&a is this mailing list, or somewhere else
that I don’t know?
Or Solr isn’t ‘hot’ as some other topics?
Sent from Mail for Windows 10
When I have a copyfield directive like,
k the situation you describe (high-cardinality
> field, known low-cardinality for the particular domain) sounds like a
> perfect use-case for dvhash.
>
> Michael
>
> On Fri, Feb 5, 2021 at 11:56 AM ufuk yılmaz
> wrote:
>
>> Hello,
>>
>> I’m using
Hello,
I’m using Solr 8.4. Very excited about performance improvements in 8.8:
http://joelsolr.blogspot.com/2021/01/optimizations-coming-to-solr.html
As I understand the main determinator of performance and RAM usage of a terms
facet is cardinality of the field in whole collection, but not the
should be for one collection. This will be where the
collection is compiled and run. It has no effect on what is actually being
searched. That is specified in the expression themselves.
Joel Bernstein
http://joelsolr.blogspot.com/
On Wed, Jan 20, 2021 at 1:34 PM ufuk yılmaz
wrote:
> Do collect
Should I create a java project with a dependency on solrj, or solr core ?,
then implement the Expressible interface
then build my project as a jar and put it into each node of SolrColud’s
classpath?
Or should I take a completely different route?
Many thanks
~ufuk
Sent from Mail for Windows 10
It’s asking for a searchscale.com email address?
Sent from Mail for Windows 10
From: Ishan Chattopadhyaya
Sent: 26 January 2021 13:33
To: solr-user
Subject: Re: Solr Slack Workspace
There is a Slack backed by official IRC support. Please see
https://lucene.472066.n3.nabble.com/Solr-Users-Slack-t
Solr version 8.4. I’m getting an unexplanetory NullPointerException when
executing a simple 2 level nodes stream, do you have any idea what may cause
this?
I tried setting /stream?partialResults=true&shards.tolerant=true and
shards.tolerant=true in nodes expressions, with no luck. I also tried
Looked at the source code of the parallel stream and it seems I need equal
number of SHARDS and workers count parameter. I thought I needed as many
replicas, it was shards.
Maybe helps someone.
Sent from Mail for Windows 10
From: ufuk yılmaz
Sent: 21 January 2021 11:16
To: solr-user
It only works when I set workers to 1, which defeats the point of parallel.
Sent from Mail for Windows 10
From: ufuk yılmaz
Sent: 21 January 2021 11:16
To: solr-user@lucene.apache.org
Subject: Parallel streaming expression java.lang.IndexOutOfBoundsException
Hello all,
https
Hello all,
https://lucene.apache.org/solr/guide/8_4/stream-decorator-reference.html#parallel
I’m sending the same query in the docs, (just collection names changed) to my
Solr but always getting the exception:
{
"result-set":{
"docs":[{
"EXCEPTION":"java.lang.IndexOutOfBoundsExcep
Do collection names in request url affect how the query works in any way?
A streaming expression is sent to http://mySolrHost/solr/col1,col2/stream
(notice multiple collections in url)
Col1 has 2 shards, each have 3 replicas.
* Shard1 has replicas on nodes A, B, C
* Shard2 has replicas on D,E,F
I’m trying to learn all I can on Solr for a year now and I still scratch my
head when it comes to effects of shards and replicas on performance.
- info about my setup
We have a SolrCloud setup with 6 nodes.
Each collection has 2 shards and 2 replicas. 1 shard’s size is about 100GB.
Each
When I performa a long running streaming expression, sometimes I get:
{
"error": {
"metadata": [
"error-class",
"org.apache.solr.common.SolrException",
"root-error-class",
"java.net.SocketTimeoutException"
],
"msg": "Error
Hi,
A while ago I asked the same thing here. Looking at the source javascript code
of the frontend app, I saw a 10k millisecond timeout config in httpInterceptor
inside app.js. I changed it to something much larger and results of long
queries began to show.
Hope it helps
Sent from Mail for W
Should I stop indexing new documents, or stop indexing and wait for collections
to recover?
Recently our disk got 100% full and Solr started to throw various errors. So I
deleted some unnecessary documents and committed with expungeDeletes=true. It
freed some space but many collections went int
-api.html#rename
On Thu, Jan 7, 2021 at 2:07 PM ufuk yılmaz wrote:
>
> Hi again,
>
> Lets say I have a collection named A.
> I’m trying to rename it to A_1, then create an alias named A, which points to
> the A_1 collection.
> Is this possible without deleting and reindexin
Hi again,
Lets say I have a collection named A.
I’m trying to rename it to A_1, then create an alias named A, which points to
the A_1 collection.
Is this possible without deleting and reindexing the collection from scratch?
Regards,
uyilmaz
Hello all,
I have been looking at our SolrCloud indexing performance statistics and trying
to make sense of the numbers. We are using a custom Flume sink and sending
updates to Solr (8.4) using SolrJ.
I know these stuff depend on a lot of things but can you tell me if these
statistics are horr
Hello All,
Is there a way to see currently executing queries in a SolrCloud? Or a general
strategy to detect a query using absurd amount or resources?
We are using Solr for not only simple querying, but running complex streaming
expressions, facets with large data etc. Sometimes, randomly, CPU
Hello all,
I have a plong field in my schema representing a Unix timestamp
I’m doing a range facet over this field to find which event occured on which
day. I’m setting “start” on some date at 00:00 o’clock, end on another, and
setting gap to 86400 (total seconds in a day)
...
"type": "range"
Hi everyone,
Last day I was comparing term+range facet counts from two different collections
having exact same data and schema. Only difference is one collection has 2
shards, the other 1. After searching about this I came upon an article:
medium.com
My results were like this:
Counts from coll
Hello all,
Documentation states “Fields are copied before analysis is done, meaning you
can have two fields with identical original content, but which use different
analysis chains and are stored in the index differently.”
I have a field definition for a case insensitive string which I use for
Fetch would work for my specific case (since I’m working with id’s there’s no
one to many), if I was able to restrict fetch’s target domain with a query. I
would first get all possible deleted ids, then use fetch to the items
collection. But then the current fetch implementation would find all d
Hi all,
I’m looking for a way to query two collections and find documents that exist in
both, I know this can be done with innerJoin streaming expression but I want to
avoid it, since one of the collection streams can possibly have billions of
results:
Let’s say two collections are:
deletedIt
Hey,
Can anyone give me an example on how can eval
https://lucene.apache.org/solr/guide/8_4/stream-decorator-reference.html#eval
be used?
Docs says it allows to run streaming expressions those created on the fly, but
I can’t wrap my head on how an expression can be created on the fly, maybe
u
Many thanks for the info Joel
--ufuk
Sent from Mail for Windows 10
From: Joel Bernstein
Sent: 12 November 2020 17:00
To: solr-user@lucene.apache.org
Subject: Re: Using Multiple collections with streaming expressions
T
Thanks again Erick, that’s a good idea!
Alternatively, I use an alias covering multiple collections in these
situations, but there may be too many combinations of collections, so it’s not
always suitable.
Merged significantTerms streams will have meaningles scores in tuples I think,
it would b
For example the streaming expression significantTerms:
https://lucene.apache.org/solr/guide/8_4/stream-source-reference.html#significantterms
significantTerms(collection1,
q="body:Solr",
field="author",
limit="50",
minDocFreq="1
44 matches
Mail list logo