Re: The _version_ field; why is it necessary?

2023-10-18 Thread David Smiley
Thank you both.  It helps to know that "_version"_ is for, I would say
succinctly, "NRT replication".  I mean; that deserves to be said internally
in some places!
Might it be advantageous to imagine it being optional for non-NRT
replicas?  I'm not sure if it saves anything or reduces complexity anywhere.
Related question:  Is the VersionInfo (with its striped VersionBucket
locks) related to this -- is it a vestige of "_version_" or is it for
something else?  If it isn't for something else, then I could imagine it
being omitted for non-NRT; maybe a dummy implementation.  BTW Bruno opened
an issue/PR on it yesterday --
https://issues.apache.org/jira/browse/SOLR-17036

~ David


On Wed, Oct 18, 2023 at 1:41 AM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Fyi, SOLR-5944, is unreadable, but introduced the concept of previous
> version or something like that.
>
> On Wed, 18 Oct, 2023, 10:35 am Mark Miller,  wrote:
>
> > The primary reason is as Ishan says - so that update reorders from leader
> > to replica can be handled in both normal and failure cases.
> >
> > It’s also true that a part of the reason that the per document, NRT
> design,
> > with versions, was chosen was a desire to support per document optimistic
> > concurrency.
> >
> > On Tue, Oct 17, 2023 at 11:37 PM Ishan Chattopadhyaya <
> > ichattopadhy...@gmail.com> wrote:
> >
> > > Also DBQs use the version field to ensure they are applied correctly,
> > even
> > > if a DBQ is reordered
> > >
> > > On Wed, 18 Oct, 2023, 10:05 am Ishan Chattopadhyaya, <
> > > ichattopadhy...@gmail.com> wrote:
> > >
> > > > To ensure reordered updates are processed properly from leader to
> other
> > > > replicas in NRT replication mode.
> > > >
> > > > On Wed, 18 Oct, 2023, 9:55 am David Smiley, 
> > wrote:
> > > >
> > > >> Question: Does the _version_ field have a purpose other than for
> > "atomic
> > > >> updates"?
> > > >> I know SolrCloud and/or having an UpdateLog insists on it.  But I
> > don't
> > > >> know if it's for that feature alone, or for additional non-obvious
> > > >> internal
> > > >> workings of SolrCloud.  Mostly I'm just asking to have a deeper
> > > >> understanding; the field doesn't bother me.  If someone knows of any
> > > docs
> > > >> on it or old interesting JIRAs to read, I'd appreciate it.
> > > >>
> > > >> ~ David Smiley
> > > >> Apache Lucene/Solr Search Developer
> > > >> http://www.linkedin.com/in/davidwsmiley
> > > >>
> > > >
> > >
> >
>


Re: Rerank / rewritten queries never get cached in queryResultCache?

2023-10-18 Thread Doug Turnbull
Created a Jira, and sent in a PR to LTR

https://issues.apache.org/jira/browse/SOLR-17037
https://github.com/apache/solr/pull/2022

🙏

On Tue, Oct 17, 2023 at 4:45 PM Doug Turnbull 
wrote:

> My google-fu got better -- FWIW This looks like an issue fixed in the
> normal rerank query code, but not in LTR, which is a different query parser
> / query path?
>
> https://issues.apache.org/jira/browse/SOLR-7689
>
> On Tue, Oct 17, 2023 at 4:03 PM Doug Turnbull 
> wrote:
>
>> Hi all,
>>
>> I'm noticing an issue where rerank queries never appear to enter the Solr
>> queryResultCache
>>
>> You can notice this, as a user, by simply repeating requests to a node.
>> You'll get an instantaneous response repeating most Solr queries. However,
>> repeating anything with rq= added, each response takes about the same
>> amount of time.
>>
>> I dug into this some in a debugger, and created a test (in LTR) to
>> recreate this
>>
>>
>> https://github.com/apache/solr/compare/main...softwaredoug:solr:no-rerank-caching?expand=1
>>
>> We do a lookup using a NON rewritten version of the query, here
>>
>> https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1561
>>
>> Later, prior to executing the query, Lucene's search method performs a
>> rewrite:
>>
>>
>> https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java#L514
>>
>> This causes a rerank query (like LTR's) to rewrite its main query, saving
>> it internally:
>>
>>
>> https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/search/AbstractReRankQuery.java#L100
>>
>> Then this is what's put into the cache
>>
>> So later on, when the same query comes in, in its non-rewritten form, its
>> never seen in the cache
>>
>> Am I missing something? It would appear that any query where its child
>> queries get rewritten would never get cached by Solr?
>>
>> Thanks
>> -Doug
>>
>


Re: Apache Solr Newsletter proposal September 2023

2023-10-18 Thread David Smiley
Sounds good Alejandro.  You are defining the protocol (with collaborative
input).  We have freedom to pretty much do as we feel is best.  It's not a
voted artifact the project/PMC produces (which would have rules /
ceremony).  If you want to "publish" the newsletter at a certain date &
time that is approaching, I recommend that you share a reminder of this a
couple days or so in advance so we could have a last chance to edit /
discuss.  Maybe by default the deadline is simply the last hour of the last
day of the month, and you might allow another day or two for you to clean
it up as you see fit?  If it's not substantial enough, tell us you have
made this decision and propose a new possible date so that we know.

I think actual publishing will be simple -- rename the title, provide a
date on the page, and send/publish it wherever you wish.

~ David


On Wed, Oct 18, 2023 at 2:39 AM Arrieta, Alejandro <
aarri...@perrinsoftware.com> wrote:

> Hello Team,
>
> > I'm not very comfortable over-publicizing what I view as a first draft of
> >the first edition.
> Sure, no problem. Nothing shared.
>
> >What is 10-20-30 about?
> The number of news/links available for that newsletter. We need to decide
> on the minimum news (non-relevant, like a new release) to publish the
> newsletter. If there is not enough news, we roll over them to next month.
>
> >At first you seemed super fixed on monthly, here you seem more
> >flexible.
> If we do not set an internal monthly release goal, it will continue to next
> month. But if there is not enough/relevant news for one month, nothing to
> do, move to next month. That is why I did not mention "monthly newsletter"
> in my draft/proposal, but instead, to start working on the October edition.
>
> > BTW I just created
> >https://cwiki.apache.org/confluence/display/SOLR/Newsletters as a parent
> >for the newsletters.
> Awesome. :-)
>
> We need to establish how a draft/proposal becomes official. Maybe there is
> already a default protocol, like for the release candidates.
> Send the draft/link to wiki to dev the first day of each month and wait YY
> hours for votes. If we do not have enough news for that month, inform the
> dev and move all news to next month.
>
> Note: I am just a member of the Solr community trying to help. I may not be
> familiar with all the protocols or politics involved. :-)
>
> Kind Regards,
> Alejandro Arrieta
>
> On Tue, Oct 17, 2023 at 7:49 PM David Smiley 
> wrote:
>
> > On Tue, Oct 17, 2023 at 1:01 PM Arrieta, Alejandro <
> > aarri...@perrinsoftware.com> wrote:
> >
> > >
> > > Editing:  I agree with David's comments.
> > > I would add only one thing. The September edition is finished/over
> unless
> > > something is wrong with it and needs to be fixed. Draft or not draft
> that
> > > is the September edition.  It is better to use our limited time
> preparing
> > > the October/November
> >
> > newsletter and make it better than the previous edition.
> > >
> >
> > This is more about periodicity (covered below) than it is about editing.
> >
> >
> > > I will share the September newsletter as Draft/Proposal on my social
> > media
> > > later today to make Solr noise :-)
> > >
> > > Publishing:
> > > Anywhere it can be indexed by the googles and bings, and
> PC/Tablet/Mobile
> > > can access it. Solr Wiki is fine.
> > > We need to make noise about it everywhere: LinkedIn, Twitter/X,
> Mastodon,
> > > mailing lists, ASF social media.
> > > Here is one example of a newsletter from another project:
> > > https://opensearch.org/blog/opensearch-newsletter-vol1-issue1/
> >
> >
> > There's something to be said for not dating a newsletter, as I see there
> in
> > the URL of that one.  Flexibility.
> >
> >
> > > Brian told me they can share on ASF social media accounts what we need
> to
> > > share, the newsletter, or a link to the newsletter. ASF publishes at
> > least
> > > once per one or two weeks project news on Linkedin. We need to send him
> > > what and when to share it.
> > >
> >
> > I'm not very comfortable over-publicizing what I view as a first draft of
> > the first edition.
> >
> > I propose that *EITHER* (A): more text be put into the September
> > newsletter, not to change the past, only to elaborate what happened in
> the
> > past :-).  Then share it widely.  In its current form, I think it's too
> > draft-like to be our first inaugural newsletter.  It will get attention;
> > it's our first one.  I want to impress people and not have Solr appear
> > second rate simply because you ran out of time Alejandro.  OR (B) let's
> > make October's newsletter be the awesome one and that which will be
> shared
> > widely.  Not September's newsletter.  That is what I would most prefer;
> we
> > can announce both a new Solr version and also take the opportunity to
> > highlight topics discussed at Community-over-Code.  WDYT?
> >
> >
> > > Periodicity:
> > > -If we have enough news/links to share on the release date (first three
> > > days of the month), the threshold

Re: The _version_ field; why is it necessary?

2023-10-18 Thread Houston Putman
I believe its still useful for TLOG replicas as well. When they gain
leadership, and they replay the TLOG which could have the same issues that
non leader NRT replicas have.

- Houston

On Wed, Oct 18, 2023 at 8:26 AM David Smiley 
wrote:

> Thank you both.  It helps to know that "_version"_ is for, I would say
> succinctly, "NRT replication".  I mean; that deserves to be said internally
> in some places!
> Might it be advantageous to imagine it being optional for non-NRT
> replicas?  I'm not sure if it saves anything or reduces complexity
> anywhere.
> Related question:  Is the VersionInfo (with its striped VersionBucket
> locks) related to this -- is it a vestige of "_version_" or is it for
> something else?  If it isn't for something else, then I could imagine it
> being omitted for non-NRT; maybe a dummy implementation.  BTW Bruno opened
> an issue/PR on it yesterday --
> https://issues.apache.org/jira/browse/SOLR-17036
>
> ~ David
>
>
> On Wed, Oct 18, 2023 at 1:41 AM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
> > Fyi, SOLR-5944, is unreadable, but introduced the concept of previous
> > version or something like that.
> >
> > On Wed, 18 Oct, 2023, 10:35 am Mark Miller, 
> wrote:
> >
> > > The primary reason is as Ishan says - so that update reorders from
> leader
> > > to replica can be handled in both normal and failure cases.
> > >
> > > It’s also true that a part of the reason that the per document, NRT
> > design,
> > > with versions, was chosen was a desire to support per document
> optimistic
> > > concurrency.
> > >
> > > On Tue, Oct 17, 2023 at 11:37 PM Ishan Chattopadhyaya <
> > > ichattopadhy...@gmail.com> wrote:
> > >
> > > > Also DBQs use the version field to ensure they are applied correctly,
> > > even
> > > > if a DBQ is reordered
> > > >
> > > > On Wed, 18 Oct, 2023, 10:05 am Ishan Chattopadhyaya, <
> > > > ichattopadhy...@gmail.com> wrote:
> > > >
> > > > > To ensure reordered updates are processed properly from leader to
> > other
> > > > > replicas in NRT replication mode.
> > > > >
> > > > > On Wed, 18 Oct, 2023, 9:55 am David Smiley, 
> > > wrote:
> > > > >
> > > > >> Question: Does the _version_ field have a purpose other than for
> > > "atomic
> > > > >> updates"?
> > > > >> I know SolrCloud and/or having an UpdateLog insists on it.  But I
> > > don't
> > > > >> know if it's for that feature alone, or for additional non-obvious
> > > > >> internal
> > > > >> workings of SolrCloud.  Mostly I'm just asking to have a deeper
> > > > >> understanding; the field doesn't bother me.  If someone knows of
> any
> > > > docs
> > > > >> on it or old interesting JIRAs to read, I'd appreciate it.
> > > > >>
> > > > >> ~ David Smiley
> > > > >> Apache Lucene/Solr Search Developer
> > > > >> http://www.linkedin.com/in/davidwsmiley
> > > > >>
> > > > >
> > > >
> > >
> >
>


Re: Apache Solr Newsletter proposal September 2023

2023-10-18 Thread Alessandro Benedetti
I like this initiative and I love the fact we have a volunteer helping on
this, is it clear what are the next steps concretely?
Should we summarise them to have then the relevant actors to act?
I'm a bit busy after the conference (and another coming) but I can try to
help if needed.

Cheers
--
*Alessandro Benedetti*
Director @ Sease Ltd.
*Apache Lucene/Solr Committer*
*Apache Solr PMC Member*

e-mail: a.benede...@sease.io


*Sease* - Information Retrieval Applied
Consulting | Training | Open Source

Website: Sease.io 
LinkedIn  | Twitter
 | Youtube
 | Github



On Wed, 18 Oct 2023 at 15:37, David Smiley  wrote:

> Sounds good Alejandro.  You are defining the protocol (with collaborative
> input).  We have freedom to pretty much do as we feel is best.  It's not a
> voted artifact the project/PMC produces (which would have rules /
> ceremony).  If you want to "publish" the newsletter at a certain date &
> time that is approaching, I recommend that you share a reminder of this a
> couple days or so in advance so we could have a last chance to edit /
> discuss.  Maybe by default the deadline is simply the last hour of the last
> day of the month, and you might allow another day or two for you to clean
> it up as you see fit?  If it's not substantial enough, tell us you have
> made this decision and propose a new possible date so that we know.
>
> I think actual publishing will be simple -- rename the title, provide a
> date on the page, and send/publish it wherever you wish.
>
> ~ David
>
>
> On Wed, Oct 18, 2023 at 2:39 AM Arrieta, Alejandro <
> aarri...@perrinsoftware.com> wrote:
>
> > Hello Team,
> >
> > > I'm not very comfortable over-publicizing what I view as a first draft
> of
> > >the first edition.
> > Sure, no problem. Nothing shared.
> >
> > >What is 10-20-30 about?
> > The number of news/links available for that newsletter. We need to decide
> > on the minimum news (non-relevant, like a new release) to publish the
> > newsletter. If there is not enough news, we roll over them to next month.
> >
> > >At first you seemed super fixed on monthly, here you seem more
> > >flexible.
> > If we do not set an internal monthly release goal, it will continue to
> next
> > month. But if there is not enough/relevant news for one month, nothing to
> > do, move to next month. That is why I did not mention "monthly
> newsletter"
> > in my draft/proposal, but instead, to start working on the October
> edition.
> >
> > > BTW I just created
> > >https://cwiki.apache.org/confluence/display/SOLR/Newsletters as a
> parent
> > >for the newsletters.
> > Awesome. :-)
> >
> > We need to establish how a draft/proposal becomes official. Maybe there
> is
> > already a default protocol, like for the release candidates.
> > Send the draft/link to wiki to dev the first day of each month and wait
> YY
> > hours for votes. If we do not have enough news for that month, inform the
> > dev and move all news to next month.
> >
> > Note: I am just a member of the Solr community trying to help. I may not
> be
> > familiar with all the protocols or politics involved. :-)
> >
> > Kind Regards,
> > Alejandro Arrieta
> >
> > On Tue, Oct 17, 2023 at 7:49 PM David Smiley 
> > wrote:
> >
> > > On Tue, Oct 17, 2023 at 1:01 PM Arrieta, Alejandro <
> > > aarri...@perrinsoftware.com> wrote:
> > >
> > > >
> > > > Editing:  I agree with David's comments.
> > > > I would add only one thing. The September edition is finished/over
> > unless
> > > > something is wrong with it and needs to be fixed. Draft or not draft
> > that
> > > > is the September edition.  It is better to use our limited time
> > preparing
> > > > the October/November
> > >
> > > newsletter and make it better than the previous edition.
> > > >
> > >
> > > This is more about periodicity (covered below) than it is about
> editing.
> > >
> > >
> > > > I will share the September newsletter as Draft/Proposal on my social
> > > media
> > > > later today to make Solr noise :-)
> > > >
> > > > Publishing:
> > > > Anywhere it can be indexed by the googles and bings, and
> > PC/Tablet/Mobile
> > > > can access it. Solr Wiki is fine.
> > > > We need to make noise about it everywhere: LinkedIn, Twitter/X,
> > Mastodon,
> > > > mailing lists, ASF social media.
> > > > Here is one example of a newsletter from another project:
> > > > https://opensearch.org/blog/opensearch-newsletter-vol1-issue1/
> > >
> > >
> > > There's something to be said for not dating a newsletter, as I see
> there
> > in
> > > the URL of that one.  Flexibility.
> > >
> > >
> > > > Brian told me they can share on ASF social media accounts what we
> need
> > to
> > > > share, the newsletter, or a link to the newsletter. ASF publishes at
> > > least
> > > > once per one or two weeks project news on Linkedin. We need to send
> him
> 

How about using JDK 21 in the official docker image?

2023-10-18 Thread Tomasz Elendt
I noticed that JDK 21 LTS was released some time ago. Is there any reason why 
official docker images still use JDK 17?

I'm asking because I know there are some preview JDK features that Lucene 
utilizes and Solr enables them when it detects a newer version (e.g. 
SOLR-16500).

Does it make sense to switch now that there is a new LTS version?

Cheers,
Tomasz
-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: How about using JDK 21 in the official docker image?

2023-10-18 Thread Houston Putman
Might be a good thing to do for Solr 9.5.

We hit some issues when moving 9.0 to JDK 17, which caused some trouble
and weren't particularly easy to fix without causing some compatibility
issues.
However, we are better equipped now to do quick fixes in the dockerfile
that don't require a bug-fix release of Solr.

I say +1 and target for 9.5, just making sure the change is represented in
the Solr upgrade notes.

- Houston

On Wed, Oct 18, 2023 at 12:11 PM Tomasz Elendt 
wrote:

> I noticed that JDK 21 LTS was released some time ago. Is there any reason
> why official docker images still use JDK 17?
>
> I'm asking because I know there are some preview JDK features that Lucene
> utilizes and Solr enables them when it detects a newer version (e.g.
> SOLR-16500).
>
> Does it make sense to switch now that there is a new LTS version?
>
> Cheers,
> Tomasz
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: How about using JDK 21 in the official docker image?

2023-10-18 Thread Tomasz Elendt
Of that helps I can build a custom image myself, upgrade a few clusters in
my company and report back if there are any issues.

(Of course our deployment might not necessarily cover all edge cases.)

On Wed, Oct 18, 2023, 18:25 Houston Putman  wrote:

> Might be a good thing to do for Solr 9.5.
>
> We hit some issues when moving 9.0 to JDK 17, which caused some trouble
> and weren't particularly easy to fix without causing some compatibility
> issues.
> However, we are better equipped now to do quick fixes in the dockerfile
> that don't require a bug-fix release of Solr.
>
> I say +1 and target for 9.5, just making sure the change is represented in
> the Solr upgrade notes.
>
> - Houston
>
> On Wed, Oct 18, 2023 at 12:11 PM Tomasz Elendt 
> wrote:
>
> > I noticed that JDK 21 LTS was released some time ago. Is there any reason
> > why official docker images still use JDK 17?
> >
> > I'm asking because I know there are some preview JDK features that Lucene
> > utilizes and Solr enables them when it detects a newer version (e.g.
> > SOLR-16500).
> >
> > Does it make sense to switch now that there is a new LTS version?
> >
> > Cheers,
> > Tomasz
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >
>


Re: How about using JDK 21 in the official docker image?

2023-10-18 Thread Houston Putman
That would definitely help. Create a JIRA issue, then post your findings
after y'all have tried it out!

- Houston

On Wed, Oct 18, 2023 at 12:35 PM Tomasz Elendt 
wrote:

> Of that helps I can build a custom image myself, upgrade a few clusters in
> my company and report back if there are any issues.
>
> (Of course our deployment might not necessarily cover all edge cases.)
>
> On Wed, Oct 18, 2023, 18:25 Houston Putman  wrote:
>
> > Might be a good thing to do for Solr 9.5.
> >
> > We hit some issues when moving 9.0 to JDK 17, which caused some trouble
> > and weren't particularly easy to fix without causing some compatibility
> > issues.
> > However, we are better equipped now to do quick fixes in the dockerfile
> > that don't require a bug-fix release of Solr.
> >
> > I say +1 and target for 9.5, just making sure the change is represented
> in
> > the Solr upgrade notes.
> >
> > - Houston
> >
> > On Wed, Oct 18, 2023 at 12:11 PM Tomasz Elendt 
> > wrote:
> >
> > > I noticed that JDK 21 LTS was released some time ago. Is there any
> reason
> > > why official docker images still use JDK 17?
> > >
> > > I'm asking because I know there are some preview JDK features that
> Lucene
> > > utilizes and Solr enables them when it detects a newer version (e.g.
> > > SOLR-16500).
> > >
> > > Does it make sense to switch now that there is a new LTS version?
> > >
> > > Cheers,
> > > Tomasz
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > > For additional commands, e-mail: dev-h...@solr.apache.org
> > >
> > >
> >
>