Re: JAX-RS APIs in Solr

2021-12-05 Thread David Smiley
Just a simple +1 of support to modernization efforts in general.  It's
encouraging to see that Jason & Eric had some fun together on this.
Modernization, I think, helps with the fun of any open-source project, and
thus helps keep everyone interested in continuing and reviewing interest in
Solr.  If we/others feel we can't make fundamental changes, then I think
our interests (and that of contributors) will wane.  Personally, I really
enjoy refactoring codebases, even if it may not seem sexy to some.

I don't think we can rush into a SIP before more research/POC is tried.
It's too abstract at this stage.  We don't even know what framework to use
yet ;-)


Re: BadApple annotation removed from Lucene 10.x

2021-12-05 Thread David Smiley
Thanks for the heads-up!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Dec 1, 2021 at 12:05 PM Adrien Grand  wrote:

> Hello Solr devs,
>
> This is a heads up that the BadApple annotation has been removed from
> Lucene 10.x since it wasn't used in Lucene. I know some Solr tests are
> using it, so you will need to add it on the Solr side when upgrading
> to Lucene 10. See https://issues.apache.org/jira/browse/LUCENE-10253.
>
> --
> Adrien
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Updated docker images for the Log4j CVE

2021-12-11 Thread David Smiley
A contributor to docker-solr has a straight-forward patch[1] setting the
system property that remediates the Log4j2 vulnerability.  I plan to test
this, merge it, and publish these.  This will update all our images that
you can see here: https://hub.docker.com/_/solr (spans 5x thru 8x).  If
anyone has concerns that I shouldn't do this now then let me know.

[1] https://github.com/docker-solr/docker-solr/pull/396

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


[ANNOUNCEMENT] Solr's Docker images were updated to remediate a CVE

2021-12-12 Thread David Smiley
Apache Solr's Docker images were updated some hours ago with a simple
remediation to avoid the Log4j 2 vulnerability[1] that many of you are
becoming aware of -- Log4j 2 CVE-2021-44228.
Just a "docker pull solr:tagVersionYouUse"  (e.g. 8.11 or whatever) will
update it for you.  The remediation in these updated images was simply
setting a Java system property to disable this misfeature of Log4j 2.  If
you have your own custom Docker image, you can easily do likewise, e.g. by
customizing the command to run the image to have an additional argument[2]
(a common remediation for other affected images).  To have confidence that
this was done correctly, log into your Solr admin screen and see the "Args"
section and look for
"-Dlog4j2.formatMsgNoLookups=true".

This is sufficient, but understand that vulnerability scanners will
continue to report that Solr's images are vulnerable because they can't
realistically know if Solr's configuration (e.g. via this system property)
defeats the problem.  It's possible the Solr project may retroactively
update these images in the future for this reason.

[1]
https://solr.apache.org/security.html#apache-solr-affected-by-apache-log4j-cve-2021-44228
[2] https://www.docker.com/blog/apache-log4j-2-cve-2021-44228/

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Risks of Log4j 2 with the Prometheus Exporter?

2021-12-12 Thread David Smiley
Just a simple question here -- does the Prometheus Exporter present a risk
for the Log4j 2 vulnerability?  It was added to the news page but
instinctively I don't see how an attacker might exploit it.  If it's not
expected to be a concern, I think we should state so in the news; no reason
to raise undue alarm bells.  Maybe we should remove it.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Lucene/Solr 8.11.1 release

2021-12-13 Thread David Smiley
Looks good; thanks Jan!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Dec 13, 2021 at 9:34 AM Jan Høydahl  wrote:

> Including Lucene dev as well.
>
> As I see no Lucene level bug fixes for 8.11.1, I have prepared an "empty"
> release announcement:
> https://cwiki.apache.org/confluence/display/LUCENE/ReleaseNote8_11_1
> Please edit as you see fit.
>
> The Solr announcement is slightly updated, proof-read welcome
> https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote8_11_1
>
> There are now 18 CHANGES entries for Solr:
> https://github.com/apache/lucene-solr/blob/branch_8_11/solr/CHANGES.txt
>
> Jan
>
> > 13. des. 2021 kl. 02:24 skrev Jan Høydahl :
> >
> > There seems to be no open blockers for 8.11.1, so I'll proceed with RC1
> soon.
> > Shout out if you want me to wait for a specific important bugfix.
> >
> > Please also review the Release Notes at
> https://cwiki.apache.org/confluence/display/SOLR/ReleaseNote8_11_1
> >
> > Jan
> >
> >> 8. des. 2021 kl. 02:48 skrev Timothy Potter :
> >>
> >> agreed! thanks for stepping up to be the RM Jan ;-)
> >>
> >> On Tue, Dec 7, 2021 at 6:05 PM Jan Høydahl 
> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Solr has 13 bug fixes lined up in branch_8_11 already. Lucene has no
> changes.
> >>> Now that Lucene 9.0 is out the door (congrats!), let's do the 8.11.1
> release.
> >>>
> >>> I volunteer as RM and propose a first RC on Monday 13th. That should
> allow some time to merge any last bug fixes both for Lucene and Solr.
> >>> Please feel free to backport bug fixes with your own judgement (no new
> features please). If you are uncertain whether a backport is "safe", please
> raise a question here.
> >>>
> >>> I'll re-enable Jenkins for branch_8_11 now. Commits that fix or
> @BadApples unstable tests highly appreciated.
> >>>
> >>> Jan
> >>>
> >>>
> >>> -
> >>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> >>> For additional commands, e-mail: dev-h...@solr.apache.org
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> >> For additional commands, e-mail: dev-h...@solr.apache.org
> >>
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Risks of Log4j 2 with the Prometheus Exporter?

2021-12-13 Thread David Smiley
Correct.  I just reviewed occurrences of log.info, log.warn etc. and it's
all boring stuff that definitely doesn't take user input.

I'm going to remove this from the news in my PR:
https://github.com/apache/solr-site/pull/54

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Dec 13, 2021 at 7:07 PM Cassandra Targett 
wrote:

> Can someone explain why it’s no risk & can’t be exploited? Because it
> doesn’t take input?
> On Dec 12, 2021, 4:26 PM -0600, Uwe Schindler , wrote:
>
> +1
>
> I was wondering about this, too. It makes mitigation too complex. There is
> no risk in the exporter script. Just mention this as a single sentence.
>
> Possibly also add the sentence u declining the importance and why in my
> previous message on private list.
>
> Am 12. Dezember 2021 22:16:38 UTC schrieb David Smiley :
>
>>
>> Just a simple question here -- does the Prometheus Exporter present a
>> risk for the Log4j 2 vulnerability?  It was added to the news page but
>> instinctively I don't see how an attacker might exploit it.  If it's not
>> expected to be a concern, I think we should state so in the news; no reason
>> to raise undue alarm bells.  Maybe we should remove it.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>
>


Re: Risks of Log4j 2 with the Prometheus Exporter?

2021-12-13 Thread David Smiley
I created a new one actually: https://github.com/apache/solr-site/pull/55

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Dec 13, 2021 at 7:39 PM David Smiley  wrote:

> Correct.  I just reviewed occurrences of log.info, log.warn etc. and it's
> all boring stuff that definitely doesn't take user input.
>
> I'm going to remove this from the news in my PR:
> https://github.com/apache/solr-site/pull/54
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Dec 13, 2021 at 7:07 PM Cassandra Targett 
> wrote:
>
>> Can someone explain why it’s no risk & can’t be exploited? Because it
>> doesn’t take input?
>> On Dec 12, 2021, 4:26 PM -0600, Uwe Schindler , wrote:
>>
>> +1
>>
>> I was wondering about this, too. It makes mitigation too complex. There
>> is no risk in the exporter script. Just mention this as a single sentence.
>>
>> Possibly also add the sentence u declining the importance and why in my
>> previous message on private list.
>>
>> Am 12. Dezember 2021 22:16:38 UTC schrieb David Smiley <
>> dsmi...@apache.org>:
>>>
>>> Just a simple question here -- does the Prometheus Exporter present a
>>> risk for the Log4j 2 vulnerability?  It was added to the news page but
>>> instinctively I don't see how an attacker might exploit it.  If it's not
>>> expected to be a concern, I think we should state so in the news; no reason
>>> to raise undue alarm bells.  Maybe we should remove it.
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>> --
>> Uwe Schindler
>> Achterdiek 19, 28357 Bremen
>> https://www.thetaphi.de
>>
>>


Re: Inventory updates via join query and caches

2021-12-19 Thread David Smiley
I'm not sure there is a clean/simple solution to this specific problem.
But I could imagine a more general & simple feature that could solve this
scenario, with just a bit more work by the user.

Imagine an optional cache-key on ExtendedQuery auto-parsed, perhaps with
local-param "cacheKey".  It would wrap any Query with one having a special
equals & hashcode on this key.  Solr wouldn't parse the string for
this query so long as it can look it up in a special cache of these.  That
special cache would be Map with weak values such that if it's
not used anymore (e.g. not in the filter cache), it would be GC'ed.  This
would be useful for expensive queries that might resolve from some
network location (e.g. access control filters that refer to data in
who-knows-where).  So that's useful on its own but doesn't solve your
conundrum.  Then, imagine some new request handler that allows you to
provide this key & query and have it perform a filter cache save,
overwriting whatever entry that may have been there.  You could even do
this in a newSearcher event on the inventory core, calling into the primary
product core.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Dec 14, 2021 at 4:24 PM Mikhail Khludnev  wrote:

> Hello, Colleagues.
> I want to discuss one frequent usecase: inventory updates.
> Let's say we can't reindex docs when inventory numbers updated. We can put
> inventory in separate index, and apply fq={!join ..
> fromIndex=inventory}left:(0 TO *]. Once it's cached in main index filter
> cache it gets a good response time. We can even shard main collection, but
> keep inventory single shard. Ok.
> The sad moment occurs when commit goes into inventory core, after searcher
> is refreshed it's going to be cache misses on those inventory queries, and
> many of them go into new inventory searcher. That's not good. I can think
> of two workarounds:
>  - relax {!join} equality regarding fromIndex timestamp, so for some time
> it will be outdated inventory, but it's ok. And then we need to somehow,
> evict, invalidate, regenerate inventory filter
>  - newSearcher listener in inventory core can introspect main core cache
> entries find {!join .. fromIndex=inventory}... regenerate and insert
> results.
> I'm afraid to think about queryResult cache.
>
> Is it worth to have something like this in Solr distro?
>
> --
> Sincerely yours
> Mikhail Khludnev
>


Re: Planning for a World without Java Security Manager

2021-12-19 Thread David Smiley
What is this warning message?
Regardless, bin/solr could detect that this scenario is going to occur and
print a message of its own so that users have better context on the
situation.

In other ways, we are investing in securing Solr.  Modularization comes to
my mind first.  And I really wish for a dev vs prod mode to gate better
defaults but no action there yet :-/.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Dec 17, 2021 at 5:22 PM Marcus Eagan  wrote:

> Hi,
>
> As a part of the Log4j madness we all have dealt with, I learned of
> JEP-411(https://openjdk.java.net/jeps/411). There is a wish to deprecate
> the Security Manager in Java 17 for eventual removal. I feel it is likely
> to land. As a result, I think we should start to think about what it means
> to run SOLR without the option of a Security Manager for SOLR 10 (or
> whatever the next major version will be named). I know that people can turn
> it off today if they wish to do so.
>
> Is it premature to have this discussion?
>
> I suggest it is not too early because there is a proposed warning message
> on startup of an application with Security Manager. The message alone could
> cause problems for some organizations using SOLR and lead them to abandon
> the project. Instead, there would need to be a multi-person effort to
> ensure that other countermeasures are sufficient and/or added to protect
> SOLR users from more pernicious and pervasive threats in today's world and
> the future. Enabling the Security Manager by default in SOLR was a good
> future-proofing measure for today's reality.
>
> Thank you all for your contributions,
>
> --
> Marcus Eagan
>
>


Re: Log4j < 2.15.0 may still be vulnerable even if -Dlog4j2.formatMsgNoLookups=true is set

2021-12-21 Thread David Smiley
(switching to dev@solr.apache.org; the O.P. unfortunatelysent this to
Lucene)

BTW I'm having a good conversation[1] with Ralph Goers on the Log4j2 PMC
about the efficacy of log4j2.formatMsgNoLookups.  So far I've learned
nothing that concerns me and I feel better in fact about other apps using
this mitigation.
[1]: https://lists.apache.org/thread/kgh63sncrsm2bls884pg87mnt8vqztmz

I think we should update our security news to reference this conversation
for those that want to dig deeper as evidence.  The fact that Log4j's
security page refers to this technique as "discredited" puts us in a
position where we have to acknowledge this word on their part and defend
ourselves so it's clear our guidance came out *after* there's, and that we
are confident.
Yes and link to the Wiki's discredited list; linking to it.  I'll get on
that.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, Dec 19, 2021 at 4:26 PM David Smiley  wrote:

> I like the idea of using our Wiki more as you describe.Not so much
> *new* news entries because I think search-ability of these CVEs is fine to
> an existing entry.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sat, Dec 18, 2021 at 4:39 PM Gus Heck  wrote:
>
>> Thinking about it some more, maybe the problem with my suggestion is
>> the table on that page is organized by the library version and, if
>> unmitigated, the version of the library is still a problem. Maybe another
>> way to be clearer about it and avoid rewriting things that people have
>> already read would be to add independent entries to the security news page
>> for the newer CVE's
>>
>> On Sat, Dec 18, 2021 at 12:20 PM Gus Heck  wrote:
>>
>>> I think perhaps in the shock of such a deep and surprising vulnerability
>>> with such high visibility, we've begun to break with how we normally handle
>>> CVE's that don't apply to our usage of the library. Previously, they just
>>> got added to the list of known false positives
>>> <https://cwiki.apache.org/confluence/display/SOLR/SolrSecurity#SolrSecurity-SolrandVulnerabilityScanningTools>.
>>> Normally we wouldn't even mention them on the security news page, but
>>> because of the high visibility we should simply have a line mentioning that
>>> these two CVE's are on our false positives page and explain details there.
>>> The wiki would provide revision history automatically.
>>>
>>> On Sat, Dec 18, 2021 at 11:25 AM Jan Høydahl 
>>> wrote:
>>>
>>>> We make edits to the log4j advisory almost daily, see
>>>> https://github.com/apache/solr-site/commits/e10a6a9fe0eed8dcba3ad1a076c8208e014e76ff/content/solr/security/2021-12-10-cve-2021-44228.md
>>>> I wonder if we should include a "Revision history" paragraph in the
>>>> advisory for transparency?
>>>>
>>>> Jan
>>>>
>>>> 15. des. 2021 kl. 19:09 skrev Uwe Schindler :
>>>>
>>>> Hi all, I prepared a PR about the followup CVE-2021-45046:
>>>> https://github.com/apache/solr-site/pull/59
>>>>
>>>> Please verify and make suggestion. I will merge this into
>>>> main/production later.
>>>>
>>>> Uwe
>>>>
>>>> -
>>>> Uwe Schindler
>>>> Achterdiek 19, D-28357 Bremen
>>>> https://www.thetaphi.de
>>>> eMail: u...@thetaphi.de
>>>>
>>>> *From:* Uwe Schindler 
>>>> *Sent:* Wednesday, December 15, 2021 3:31 PM
>>>> *To:* 'd...@lucene.apache.org' 
>>>> *Subject:* RE: Log4j < 2.15.0 may still be vulnerable even if
>>>> -Dlog4j2.formatMsgNoLookups=true is set
>>>>
>>>> We should add this to the webpage. Another one asked on the security
>>>> mailing list.
>>>>
>>>> Uwe
>>>>
>>>> -
>>>> Uwe Schindler
>>>> Achterdiek 19, D-28357 Bremen
>>>> https://www.thetaphi.de
>>>> eMail: u...@thetaphi.de
>>>>
>>>> *From:* Gus Heck 
>>>> *Sent:* Wednesday, December 15, 2021 12:39 AM
>>>> *To:* dev 
>>>> *Subject:* Re: Log4j < 2.15.0 may still be vulnerable even if
>>>> -Dlog4j2.formatMsgNoLookups=true is set
>>>>
>>>> Perhaps we could tweak it to say that the system property fix is
>>>> sufficient *for Solr* (i.e. not imply that it is a valid work around for
>>>>

Re: Solr 9.0.0 release in February

2021-12-21 Thread David Smiley
Thanks for volunteering to be the RM!

No comment on the timeline; I'm in denial of the time flying.  Log4shell
and all that.

Let's go to Lucene 9.1 and not 9.0.  I'm seeing a massive change to
lucene-test-framework in 9.1 on it's way that IMO ought to have been done
in 9.0.  Going right to 9.1 averts issues there for Solr users writing
plugins.

You're right RE blockers -- it's always tough to let go of our
ideals/hopes/dreams on what we want 9.0 to be.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Dec 21, 2021 at 10:57 AM Jan Høydahl  wrote:

> Hi,
>
> Solr's next feature release will be 9.0 (as 8x is in bugfix mode).
> Let's not even think about hacking an 8.12 release based on lucene-solr 8x
> branch. It will be ugly.
>
> The "Solr 9.0 release blockers" thread
> <https://lists.apache.org/thread/m7k2gvgxldkns7jqjnw1ghhqx7s3tpl1> was
> started exactly 2 months ago to try to prepare us. But we're moving slowly.
> The same happened for Lucene, until the 9.0 release :) So I'll start the
> train right now...
>
> I propose the following rough roadmap:
>
>
>- *December*: Cut branch_9x next week and enter feature freeze on that
>branch
>- *January*: Remove blockers, prepare build & release machinery,
>including Docker
>- *February*: Cut branch_9_0 and build RC1 - branch_9x is again
>re-opened for new features
>
>
> I volunteer as RM.
>
> Wrt blockers, we need to be tough on ourselves and ask the question "Is it
> possible to release 9.0 without this?"..
> At the end of January we should have only a few real blockers left, that
> are all actively in progress.
> The delay between branch_9x and branch_9_0 is to avoid having to backport
> everything twice during the hardening phase.
>
> Jan
>
>


Re: Planning for a World without Java Security Manager

2021-12-23 Thread David Smiley
I created one now: https://issues.apache.org/jira/browse/SOLR-15875 In the
comment, I suggest this probably should be a SIP, and that there are
possibly conflicting/redundant ideas (yet may be complementary?) in
SOLR-14049.  So, discussion is definitely necessary.  That's really the
point of a SIP anyway -- forcing a discussion on major decisions.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Dec 22, 2021 at 10:36 PM Marcus Eagan  wrote:

> It doesn't seem that bad, yet I know some people will freak. According to the 
> proposal, it will say this:
>
>
> WARNING: A command line option has enabled the Security Manager
> WARNING: The Security Manager is deprecated and will be removed in a future 
> release
>
>
> I think the modularization goal is great, and I feel the same way for dev and 
> prod. Is there a ticket for dev and prod modes. I think I could schedule time 
> to do that
>
>
> On Sun, Dec 19, 2021 at 3:22 PM David Smiley  wrote:
>
>> What is this warning message?
>> Regardless, bin/solr could detect that this scenario is going to occur
>> and print a message of its own so that users have better context on the
>> situation.
>>
>> In other ways, we are investing in securing Solr.  Modularization comes
>> to my mind first.  And I really wish for a dev vs prod mode to gate better
>> defaults but no action there yet :-/.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Fri, Dec 17, 2021 at 5:22 PM Marcus Eagan 
>> wrote:
>>
>>> Hi,
>>>
>>> As a part of the Log4j madness we all have dealt with, I learned of
>>> JEP-411(https://openjdk.java.net/jeps/411). There is a wish to
>>> deprecate the Security Manager in Java 17 for eventual removal. I feel it
>>> is likely to land. As a result, I think we should start to think about what
>>> it means to run SOLR without the option of a Security Manager for SOLR 10
>>> (or whatever the next major version will be named). I know that people can
>>> turn it off today if they wish to do so.
>>>
>>> Is it premature to have this discussion?
>>>
>>> I suggest it is not too early because there is a proposed warning
>>> message on startup of an application with Security Manager. The message
>>> alone could cause problems for some organizations using SOLR and lead them
>>> to abandon the project. Instead, there would need to be a multi-person
>>> effort to ensure that other countermeasures are sufficient and/or added to
>>> protect SOLR users from more pernicious and pervasive threats in today's
>>> world and the future. Enabling the Security Manager by default in SOLR was
>>> a good future-proofing measure for today's reality.
>>>
>>> Thank you all for your contributions,
>>>
>>> --
>>> Marcus Eagan
>>>
>>>
>
> --
> Marcus Eagan
>
>


Re: 7.7.x-mas

2021-12-25 Thread David Smiley
Users have a valid mitigation that is easy to apply (that sys prop =true),
and they could upgrade Log4j themselves if they are extra paranoid (e.g.
corp mandates, which I am familiar with). So I think no further action by
our project is necessary.


(Merry Christmas to you all)

On Fri, Dec 24, 2021 at 11:11 AM Shawn Heisey  wrote:

> On 12/24/2021 5:12 AM, Jan Høydahl wrote:
> > Merry Christmas to all fellow committers and the wider community!
> >
> > If there are no plans of (quickly) releasing a 7.7.4 with all known
> vulnerabilities fixed, I propose we publish a statement that 7.x is
> officially not supported and urge users to upgrade to 8.11.
>
> I agree.  7.x is in maintenance mode until 9.0 is released, and users
> have a few options for a workaround.  If patching and recompiling were
> the only option for users to fix the problem themselves, then I think we
> would need to make a new release.
>
> Thanks,
> Shawn
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
> --
Sent from Gmail Mobile


Re: Solr 9.0.0 release in February

2022-01-03 Thread David Smiley
I completely agree with Houston; let's not create branch_9_0 yet.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jan 3, 2022 at 11:36 AM Houston Putman 
wrote:

> I think its fine to start with just branch_9x until we are ready to
> actually do the release, even if it is unconventional for our processes.
> There’s  no need to have a branch_9_0 until there are actual reasons that
> 9x and 9_0 would differ (i.e. 9.0.0 is ready to be released and people want
> to add things for 9.1.0).
>
> On Mon, Jan 3, 2022 at 10:31 AM Jan Høydahl  wrote:
>
>> Happy New Year everyone!
>>
>> According to my initial mail it's now time to cut branch_9x. However, I'm
>> in the middle of some build and build-script tuning, so it may delay a few
>> days more.
>>
>> I'm also wondering whether it's better to cut both branch_9x and well as
>> branch_9_0 so everyone can continue adding features for 9.1, with the cost
>> of having to do another backport for every fix that is targeted for 9.0.
>> Will it be confusing to treat branch_9x as a feature-frozen release-branch
>> for all of January?
>>
>> Jan
>>
>> 21. des. 2021 kl. 20:03 skrev David Smiley :
>>
>> Thanks for volunteering to be the RM!
>>
>> No comment on the timeline; I'm in denial of the time flying.  Log4shell
>> and all that.
>>
>> Let's go to Lucene 9.1 and not 9.0.  I'm seeing a massive change to
>> lucene-test-framework in 9.1 on it's way that IMO ought to have been done
>> in 9.0.  Going right to 9.1 averts issues there for Solr users writing
>> plugins.
>>
>> You're right RE blockers -- it's always tough to let go of our
>> ideals/hopes/dreams on what we want 9.0 to be.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Tue, Dec 21, 2021 at 10:57 AM Jan Høydahl 
>> wrote:
>>
>>> Hi,
>>>
>>> Solr's next feature release will be 9.0 (as 8x is in bugfix mode).
>>> Let's not even think about hacking an 8.12 release based on lucene-solr
>>> 8x branch. It will be ugly.
>>>
>>> The "Solr 9.0 release blockers" thread
>>> <https://lists.apache.org/thread/m7k2gvgxldkns7jqjnw1ghhqx7s3tpl1> was
>>> started exactly 2 months ago to try to prepare us. But we're moving slowly.
>>> The same happened for Lucene, until the 9.0 release :) So I'll start the
>>> train right now...
>>>
>>> I propose the following rough roadmap:
>>>
>>>
>>>- *December*: Cut branch_9x next week and enter feature freeze on
>>>that branch
>>>- *January*: Remove blockers, prepare build & release machinery,
>>>including Docker
>>>- *February*: Cut branch_9_0 and build RC1 - branch_9x is again
>>>re-opened for new features
>>>
>>>
>>> I volunteer as RM.
>>>
>>> Wrt blockers, we need to be tough on ourselves and ask the question "Is
>>> it possible to release 9.0 without this?"..
>>> At the end of January we should have only a few real blockers left, that
>>> are all actively in progress.
>>> The delay between branch_9x and branch_9_0 is to avoid having to
>>> backport everything twice during the hardening phase.
>>>
>>> Jan
>>>
>>>
>>


Re: [Solr] does not use the filterCache

2022-01-03 Thread David Smiley
Daniele,

The filter cache contains unsorted lists of docs; an entry ultimately needs
to be sorted to what the user wants.  The score in particular requires
actually running the query, at which point there isn't a point in using the
filter cache.  Well sort of; I could imagine a hybrid to visit only the
matching docs but that would add complexity.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jan 3, 2022 at 2:30 PM Daniele Antuzi 
wrote:

> Hi Mikhail,
> Thanks for your reply.
> Probably I wasn't clear enough, actually, in the piece of code I pointed
> out
> <https://github.com/apache/solr/blob/c2db3a943e665cfb39e9ea53640be40cf2c09fbc/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1387-L1398>,
> the searcher decides whether to use (or not use) the filterCache by setting
> the boolean *useFilterCache*.
>
> The searcher will use the filterCache in the search only if
>
>- the filterCache exists
>- AND the flags *GET_SCORES* and *NO_CHECK_FILTERCACHE* are not set
>- AND the parameter *useFilterForSortedQuery* is true (by default is
>false and I don't really understand why)
>- AND the sort is not null
>- AND none of the sort clause contains the score
>
> If I don't mistaken, if the sort is null the resultset is sorted by the
> score.
> So, if the resultset is sorted implicitly or explicitly by score, the
> searcher does not use the filterCache. Does everyone know why?
>
>
>
> Il giorno lun 3 gen 2022 alle ore 16:50 Mikhail Khludnev 
> ha scritto:
>
>> Hi, Adrien. Thanks for forwarding this.
>> Daniele, you pointed to the code which bypasses Lucene searching and just
>> sorts cached docset.
>> Applying filter before searching is done by getProcessedFilter()
>> https://github.com/apache/solr/blob/c2db3a943e665cfb39e9ea53640be40cf2c09fbc/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L956
>>
>> Happy New Year!
>>
>> On Mon, Jan 3, 2022 at 5:12 PM Adrien Grand  wrote:
>>
>>> Hi Daniele,
>>>
>>> This is the Lucene dev list, I'm redirecting your question to
>>> dev@solr.apache.org.
>>>
>>> On Fri, Dec 31, 2021 at 5:35 PM Daniele Antuzi 
>>> wrote:
>>> >
>>> > Hi,
>>> > I was taking a look at the Solr searcher to see how the filterCache is
>>> used:
>>> https://github.com/apache/solr/blob/c2db3a943e665cfb39e9ea53640be40cf2c09fbc/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L1379-L1398
>>> > Reading the code, it turned out that the filterCache is not used if
>>> the sort contains the score or if we don't have any score specified (by
>>> default, it sorts by score).
>>> > As far as I know, the filterCache contains an unordered set of
>>> documents so the sort must be calculated after the application of the
>>> filter query.
>>> > Then, also the score should be computed after the filter query to have
>>> a smaller set of documents.
>>> > That being said, I don't understand why Solr does not use the
>>> filterCache if the score is somehow involved in the sort.
>>> > In theory, it can
>>> >
>>> > apply the filter query reducing the number of result
>>> > computes the score
>>> > sort the results
>>> >
>>> > Am I missing something?
>>> >
>>> > Happy new year,
>>> > Daniele
>>> >
>>>
>>>
>>> --
>>> Adrien
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>>
>>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>>
>


Re: Mirroring the later 8.x release tags in the "new" split repositories

2022-01-04 Thread David Smiley
+1 to Houston's proposal.  Given all the release tags seen here:
https://github.com/apache/solr/tags it makes sense that it would include
the tag for 8.11 and the others we're missing.  I think this is a really
easy decision as it's weird/inconsistent that these particular versions are
omitted yet the many older ones exist.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jan 4, 2022 at 4:01 PM Houston Putman  wrote:

> Dawid,
>
> I did mean that we should be pushing the tags as well as their associated
> commits. I was unaware that you could push the tags without the commits,
> sorry if I caused confusion there.
>
> Jan,
>
> Looking in the diff between the "history/branches/lucene-solr/branch_8x"
> tag in apache/solr and the current "branch_8_11" in apache/lucene-solr,
> there is around 12 MB of commits to add. This is a rough estimate, but it
> should be close enough.
>
> The best approximation I have of the apache solr repository is that it's
> size is around 400 MB. So adding these tags/refs would cause a 3% increase
> in the size of the repo. The lucene repo is a little larger currently, but
> the new tag sizes should be identical.
>
> - Houston
>
> On Tue, Jan 4, 2022 at 3:36 PM Jan Høydahl  wrote:
>
>> We have edit history ever since the earliest svn commits, we just lack a
>> years worth of commits from the latest 8.x versions, so from a traceability
>> view it makes sense, instead of having to look in two repos. Do you know
>> how much weight it will add to a full clone?
>>
>> Jan Høydahl
>>
>> > 4. jan. 2022 kl. 21:01 skrev Dawid Weiss :
>> >
>> > 
>> >>
>> >> You can push a tag to a repo that doesn't already have that commit (or
>> history of commits)
>> > in an existing branch, without issue.
>> >
>> > But why do it? These are refs - if they point to non-existing commits
>> > then I honestly don't see any value in having them. It would
>> > confuse the hell out of me.
>> >
>> >> They are separate projects, but with a shared history. I'd like to be
>> able to go to the apache/solr github
>> > and be able to go through the history of a file in different release
>> > versions, even if that specific release happened
>> > under apache/lucene-solr.
>> >
>> > This is a different requirement, actually. If Solr (or Lucene) would
>> > like to keep such a history then I think it should just fetch those
>> > release refs and all the commits leading to them. Since these projects
>> > share a common root, there is nothing to prevent this from happening.
>> > Then tags point at actual revisions and everything makes sense.
>> >
>> > This does not change the fact that I don't really see much value in
>> > doing all this.
>> >
>> > Dawid
>> >
>> >> On Tue, Jan 4, 2022 at 8:30 PM Houston Putman 
>> wrote:
>> >>
>> >> They don't have those commits, but they also don't have the commits
>> for the
>> >> previous release tags in the repo. You can go to any of the release
>> tags, choose
>> >> a commit to view and you will get a message saying:
>> >>
>> >>>
>> >>> This commit does not belong to any branch on this repository,
>> >>> and may belong to a fork outside of the repository.
>> >>
>> >>
>> >> You can push a tag to a repo that doesn't already have that commit (or
>> history of commits)
>> >> in an existing branch, without issue.
>> >>
>> >> They are separate projects, but with a shared history. I'd like to be
>> able to go to the apache/solr github
>> >> and be able to go through the history of a file in different release
>> versions, even if that specific release happened
>> >> under apache/lucene-solr.
>> >>
>> >> - Houston
>> >>
>> >>> On Tue, Jan 4, 2022 at 2:02 PM Dawid Weiss 
>> wrote:
>> >>>
>> >>>> As mentioned in SOLR-15874, we are not hosting the tags for the
>> latest 8.x releases in the split apache/solr and apache/lucene
>> repositories. All release tags made prior to the repository split exist in
>> the new repos, so I see no reason that the newer 8.x tags cannot exist in
>> the new repos as well.
>> >>>
>> >>> I'm not sure I understand - to create a tag you'd need that particular
>> >>> commit - the "new" repositories for each project don't have those
>> >>> commits (and arguably shouldn't have since they're, well, separate
>> >>> projects now).
>> >>>
>> >>> Dawid
>> >>>
>> >>> -
>> >>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> >>> For additional commands, e-mail: dev-h...@solr.apache.org
>> >>>
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> > For additional commands, e-mail: dev-h...@solr.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>>
>>


Re: Mirroring the later 8.x release tags in the "new" split repositories

2022-01-06 Thread David Smiley
Removing the old tags is valid too.  But the current state is
confusing/inconsistent and something should be done.  Thanks for raising
this Houston.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 6, 2022 at 8:56 AM Uwe Schindler  wrote:

> I agree with Dawid, why the hell do we need those tags? The old
> lucene-solr repo can stay forever on Github. If I want to checkout an older
> version,  I would go into the old repo and check it out. In fact that’s
> also what tools may do, because the old git repo is stated in the pom.xml
> files (or similar).
>
>
>
> I would rather go and nuke the tags (not the commits of course) from new
> repo for everything before 9.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19, D-28357 Bremen
>
> https://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Dawid Weiss 
> *Sent:* Wednesday, January 5, 2022 8:17 AM
> *To:* Lucene Dev 
> *Cc:* Solr Dev 
> *Subject:* Re: Mirroring the later 8.x release tags in the "new" split
> repositories
>
>
>
>
>
> I did mean that we should be pushing the tags as well as their associated
> commits.
>
>
>
> You can even edit them by hand so you can definitely have references
> pointing at void...
>
>
>
> I already expressed my opinion on the matter but I won't object if you
> wish to do it. The problem I see is that it's really easy to break things
> in a catastrophic way by force-pushing refs or by pushing refs that
> shouldn't be copied - it's not hard, but it's easy to make a mistake. I'd
> try out ten times on a bare test clone somewhere before actually doing it
> on the target git repository.
>
>
>
> But it is definitely doable. Git repositories are conceptually very simple
> - just a graph of commits and tags/ labels.
>
>
>
> Dawid
>


Propose Solr 9 *Docker* image use Java 17

2022-01-06 Thread David Smiley
I'd like to propose that our Docker image for Solr 9 move from Java 11 to
Java 17.  Admittedly I don't have any familiarity with running 17, so I
would really like to hear from those of you using it.  I'm guessing
(informed from some quick google searches) there are some ~minor
performance improvements but nothing eye-popping there.  Mostly, I propose
this because a 9.0 release is an ideal time to make such a change instead
of some minor release in between that could introduce a subtle surprise for
some users.  The new Shenandoah GC looks exciting but may not be
sufficiently ready for us to recommend (if I recall from a recent user who
reported a problem with it) -- and that's okay.  Having this as an option
for users is great, especially as time progresses and future Docker Solr
releases include minor updates to the JVM base image that will increase the
viability.

I'm aware our nifty image building enables people to do a custom build to
specify their own preferred FROM image, which is cool.  Still, I think we
should move on to 17 as the default.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: New branch and feature freeze for Solr 9.0.0

2022-01-06 Thread David Smiley
What do we think about a "beta" release, to give users (including
*ourselves* in many cases) more time to try out 9.0 to report issues? I
don't think a beta release would necessitate a typical feature freeze.  If
we ultimately decline on a beta release, a counter-proposal would be to
promote our nightly docker images everywhere (solr-users list, twitter,
Slack) to solicit feedback.

It would be a shame to release Solr 9 without support for the vector based
index in Lucene 9.  Thankfully there's a JIRA issue with a PR!
https://issues.apache.org/jira/browse/SOLR-15880 .  It's as much about
optics as anything.  I think many users are probably more at a curiosity /
exploratory stage with this topic but still -- Solr 9 without the ability
to explore this is disappointing, leaving them to consider other options to
scratch that itch.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 6, 2022 at 2:11 PM Timothy Potter  wrote:

> thanks Jan, PR looks good now! 😀
>
>
> On Thu, Jan 6, 2022 at 11:52 AM Jan Høydahl  wrote:
>
>> False alarm, I had a dirty checkout.
>> Please see if your PR passes precommit.
>>
>> Jan
>>
>> > 6. jan. 2022 kl. 19:49 skrev Jan Høydahl :
>> >
>> > Tim, I pushed a change to gradle that now uses hardcoded 9.0.0 for
>> tests.luceneMatchVersion. That's a stop-gap, will make it dynamically
>> follow the current lucene-version, but somehow my gradle project picked up
>> an old version of org.apache.lucene.utils.Version class...
>> >
>> > Now I get a new error
>> >
>> > * What went wrong:
>> > Execution failed for task ':validateSourcePatterns'.
>> >> Found 10 violations in source files (@author javadoc tag, svn keyword,
>> tabs instead spaces).
>> >
>> > Jan
>> >
>> >> 6. jan. 2022 kl. 17:53 skrev Timothy Potter :
>> >>
>> >> Thanks for the update Jan!
>> >>
>> >> One of my PRs (sync'd with main) is now failing precommit with:
>> >>
>> >> 105 actionable tasks: 103 executed, 2 up-to-date
>> >> 201FAILURE: Build failed with an exception.
>> >> 202
>> >> 203* Where:
>> >> 204Script
>> '/home/runner/work/solr/solr/gradle/validation/solr.config-file-sanity.gradle'
>> >> line: 40
>> >> 205
>> >> 206* What went wrong:
>> >> 207Execution failed for task ':solr:validateConfigFileSanity'.
>> >> 208> Configset does not refer to the correct luceneMatchVersion
>> >> (10.0.0):
>> /home/runner/work/solr/solr/solr/server/solr/configsets/_default/conf/solrconfig.xml
>> >> 209
>> >>
>> >> Any ideas what's wrong there?
>> >>
>> >> On Thu, Jan 6, 2022 at 9:40 AM Jan Høydahl 
>> wrote:
>> >>>
>> >>> NOTICE:
>> >>>
>> >>> Branch branch_9_x has been cut and versions updated to 10.0 on 'main'
>> branch.
>> >>>
>> >>> This follows the plan from previous notice about 9.0 release [1].
>> Here is what will happen:
>> >>>
>> >>> Today: Cut branch_9x and enter feature freeze on that branch
>> >>> Next few weeks: Remove blockers, prepare build & release machinery
>> >>> February: Cut branch_9_0 and build RC1
>> >>>
>> >>> This is how we'll use the branches until we cut the branch_9_0
>> release-branch:
>> >>>
>> >>> main: All new features and bug fixes (as today)
>> >>> branch_9x: Only backport of bugfixes and blockers for the 9.0 release.
>> >>>
>> >>>
>> >>> FAQ:
>> >>> --
>> >>> Q: Where do I put a feature intended for 9.1?
>> >>> A: On main branch. Then in February, bulk backport to branch_9x
>> >>>
>> >>> Q: Can we go to Java17 on main branch now?
>> >>> A: Not yet, let's keep main branch as-is until branch_9_0 is cut, to
>> easen backporting
>> >>>
>> >>> Q: But my feature is almost ready and low-risk, I can surely put it
>> on branch_9x ?
>> >>> A: No, only blockers and bugfixes please. You can argue on dev@ that
>> your feature is a blocker
>> >>>
>> >>> Q: How can I help with the 9.0 release?
>> >>> A: You can check out the JIRA for blockers [2] and help fix those
>> >>>
>> >>> Q: Why do we need to wait until February with cutting the release
>> branch?
>> >>> A: We don't - if blockers are resolved and we feel close to RC1
>> before then...
>> >>>
>> >>>
>> >>> [1] https://lists.apache.org/thread/qv9n2b7jkmzr26ov5p50lc3h2dy7htzo
>> >>> [2] https://issues.apache.org/jira/issues/?filter=12351219
>> >>
>> >> -
>> >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> >> For additional commands, e-mail: dev-h...@solr.apache.org
>> >>
>> >
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>>
>>


Re: Propose Solr 9 *Docker* image use Java 17

2022-01-06 Thread David Smiley
And to those of you who may not know, our Docker Solr image for Solr 8 uses
Java 11 even though Solr 8 supports Java 8.  Solr 9 increases to require
Java 11 (not Java 17) and I'm proposing only bumping the Docker-Solr
default accordingly upwards (newer).  In a container-ized world, I think
picking the most recent LTS (which is currently Java 17) should be our
standard practice because the onus on upgrading is on *us*, unlike classic
bare metal where upgrading effort is on the user.  Users have ask for this:
https://github.com/docker-solr/docker-solr/issues/231 (3 people +1'ed my
proposal to move to Java 17 at this juncture)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 6, 2022 at 4:03 PM David Smiley  wrote:

> I'd like to propose that our Docker image for Solr 9 move from Java 11 to
> Java 17.  Admittedly I don't have any familiarity with running 17, so I
> would really like to hear from those of you using it.  I'm guessing
> (informed from some quick google searches) there are some ~minor
> performance improvements but nothing eye-popping there.  Mostly, I propose
> this because a 9.0 release is an ideal time to make such a change instead
> of some minor release in between that could introduce a subtle surprise for
> some users.  The new Shenandoah GC looks exciting but may not be
> sufficiently ready for us to recommend (if I recall from a recent user who
> reported a problem with it) -- and that's okay.  Having this as an option
> for users is great, especially as time progresses and future Docker Solr
> releases include minor updates to the JVM base image that will increase the
> viability.
>
> I'm aware our nifty image building enables people to do a custom build to
> specify their own preferred FROM image, which is cool.  Still, I think we
> should move on to 17 as the default.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>


Disable metrics reporting to JMX

2022-01-09 Thread David Smiley
I noticed Solr auto-creates a metrics SolrJmxReporter if there is a
platform "MBeanServer" that exists, which AFAICT is always.  Thanks?  Ehh,
no thanks.  It's not evident how to disable JMX after some fruitless google
searches.  Don't get me wrong, I like jconsole, jvisualvm, JFR etc and I
think some of these things may rely on JMX but I don't particularly need
Solr to expose its metrics to these tools ever since Solr gained pretty
excellent /admin/metrics support that is easier to get at.

I see Solr's code that makes this decision in
SolrXmlConfig.getMetricReporterPluginInfos and I could see that I could
enhance it with a few lines of code to check pluginInfo.isEnabled().  Thus
to disable JMX reporting, one would configure it with the enable="false"
XML attribute.  Or maybe we just remove the automatic enablement.

BTW what's driving me to look at this is that there is some time spent
registering and unregistering SolrCore level metrics to JMX when SolrCores
are loaded and unloaded, and logs to this effect likewise.  Not a big deal
but it's something.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Disable metrics reporting to JMX

2022-01-10 Thread David Smiley
Thanks for sharing that Matthias; point taken.  I know it's useful for some
users, it isn't going away.

I filed a JIRA issue: https://issues.apache.org/jira/browse/SOLR-15905
It's debatable wether's solr.xml should come with it enabled by default or
not.  I don't have a strong opinion there.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jan 10, 2022 at 10:22 AM Matthias Krueger  wrote:

> JMX has its issues but we should be aware that it currently provides a
> relatively generic and easy to use integration point and is used by, for
> example, the Datadog Solr integration:
> https://github.com/DataDog/integrations-core/tree/master/solr (and maybe
> others).
> On 10.01.22 14:30, Andrzej Białecki wrote:
>
> I agree, we should disable it by default (8x probably still needs to
> enable it by default for back-compat?)
>
>
> On 10 Jan 2022, at 13:56, Eric Pugh 
> wrote:
>
> Seems like if folks are not using it as much, maybe it should be disabled
> by default?
>
> In SOLR-15887 I removed the  from the solrconfig.xml files, and
> added a commented out setup in solr.xml:
>
> https://github.com/apache/solr/blob/main/solr/server/solr/solr.xml#L61
>
> I wonder if it should be NOT commented out, but enabled=“false” ?   Or, if
> it isn’t enabled, then that would imply that JMX reporting would be
> disabled?
>
> Or am I misunderstanding how org.apache.solr.metrics.reporters.SolrJmxReporter
> works?
>
>
>
> On Jan 9, 2022, at 7:40 PM, Mark Miller  wrote:
>
> JMX is really a toy metric system and comes with potential security
> concerns that have to be considered and managed over time.
>
> The cost in the case you are seeing has also been potentially much worse
> in the past - a variety of expensive metrics are now cached I believe - but
> as it iterated over each objects metrics it would rapidly gather all of the
> metrics for the object once for each metric the object had. If you had many
> large cores, each with many index files for example, this was not good to
> say the least.
>
> I would certainly not want to be exposed to these types of things when I
> was not using the metrics or using the more scalable and logical metrics
> api.
>
>   *Mark Miller* - Chat @ Spike
> <https://spikenow.com/r/a/?ref=spike-organic-signature&_ts=1dg8vz> [image:
> 1dg8vz]
>
> On January 9, 2022 at 22:46 GMT, David Smiley  wrote:
>
>
> I noticed Solr auto-creates a metrics SolrJmxReporter if there is a
> platform "MBeanServer" that exists, which AFAICT is always.  Thanks?  Ehh,
> no thanks.  It's not evident how to disable JMX after some fruitless google
> searches.  Don't get me wrong, I like jconsole, jvisualvm, JFR etc and I
> think some of these things may rely on JMX but I don't particularly need
> Solr to expose its metrics to these tools ever since Solr gained pretty
> excellent /admin/metrics support that is easier to get at.
>
> I see Solr's code that makes this decision in
> SolrXmlConfig.getMetricReporterPluginInfos and I could see that I could
> enhance it with a few lines of code to check pluginInfo.isEnabled().  Thus
> to disable JMX reporting, one would configure it with the enable="false"
> XML attribute.  Or maybe we just remove the automatic enablement.
>
> BTW what's driving me to look at this is that there is some time spent
> registering and unregistering SolrCore level metrics to JMX when SolrCores
> are loaded and unloaded, and logs to this effect likewise.  Not a big deal
> but it's something.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> ___
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>
>


Re: New branch and feature freeze for Solr 9.0.0

2022-01-10 Thread David Smiley
Major releases are special, not like minor releases.  More people will look
at what's neat in a major release than a minor one.  It's a de facto PR
event for the project to *potentially* shine.  If Lucene or whatever
project didn't use them to its advantage years ago then it was a missed
opportunity.  Also, just as importantly, it's a rare event to make
backwards-incompatible changes.  So I don't think we should release a major
version simply because someone is willing to go through the procedures to
do so.  Of course this is motivated by 8x being in feature freeze, which we
are somewhat beholden to because of our unfortunate wedlock with Lucene
that is not yet fully severed.  Yes, we all knew this time was coming and
we were all busy and didn't get to some of the things we all wanted to do
-- (heavy sigh).  Personally, a metaphorical fire is under my butt to try
to make the changes in Solr I want, and I hope you all are mustering the
time to do likewise.  I'm not sure if they will all make Jan's deadlines or
not.  Scant few are marked release blockers because in my mind it's more of
a best-effort.

A beta version would help here -- a "real" release that may not have all of
the changes we want to make yet -- be they cool things like vector search
or backwards-incompatible changes.  It could also give us a reasonable
excuse to ship some features and have them be refined subsequently.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jan 10, 2022 at 3:32 AM Jan Høydahl  wrote:

> Hi Cassandra,
>
> I browsed through the confluence page and the Antora project a bit. Looks
> like many good reasons for doing the move, and the new structure at
> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-antora/solr/HEAD/index.html
> looks very promising.
> I'll not stand in the way of this for 9.0. It is not a code change
> jeopardizing Solr stability. The only thing it may challenge is open PRs
> that need editing of reorganized refguide files, but that will happen
> anyway.
>
> I also see that there is lots of work left wrt deployment, multi-version,
> UI part, a non-released gradle plugin etc. I see Hoss has been involved,
> hope others can lend a hand too.
> I see you note that it may be hard to build the guide locally, so perhaps
> this is the time to let the official guide be built by Jenkins triggered on
> every commit to release-branches, that changes the folder?
> I hope local workflow won't be bogged down if ref-guide needs to build 10
> versons back from different branches every time? Or perhaps there could be
> a separate antora playbook for single-version build?
> Multi-version support is cool, but is there a way to add some static links
> in that dropdown menu that would take you to historic online versions such
> as 8.x?
>
> I'll certainly assist in updating release process in whatever way
> necessary.
>
> Jan
>
> 9. jan. 2022 kl. 17:09 skrev Cassandra Targett :
>
> One thing we haven’t talked about yet is the Ref Guide for 9.0.
>
> SOLR-1 re-organized the entire guide and changed a lot of page names.
> At the time I merged this into the main branch there was a little bit of
> comment about trying to provide page redirects for the changed page names
> but AFAIK no one has worked on that yet at all (SOLR-15557).
>
> Then I embarked on trying to move us to use Antora instead of Jekyll (
> https://cwiki.apache.org/confluence/display/SOLR/Antora+Migration+Notes),
> which when complete will dramatically change page URL paths. It would be
> sensible and less disruptive for users if we only make page name/path
> changes once, but it isn’t ready yet.
>
> I could try to make a big push to get it done, but I will need some help,
> and of course it is a major change so sort of technically violates the
> intent of code freeze so I won’t kill myself if folks are going to balk at
> some point in February if it does actually get done in time.
> On Jan 9, 2022, 7:56 AM -0600, Jan Høydahl , wrote:
>
> Hi,
>
> This is quite simple.
> - The 9.0 release is on-going - now on branch_9_0
> - The normal feature feeze rules apply. Anything explicitly approved as
> blocker gets in
>
> We can still consider adding features to the branch if the benfit/risk
> ratio is high enough. Having a long stabilization period also helps. But
> please ask before merging.
>
> The discussion whether we have "enough features" is i.m.o. silly, that's
> not how it works, we release as often as possible, and major versions
> annually. But this time around we wanted to wait for Lucene 9 which was
> also delayed. Now, after almost 2 years on 8.x we have Java11, Lucene 9,
> gradle build, embedded Docker i

Quarterly Committer Meetings

2022-01-10 Thread David Smiley
Hello everyone,

I would like to propose that we have Solr committer online meetings, as we
did sporadically previously, but henceforth quarterly & scheduled in
advance.  I enjoyed seeing all your faces, complementing each other on our
fine work, and getting down to business of discussing the evolution of
Solr.  Mike D's proposal to share complements was awesome, from last time!
I think seeing each other is really helpful for the group of us on multiple
levels.  If we schedule them in advance with predetermined organizers
and maybe even the time, they will happen and not be forgotten.  It should
also be less work to plan if it's automatic.  Credit on this idea is shared
with Eric Pugh.

I volunteer to do the next 4 of them, and to use the Google Meet platform.
I can record them for the benefit of anyone who can't make it.

Let's start next week on Thursday, January 20th; okay?  I routinely meet
with colleagues between Europe and US West coast, and I think
noon probably is the balance between the two.  In India, this is at
10:30pm.  If you would prefer I do the doodle.com thing to find the best
coordinated time; I can do that.

https://www.timeanddate.com/worldclock/fixedtime.html?msg=Solr+Committer+Meeting&iso=20220120T11&p1=43&ah=1


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Quarterly Committer Meetings

2022-01-11 Thread David Smiley
Woops; wrong URL.  I meant what I said -- noon EST.
Correct URL
https://www.timeanddate.com/worldclock/fixedtime.html?msg=Solr+Committer+Meeting&iso=20220120T12&p1=43&ah=1

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jan 11, 2022 at 9:47 AM Mike Drob  wrote:

> David,
>
> Thank you for taking the initiative! Happy New Year!
>
> In your text you propose noon, but the time link is for 11am Boston Time.
> Which did you mean?
>
> Mike
>
> On Mon, Jan 10, 2022 at 8:36 PM David Smiley  wrote:
>
>> Hello everyone,
>>
>> I would like to propose that we have Solr committer online meetings, as
>> we did sporadically previously, but henceforth quarterly & scheduled in
>> advance.  I enjoyed seeing all your faces, complementing each other on our
>> fine work, and getting down to business of discussing the evolution of
>> Solr.  Mike D's proposal to share complements was awesome, from last time!
>> I think seeing each other is really helpful for the group of us on multiple
>> levels.  If we schedule them in advance with predetermined organizers
>> and maybe even the time, they will happen and not be forgotten.  It should
>> also be less work to plan if it's automatic.  Credit on this idea is shared
>> with Eric Pugh.
>>
>> I volunteer to do the next 4 of them, and to use the Google Meet
>> platform.  I can record them for the benefit of anyone who can't make it.
>>
>> Let's start next week on Thursday, January 20th; okay?  I routinely meet
>> with colleagues between Europe and US West coast, and I think
>> noon probably is the balance between the two.  In India, this is at
>> 10:30pm.  If you would prefer I do the doodle.com thing to find the best
>> coordinated time; I can do that.
>>
>>
>> https://www.timeanddate.com/worldclock/fixedtime.html?msg=Solr+Committer+Meeting&iso=20220120T11&p1=43&ah=1
>>
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>


Fails: PackageManagerCLITest

2022-01-11 Thread David Smiley
org.apache.solr.cloud.PackageManagerCLITest
Is usually failing:
http://fucit.org/solr-jenkins-reports/history-trend-of-recent-failures.html#series/org.apache.solr.cloud.PackageManagerCLITest.testPackageManager
Based on a quick, look, it seems that the package manager is calling
System.exit, which isn't allowed by the Lucene's test SecurityManager that
we're using.  The relevant code was recently touched by Jan but the
System.exit's have been there for a while.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Modularizing Solr with new contrib packages

2022-01-12 Thread David Smiley
All awesomeness!

Speaking of modularization:
* https://issues.apache.org/jira/browse/SOLR-15904 Move SQLHandler to a
contrib/module/package
-- just a JIRA issue; I don't have time for this one now.
* https://issues.apache.org/jira/browse/SOLR-14660


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Jan 12, 2022 at 10:31 AM Jan Høydahl  wrote:

> Hi,
>
> I just did an attempt to lift out the JWT auth plugin from solr-core into
> its own contrib [1] and it wasn't too hard.
> I think it gives much better insight into the dependency situation and
> nice to have a separate solr-jwt-auth-9.0.0.jar
> This is also a first step towards converting it to a proper package, this
> needs to be done first in any case.
>
> I think there are lots of pieces of code in solr-core that can easily be
> extracted the same way.
> Some perhaps even for 9.0.0, as it slims down the core and reduces attack
> surface for most users as well.
>
> To aid in the process I hacked a python tool that scaffolds a new contrib
> module [2].
> Go give it a spin and see where YOU can un-bloat Solr-core today :)
>
> Related to this I also suggest [3] to make it easier to add contribs to
> classpath when starting Solr. I think users would love it :)
> That was inspired by solrOptions.solrModules in Solr's helm-chart for
> Kubernetes [4]
>
> [1] https://github.com/apache/solr/pull/518
> [2] https://github.com/apache/solr/pull/519
> [3] https://issues.apache.org/jira/browse/SOLR-15914
> [4] https://artifacthub.io/packages/helm/apache-solr/solr#running-solr
>
> Jan
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: Modularizing Solr with new contrib packages

2022-01-12 Thread David Smiley
(my previous email was accidentally sent; it was incomplete!)

Speaking of modularization:
* https://issues.apache.org/jira/browse/SOLR-15904 "Move SQLHandler to a
contrib/module/package" -- just a JIRA issue; I don't have time for this
one now.
* https://issues.apache.org/jira/browse/SOLR-14660 "Migrating HDFS into a
package" -- the contributor messaged me a couple days ago and is committed
to this one; no ETA.  Also maybe it should be all of Hadoop related stuff
(expands scope to some fancy authentication like Kerberos, which
confusingly also uses Hadoop libs).
* https://issues.apache.org/jira/browse/SOLR-15342 "Separate out a
SolrJ-Zookeeper module" -- I'm working with a colleague on this one. I
anticipate something by the end of the week.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Jan 12, 2022 at 10:31 AM Jan Høydahl  wrote:

> Hi,
>
> I just did an attempt to lift out the JWT auth plugin from solr-core into
> its own contrib [1] and it wasn't too hard.
> I think it gives much better insight into the dependency situation and
> nice to have a separate solr-jwt-auth-9.0.0.jar
> This is also a first step towards converting it to a proper package, this
> needs to be done first in any case.
>
> I think there are lots of pieces of code in solr-core that can easily be
> extracted the same way.
> Some perhaps even for 9.0.0, as it slims down the core and reduces attack
> surface for most users as well.
>
> To aid in the process I hacked a python tool that scaffolds a new contrib
> module [2].
> Go give it a spin and see where YOU can un-bloat Solr-core today :)
>
> Related to this I also suggest [3] to make it easier to add contribs to
> classpath when starting Solr. I think users would love it :)
> That was inspired by solrOptions.solrModules in Solr's helm-chart for
> Kubernetes [4]
>
> [1] https://github.com/apache/solr/pull/518
> [2] https://github.com/apache/solr/pull/519
> [3] https://issues.apache.org/jira/browse/SOLR-15914
> [4] https://artifacthub.io/packages/helm/apache-solr/solr#running-solr
>
> Jan
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Prometheus Exporter; new location?

2022-01-12 Thread David Smiley
The inevitable rename of the "contribs" module to something else (
https://issues.apache.org/jira/browse/SOLR-15917 ) will be a time for us to
move the prometheus exporter somewhere else, as it is not a module/package
within Solr; it's an independent service with its own dependencies; one of
which is SolrJ.  It does not depend on Solr-core or such internals as it
once did.

I just want to list some options here; perhaps there is another or a
variation.  I suppose the last option is maybe the best but ultimately
someone would need to step up to the task.  If nobody steps up (not me!),
then (A) will get done; it's very easy.

(A) The least effort is merely to move it somewhere else in our directory
structure.  If it's still under /solr, like /solr/prometheus-exporter, then
definitely very minor effort & impact.

(B) More effort is to move to the top level of our source repo to distance
itself further from the Solr server.  But that probably means it would not
be in our distribution, which also means it would not be in Solr's Docker
image?  We could write a Dockerfile easily enough, I'm more unsure of how
to publish it and how much effort that is.

(C) Even more effort is to outright move it to a new ASF git repo; to
arrange for CI (or just rely on GitHub Automation?).  I'm unfamiliar with
the effort in all that.  I could help with extracting source history in
initializing the git repo so we don't lose that.  There is also the need to
add a Dockerfile and to publish it.  The beauty of this is that it can have
its own release cycle!  That means very few releases in practice.  Although
it means extra work for actually doing these releases (voting,
release-manager steps, publishing), instead of a free ride on the Solr
release train.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Modularizing Solr with new contrib packages

2022-01-12 Thread David Smiley
Shawn:
* RE redundancies of stuff in /dist/, see
https://issues.apache.org/jira/browse/SOLR-15916
* RE "contrib" vs "module" vs "package", see:
https://issues.apache.org/jira/browse/SOLR-15917
* RE not shipping these extras with the Solr distribution, see: "slim
distro" mention in the document "Solr first party packages"
https://docs.google.com/document/d/1n7gB2JAdZhlJKFrCd4Txcw4HDkdk7hlULyAZBS-wXrE/edit?usp=sharing

It could very well be worth shipping two docker images in the meantime.
Or maybe a zip of each module could be a separate artifact that is
published?  I'm not sure what freedoms we have to do this in the ASF.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Jan 12, 2022 at 8:21 PM Shawn Heisey  wrote:

> On 1/12/2022 8:31 AM, Jan Høydahl wrote:
> > I think there are lots of pieces of code in solr-core that can easily be
> extracted the same way.
> > Some perhaps even for 9.0.0, as it slims down the core and reduces
> attack surface for most users as well.
>
> I think it would be really awesome if we had a core download that only
> included basic functionality, and all the other fancy things that Solr
> does now out of the box (as well as those that are contrib) could be
> added after download via package scripting or just additional downloads.
>
> The size of solr-8.11.1.tgz is 207MiB, or 218076598 bytes.  The .zip
> version is slightly larger.  8.0.0 was 163MiB, 7.0.0 was 142MiBm, 6.0.0
> was 131MiB, and 1.4.1 was 53.7MiB.  I think it's insane that the
> download is so big ... and a lot of what makes it big are things that
> the vast majority of our users will never use.
>
> Large reductions in the overall size of the main download would be
> possible by putting hadoop, calcite, some of the really large lucene
> analysis components, and the contrib stuff into packages.  The
> extraction contrib alone is 43.5MiB compressed in zip format.
>
> I would suggest moving zookeeper and its dependencies as well, but I
> think we probably want SolrCloud to be part of base functionality.
>
> Some of the large jars are included for what are probably insignificant
> usages, and I wonder if that functionality could be replaced by newer
> native functions available in Java 8 and later.  I am eyeballing things
> like guava and the commons-* jars here, but I am sure there are other
> things in this category.  I'd like to eliminate as many dependencies as
> we can.
>
> Extracting some things from the solr-core jar into other jars sounds
> like a really awesome idea.
>
> I don't think the solr-core jar should be in the dist directory.  It's
> useless by itself, because it will still have a LOT of dependencies even
> if we shrink it.  And there are likely other things in the dist
> directory that fall into that category.  The test framework and its
> dependencies are a good candidate for removal.
>
> By removing some of the low-hanging fruit that I am SURE isn't needed
> for base binary functionality on the 8.11.1 download, I was able to end
> up with a .zip file sized in at 60.4MiB, and I am sure at least a little
> bit of further reduction is possible if we can fully map out
> dependencies.  I think we can leverage gradle to provide some dependency
> info.
>
> Exactly how to organize the code repo to create divided artifacts is
> something that we would need to think about.  My initial idea is
> changing "contrib" to "package" and then making some new directories
> under package.
>
> Thanks,
> Shawn
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: Propose changing the "dist" layout

2022-01-12 Thread David Smiley
(now to dev@solr.apache.org not lucene)
I'm just re-surfacing an older conversation where it is time for us to take
action.
Some JIRAs were recently created, and a separate thread about the
prometheus-exporter

Full thread in ponymail:
https://lists.apache.org/thread/jbs4ds0w3r3v1hto9rqhs4qq1xfk5z61

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, Jun 13, 2021 at 7:31 AM Jan Høydahl  wrote:

> When we collapse the solr/solr structure I hope we can keep the number of
> top-level folders in git to a minimum. There are too many already, so
> adding all contribs don’t help.
>
> I hope the cobtribs will be converted to 1st party packages soon, so
> perhaps “plugins” or “packages” is a good top level folder name?
>
> Those that are not yet converted can stay in “contribs”?
>
> Can we make solr-exporter a separate git repo? With separate artifacts and
> separate docker image. Don’t know if that means it must also be a full sub
> project?
>
> Jan Høydahl
>
> 11. jun. 2021 kl. 22:46 skrev David Smiley :
>
> 
> We (all?) agree to do away with "contrib" :-).
> I think a folder grouping the modules (that which can plug inside Solr) is
> useful as there are a number of them -- as such this is a nice organization
> IMO.  There's a bunch of other stuff at the top level and I'd rather not
> intermix all our modules at this layer.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jun 11, 2021 at 4:41 PM Mike Drob  wrote:
>
>> We can have modules, but why do they need to be in an additional folder
>> deep? Why not just have langid next to solrj and core? Contrib to me
>> connotes experimental or unsupported, which these things are decidedly not.
>>
>> On Fri, Jun 11, 2021 at 2:59 PM David Smiley  wrote:
>>
>>> The contrib folder is just a folder of modules -- optional plugins for
>>> solr-core.  IMO we should simply rename "contrib" to "modules".  I think
>>> the only non-module in there is the prometheus exporter which could move
>>> out.  Mike, I'm not sure if you have a different notion of what "module"
>>> is?  I believe most of us would be happy to move away from "contrib"
>>> wording, anyway.
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Fri, Jun 11, 2021 at 3:03 PM Mike Drob  wrote:
>>>
>>>> I think related to this, I would like to see some "contibs" moved out
>>>> from the contrib folder and into proper modules. Right now the
>>>> definition of contrib seems to be anything that isn't core or solrj,
>>>> but maybe there is room for a backup module that has gcs and s3 and
>>>> hdfs all under it. LangId is already mentioned in our ref guide but we
>>>> pretend like it is always present and don't think of it as a contrib.
>>>> We kind of think of contrib as optional extra stuff, so maybe we call
>>>> the things what they are - plugins and extensions? Then we don't have
>>>> to think as hard about why certain things are showing up in which lib
>>>> folders.
>>>>
>>>> Also, minor benefit, I would then be able to type c instead of
>>>> having to type cor to disambiguate from con in my terminal.
>>>>
>>>> On Fri, Jun 11, 2021 at 8:09 AM David Smiley 
>>>> wrote:
>>>> >
>>>> > I believe we can do a fair amount of re-organization pertaining to
>>>> Jetty without losing the Jetty configuration that I think is valuable to
>>>> users who want to tweak something.
>>>> >
>>>> > ~ David Smiley
>>>> > Apache Lucene/Solr Search Developer
>>>> > http://www.linkedin.com/in/davidwsmiley
>>>> >
>>>> >
>>>> > On Fri, Jun 11, 2021 at 8:01 AM Jan Høydahl 
>>>> wrote:
>>>> >>
>>>> >> +1 to a cleanup here for 9.0. As clean and neat organization as
>>>> possible. Perhaps rename "dist" -> "lib"?
>>>> >>
>>>> >> I wish we could get rid of the server (jetty) folder altogether, and
>>>> move everything from server/solr-webapp/webapp/WEB-INF/lib to "lib/deps/".
>>>> But that ties into custom boot-class, getting rid of web.xml and building
>>>> Jetty context in Java code.. I'm willing to help 

Re: Modularizing Solr with new contrib packages

2022-01-13 Thread David Smiley
+1 to your phasing.


> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to the
> classloader

I'll create a JIRA :)


SOLR-HOME/lib is already supported --
https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/libs.html
This is what I recommend people use in general.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 13, 2022 at 10:59 AM Houston Putman  wrote:

> It could very well be worth shipping two docker images in the meantime.
>> Or maybe a zip of each module could be a separate artifact that is
>> published?  I'm not sure what freedoms we have to do this in the ASF.
>>
>
> I think for 9.0 we could realistically shoot for 2 binary releases and 2
> docker images, slim (without the modules) and full-featured (with the
> modules), having the full-featured be the default.
>
> Starting in the 9.x line, we could start packaging the modules as separate
> binary artifacts for the solr release. Then in 10.x we can make the slim
> release be the default (still having the fat tgz available as well with as
> solr-extended-10.0.0.tgz or something like that).
>
>
>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging fruits
>> plugins into contribs/modules. Make it super easy to launch solr wil any of
>> these on class-path (SOLR-15914
>> <https://issues.apache.org/jira/browse/SOLR-15914>).
>> Phase 2 (9.x): Evolve package manager and make it possible to optionally
>> install the modules as 1st party packages instead (still fat distro)
>> Pase 3: (10.0?): Extract even more features as modules, and publish all
>> modules as separate delivery artifacts on DLCDN
>>
>
> I really like this plan. I agree for 9.x we really don't have an option,
> but to keep publishing the fat tgz as the default. Even in 10.x I think we
> want to offer both a full-featured download and a slim download, but with
> first-part-packages we can make slim the "default".
>
> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to the
>> classloader
>
> I'll create a JIRA :)
>
>
> Yes please. That would be a lovely improvement! People bend-over-backward
> currently to add custom libs.
>
> - Houston
>
> On Thu, Jan 13, 2022 at 8:09 AM Jan Høydahl  wrote:
>
>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to the
>> classloader, similar to what we have with $SOLR_HOME/lib today. The
>> disadvantage of $SOLR_HOME/lib is that it can be anywhere, perhaps on a
>> Docker volume or a different disk, so you cannot e.g make a Dockerfile like
>>
>> FROM solr:9.0
>> ADD foo.jar /var/solr/data/lib/foo.jar
>>
>> ...since /var/solr/data is a volume and will resolve to the volume
>> partition of the user, not the content from the image. So if we instead
>> allow users to do
>>
>> FROM solr:9.0
>> ADD foo.jar /opt/solr/lib/
>>
>> That is both logical and beautiful, and would always work.
>>
>> I'll create a JIRA :)
>>
>> Jan
>>
>> 13. jan. 2022 kl. 13:57 skrev Jan Høydahl :
>>
>> There is not a lack of vision for future local and remote package
>> repositories, but the story is that package mgmt development has stalled,
>> and is out of reach for 1st party pkgs in the 9.0.0 timeframe.
>> So we have to think progress over perfection - once again
>>
>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging fruits
>> plugins into contribs/modules. Make it super easy to launch solr wil any of
>> these on class-path (SOLR-15914
>> <https://issues.apache.org/jira/browse/SOLR-15914>).
>> Phase 2 (9.x): Evolve package manager and make it possible to optionally
>> install the modules as 1st party packages instead (still fat distro)
>> Pase 3: (10.0?): Extract even more features as modules, and publish all
>> modules as separate delivery artifacts on DLCDN
>>
>> Regarding phase 2 in 9.x. We cannot really extract a feature into a
>> module in e.g. 9.1 so users upgrading from 9.0 will get
>> NoClassFoundException. That breaks back-compat. But perhaps we could
>> continue modularization efforts in 9.x if we make sure that all new modules
>> extracted in a minor release are automatically added to the classloader?
>> Then the classes will disappear from solr-core.jar so would possibly break
>> someone's custom embedded usecase, but 99% of users would be unaffected.
>> Wdyt?
>>
>> In any case, I think for 9.x the realistic route is to keep our fat tgz,
>> but make it slimmer by removing redundancy and prune down on th

Re: Quarterly Committer Meetings

2022-01-13 Thread David Smiley
I created a wiki page for the meeting:
https://cwiki.apache.org/confluence/display/SOLR/2022-01-20+Meeting+notes

I propose that we try and keep this meeting to one topic: 9.0, which is big
because it's any changes to discuss that may/may-not make 9.0.  The meeting
release cadence will increase so we'll get to other topics.

Add your name to the page as an attendee if you'd like to attend.  I'll
create a meeting invite eventually and definitely will share the link
directly/privately and also in #asf-slack because it's not public.


Some 9.0 proposals of mine

2022-01-13 Thread David Smiley
The following changes are on my mind for 9.0, and some others.  I will do
many; others I'm a reviewer for a contributor. I think these are best done
on a major release, but this isn't to say each is "important".  Jan (as
RM), please let me know what you think.  I suppose I need your approval?

Observability:
* https://issues.apache.org/jira/browse/SOLR-14686 Remove log "[coreName]"
(logid) which is redundant with MDC
-- PR just updated and tagged some possible reviewers.  No feedback yet :-/
 I'll merge soon.
* https://issues.apache.org/jira/browse/SOLR-15905 "Don't automatically
register Solr's metrics with JMX (SolrJmxReporter)"
-- Yet our default solr.xml could keep it?.  ETA Jan 24
* https://issues.apache.org/jira/browse/SOLR-14401 ""distrib" request
handler metrics should only be tracked on pertinent handlers"
-- Looking for some feedback on the issue first.  ETA Jan 24

Highlighting:
* https://issues.apache.org/jira/browse/SOLR-15259 lower default
hl.fragAlignRatio
-- minor change but better to change highlighting fragment defaults in a
major release but not critical.  ETA: Jan 17
* https://issues.apache.org/jira/browse/SOLR-12901 Make UnifiedHighlighter
the default
-- There are some overlaps between the highlighters but I definitely think
the UH is the best highlighter.  ETA Jan 21
-- separate issue, TBD: removing the big/verbose configs for the other
highlighters from the default solrconfig.xml to keep it leaner.

Docker:
* JIRA TBD, Java 17 runtime.  ETA Jan 17

Filter.java; remove/hide
* https://issues.apache.org/jira/browse/SOLR-12336 "Remove Filter from Solr"
-- a contributor has something but is getting approval.  If we don't get
this in time, I could do something simple to just ensure the class isn't
public.

Nested Docs:
* https://issues.apache.org/jira/browse/SOLR-15064 "Atomic/partial updates
to nested docs should not assume _route_ param is the root ID"
-- Debt/confusion to be removed. ETA 21 Jan

Modularizing:
* Solrj-Zookeeper: ETA Jan 17
* https://issues.apache.org/jira/browse/SOLR-14660  HDFS (or Hadoop?)
-- Waiting for the contributor to return to it.  See the issue for
discussion on what to do if it stagnates further.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Modularizing Solr with new contrib packages

2022-01-14 Thread David Smiley
I believe the root cause here is fixed by my "Immutable Infrastructure"
adherence proposal relating to a new SOLR_VAR:
https://lists.apache.org/thread/3vvld3xnndtthtl7sfgdbsgkbtpm55b0
Thus SOLR_HOME stays with the solr installation; mutable data like the
indexes go in a new SOLR_VAR -- ultimately the same path to the data that
exists today.  But since SOLR_HOME stays with Solr, so does the lib and
thus it's easy to mount in some other path or whatever.

I didn't create a JIRA issue... I've been extremely busy.  But before I do,
WDYT about this?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 14, 2022 at 4:20 AM Jan Høydahl  wrote:

> Yep, have also been using SOLR_HOME/lib for years. But for a recent
> client, they needed to package up 2-3 plugin jars into the docker image, so
> then we tried $SOLR_HOME/lib, but since /var/solr/data is defined as a
> Docker volume in our Dockerfile, it won't help copying libs in that
> location in custom Dockerfile, since at runtime the volume location will be
> used instead, where some old jars would be used instead. So we added the
> libs to some /opt/foo/lib folder, and made an init-script in
> "/docker-entrypoint-initdb.d/" that on container startup would do a "rm
> /var/solr/data/lib/*.jar && cp /opt/foo/lib/*.jar /var/solr/data/lib/",
> i.e. clean up existing jars from the docker-host's existing volume and copy
> in the fresh plugin jars from the newest image. Phew. And the same with
> solr.xml initialization...
>
> Of course we could have used export SOLR_OPTS=$SOLR_OPTS
> -Dsolr.sharedLib=/opt/foo/lib or something, but it is still not super easy.
> So that's what the new standard location tries to solve - you load code
> from a stable path, not together with your data.
>
> Jan
>
> 13. jan. 2022 kl. 19:04 skrev David Smiley :
>
> +1 to your phasing.
>
>
>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to the
>> classloader
>
> I'll create a JIRA :)
>
>
> SOLR-HOME/lib is already supported --
> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/libs.html
> This is what I recommend people use in general.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Jan 13, 2022 at 10:59 AM Houston Putman 
> wrote:
>
>> It could very well be worth shipping two docker images in the meantime.
>>> Or maybe a zip of each module could be a separate artifact that is
>>> published?  I'm not sure what freedoms we have to do this in the ASF.
>>>
>>
>> I think for 9.0 we could realistically shoot for 2 binary releases and 2
>> docker images, slim (without the modules) and full-featured (with the
>> modules), having the full-featured be the default.
>>
>> Starting in the 9.x line, we could start packaging the modules as
>> separate binary artifacts for the solr release. Then in 10.x we can make
>> the slim release be the default (still having the fat tgz available as well
>> with as solr-extended-10.0.0.tgz or something like that).
>>
>>
>>> Phase 1. (9.0): Modularize Solr by extracting obvious low hanging fruits
>>> plugins into contribs/modules. Make it super easy to launch solr wil any of
>>> these on class-path (SOLR-15914
>>> <https://issues.apache.org/jira/browse/SOLR-15914>).
>>> Phase 2 (9.x): Evolve package manager and make it possible to optionally
>>> install the modules as 1st party packages instead (still fat distro)
>>> Pase 3: (10.0?): Extract even more features as modules, and publish all
>>> modules as separate delivery artifacts on DLCDN
>>>
>>
>> I really like this plan. I agree for 9.x we really don't have an option,
>> but to keep publishing the fat tgz as the default. Even in 10.x I think we
>> want to offer both a full-featured download and a slim download, but with
>> first-part-packages we can make slim the "default".
>>
>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to the
>>> classloader
>>
>> I'll create a JIRA :)
>>
>>
>> Yes please. That would be a lovely improvement! People bend-over-backward
>> currently to add custom libs.
>>
>> - Houston
>>
>> On Thu, Jan 13, 2022 at 8:09 AM Jan Høydahl 
>> wrote:
>>
>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to
>>> the classloader, similar to what we have with $SOLR_HOME/lib today. The
>>> disadvantage of $SOLR_HOME/lib is that it can be anywhere, perhap

Re: Modularizing Solr with new contrib packages

2022-01-14 Thread David Smiley
Fair points.  I might take a stab at this on the weekend to see.

I propose no change to the SOLR_HOME detection logic, which will naturally
end up being SOLR_INSTALL/server/solr (where solr.xml is).  Docker stuff
won't need to set it / play games as it does now.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 14, 2022 at 9:08 AM Jan Høydahl  wrote:

> Hmm, yea it's always been a bit odd how SOLR_HOME does not point to where
> you untared solr, i.e. /opt/solr, like for every other software out there.
> So I support such a change.
> Will SOLR_VAR be exactly what the old SOLR_HOME was, i.e. /var/solr/data,
> or will it point to /var/solr? It's also a bit odd how we don't (I think)
> have a var pointing to /var/solr as laid out by the install script and in
> Dockerfile.
>
> Such a change will have to happen either in 9.0 or 10.0. Sounds a tad too
> large for 9.0, since it's not even started. But a JIRA is a good start.
> Perhaps it is easier than we imagine, and suddenly someone have put up a
> PR? :)
>
> I did not quite get where you wanted the "new" SOLR_HOME to point to. I
> think if we should change anything, it should point to the root of the Solr
> installation?
>
> Jan
>
> 14. jan. 2022 kl. 14:47 skrev David Smiley :
>
> I believe the root cause here is fixed by my "Immutable Infrastructure"
> adherence proposal relating to a new SOLR_VAR:
> https://lists.apache.org/thread/3vvld3xnndtthtl7sfgdbsgkbtpm55b0
> Thus SOLR_HOME stays with the solr installation; mutable data like the
> indexes go in a new SOLR_VAR -- ultimately the same path to the data that
> exists today.  But since SOLR_HOME stays with Solr, so does the lib and
> thus it's easy to mount in some other path or whatever.
>
> I didn't create a JIRA issue... I've been extremely busy.  But before I
> do, WDYT about this?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 14, 2022 at 4:20 AM Jan Høydahl  wrote:
>
>> Yep, have also been using SOLR_HOME/lib for years. But for a recent
>> client, they needed to package up 2-3 plugin jars into the docker image, so
>> then we tried $SOLR_HOME/lib, but since /var/solr/data is defined as a
>> Docker volume in our Dockerfile, it won't help copying libs in that
>> location in custom Dockerfile, since at runtime the volume location will be
>> used instead, where some old jars would be used instead. So we added the
>> libs to some /opt/foo/lib folder, and made an init-script in
>> "/docker-entrypoint-initdb.d/" that on container startup would do a "rm
>> /var/solr/data/lib/*.jar && cp /opt/foo/lib/*.jar /var/solr/data/lib/",
>> i.e. clean up existing jars from the docker-host's existing volume and copy
>> in the fresh plugin jars from the newest image. Phew. And the same with
>> solr.xml initialization...
>>
>> Of course we could have used export SOLR_OPTS=$SOLR_OPTS
>> -Dsolr.sharedLib=/opt/foo/lib or something, but it is still not super easy.
>> So that's what the new standard location tries to solve - you load code
>> from a stable path, not together with your data.
>>
>> Jan
>>
>> 13. jan. 2022 kl. 19:04 skrev David Smiley :
>>
>> +1 to your phasing.
>>
>>
>>> Another minor improvement for users is if we pre-add $SOLR_TIP/lib to
>>> the classloader
>>
>> I'll create a JIRA :)
>>
>>
>> SOLR-HOME/lib is already supported --
>> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-main/libs.html
>> This is what I recommend people use in general.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Thu, Jan 13, 2022 at 10:59 AM Houston Putman 
>> wrote:
>>
>>> It could very well be worth shipping two docker images in the meantime.
>>>> Or maybe a zip of each module could be a separate artifact that is
>>>> published?  I'm not sure what freedoms we have to do this in the ASF.
>>>>
>>>
>>> I think for 9.0 we could realistically shoot for 2 binary releases and 2
>>> docker images, slim (without the modules) and full-featured (with the
>>> modules), having the full-featured be the default.
>>>
>>> Starting in the 9.x line, we could start packaging the modules as
>>> separate binary artifacts for the solr release. Then in 10.x we can make
>>> the slim release be the default (still having the fat tgz available a

Re: Some 9.0 proposals of mine

2022-01-14 Thread David Smiley
I was merely listing PRs I was working on or closely related to; not an
exclusive list to those others are working on :-)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 14, 2022 at 12:45 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Hi Alessandro,
> I'd strongly appreciate if this can be included in 9.0. It is a very
> thorough work and clearly the best highlight of 9.0, if you're able to
> include it.Thanks for your work.
> Regards,
> Ishan
>
> On Fri, 14 Jan, 2022, 10:59 pm Alessandro Benedetti, <
> abenede...@apache.org> wrote:
>
>> What about the Neural Search contribution I am working on:
>> https://github.com/apache/solr/pull/476 ?
>> It's pretty much done and low risk (just waiting for some additional
>> review, but I hope I fixed most of the concerns found so far).
>> To me, it could easily be on 9.1 but the fact that is completed could
>> spark some discussions about including it in 9.0 or not :)
>>
>> Cheers
>> --
>> Alessandro Benedetti
>> Apache Lucene/Solr PMC member and Committer
>> Director, R&D Software Engineer, Search Consultant
>>
>> www.sease.io
>>
>>
>> On Fri, 14 Jan 2022 at 04:47, David Smiley  wrote:
>>
>>> The following changes are on my mind for 9.0, and some others.  I will
>>> do many; others I'm a reviewer for a contributor. I think these are best
>>> done on a major release, but this isn't to say each is "important".  Jan
>>> (as RM), please let me know what you think.  I suppose I need your approval?
>>>
>>> Observability:
>>> * https://issues.apache.org/jira/browse/SOLR-14686 Remove log
>>> "[coreName]" (logid) which is redundant with MDC
>>> -- PR just updated and tagged some possible reviewers.  No feedback yet
>>> :-/  I'll merge soon.
>>> * https://issues.apache.org/jira/browse/SOLR-15905 "Don't automatically
>>> register Solr's metrics with JMX (SolrJmxReporter)"
>>> -- Yet our default solr.xml could keep it?.  ETA Jan 24
>>> * https://issues.apache.org/jira/browse/SOLR-14401 ""distrib" request
>>> handler metrics should only be tracked on pertinent handlers"
>>> -- Looking for some feedback on the issue first.  ETA Jan 24
>>>
>>> Highlighting:
>>> * https://issues.apache.org/jira/browse/SOLR-15259 lower default
>>> hl.fragAlignRatio
>>> -- minor change but better to change highlighting fragment defaults in a
>>> major release but not critical.  ETA: Jan 17
>>> * https://issues.apache.org/jira/browse/SOLR-12901 Make
>>> UnifiedHighlighter the default
>>> -- There are some overlaps between the highlighters but I definitely
>>> think the UH is the best highlighter.  ETA Jan 21
>>> -- separate issue, TBD: removing the big/verbose configs for the other
>>> highlighters from the default solrconfig.xml to keep it leaner.
>>>
>>> Docker:
>>> * JIRA TBD, Java 17 runtime.  ETA Jan 17
>>>
>>> Filter.java; remove/hide
>>> * https://issues.apache.org/jira/browse/SOLR-12336 "Remove Filter from
>>> Solr"
>>> -- a contributor has something but is getting approval.  If we don't get
>>> this in time, I could do something simple to just ensure the class isn't
>>> public.
>>>
>>> Nested Docs:
>>> * https://issues.apache.org/jira/browse/SOLR-15064 "Atomic/partial
>>> updates to nested docs should not assume _route_ param is the root ID"
>>> -- Debt/confusion to be removed. ETA 21 Jan
>>>
>>> Modularizing:
>>> * Solrj-Zookeeper: ETA Jan 17
>>> * https://issues.apache.org/jira/browse/SOLR-14660  HDFS (or Hadoop?)
>>> -- Waiting for the contributor to return to it.  See the issue for
>>> discussion on what to do if it stagnates further.
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>


Re: Tests: Flag to run only unit tests

2022-01-14 Thread David Smiley
Something is needed.

I think unit tests should allow use of a single embedded Solr server (not
Jetty and thus not SolrCloud either).  Whatever we choose, we should
document this in Javadocs on the annotation so we can point ourselves &
contributors to correct use of these annotations.

We could even have MiniSolrCloudCluster or Jetty Solr runner thing do a
test runtime look to see if the current test is classified as an
integration test, and fail otherwise.  Just an idea to help us enforce use
of this correctly.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 14, 2022 at 10:47 AM Jan Høydahl  wrote:

> Created https://issues.apache.org/jira/browse/SOLR-15925 for this
> Feel free to continue the concept discussion here and tech discussion in
> the JIRA.
> PS: I'm not planning to assign myself to this JIRA, so up for grabs.
>
> Jan
>
> 14. jan. 2022 kl. 15:06 skrev Uwe Schindler :
>
> +1
>
> On Jenkins we would enable those.
>
> Am 14. Januar 2022 12:53:21 UTC schrieb "Jan Høydahl" <
> jan@cominvent.com>:
>>
>> Tests take forever to run, and we have totally intermingled unit tests
>> (testing one class or with mocks) with integration tests (spinning up solr
>> clusters, indexing etc), which is not good project hygiene imo.
>> Can we start tagging integration tests in a way so we can choose to leave
>> them out for the quick dev iterations? We already have these properties for
>> test filtering:
>>
>> // filtering
>> [propName: 'tests.filter', value: null, description: "Applies a test filter 
>> (see :helpTests)."],
>> [propName: 'tests.slow', value: true, description: "Enables or disables 
>> @Slow tests."],
>> [propName: 'tests.nightly', value: false, description: "Enables or disables 
>> @Nightly tests."],
>> [propName: 'tests.weekly', value: false, description: "Enables or disables 
>> @Weekly tests."],
>> [propName: 'tests.monster', value: false, description: "Enables or disables 
>> @Monster tests."],
>> [propName: 'tests.awaitsfix', value: null, description: "Enables or disables 
>> @AwaitsFix tests."],
>> [propName: 'tests.badapples', value: null, description: "Enables or disables 
>> @BadApple tests."],
>>
>> So perhaps add an @Ingtegrationtest annotation and a
>> -Ptests.integrationtest=true/false flag. Then we don't need to move test
>> files to other folder, enough to annotate them. For a start,
>> all SolrCloudTestCase tests could be annotated (are annotations inherited?)
>> Then I imagine I'd run unit tests frequently and the whole suite
>> occationally right before merging a large feature.
>> It would also be interesting to run only unit tests and then have a look
>> at the JaCoco test coverage stats. I have a suspicion we have not been good
>> enough writing basic unit tests, including for failure and corner cases.
>>
>> Don't get me wrong. I think it is crucial to test real, live clusters for
>> a project like Solr, and we should keep the tests (and stabilize them). But
>> we cannot focus only on integration. It is developer hostile :)
>> I'd like a "gradlew test" to take 3-5 minutes, at most. Perhaps "gradlew
>> check" could include integration tests while "test" don't?
>> I also know that Mark Miller could write 24 e-mails on this topic and how
>> fast our tests could be. In fact, lightning fast I'm told. But let's not go
>> there in this thread :) Progress over perfection.
>>
>> Jan
>>
>> --
> Uwe Schindler
> Achterdiek 19, 28357 Bremen
> https://www.thetaphi.de
>
>
>


Re: Propose changing the "dist" layout

2022-01-14 Thread David Smiley
I think your proposal is basically a competitive/alternative proposal to my
proposal for SOLR_VAR in
https://lists.apache.org/thread/3vvld3xnndtthtl7sfgdbsgkbtpm55b0
which you completely agreed with.
In addition to my proposal, we could move SOLR_HOME from
$SOLR_TIP/server/solr to $SOLR_TIP/home since that's what Solr knows this
as and there's no reason, I think, for it to be under Jetty's "server".  I
know "solr home" / "home" is maybe ambiguous as to its purpose to a new
user but It's been a Solr concept since forever and it is possible for it
to not just have configuration but data as well depending on how someone
configures/uses it.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 13, 2022 at 11:27 AM Houston Putman  wrote:

> So would make sense to move configsets out to $SOLR_TIP/configsets and
>>> have the default location of SOLR_HOME be not "./solr" but $TEMP/solr or
>>> something?
>>>
>>
>> Regardless of the rest of this discussion I think this would be a big
>> win. Currently it's a pain (especially for new users/contributors) to find
>> the default configsets that solr ships with. This would make it a lot
>> easier.
>>
>
> Looking at this again, maybe we just move solr/server/solr to
> solr/defaults or something else. This directory can be copied to $SOLR_HOME
> when starting solr, but I think it's nice to have all of those defaults
> (configsets, solr.xml, zoo.cfg) together.
>
> On Thu, Jan 13, 2022 at 11:08 AM Houston Putman 
> wrote:
>
>> So would make sense to move configsets out to $SOLR_TIP/configsets and
>>> have the default location of SOLR_HOME be not "./solr" but $TEMP/solr or
>>> something?
>>>
>>
>> Regardless of the rest of this discussion I think this would be a big
>> win. Currently it's a pain (especially for new users/contributors) to find
>> the default configsets that solr ships with. This would make it a lot
>> easier.
>>
>> That would also have the benefit of never polluting the git area with
>>> test data?
>>>
>>
>> Spent a few hours chasing down an issue that turned out to be this, so I
>> am very much +1 here
>>
>> +1 renaming for contribs -> *modules*
>> +1 for re-structuring dist
>>
>> On Thu, Jan 13, 2022 at 8:12 AM Jan Høydahl 
>> wrote:
>>
>>> See https://issues.apache.org/jira/browse/SOLR-15917 for renaming of
>>> "contribs" discussion.
>>>
>>> Wrt server/ folder that is in reality the jetty distribution with an
>>> added "solr" folder for historic reasons. Yea, it is confusing. The "solr"
>>> folder will become $SOLR_HOME if you start Solr without an explicit
>>> $SOLR_HOME var. But it is also the authoritative location of our
>>> config-sets, even if you have $SOLR_HOME somewhere else (I believe?). So
>>> would make sense to move configsets out to $SOLR_TIP/configsets and have
>>> the default location of SOLR_HOME be not "./solr" but $TEMP/solr or
>>> something? That would also have the benefit of never polluting the git area
>>> with test data?
>>>
>>> Jan
>>>
>>> 13. jan. 2022 kl. 12:00 skrev Alessandro Benedetti <
>>> abenede...@apache.org>:
>>>
>>> +1 renaming for contribs -> plugins
>>> +1 for re-structuring dist
>>>
>>> I also would like to raise some concern over the 'server' directory
>>> structure:
>>> README.txt lib resources solr-webapp
>>> contexts logs scripts start.jar
>>> etc modules solr
>>>
>>> I have been using it for years, and still, sometimes I get lost, some of
>>> them are not really clear:
>>> *resources* -> very generic, resources for what? currently, it contains
>>> logging configurations
>>> *etc* -> even more generic, currently it contains a lot of jetty stuff?
>>> *contexts* -> also in this case, not sure what to expect in here (web
>>> app context I suppose?)
>>> *solr* -> we are in the solr binary directory, so 'solr' is again very
>>> generic, this feels like the solr-home to me?
>>> *modules* -> also in this case, not sure what to expect
>>> *solr-webapp* -> is it just the UI? also in this case, it sounds a bit
>>> generic
>>> *start.jar* -> It's clear it's the entrypoint jar, but what does it
>>> contain?
>>>
>>> Cheers
>>>
>>> On Thu, 13 Jan 202

Re: Some 9.0 proposals of mine

2022-01-14 Thread David Smiley
I like letting features bake in main & 9x!
I don't plan to merge anything to 9.0 until after the committer meeting,
and then will do a dev list follow-up so anyone can weigh in if they didn't
attend.

Two problems:
(1) CHANGES.txt management has always been a pain (IMO) and it could be
here especially if we don't even know if it's going to make 9.0 initially.
We'd have to move the bullets around and do extra commits and deal with the
merge issues (albeit minor).  We could deliberately not add some
CHANGES.txt entries until the release inclusion decision is made?   (an
aside: removing the CHANGES.txt in lieu of other things that mostly exist
would be awesome but I digress.
(2) Some changes should be made in a major release only.  I suppose
it could then simply stay in main... but if it's big/impactful then the
timing is terrible for our backports over the next year.  Oh well, I guess.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 14, 2022 at 2:59 PM Jan Høydahl  wrote:

> Thanks for being so structured and transparent about your proposals,
> David. I'll not comment on each JIRA here but reply to your "approval"
> question.
>
> As RM my primary concern is to see to that the release happens, in due
> time, with good quality and with no last-minute changes that causes
> instability.
> I do NOT have a need to display authority as RM, by pretending to know
> better than anyone else what is best for the project.
> So I'll likely respect any consensus from the committer's meeting next
> week wrt JIRAs to add as blockers for 9.0.
>
> I'm pleased to see the momentum on wrapping things up, this is exactly how
> preparing for a major release in a healthy project should be.
> We all now realize that we have to let go of some hopes we had for 9.0.
> Still, many impactful changes are still within reach if they are not too
> intrusive.
> I think starting a new thread here on dev@ for each proposal has worked
> really well so far. Perhaps tag the subject with [9.0 PROPOSAL] ?
>
> Wrt branches, let's utilize the main -> branch_9x -> if(blocker) {
> branch_9_0 } workflow.
> It is much easier to advocate for a blocker if it has baked in main for a
> week or two!
>
> Jan
>
>
> 14. jan. 2022 kl. 05:47 skrev David Smiley :
>
> The following changes are on my mind for 9.0, and some others.  I will do
> many; others I'm a reviewer for a contributor. I think these are best done
> on a major release, but this isn't to say each is "important".  Jan (as
> RM), please let me know what you think.  I suppose I need your approval?
>
> Observability:
> * https://issues.apache.org/jira/browse/SOLR-14686 Remove log
> "[coreName]" (logid) which is redundant with MDC
> -- PR just updated and tagged some possible reviewers.  No feedback yet
> :-/  I'll merge soon.
> * https://issues.apache.org/jira/browse/SOLR-15905 "Don't automatically
> register Solr's metrics with JMX (SolrJmxReporter)"
> -- Yet our default solr.xml could keep it?.  ETA Jan 24
> * https://issues.apache.org/jira/browse/SOLR-14401 ""distrib" request
> handler metrics should only be tracked on pertinent handlers"
> -- Looking for some feedback on the issue first.  ETA Jan 24
>
> Highlighting:
> * https://issues.apache.org/jira/browse/SOLR-15259 lower default
> hl.fragAlignRatio
> -- minor change but better to change highlighting fragment defaults in a
> major release but not critical.  ETA: Jan 17
> * https://issues.apache.org/jira/browse/SOLR-12901 Make
> UnifiedHighlighter the default
> -- There are some overlaps between the highlighters but I definitely think
> the UH is the best highlighter.  ETA Jan 21
> -- separate issue, TBD: removing the big/verbose configs for the other
> highlighters from the default solrconfig.xml to keep it leaner.
>
> Docker:
> * JIRA TBD, Java 17 runtime.  ETA Jan 17
>
> Filter.java; remove/hide
> * https://issues.apache.org/jira/browse/SOLR-12336 "Remove Filter from
> Solr"
> -- a contributor has something but is getting approval.  If we don't get
> this in time, I could do something simple to just ensure the class isn't
> public.
>
> Nested Docs:
> * https://issues.apache.org/jira/browse/SOLR-15064 "Atomic/partial
> updates to nested docs should not assume _route_ param is the root ID"
> -- Debt/confusion to be removed. ETA 21 Jan
>
> Modularizing:
> * Solrj-Zookeeper: ETA Jan 17
> * https://issues.apache.org/jira/browse/SOLR-14660  HDFS (or Hadoop?)
> -- Waiting for the contributor to return to it.  See the issue for
> discussion on what to do if it stagnates further.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
>


Moving out all Hadoop plugins into one module

2022-01-20 Thread David Smiley
The issue https://issues.apache.org/jira/browse/SOLR-14660 is about moving
the HDFS plugins out of core into a module.  While a great thing, it still
leaves quite a few Hadoop related dependencies in solr-core because Hadoop
is not there only for HDFS; it's there for some exotic authentication &
authorization plugins.  In that JIRA issue I proposed that this module be
"hadoop" and have any hadoop related plugins.

As a quick experiment, I commented out the hadoop-auth dependency and tried
to compile to see what the compiler caught. It exposed the following two
Solr plugins:
* HadoopAuthPlugin
* KerberosPlugin

Are we okay with expanding the scope of SOLR-14660 to include these?

Note that SOLR-14660 *might* result in 9.0 not including this module in the
release distribution if we don't feel the module will be sufficiently ready
to release.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Moving out all Hadoop plugins into one module

2022-01-20 Thread David Smiley
Separate modules will mean our distro will end up duplicating hadoop-common
and other related JARs for both modules.  I was trying to be practical.
But it's not important to me; ok.

implementation ('org.apache.hadoop:hadoop-common') { transitive =
false } // too many to ignore
implementation ('org.apache.hadoop:hadoop-annotations')
runtimeOnly 'org.apache.htrace:htrace-core4' // note: removed in Hadoop 3.3.2
runtimeOnly "org.apache.commons:commons-configuration2"

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 20, 2022 at 4:55 PM Kevin Risden  wrote:

> My preference would be as a separate HadoopAuthentication or something
> module. HDFS the filesystem / blockcache / etc support is unique and
> separate from the authentication part. It shouldn't all be in one module.
>
> Kevin Risden
>
>
> On Thu, Jan 20, 2022 at 4:48 PM David Smiley  wrote:
>
>> The issue https://issues.apache.org/jira/browse/SOLR-14660 is about
>> moving the HDFS plugins out of core into a module.  While a great thing, it
>> still leaves quite a few Hadoop related dependencies in solr-core because
>> Hadoop is not there only for HDFS; it's there for some exotic
>> authentication & authorization plugins.  In that JIRA issue I proposed that
>> this module be "hadoop" and have any hadoop related plugins.
>>
>> As a quick experiment, I commented out the hadoop-auth dependency and
>> tried to compile to see what the compiler caught. It exposed the following
>> two Solr plugins:
>> * HadoopAuthPlugin
>> * KerberosPlugin
>>
>> Are we okay with expanding the scope of SOLR-14660 to include these?
>>
>> Note that SOLR-14660 *might* result in 9.0 not including this module in
>> the release distribution if we don't feel the module will be sufficiently
>> ready to release.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>


Re: Quarterly Committer Meetings

2022-01-20 Thread David Smiley
Thanks for the comments Shawn but maybe follow-ups to your ideas should be
a new email thread so we can keep this thread about the meeting itself.

I just want to follow-up on this thread as I've done in the past.  We
actually got through the agenda
https://cwiki.apache.org/confluence/display/SOLR/2022-01-20+Meeting+notes
and also discussed CHANGES.txt and change tracking generally at the end.
Instead of summarizing everything; I think we should try to keep matters to
their respective JIRA tickets and/or dev list threads as applicable.  I see
comments on some JIRAs today for some issues we discussed.  Maybe we don't
need the "Action Items" template Confluence added.

CHANGES.txt / change tracking doesn't have a dev list thread but deserves
one.  I don't have time for it at the moment TBH; hopefully someone starts
that important discussion.  I'm sure there's an existing thread that can be
re-awoken.

I'll bring up the scheduling/organizing of the next meeting in a new thread
now so that it isn't forgotten.

It was great to hear all the peer-to-peer complements at the end.  If I
could have kept the meeting going, we could have kept that going for twice
as long :-)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


2022 April, Solr committer meeting

2022-01-20 Thread David Smiley
I propose that the next committer meeting be held on Thursday, April 14th.
I'd be happy to host again unless someone wishes to take over.  The time is
TBD.

Continuing from today's meeting: We discussed the meeting invitation itself
a little.  To be clear, all committers are officially invited via the email
announcement to the dev list.  As a practical matter, I'm skeptical
about adding 89 committers[1] to my Google Calendar.  Setting that up seems
like a pain and I wonder if a list that long is allowed.  Instead, I
propose I duplicate the previous one, thus carry the list forward.  This is
less work for everyone, I think.  When someone next takes over for me as
host, we'll try to work this out.

Perhaps the next meeting should be at a time more friendly to other
timezones like those in India?  Just an idea; I feel bad when some of us
can't attend.  Even if there are no requests for this, I propose we at
least move the next meeting to one hour earlier than previously to make it
a little nicer for Europe timezones.  Pacific can wake up at 8am --
personally I wake up at 6:40am every day :-)

[1]: https://projects.apache.org/committee.html?solr

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Tagger Handler response format - change proposition

2022-01-20 Thread David Smiley
I posted a JIRA issue & PR: https://issues.apache.org/jira/browse/SOLR-15944

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, Aug 29, 2021 at 3:13 PM Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> wrote:

> Based on similar unexpected formatting elsewhere in the code base (in the
> past) and a quick look at the code I would guess that
> https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.9.0/solr/core/src/java/org/apache/solr/handler/tagger/TaggerRequestHandler.java#L191
> allocating SimpleOrderedMap instead of NamedList might give the proposed
> change.
>
> From: dev@solr.apache.org At: 08/29/21 16:54:53 UTC+1:00
> To: dev@solr.apache.org
> Subject: Re: Tagger Handler response format - change proposition
>
> How about the json.nl parameter?That ought to control how this
> outputs.
>
> On Sun, Aug 29, 2021, 10:49 Eric Pugh 
> wrote:
>
>> Interesting change, and yes, that makes sense to me.  Would love to see a
>> PR.
>>
>> On Aug 28, 2021, at 3:32 AM, Łukasz Sokołowski <
>> lukasz.sokolow...@xtech.pl> wrote:
>>
>> Hi,
>>
>> Forgive me using this channel but I didn’t find better place to ask and
>> eventually propose the change below.
>>
>> I’m just implementing content tagging with use of Solr Tagger Handler and
>> I’m wondering why “tags” in Tagger response is an array of arrays instead
>> of array of objects?
>>
>> As in example from
>> https://solr.apache.org/guide/8_7/the-tagger-handler.html
>>
>> {
>>   "responseHeader":{
>> "status":0,
>> "QTime":1},
>>   "tagsCount":1,
>>   "tags":[[
>>   "startOffset",6,
>>   "endOffset",19,
>>   "ids",["5128581"]]],
>>   "response":{"numFound":1,"start":0,"docs":[
>>   {
>> "id":"5128581",
>> "name":["New York City"],
>> "countrycode":["US"]}]
>>   }}
>>
>> Wouldn’t it be more intuitive and easy to use:
>>
>>
>> …
>>
>>   "tags":[{
>>
>>   "startOffset":6,
>>
>>   "endOffset":19,
>>
>>   "ids":["5128581"]}],
>>
>> …
>>
>>
>>
>> Best regards,
>> Łukasz Sokołowski
>>
>>
>> ___
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>
>


Re: [DISCUSS] Standardizing module naming

2022-01-21 Thread David Smiley
Now is a great time to do some name changes.  I suggest that you make a
specific proposal of what the names should be.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti 
wrote:

> I would also add a tangential question (rather than answers at this point):
> What makes a module(contrib) a module(contrib)?
> *From now on I'll use 'module' where I intend a package under contrib.*
>
> I am referring to first-party modules such as ltr or langid.
> My initial understanding was that a module in contrib, is an integration
> with some external dependency (like langid with OpenNLP, Tika or
> langdetect).
> But then, why is *ltr* a module? It doesn't really integrate with any
> external dependency.
> It's additional query parsers and components for a key Solr functionality.
> Is it just a legacy consequence of the fact that initially, Bloomberg
> contributed the module?
> Maybe this applies to other modules as well (analytics?).
> Then, should this be fixed and brought inside the Solr core?
>
> And what about first party/third party modules?
> I don't think there's any visible difference right now, but in case we
> want to make a difference, should we create a sort of official "Solr Plugin
> Marketplace" ?
> (I proposed the idea to Lucidworks many years ago when I was working for a
> partner, and for a certain amount of time, I think there was a Solr Plugin
> Marketplace, but it was proprietary).
>
> I am curious to understand what you think about this and then reason about
> the naming convention.
>
> Cheers
>
>
> --
> Alessandro Benedetti
> Apache Lucene/Solr PMC member and Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl  wrote:
>
>> Hi,
>>
>> In
>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>> I suggested standardizing contrib/module names. We did not discuss it in
>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>> I'd like to discussed, since we are anyway renaming everything in
>> SOLR-15917 "contrib->module".
>> With as few contribs as we had so far it has not really been an issue.
>> But the reason I suggested it is because I anticipate a huge growth in
>> number of modules/packages during 9.x, and it can get messy. Another reason
>> for having a convention is that it forces the module/package creator to
>> think through whether the proposed module has the right granularity. Take
>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>> my proposed types, as it contains both a directoryFactory, one or two
>> authentication plugins and one backup repository. That of course suggests
>> that the module is too big and should be divided. Another reason is that
>> when we have 50 modules / packages it would be far better for users to be
>> able to find all backup repositories by looking for backup-* rather than
>> guess from naming what it is. Perhaps a bad example since both repo
>> contribs have a suffix "-repository" today. But then "-repository" is not
>> as user friendly as "backup-".
>>
>> So I guess I'd like your opinion on
>>
>> 1) Do we even want a convention (at least for our own code?)
>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>> them around anyway?
>> 3) When we start adding package manifests to the modules, should there be
>> a 1:1 between module name and package name?
>>
>> Refarding the last point, we could apply such standardized naming
>> convention for the packages only and leave module names as-is, i.e. you'd
>> do "solr package install update-extraction" even if the module name is "
>> extraction".
>>
>> Jan
>>
>


Re: Modularizing Solr with new contrib packages

2022-01-21 Thread David Smiley
Yeah +1 to increase modularization in general.  If for some reason this
makes the functionality harder to use (which I sympathize with), I think we
should instead direct our energy to making modules/packages easier to use.
I'm thrilled about Jan's proposal to simply list the module names on
bin/solr at startup.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 21, 2022 at 8:55 AM Jan Høydahl  wrote:

> There was a message from Alessandro in another thread
> <https://lists.apache.org/thread/q014rj1o4cnhq6olr3krnm66q44868x6>, which
> I'll answer here instead:
> (I'm posting the full question below my answers here for context)
>
> Many questions, and definitely a tangent :) But let me try a short answer
>
> What makes a module(contrib) a module(contrib)?
>
>
> Historically I think it actually was some new contribution that we wanted
> to be optional, perhaps since it was experimental, perhaps due to extra
> dependencies, don't know.
> I think the funny name "contrib" has fended us off from putting new
> tunctionality in modules, and instead committers have continued to add to
> solr-core all kinds of things.
> What I'm trying to push for with the dev-doc linked in the original mail
> is to shift our mindsets.
> You should have a really good reason for NOT putting some new
> functionality that is not going to be used by a majority of users, into a
> module.
>
> *From now on I'll use 'module' where I intend a package under contrib.*
>
>
> This is still in flux until
> https://issues.apache.org/jira/browse/SOLR-15917 lands on main. After
> that "contrib" will be history both in code and docs.
>
> I am referring to first-party modules such as ltr or langid.
> My initial understanding was that a module in contrib, is an integration
> with some external dependency (like langid with OpenNLP, Tika or
> langdetect).
> But then, why is *ltr* a module? It doesn't really integrate with any
> external dependency.
>
>
> Again, it's all historic reasons here. Bloomberg wisely contributed ltr
> and analysis as contrib modules. Kudos!
>
> Then, should this be fixed and brought inside the Solr core?
>
>
> Rather the opposite. We should lift much more non-core features out of
> core and into modules. That's why I wrote a scaffoldNewModule.py script
> to lower the bar for those wanting to do that. I'm lifting out
> JWTAuthPlugin in https://issues.apache.org/jira/browse/SOLR-15907 since
> it is not needed by all users
>
> And what about first party/third party modules?
> I don't think there's any visible difference right now, but in case we
> want to make a difference, should we create a sort of official "Solr Plugin
> Marketplace" ?
>
>
> A 1st party package will be (still not ready) a module that has a manifest
> added to it and that can be installed locally via pacakge manager.
>
> The pkg manager has a concept of package repos. So you can already today
> add a remote repo, see examples in the dev-doc.
> See also https://issues.apache.org/jira/browse/SOLR-14688 for a proposed
> 1st party package design. That JIRA is 1,5 years old :)
> The thougt is that the Solr project at some point release our modules as
> separate JAR files to the download repository, and publish
> a repository.json file at solr.apache.org/repository or similar. Then we
> can release a slim tgz with only solr-core, and users can pull
> down the packages they like. Perhaps we'll see some of this materialize in
> 9.x or at least in 10.0. Until then, all we have is
> contribs (soon to be named modules) :)
>
> Jan
>
> 21. jan. 2022 kl. 14:17 skrev Alessandro Benedetti :
>
> I would also add a tangential question (rather than answers at this point):
> What makes a module(contrib) a module(contrib)?
> *From now on I'll use 'module' where I intend a package under contrib.*
>
> I am referring to first-party modules such as ltr or langid.
> My initial understanding was that a module in contrib, is an integration
> with some external dependency (like langid with OpenNLP, Tika or
> langdetect).
> But then, why is *ltr* a module? It doesn't really integrate with any
> external dependency.
> It's additional query parsers and components for a key Solr functionality.
> Is it just a legacy consequence of the fact that initially, Bloomberg
> contributed the module?
> Maybe this applies to other modules as well (analytics?).
> Then, should this be fixed and brought inside the Solr core?
>
> And what about first party/third party modules?
> I don't think there's any visible difference right now, but in case we
> want to make a difference, should we create a sort of official "Solr Plugin
> Marketplace" ?
> (I proposed the idea to Lucidworks many years ago when I was working for a
> partner, and for a certain amount of time, I think there was a Solr Plugin
> Marketplace, but it was proprietary).
>
> I am curious to understand what you think about this and then reason about
> the naming convention.
>
> Cheers
>
>
>
> ---
>


Re: [DISCUSS] Standardizing module naming

2022-01-21 Thread David Smiley
+1 I like your proposed names.  Some of our names are so short now that
only us know what they are at a glance.


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl  wrote:

> There is kind of a proposal in
> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
> already, but I'd like to discuss the general idea and what structure makes
> the most sense here. With my "type" proposal, you can easily map the new
> names for the various contribs, e.g. "backup-s3", "backup-gce",
> "update-extraction", "update-langid", "search-analytics" etc. Other
> structures are also probably possible? Or we could just leave it up to each
> module author as before :)
>
> Jan
>
> 21. jan. 2022 kl. 15:25 skrev David Smiley :
>
> Now is a great time to do some name changes.  I suggest that you make a
> specific proposal of what the names should be.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti 
> wrote:
>
>> I would also add a tangential question (rather than answers at this
>> point):
>> What makes a module(contrib) a module(contrib)?
>> *From now on I'll use 'module' where I intend a package under contrib.*
>>
>> I am referring to first-party modules such as ltr or langid.
>> My initial understanding was that a module in contrib, is an integration
>> with some external dependency (like langid with OpenNLP, Tika or
>> langdetect).
>> But then, why is *ltr* a module? It doesn't really integrate with any
>> external dependency.
>> It's additional query parsers and components for a key Solr functionality.
>> Is it just a legacy consequence of the fact that initially, Bloomberg
>> contributed the module?
>> Maybe this applies to other modules as well (analytics?).
>> Then, should this be fixed and brought inside the Solr core?
>>
>> And what about first party/third party modules?
>> I don't think there's any visible difference right now, but in case we
>> want to make a difference, should we create a sort of official "Solr Plugin
>> Marketplace" ?
>> (I proposed the idea to Lucidworks many years ago when I was working for
>> a partner, and for a certain amount of time, I think there was a Solr
>> Plugin Marketplace, but it was proprietary).
>>
>> I am curious to understand what you think about this and then reason
>> about the naming convention.
>>
>> Cheers
>>
>>
>> --
>> Alessandro Benedetti
>> Apache Lucene/Solr PMC member and Committer
>> Director, R&D Software Engineer, Search Consultant
>>
>> www.sease.io
>>
>>
>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl  wrote:
>>
>>> Hi,
>>>
>>> In
>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>> I suggested standardizing contrib/module names. We did not discuss it in
>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>> I'd like to discussed, since we are anyway renaming everything in
>>> SOLR-15917 "contrib->module".
>>> With as few contribs as we had so far it has not really been an issue.
>>> But the reason I suggested it is because I anticipate a huge growth in
>>> number of modules/packages during 9.x, and it can get messy. Another reason
>>> for having a convention is that it forces the module/package creator to
>>> think through whether the proposed module has the right granularity. Take
>>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>>> my proposed types, as it contains both a directoryFactory, one or two
>>> authentication plugins and one backup repository. That of course suggests
>>> that the module is too big and should be divided. Another reason is that
>>> when we have 50 modules / packages it would be far better for users to be
>>> able to find all backup repositories by looking for backup-* rather than
>>> guess from naming what it is. Perhaps a bad example since both repo
>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>> as user friendly as "backup-".
>>>
>>> So I guess I'd like your opinion on
>>>
>>> 1) Do we even want a convention (at least for our own code?)
>>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>>> them around anyway?
>>> 3) When we start adding package manifests to the modules, should there
>>> be a 1:1 between module name and package name?
>>>
>>> Refarding the last point, we could apply such standardized naming
>>> convention for the packages only and leave module names as-is, i.e. you'd
>>> do "solr package install update-extraction" even if the module name is "
>>> extraction".
>>>
>>> Jan
>>>
>>
>


Re: Proposal for node configs to adhere to immutable infrastructure

2022-01-24 Thread David Smiley
There is an alternative approach to achieve the aims here, which I think is
a bit simpler.  No SOLR_VAR env; we don't need it.  Instead SOLR_HOME is
that.  Anything that goes in SOLR_HOME would be resolved with defaulting
logic off of SOLR_TIP (solr install dir) at the typical paths from there
(e.g. server/solr).  That means solr.xml, zoo.cfg, configSets, fileStore,
lib, ... -- if none of this is in SOLR_HOME, then it will be resolved
against SOLR_TIP/server/solr.  It would be fine to start Solr with an empty
SOLR_HOME for a new node (SOLR-9575).  SOLR_TIP is to be treated as
read-only.

Furthermore, SOLR_HOME ought to point to /var/solr (not /var/solr/data) and
coreRootDirectory ought to default to "data", thus keeping cores separate
from our other stuff.  This should have been done a long time ago.
Related: https://issues.apache.org/jira/browse/SOLR-11508

Any concerns?


Re: New branch and feature freeze for Solr 9.0.0

2022-01-29 Thread David Smiley
SOLR-15064 "Atomic/partial updates to nested docs should not assume _route_
param is the root ID" will be simple; I will do it.

RE the UH being the default; I'm about to post a PR

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Jan 29, 2022 at 5:11 PM Jan Høydahl  wrote:

> Hi,
>
> Nine days ago we had 24 blockers. Several blockers have since been added
> and several resolved, and today we are at 19.
>
> I notified in the previous email that I'd remove inactive blockers. Here
> is the list of JIRAs for which I intend to remove the blocker flag on
> February 1st:
>
> SOLR-14097  coreRootDirectory should be solr_home/cores
> SOLR-15064  Atomic/partial updates to nested docs should not assume
> _route_ param is the root ID
> SOLR-15242  Consolidate README.md with solr/README.md
> SOLR-15096  [REGRESSION] Collection Delete Performance significantly
> degraded in Java 11 v 8
> SOLR-15223  Deprecate HttpSolrClient, mark httpcomponents dep as
> "optional" in SolrJ
> SOLR-15835  Collection creation failing with https
>
> Shout out if you are working on one of these and expect it to be finished
> soon-ish.
>
> That leaves 13 blockers:
>
> Issue key   Summary
>  AssigneeReporter
> SOLR-14660  Migrating HDFS into a module
> krisden ichattopadhyaya
> SOLR-15956  Add documentation for creating a docker image from the binary
> dis  houston houston
> SOLR-13138  Remove deprecated code prior to 9.0
>  romseygeek
> SOLR-15556  Ref Guide Redesign Phase 3: Replace Jekyll
> ctargettctargett
> SOLR-15949  Use Java 17 in docker
>  dsmiley dsmiley
> SOLR-15926  Fix version specification in the Solr Ref Guide
>  houston
> SOLR-14290  Fix NPE in SolrTestCaseJ4 breaking external usage for
> master/9.x   gus gus
> SOLR-12901  Make UnifiedHighlighter the default
>  dsmiley dsmiley
> SOLR-14401  """distrib"" request handler metrics should only be tracked on
> pe  dsmiley dsmiley
> SOLR-15587  Replicas end up with base_url as http on client side even if
> clus  thelabdude  thelabdude
> SOLR-15557  Figure out how to handle ref guide page renames/redirects
>  ctargett
> SOLR-15898  Complete Major changes and Upgrade Notes in RefGudie for 9.0.0
> janhoy
> SOLR-15321  "Flesh out process for managing/storing ""official""
> Dockerfiles   houston hossman
>
> It seems like most of these have had some recent activity.
> Please everyone, have a look if you can lend a hand with any of these, so
> we can get the list to zero early in February and do the first RC.
> I'd appreciate some help on SOLR-15898, consolidating and structuring the
> "Major changes" chapter of the reference guide.
>
> Jan
>
>
> 20. jan. 2022 kl. 19:48 skrev Jan Høydahl :
>
> Hi,
>
> The list of release blockers can be seen with this JIRA filter:
> https://issues.apache.org/jira/issues/?filter=12351219
>
> After the committer's meeting today we decided to add these to the
> blockers list:
> - SOLR-15556 Ref Guide Redesign Phase 3: Replace Jekyll
> - SOLR-15917 Rename 'contrib' as 'module'
> - SOLR-15880 Introduce Support to K Nearest Neighbors Search
> - SOLR-14660 Migrating HDFS into a package
> - SOLR-12901 Make UnifiedHighlighter the default
> - SOLR-15914 Make it super simple to add a contrib module to shared
> classpath
> - And probably some minor ones too
>
> Each of these are in-flight and are expected to be ready really soon™.
>
> There are currently 24 blockers, but some of those are Unsassigned and/or
> have not been given any attention for some time.
> On Feb 1st I'll take the freedom to remove blocker flag for those that
> have not moved anywhere since.
> If you want to own one of them, please assign yourself and communicate
> progress and an ETA.
>
> Jan
>
>
>


Re: [DISCUSS] Standardizing module naming

2022-01-30 Thread David Smiley
Yes; I was thinking something like this as well.  This way we can make
meaningful progress on modularization during the 9x series without breaking
compatibility.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Jan 29, 2022 at 4:37 PM Jan Høydahl  wrote:

> Hi,
>
> Seems to be not an overwhelming support for enforcing naming convention -
> at least not yet.
> So let the suggestion be a recommendation, and we'll see during 9.x what
> naming makes sense for new modules.
>
> I thought about whether we can extract code from solr-core into modules in
> a 9.x minor release.
> If it breaks exisitng use, e.g. package name change, or if the plugin in
> no longer on classpath by default, we cannot.
> But if we want to extract a certain feature, such as Hadoop-Auth, in 9.1 -
> if we keep the package name and make the new module included in
> SOLR_MODULES by default, then perhaps? Views?
>
> Jan
>
>
> 24. jan. 2022 kl. 17:23 skrev Jason Gerlowski :
>
>
> 1. [Do we want a convention?] I'd be fine with a convention as long as
> we're willing to be flexible on it or evolve it as more modules come in.
> If we're expecting that 9.x will bring in other new modules but we don't
> know what those are, then we can't be too strict on any particular naming.
>
> 2. [should we rename the contribs/modules for 9.0 when we throw them
> around anyway?] Sure, +1 to the proposed names.
>
> Jason
>
> On Fri, Jan 21, 2022 at 1:53 PM Houston Putman 
> wrote:
>
>> I agree that standardizing the names would be nice.
>>
>> Another good option is to have a ref-guide page that lists all the
>> modules, explains their purpose and links to relevant documentation.
>> This page could be broken down by feature, much like your proposed names
>> would be.
>>
>> On Fri, Jan 21, 2022 at 1:47 PM David Smiley  wrote:
>>
>>> +1 I like your proposed names.  Some of our names are so short now that
>>> only us know what they are at a glance.
>>>
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl 
>>> wrote:
>>>
>>>> There is kind of a proposal in
>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>> already, but I'd like to discuss the general idea and what structure makes
>>>> the most sense here. With my "type" proposal, you can easily map the new
>>>> names for the various contribs, e.g. "backup-s3", "backup-gce",
>>>> "update-extraction", "update-langid", "search-analytics" etc. Other
>>>> structures are also probably possible? Or we could just leave it up to each
>>>> module author as before :)
>>>>
>>>> Jan
>>>>
>>>> 21. jan. 2022 kl. 15:25 skrev David Smiley :
>>>>
>>>> Now is a great time to do some name changes.  I suggest that you make a
>>>> specific proposal of what the names should be.
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <
>>>> a.benede...@sease.io> wrote:
>>>>
>>>>> I would also add a tangential question (rather than answers at this
>>>>> point):
>>>>> What makes a module(contrib) a module(contrib)?
>>>>> *From now on I'll use 'module' where I intend a package under contrib.*
>>>>>
>>>>> I am referring to first-party modules such as ltr or langid.
>>>>> My initial understanding was that a module in contrib, is an
>>>>> integration with some external dependency (like langid with OpenNLP, Tika
>>>>> or langdetect).
>>>>> But then, why is *ltr* a module? It doesn't really integrate with any
>>>>> external dependency.
>>>>> It's additional query parsers and components for a key Solr
>>>>> functionality.
>>>>> Is it just a legacy consequence of the fact that initially, Bloomberg
>>>>> contributed the module?
>>>>> Maybe this applies to other modules as well (analytics?).
>>>>> Then, should this be fixed and brought inside the Solr core?
>>>>>
>>>>&g

Re: [DISCUSS] Standardizing module naming

2022-01-31 Thread David Smiley
I don't think we should embrace the module system in 9x; too much pain for
too little gain.  Maybe the only exception would be the SolrJ side as it
affects our users more directly.

For changing package names without breaking back-compat; depending on the
plugin, it may already be supported via the "solr.MyClassName" pattern.
These will work automatically.  If needed, we could do some small hack in
SolrResourceLoader to remap older class names to new ones.

Another interesting case would be auto-registered handlers like /sql for
modularizing the SQLHandler --
https://issues.apache.org/jira/browse/SOLR-15904. I think we could support
this by making ImplicitPlugins.json loading tolerant of a
ClassNotFoundException.  Alternatively it could be interesting to support a
way for a module to self-declare plugins that should be automatically
registered... although that would need to be reconciled with the package
manager's approach.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jan 31, 2022 at 6:01 AM Jan Høydahl  wrote:

> Could this work? :
>
> Say, in 9.1 that we move feature FOO out of core into a module. We
> introduce a new SOLR_DEFAULT_MODULES=foo variable that will append to
> SOLR_MODULES,
> so if a 9.0 user has SOLR_MODULES=extracting and upgrades to 9.1, tey will
> not need to change anything and will get both. But if they do not need FOO,
> then they
> can get rid of it from classpath by setting SOLR_DEFAULT_MODULES="". Then
> in 10.0 SOLR_DEFAULT_MODULES is again empty.
>
> The only thing I'm worried abut is split packages. E.g. HadoopAuthPlugin
> lives in org.apache.solr.security which will be shared with core. As I
> understand, that may be a
> problem for JavaDoc, and for java module system if we want to embrace it.
> Anything else? Is there a clever way we could change package name in 9.1
> without
> breaking back-compat?
>
> Jan
>
> 31. jan. 2022 kl. 04:11 skrev David Smiley :
>
> Yes; I was thinking something like this as well.  This way we can make
> meaningful progress on modularization during the 9x series without breaking
> compatibility.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sat, Jan 29, 2022 at 4:37 PM Jan Høydahl  wrote:
>
>> Hi,
>>
>> Seems to be not an overwhelming support for enforcing naming convention -
>> at least not yet.
>> So let the suggestion be a recommendation, and we'll see during 9.x what
>> naming makes sense for new modules.
>>
>> I thought about whether we can extract code from solr-core into modules
>> in a 9.x minor release.
>> If it breaks exisitng use, e.g. package name change, or if the plugin in
>> no longer on classpath by default, we cannot.
>> But if we want to extract a certain feature, such as Hadoop-Auth, in 9.1
>> - if we keep the package name and make the new module included in
>> SOLR_MODULES by default, then perhaps? Views?
>>
>> Jan
>>
>>
>> 24. jan. 2022 kl. 17:23 skrev Jason Gerlowski :
>>
>>
>> 1. [Do we want a convention?] I'd be fine with a convention as long as
>> we're willing to be flexible on it or evolve it as more modules come in.
>> If we're expecting that 9.x will bring in other new modules but we don't
>> know what those are, then we can't be too strict on any particular naming.
>>
>> 2. [should we rename the contribs/modules for 9.0 when we throw them
>> around anyway?] Sure, +1 to the proposed names.
>>
>> Jason
>>
>> On Fri, Jan 21, 2022 at 1:53 PM Houston Putman 
>> wrote:
>>
>>> I agree that standardizing the names would be nice.
>>>
>>> Another good option is to have a ref-guide page that lists all the
>>> modules, explains their purpose and links to relevant documentation.
>>> This page could be broken down by feature, much like your proposed names
>>> would be.
>>>
>>> On Fri, Jan 21, 2022 at 1:47 PM David Smiley  wrote:
>>>
>>>> +1 I like your proposed names.  Some of our names are so short now that
>>>> only us know what they are at a glance.
>>>>
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl 
>>>> wrote:
>>>>
>>>>> There is kind of a proposal in
>>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>

Re: Warning - Ref Guide has been migrated to Antora

2022-02-10 Thread David Smiley
This is a nice fresh look; thanks Cassandra, Houston, and Mike!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Feb 10, 2022 at 12:47 PM Houston Putman  wrote:

> Hey everyone. The new ref guide is officially merged into all relevant
> branches (down to branches_9_0). Luckily, there shouldn't be much change to
> your workflow!
>
> *Building*
> In order to build a local site, use: "gradlew buildLocalSite", or just
> "gradlew assemble".
> The output will be in "solr/solr-ref-guide/build/site/index.html", but
> this is also output when the task is run.
>
> *Page source*
> The source for the pages is now found under the
> "solr/solr-ref-guide/modules" directory. At first it might be hard to find
> files, but they are pretty logically separated out.
>
> The syntax is still asciidoctor, but there is a large change in how you
> link between pages in the ref guide. You can find lots of examples
> throughout the existing pages, but it is documented here:
> https://github.com/apache/solr/blob/main/dev-docs/ref-guide/asciidoc-syntax.adoc#link-to-other-pagessections-of-the-guide
>
> Please make sure that the merge goes cleanly for the PRs you have already
> created, before the new ref-guide was committed. The only real issue you
> should see is the new link syntax, mentioned above, but there is a
> possibility there will be worse problems. I'm happy to help with any merge
> issues you run into so please reach out.
>
> *Check it out*
> You can check out the local build here:
> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-antora/solr/10_0/
>
> That link is up-to-date as of yesterday. Soon we will have it up to date
> with all current versions (9.0, 9.1 and 10.0), hopefully tomorrow at some
> point. (I will also go through and make sure we didn't backport things from
> 10 and 9.1 to 9.0 that shouldn't have been included...)
>
>
> Thanks to Cassandra for doing the heavy lifting here (and Mike as well)!
> This is a major improvement for our docs, and I'm really excited to have it
> out there soon!
>
> - Houston
>


Re: Warning - Ref Guide has been migrated to Antora

2022-02-10 Thread David Smiley
OMG I see we have search now -- good search with snippets too!  How does it
work?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Feb 10, 2022 at 1:09 PM David Smiley  wrote:

> This is a nice fresh look; thanks Cassandra, Houston, and Mike!
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Feb 10, 2022 at 12:47 PM Houston Putman 
> wrote:
>
>> Hey everyone. The new ref guide is officially merged into all relevant
>> branches (down to branches_9_0). Luckily, there shouldn't be much change to
>> your workflow!
>>
>> *Building*
>> In order to build a local site, use: "gradlew buildLocalSite", or just
>> "gradlew assemble".
>> The output will be in "solr/solr-ref-guide/build/site/index.html", but
>> this is also output when the task is run.
>>
>> *Page source*
>> The source for the pages is now found under the
>> "solr/solr-ref-guide/modules" directory. At first it might be hard to find
>> files, but they are pretty logically separated out.
>>
>> The syntax is still asciidoctor, but there is a large change in how you
>> link between pages in the ref guide. You can find lots of examples
>> throughout the existing pages, but it is documented here:
>> https://github.com/apache/solr/blob/main/dev-docs/ref-guide/asciidoc-syntax.adoc#link-to-other-pagessections-of-the-guide
>>
>> Please make sure that the merge goes cleanly for the PRs you have already
>> created, before the new ref-guide was committed. The only real issue you
>> should see is the new link syntax, mentioned above, but there is a
>> possibility there will be worse problems. I'm happy to help with any merge
>> issues you run into so please reach out.
>>
>> *Check it out*
>> You can check out the local build here:
>> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-antora/solr/10_0/
>>
>> That link is up-to-date as of yesterday. Soon we will have it up to date
>> with all current versions (9.0, 9.1 and 10.0), hopefully tomorrow at some
>> point. (I will also go through and make sure we didn't backport things from
>> 10 and 9.1 to 9.0 that shouldn't have been included...)
>>
>>
>> Thanks to Cassandra for doing the heavy lifting here (and Mike as well)!
>> This is a major improvement for our docs, and I'm really excited to have it
>> out there soon!
>>
>> - Houston
>>
>


Re: Propose Solr 9 *Docker* image use Java 17

2022-02-11 Thread David Smiley
Closing the loop here: https://issues.apache.org/jira/browse/SOLR-15949 for
Java 17 via the Eclipse Temurin distribution.  I'll merge this weekend.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, Jan 9, 2022 at 3:01 AM Mark Miller  wrote:

> That is generally an expected trade off when using the new collectors -
> perhaps a 15% hit to throughput due to locking in concurrent compaction but
> better latency with the shorter pauses.
>
> The last I saw, for smaller heaps, G1 generally wins  - better
> throughput and latency that’s just as good.
>
> For larger heaps, if memory is not a constraint, the new collectors win.
>
> If memory is a constraint, you pay for the better latency with throughput
> with the new collectors.
>
> G1 remains a good default generally.
>
>
>   *Mark Miller* - Chat @ Spike
> <https://spikenow.com/r/a/?ref=spike-organic-signature&_ts=1deyla> [image:
> 1deyla]
>
> On January 8, 2022 at 2:33 GMT, Shawn Heisey  wrote:
>
>
> On 1/6/2022 2:03 PM, David Smiley wrote:
> > The new Shenandoah GC looks exciting but may not be sufficiently ready
> > for us to recommend (if I recall from a recent user who reported a
> > problem with it) -- and that's okay.
>
> I've done some experiments with Shenandoah and ZGC. My index is tiny,
> 155297 docs and a total index size of 644.22MB. I'm running it with a
> max heap of 512MB.
>
> When I uploaded the GC logs to gceasy, the newer collectors showed much
> smaller pauses than G1, and more collections. I think the overall pause
> time is significantly smaller, but throughput took a hit. It was a
> larger impact than I imagined. Last time I tested it, a full rebuild of
> the index took about 8 minutes with G1, and over 9 minutes with the
> newer collectors. That index is built and used by my dovecot install.
> There's pretty much no query activity on my system, so I used the full
> index rebuild to exercise it.
>
> I had a remote co-conspirator on this testing. On that user's index,
> their reindex procedure failed to complete at all with the newer
> collectors, but it did work with G1. I did not get any details about
> how it failed, but I know that their testing and mine were done with
> OpenJDK 11.
>
> I was going to do some additional testing with OpenJDK 17.0.1, but I
> found that when I tried to start Solr with Shenandoah, it has the
> following in the console log:
>
> Error occurred during initialization of VM
> Option -XX:+UseShenandoahGC not supported
>
> This is very weird, because I saw an article about sub-millisecond
> pauses using OpenJDK 17 and Shenandoah. Unless Oracle has decided to
> pull Shenandoah completely in-house and not make it available in OpenJDK
> any more.
>
> Thanks,
> Shawn
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>
>
>


Re: Warning - Ref Guide has been migrated to Antora

2022-02-14 Thread David Smiley
When I do "gw check", I see an error about a a "npx" program not being
found:

> Process 'command
> '/Users/dsmiley/SFDCDev/solr_main/.gradle/node/nodejs/node-v16.13.2-darwin-x64/bin/npx''
> finished with non-zero exit value 1
>

Um... are there new prerequisites to do a build?  Is this documented?


Re: Define what requires a JIRA and an entry in CHANGES.txt?

2022-02-16 Thread David Smiley
This has been discussed and documented before:
https://cwiki.apache.org/confluence/display/SOLR/Commit+Process+Guidelines#CommitProcessGuidelines-UpdatingCHANGES.txt

In general I push for us to not waste our time on busy-work like this, not
to mention hassling everyone else in perpetuity who reads the CHANGES.txt
to read that some test doesn't need a dependency it once needed ;-).
Fundamentally ask yourself *who cares*?  Are there user perceivable
consequences?  Are there non-trivial risks?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Feb 16, 2022 at 2:22 AM Eric Pugh 
wrote:

> Hi all!Do we have a written definition of what requires a JIRA and/or
> an entry in CHANGES.txt?   Something that could go in our developer docs?
>
> Is there a consensus on this?   For example, changes to documentation
> typically have NOT gone into the CHANGES.txt file.   It appears that some
> refactoring don’t need a JIRA issue as well.
>
> Thoughts on this?
>
>
> Eric
>
> ___
> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
> | http://www.opensourceconnections.com | My Free/Busy
> <http://tinyurl.com/eric-cal>
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>
>


Re: Request for a mentor

2022-02-21 Thread David Smiley
Hi Raveesh!

Have you thought of participating in Google Summer of Code or similar
programs?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, Feb 20, 2022 at 11:04 PM Raveesh Gupta 
wrote:

> Hi,
>
>> My name is Raveesh Gupta and I am a university student  and I like to
>> contribute to apache lucene. I am very passionate about search engines and
>> information retrieval. Moreover I would like a mentor to show the
>> direction. Please find my qualifications from my resume attached. Kindly
>> consider me as I am crazy about building things from scratch and really
>> love to work on apache. Thank you for your time and consideration.
>>
>> [image: PDF file]
>> Raveesh_Resume(6).pdf
>>
>> <https://drive.google.com/file/d/1Xg-pPPFUtgXP130D9O7p4Z0bQpwKypMp/view?usp=drivesdk>
>>
>>
>> Best regards,
>> Raveesh Gupta
>>
> --
> Best regards,
> Raveesh Gupta
>


Re: Warning - Ref Guide has been migrated to Antora

2022-02-22 Thread David Smiley
At the root of the project I run the command : gw check -x test
The "rat" task is run on most modules producing one line, and then after
that, this is what's output:

> Task :solr:solr-ref-guide:downloadAntoraCli

up to date, audited 294 packages in 5s

35 packages are looking for funding
  run `npm fund` for details

9 high severity vulnerabilities

To address issues that do not require attention, run:
  npm audit fix

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.

> Task :solr:solrj:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

> Task :solr:solr-ref-guide:downloadAntoraLunrExtension

up to date, audited 294 packages in 2s

35 packages are looking for funding
  run `npm fund` for details

9 high severity vulnerabilities

To address issues that do not require attention, run:
  npm audit fix

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.

> Task :solr:solr-ref-guide:downloadAntoraSiteGenerator

up to date, audited 294 packages in 1s

35 packages are looking for funding
  run `npm fund` for details

9 high severity vulnerabilities

To address issues that do not require attention, run:
  npm audit fix

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.

> Task :solr:solr-ref-guide:downloadAsciidoctorMathjaxExtension

up to date, audited 294 packages in 2s

35 packages are looking for funding
  run `npm fund` for details

9 high severity vulnerabilities

To address issues that do not require attention, run:
  npm audit fix

To address all issues (including breaking changes), run:
  npm audit fix --force

Run `npm audit` for details.

> Task :solr:solr-ref-guide:buildLocalAntoraSite
{"level":"fatal","time":1645556621695,"name":"antora","hint":"Add the
--stacktrace option to see the cause of the error.","msg":"Local content
source must be a git repository: /Users/dsmiley/SFDCDev/solr_main"}

> Task :solr:solr-ref-guide:buildLocalAntoraSite FAILED

> Task :solr:core:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':solr:solr-ref-guide:buildLocalAntoraSite'.
> Process 'command
'/Users/dsmiley/SFDCDev/solr_main/.gradle/node/nodejs/node-v16.13.2-darwin-x64/bin/npx''
finished with non-zero exit value 1

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or
--debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 1m 20s



I would appreciate any help!  Maybe we might use Slack?  I'll share the
conclusion here once this is figured out.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Feb 14, 2022 at 11:52 PM Mike Drob  wrote:

> The build should download and install node/npm/npx for you under .gradle,
> so no new prerequisites aside from gradle pulling in new dependency trees.
>
> Do you have more detail on the error message so we can investigate further?
>
> On Mon, Feb 14, 2022 at 10:30 PM David Smiley  wrote:
>
>> When I do "gw check", I see an error about a a "npx" program not being
>> found:
>>
>> > Process 'command
>>> '/Users/dsmiley/SFDCDev/solr_main/.gradle/node/nodejs/node-v16.13.2-darwin-x64/bin/npx''
>>> finished with non-zero exit value 1
>>>
>>
>> Um... are there new prerequisites to do a build?  Is this documented?
>>
>


Re: Warning - Ref Guide has been migrated to Antora

2022-02-22 Thread David Smiley
It turns out it was because I was using git worktree (which I use
extensively) and Antora didn't like that my main/primary checkout was at a
branch that didn't have the ref guide.  I took this opportunity to make
"main" my primary checkout, and then the problem disappeared.  It took a
bit of doing "git worktree repair LINKNAME" to my linked checkouts to
re-establish the links bidirectionally.

Thanks Houston for some pointers!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Feb 22, 2022 at 2:08 PM David Smiley  wrote:

> At the root of the project I run the command : gw check -x test
> The "rat" task is run on most modules producing one line, and then after
> that, this is what's output:
>
> > Task :solr:solr-ref-guide:downloadAntoraCli
>
> up to date, audited 294 packages in 5s
>
> 35 packages are looking for funding
>   run `npm fund` for details
>
> 9 high severity vulnerabilities
>
> To address issues that do not require attention, run:
>   npm audit fix
>
> To address all issues (including breaking changes), run:
>   npm audit fix --force
>
> Run `npm audit` for details.
>
> > Task :solr:solrj:compileJava
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
>
> > Task :solr:solr-ref-guide:downloadAntoraLunrExtension
>
> up to date, audited 294 packages in 2s
>
> 35 packages are looking for funding
>   run `npm fund` for details
>
> 9 high severity vulnerabilities
>
> To address issues that do not require attention, run:
>   npm audit fix
>
> To address all issues (including breaking changes), run:
>   npm audit fix --force
>
> Run `npm audit` for details.
>
> > Task :solr:solr-ref-guide:downloadAntoraSiteGenerator
>
> up to date, audited 294 packages in 1s
>
> 35 packages are looking for funding
>   run `npm fund` for details
>
> 9 high severity vulnerabilities
>
> To address issues that do not require attention, run:
>   npm audit fix
>
> To address all issues (including breaking changes), run:
>   npm audit fix --force
>
> Run `npm audit` for details.
>
> > Task :solr:solr-ref-guide:downloadAsciidoctorMathjaxExtension
>
> up to date, audited 294 packages in 2s
>
> 35 packages are looking for funding
>   run `npm fund` for details
>
> 9 high severity vulnerabilities
>
> To address issues that do not require attention, run:
>   npm audit fix
>
> To address all issues (including breaking changes), run:
>   npm audit fix --force
>
> Run `npm audit` for details.
>
> > Task :solr:solr-ref-guide:buildLocalAntoraSite
> {"level":"fatal","time":1645556621695,"name":"antora","hint":"Add the
> --stacktrace option to see the cause of the error.","msg":"Local content
> source must be a git repository: /Users/dsmiley/SFDCDev/solr_main"}
>
> > Task :solr:solr-ref-guide:buildLocalAntoraSite FAILED
>
> > Task :solr:core:compileJava
> Note: Some input files use or override a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
>
> FAILURE: Build failed with an exception.
>
> * What went wrong:
> Execution failed for task ':solr:solr-ref-guide:buildLocalAntoraSite'.
> > Process 'command
> '/Users/dsmiley/SFDCDev/solr_main/.gradle/node/nodejs/node-v16.13.2-darwin-x64/bin/npx''
> finished with non-zero exit value 1
>
> * Try:
> Run with --stacktrace option to get the stack trace. Run with --info or
> --debug option to get more log output. Run with --scan to get full insights.
>
> * Get more help at https://help.gradle.org
>
> BUILD FAILED in 1m 20s
>
>
>
> I would appreciate any help!  Maybe we might use Slack?  I'll share the
> conclusion here once this is figured out.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Feb 14, 2022 at 11:52 PM Mike Drob  wrote:
>
>> The build should download and install node/npm/npx for you under .gradle,
>> so no new prerequisites aside from gradle pulling in new dependency trees.
>>
>> Do you have more detail on the error message so we can
>> investigate further?
>>
>> On Mon, Feb 14, 2022 at 10:30 PM David Smiley  wrote:
>>
>>> When I do "gw check", I see an error about a a "npx" program not being
>>> found:
>>>
>>> > Process 'command
>>>> '/Users/dsmiley/SFDCDev/solr_main/.gradle/node/nodejs/node-v16.13.2-darwin-x64/bin/npx''
>>>> finished with non-zero exit value 1
>>>>
>>>
>>> Um... are there new prerequisites to do a build?  Is this documented?
>>>
>>


Metrics: solr.jetty -- where did it go?

2022-02-27 Thread David Smiley
We used to have a metrics group "solr.jetty" but it's absent now on main &
9_0.  I noticed this problem when running the prometheus exporter with the
default configuration which output some errors relating to it not finding
this.  The fact that I'm seeing this shows we lack at least a sanity check
(no errors) for the prometheus exporter default config.  I think this is a
release blocker.  I'm putting this aside now but perhaps someone knows what
happened to it?.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: New branch and feature freeze for Solr 9.0.0

2022-03-01 Thread David Smiley
On Tue, Mar 1, 2022 at 4:46 AM Jan Høydahl  wrote:

> Hi, and welcome to March!
>
> Our initial goal of a RC1 within February slipped, but we are still in a
> good position.
> I'll try to summarize the current code blockers:
>
>
> *SOLR-16061  Decouple CloudSolrClient from ZkStateReader*
>
> This is new, a spin-off from SOLR-15342 to prepare for solrj
> modularization. There is already a draft PR. Hope there will be progress on
> this so we don't have to delay solrj modularization until 10.0
>

I'm working with Haythem on this (a colleague).  I think it's close; it's
"just" a refactoring.  The main constraint on this is Haythem's time.


>
> *SOLR-14290  Fix NPE in SolrTestCaseJ4 breaking external usage for
> master/9.x*
>
> This has not seen any movement despite repeated reminders, so unless there
> is progress within a few days I'll remove it as blocker and add a note to
> the release notes that users relying on running test framework locally
> should wait for a later release.
>

I'm interested in looking but not until I get through the other two.


> *SOLR-14401  "distrib" request handler metrics should only be tracked
> on...*
>
> There is a PR, not sure how close to merge it is though.
>

I think the core of the change is fine but there are downstream changes
needed.  First level is the prometheus exporter configuration to not look
for ".distrib." vs ".local."; it's different now.  Next level is
the Grafana dashboard.  I don't normally play with JQ, Prometheus or
Grafana so it's taking me some time this week.  I'd appreciate any feedback
on the choices here; so far only Houston has weigh'ed in.  I think Tim
Potter, if you're reading this, would be useful given you did major work
here.


>
> Also, David found a new blocker bug yesterday - the "jetty" metrics group
> is missing in 9.x. There will likely be another blocker due to this.
>
>
> Appreciate an update in this thread on the ETA for each of these.
>
> Jan
>
> 22. feb. 2022 kl. 12:20 skrev Jan Høydahl :
>
> I created a new blocker
>
> SOLR-16040  Fix split packages in hdfs module
>
> Not sure if it needs to be a blocker though, but we should try to avoid
> split packages as far as we can, and this cannot be done in 9.x.
> Meanwhile, SOLR-15064 is resolved and SOLR-14401 is in PR review phase.
> Jenkins is now mostly green after some turmoil!
>
> *SOLR-14290 (SolrTestCaseJ4 NPE) seems to be stalled - anyone who can lend
> a hand there?*
>
> We also discussed in SOLR-15342
>  whether refactoring
> CloudSolrClient to untangle ZkStateProvider should be done now, and also
> rename solr-solrj as solr-solrj-all so that we can continue with the solrj
> modularization in 9.x without back-compat breaks.
> It seems worthy of a blocker to me, but we need someone willing to do the
> work in the next few days. Anyone?
>
> I also created SOLR-16041
>  (not blocker) to try
> to setup nightly smoketestRelease Jenkins jobs, I may try to give it a go.
>
> Assuming progress on the above, I'm still hopeful for an RC1 in the
> timeframe of next week.
>
> Jan
>
> 16. feb. 2022 kl. 17:05 skrev Jan Høydahl :
>
> These are the three main code-blockers for doing 9.0.0 RC1:
>
> (P) SOLR-15064  Atomic/partial updates to nested docs should not assume
> _route_dsmiley dsmiley
> (S) SOLR-14290  Fix NPE in SolrTestCaseJ4 breaking external usage for
> master/9.x   gus gus
> (S) SOLR-14401  """distrib"" request handler metrics should only be
> tracked on pe  dsmiley dsmiley
>
> The other blockers are mostly about the release process itself, including
> docker and refguide. I'm doing a clean-up of 9.0  CHANGES too.
> When these are resolved, I'll prepare RC1. That means *we're really close
> *now!!
>
> Anshum is preparing a release notes draft, and we also need to complete
> "Major Changes in 9.0"  and "Upgrade Notes" in ref-guide before publishing
> the guide.
>
> Jan
>
> 7. feb. 2022 kl. 14:52 skrev Jan Høydahl :
>
> Congrats on HDFS as a package! Huge win! Also some other blockers have
> been closed recently.
>
> Status on the 9.0 release, one week into February.
>
> - I have done a dry-run of an RC and the smoketester. Think the release
> scripts are ready!
> - 11 open blockers:
>
> (P) SOLR-15587  Replicas end up with base_url as http on client side even
> if clus  thelabdude  thelabdude
> (P) SOLR-15556  Ref Guide Redesign Phase 3: Replace Jekyll
> ctargettctargett
> (P) SOLR-15557  Figure out how to handle ref guide page renames/redirects
>  janhoy  ctargett
> (A) SOLR-15064  Atomic/partial updates to nested docs should not assume
> _route_dsmiley dsmiley
> (A) SOLR-15949  Use Java 17 in docker
>  dsmiley dsmiley
> (S) SOLR-14290  Fix NPE in SolrTestCaseJ4 breaking external usage for
> master/9.x   gus gus
> (S) SOLR-14401  """distrib"" request handler metrics should only be
> tracked on pe  dsmiley ds

Re: New branch and feature freeze for Solr 9.0.0

2022-03-01 Thread David Smiley
I suppose the biggest spots for peer review are:
* use of brackets [ ] in the metric name where the request handler is.
Thus "/select[shard]"
* There is a fundamental difference in how the metrics are tracked on a
handler.  Previously, there were metrics for all of /select (no matter how
it was invoked), and a few for .distrib. & .shard. depending on how it was
invoked.  Now, the request is classified to be a shard request, or not a
shard request, after which separate metrics (same type/semantics) are
manipulated based on that classification, kind of as if there are two
distinct request handlers even though just one is registered.  I think
the PR makes this clear.  While I like it, the main trade-off is that a
user would be forced to aggregate metrics if they wanted a single metric
for the handler.  I think the isShard=true request changes the
personality/mode of the handler so much that I prefer to present it as its
own identity from a metrics standpoint.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Mar 1, 2022 at 4:19 PM Timothy Potter  wrote:

> Hi David,
>
> I read your note about SOLR-14401 but not clear what you need from me?
> Seems like you're renaming existing metrics and removing "distrib"
> from handlers that don't support a distrib mode, seems right to me.
>
> I actually haven't done much work on the metrics backend. For Grafana,
> it's a JSON file so search / replace the metrics you're changing. The
> Solr operator makes it really easy to set up Solr + ZK + Grafana +
> Prometheus + Exporter to test out your changes. It'll be pretty
> obvious if the dashboard is broken.
>
> Tim
>
> On Tue, Mar 1, 2022 at 7:01 AM David Smiley  wrote:
> >
> >
> >
> > On Tue, Mar 1, 2022 at 4:46 AM Jan Høydahl 
> wrote:
> >>
> >> Hi, and welcome to March!
> >>
> >> Our initial goal of a RC1 within February slipped, but we are still in
> a good position.
> >> I'll try to summarize the current code blockers:
> >>
> >>
> >> SOLR-16061  Decouple CloudSolrClient from ZkStateReader
> >>
> >> This is new, a spin-off from SOLR-15342 to prepare for solrj
> modularization. There is already a draft PR. Hope there will be progress on
> this so we don't have to delay solrj modularization until 10.0
> >
> >
> > I'm working with Haythem on this (a colleague).  I think it's close;
> it's "just" a refactoring.  The main constraint on this is Haythem's time.
> >
> >>
> >>
> >> SOLR-14290  Fix NPE in SolrTestCaseJ4 breaking external usage for
> master/9.x
> >>
> >> This has not seen any movement despite repeated reminders, so unless
> there is progress within a few days I'll remove it as blocker and add a
> note to the release notes that users relying on running test framework
> locally should wait for a later release.
> >
> >
> > I'm interested in looking but not until I get through the other two.
> >
> >>
> >> SOLR-14401  "distrib" request handler metrics should only be tracked
> on...
> >>
> >> There is a PR, not sure how close to merge it is though.
> >
> >
> > I think the core of the change is fine but there are downstream changes
> needed.  First level is the prometheus exporter configuration to not look
> for ".distrib." vs ".local."; it's different now.  Next level is the
> Grafana dashboard.  I don't normally play with JQ, Prometheus or Grafana so
> it's taking me some time this week.  I'd appreciate any feedback on the
> choices here; so far only Houston has weigh'ed in.  I think Tim Potter, if
> you're reading this, would be useful given you did major work here.
> >
> >>
> >>
> >> Also, David found a new blocker bug yesterday - the "jetty" metrics
> group is missing in 9.x. There will likely be another blocker due to this.
> >>
> >>
> >> Appreciate an update in this thread on the ETA for each of these.
> >>
> >> Jan
> >>
> >> 22. feb. 2022 kl. 12:20 skrev Jan Høydahl :
> >>
> >> I created a new blocker
> >>
> >> SOLR-16040  Fix split packages in hdfs module
> >>
> >> Not sure if it needs to be a blocker though, but we should try to avoid
> split packages as far as we can, and this cannot be done in 9.x.
> >> Meanwhile, SOLR-15064 is resolved and SOLR-14401 is in PR review phase.
> >> Jenkins is now mostly green after some turmoil!
> >>
> >> SOLR-14290 (Solr

CloudSolrClient; do we deprecate or not?

2022-03-14 Thread David Smiley
I want to bring an important SolrJ decision to the dev list.

There's a JIRA issue https://issues.apache.org/jira/browse/SOLR-15223
"Deprecate HttpSolrClient and friends in 9.0"

Sounds great by the title -- we want to transition over time to the Jetty
client instead.  Jan submitted a PR to deprecate CloudSolrClient and some
others, and I approved it because these classes intimately assume the
Apache HttpClient.  It's merged.

But I have serious doubts now and wish to discuss it with the dev list.
Copying my last message on the issue:

Now that I'm "seeing" the results of this in my IDE, seeing the
> cross-through of deprecated usage on innocent looking classes like
> CloudSolrClient in particular, I have doubts on the approach.
> "CloudSolrClient" is an intuitive/obvious name to a user that wants to talk
> to SolrCloud. The particulars of which HTTP protocol or wether the client
> is using whatever HTTP library is all an implementation detail. Ideally
> such decisions would be done in the builder, either a common builder or if
> not then a builder specific to those libraries if needed (less nice but
> acceptable IMO).
>
> The easiest way to get there is to rename CloudSolrClient to
> CloudHttp1SolrClient in one commit (merge it) and then rename
> BaseCloudSolrClient to simply CloudSolrClient in the next. Then add a
> Builder to this class that is the one in Http2; subclass it or something
> (details TBD).
>
> WDYT?
>
> Of course, today they are separated by their classes. Maybe we should
> simply convey the deprecation intent in the upgrade notes as an advanced
> warning, but not deprecate CloudSolrClient in particular.
>

Jan replied:

Since we did not deprecate these in 8.x, we still have a back-compat
> promise to keep these classes around in 9.x, and thus also the old http
> client. But perhaps we are breaking that promise already in SOLR-16061
> <https://issues.apache.org/jira/browse/SOLR-16061>, so maybe we can
> change even more
>
> I don't like the CloudHttp2SolrClient naming either, would prefer the Http
> client to be abstracted away so that it could be swapped out as an impl
> detail, but it was not designed that way. I fear that re-using the same
> class name but with slightly different contract is harder to explain than a
> clear deprecation message in the IDE pointing you to the replacement.
>
> Perhaps the one client to rule them all in 10.0 should be
> ClusterSolrClient? And aim for it to be constructed with either a Jetty
> client or JDK8-HttpClient as backbone through different factories/builder?
>

How is the contract between CloudSolrClient & BaseCloudSolrClient different
Jan?  I suspect if there's breakage, it'd be relatively obscure options on
the builder.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: CloudSolrClient; do we deprecate or not?

2022-03-15 Thread David Smiley
On Tue, Mar 15, 2022 at 8:47 AM Jan Høydahl  wrote:

> I re-opened SOLR-15223 to highlight that we are still blocked by this
> decision.
>
> I don't clearly see the full effects of your suggestion right now. Does
> your proposal also involve deprecating CloudHttp2SolrClient as a separate
> class?
>

No; it would stay.  Perhaps ideally it would have a name reflecting it uses
the Jetty client but no big deal; it can stay as-is.  Its name already
isn't necessarily true; you can use this class (and thus the Jetty client)
and tell it not to use Http2 :-). I'm reminded that HdfsDirectory doesn't
require HDFS :-). (It requires the HDFS client libs but not necessarily an
HDFS backend, if you're curious).


> I would imagine users with existing SolrJ code would after upgrading get
> an instance of BaseCloudSolrClient (with a new name) using Jetty client
> under the hood? What if that application code assumes org.apache.http as
> client and tries to obtain HttpSolrClient and even org.apache.http objects
> based on CloudSolrClient? The code would fail since the contract is broken.
>

If the client/user truly assumes org.apache.http well clearly they will be
disrupted by this change.  You want to call that a "contract" -- shrug; I
call it an implementation detail that can change :-).  They may be calling
getLbClient and may be using the LBHttpSolrClient subclass of LBSolrClient,
perhaps.  Or similarly specifying builder options relating to this advanced
option.  It's possible and it's undeniable _some_ clients will be
impacted.  We can only hypothesize _why_ a client is dependent in the first
place (vs. perhaps an accidental dependency/assumption e.g. in dependency
management).  Perhaps to tweak/tune advanced options, timeouts.  Perhaps to
instrument mTLS details; although I know from experience it can be done
without calling special methods on builders; it can be done via setting
special system properties referring to one's own classes that are called in
certain ways.  If you do that (and we have), the way to do it for the
Apache based client differs from Jetty; we've done it for both because Solr
uses both internally.  Anyway, this is off the beaten path of most users.


>
> With the current pure deprecation and switch to CloudHttp2SolrClient,
> existing users' code would continue to work..
>

Hey, this is a major release; let's not hold ourselves to a standard that
is too onerous for us to maintain.  We can make our intentions clear in
upgrade notes.

~ David


> Jan
>
>
> 14. mar. 2022 kl. 15:40 skrev David Smiley :
>
> I want to bring an important SolrJ decision to the dev list.
>
> There's a JIRA issue https://issues.apache.org/jira/browse/SOLR-15223
> "Deprecate HttpSolrClient and friends in 9.0"
>
> Sounds great by the title -- we want to transition over time to the Jetty
> client instead.  Jan submitted a PR to deprecate CloudSolrClient and some
> others, and I approved it because these classes intimately assume the
> Apache HttpClient.  It's merged.
>
> But I have serious doubts now and wish to discuss it with the dev list.
> Copying my last message on the issue:
>
> Now that I'm "seeing" the results of this in my IDE, seeing the
>> cross-through of deprecated usage on innocent looking classes like
>> CloudSolrClient in particular, I have doubts on the approach.
>> "CloudSolrClient" is an intuitive/obvious name to a user that wants to talk
>> to SolrCloud. The particulars of which HTTP protocol or wether the client
>> is using whatever HTTP library is all an implementation detail. Ideally
>> such decisions would be done in the builder, either a common builder or if
>> not then a builder specific to those libraries if needed (less nice but
>> acceptable IMO).
>>
>> The easiest way to get there is to rename CloudSolrClient to
>> CloudHttp1SolrClient in one commit (merge it) and then rename
>> BaseCloudSolrClient to simply CloudSolrClient in the next. Then add a
>> Builder to this class that is the one in Http2; subclass it or something
>> (details TBD).
>>
>> WDYT?
>>
>> Of course, today they are separated by their classes. Maybe we should
>> simply convey the deprecation intent in the upgrade notes as an advanced
>> warning, but not deprecate CloudSolrClient in particular.
>>
>
> Jan replied:
>
> Since we did not deprecate these in 8.x, we still have a back-compat
>> promise to keep these classes around in 9.x, and thus also the old http
>> client. But perhaps we are breaking that promise already in SOLR-16061
>> <https://issues.apache.org/jira/browse/SOLR-16061>, so maybe we can
>> change even more
>>
>> I don'

Re: CloudSolrClient; do we deprecate or not?

2022-03-16 Thread David Smiley
"ClusterSolrClient" is a fine name but we already have a fine name
that users are using.  Waiting till 10.0 is depressing to me, particularly
because it seems unnecessary.  Is there disagreement that the possibility
of some users having to change something is too much to ask in a major
version?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 16, 2022 at 4:53 PM Jason Gerlowski 
wrote:

> > We can only hypothesize _why_ a client is dependent in the first place
> ...  Perhaps to tweak/tune advanced options, timeouts.  Perhaps to
> instrument mTLS details
>
> Another use-case to add to this list would be auth settings.  I'm
> struggling to come up with a concrete example this minute, but I know
> I've written SolrJ code that customized the underlying HttpClient for
> auth-related purposes.
>
> > "CloudSolrClient" is an intuitive/obvious name to a user that wants to
> talk to SolrCloud [...] HTTP protocol or wether the client is using
> whatever HTTP library is all an implementation detail.
>
> +1  I like the idea of keeping implementation details out of the name
> of any types we're putting front-and-center for SolrJ users.  But I
> share Jan's concern about breaking clients who rely on a particular
> underlying client type.
>
> My favorite idea so far is probably Jan's point that balancing those
> two gets a lot easier if we introduce some "new" name like
> "ClusterSolrClient" as the long term successor to
> CloudSolrClient/BaseCloudSolrClient.  It'd be nice to keep the name
> 'CloudSolrClient' itself for the sake of continuity, but
> ClusterSolrClient at least preserves the reasons we like
> 'CloudSolrClient' as a name and it makes keeping backcompat pretty
> easy:
>
> I guess concretely, this would look something like:
>
> 1. Create a new class, 'ClusterSolrClient', that's a trivial extension
> of BaseCloudSolrClient. (i.e. `class ClusterSolrClient extends
> BaseCloudSolrClient {}`)
> 2. Add a builder for the new 'ClusterSolrClient' that can create
> either the apache or jetty-powered CloudSolrClient based on the
> builder methods invoked.
> 3. Deprecate BaseCloudSolrClient, CloudSolrClient, CSC.Builder, and
> CloudHttp2SolrClient.Builder for 9.0, directing users over to the new
> ClusterSolrClient and its builder.
> 4. Remove the deprecated classes in 10.0
>
> Does something like this sound do-able?
>
> Jason
>
> On Wed, Mar 16, 2022 at 10:50 AM Mike Drob  wrote:
> >
> > I feel like CloudSolrClient doesn't imply anything about HTTP 1 or 2,
> anything about Apache or Jetty (or java.net.http). If we have exposed those
> internal details in some ways, then that is unfortunate and should be
> addressed.
> >
> > I personally never use CloudHttp2SolrClient because I kind of assumed
> that it was an implementation detail and the various builders would give me
> the http2 client when I needed it. Maybe that's not the case. I've never
> thought about it too much. CloudSolrClient looks like the "simpler" one to
> use so that's what people gravitate towards.
> >
> > A quick look in my editor suggests that we have 100 uses of
> CloudSolrClient, including some in the ref guide. If we want to deprecate
> this, then we should update our documentation to guide people away from it
> as well. I suspect that if we try to examine which uses of CloudSolrClient
> in our code could just be SolrClient, we wouldn't make much progress on
> this though.
> >
> > I know this isn't offering much in the way of solutions, but I'm mostly
> trying to say that I agree it is a problem.
> >
> >
> > Mike
> >
> > On Wed, Mar 16, 2022 at 12:05 AM David Smiley 
> wrote:
> >>
> >> On Tue, Mar 15, 2022 at 8:47 AM Jan Høydahl 
> wrote:
> >>>
> >>> I re-opened SOLR-15223 to highlight that we are still blocked by this
> decision.
> >>>
> >>> I don't clearly see the full effects of your suggestion right now.
> Does your proposal also involve deprecating CloudHttp2SolrClient as a
> separate class?
> >>
> >>
> >> No; it would stay.  Perhaps ideally it would have a name reflecting it
> uses the Jetty client but no big deal; it can stay as-is.  Its name already
> isn't necessarily true; you can use this class (and thus the Jetty client)
> and tell it not to use Http2 :-). I'm reminded that HdfsDirectory doesn't
> require HDFS :-). (It requires the HDFS client libs but not necessarily an
> HDFS backend, if you're curious).
> &

Re: CloudSolrClient; do we deprecate or not?

2022-03-17 Thread David Smiley
Thank you for accepting my proposal -- I definitely volunteer to implement
it!

ETA:   I've started... I should have a PR to share this weekend.

One thing I want to point out that I see is that, as tempting as it may be,
all the places inside Solr that call the existing Http1 (using Apache
HttpClient) based builder will *continue* to do so.  Migrating to Http2 is
out of scope of this issue.  There are risks around authentication
propagation since there are known gaps there.  There will be some judgement
calls as to which internal method signatures should take CloudSolrClient
or CloudHttp1SolrClient but I lean to keep CloudSolrClient and do a bit of
casting on occasion when necessary (e.g. to access the HttpClient inside).

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Mar 17, 2022 at 12:17 PM Jan Høydahl  wrote:

> To wrap up:
>
> David's proposal:
> * Un-deprecate CloudSolrClient to use it as the main cluster client going
> forward
> * Rename CloudSolrClient -> CloudHttp1SolrClient and rename
> BaseCloudSolrClient -> CloudSolrClient
> * Make a new Builder that will instantiate a CloudSolrClient instance with
> a LBHttp2SolrClient / Jetty client backing it
>Users who want / need to use the old apache clients will now
> use CloudHttp1SolrClient's Builder instead
> * CloudHttp2SolrClient will remain (but can be deprecated?)
>
> Most SolrJ users will need to adapt their app when upgrading to solrj 9.0,
> but we are willing to accept that even if things are not pre-announced with
> deprecations.
> We can introduce some deprecations and "bridge" code in 8.11.2 if we want
> to provide a smoother path.
>
> I'm also willing to accept such a compat break, given that users can still
> use 8.11 solrj with solr9 as a bridge, and that we document the changes.
>
> David, I assume you were volunteering to land the proposed refactorings?
> :) Do you have an ETA?
>
>
> Can someone also please also comment on whether the rest of the
> deprecations look ok? The Auth stuff is closely tied to apache-http-client
> so will need to switch to Jetty-client before 10.0 if we're going to get
> rid of the dependency from Solr.:
>
> ConcurrentUpdateSolrClient
> HttpClientUtil
> HttpClusterStateProvider
> HttpSolrClient
> Krb5HttpClientBuilder
> LBHttpSolrClient
> PreemptiveAuth
> PreemptiveBasicAuthClientBuilderFactory
> SolrClientBuilder
> SolrHttpClientBuilder
> SolrHttpClientContextBuilder
> SolrHttpRequestRetryHandler
>
> Jan
>
> 17. mar. 2022 kl. 16:47 skrev Houston Putman :
>
> I think it's fine to change the SolrJ code in 9.0, it's a major version
> and we are not doing it for a silly reason.
>
> As long as we document the changes well (maybe we need a separate page for
> Major changes in SolrJ-9), I don't see a reason why we can't make these
> changes.
>
> It could be that we should be even bolder in 10.0 and provide a more
>> modern Cluster SolrJ client that supports an instant pub/sub over HTTP/2
>> for clusterstate changes (i.e. a push from Solr server to client over
>> HTTP/2), eliminating the need for user apps talking to Zookeeper at all.
>> That would also make it easier for 3rd party clients to implement a good
>> Solr client.
>>
>
> That sounds like a great idea (would love to eliminate the need for users
> to talk to ZK).
>
> On Thu, Mar 17, 2022 at 8:54 AM Jan Høydahl  wrote:
>
>> One simple solution is to revert SOLR-15223 Deprecate HttpSolrClient and
>> friends in 9.0, do the 9.0 release and then continue planning for the
>> next-gen Cloud client.
>>
>> It could be that we should be even bolder in 10.0 and provide a more
>> modern Cluster SolrJ client that supports an instant pub/sub over HTTP/2
>> for clusterstate changes (i.e. a push from Solr server to client over
>> HTTP/2), eliminating the need for user apps talking to Zookeeper at all.
>> That would also make it easier for 3rd party clients to implement a good
>> Solr client.
>>
>> Jan
>>
>> 17. mar. 2022 kl. 04:40 skrev David Smiley :
>>
>> "ClusterSolrClient" is a fine name but we already have a fine name
>> that users are using.  Waiting till 10.0 is depressing to me, particularly
>> because it seems unnecessary.  Is there disagreement that the possibility
>> of some users having to change something is too much to ask in a major
>> version?
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Wed, Mar 16, 2022 at 4:53 PM Jason Gerlowski 
>> wrote:
>>
>>> > We can

Re: CloudSolrClient; do we deprecate or not?

2022-03-18 Thread David Smiley
The name CloudHttp1SolrClient is understandable because we have one with
"Http2" in the name.  But our Http2 one speaks HTTP 1.1 too :-)
I think the names CloudApacheHttpSolrClient or CloudLegacySolrClient are
good names, and I lean to the latter because with the word "legacy" in its
name, it screams, don't use me if you can avoid it ;-).  Also,
CloudApacheHttpSolrClient is even more of a mouthful, and it could be not
so obvious how to parse that (to our users) since Solr is also under the
ASF and the Http part could be sort of obvious vs what we intend to mean --
a specific "Apache" http client vs whatever other ones.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Mar 18, 2022 at 12:28 AM David Smiley  wrote:

> Thank you for accepting my proposal -- I definitely volunteer to implement
> it!
>
> ETA:   I've started... I should have a PR to share this weekend.
>
> One thing I want to point out that I see is that, as tempting as it may
> be, all the places inside Solr that call the existing Http1 (using Apache
> HttpClient) based builder will *continue* to do so.  Migrating to Http2 is
> out of scope of this issue.  There are risks around authentication
> propagation since there are known gaps there.  There will be some judgement
> calls as to which internal method signatures should take CloudSolrClient
> or CloudHttp1SolrClient but I lean to keep CloudSolrClient and do a bit of
> casting on occasion when necessary (e.g. to access the HttpClient inside).
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Thu, Mar 17, 2022 at 12:17 PM Jan Høydahl 
> wrote:
>
>> To wrap up:
>>
>> David's proposal:
>> * Un-deprecate CloudSolrClient to use it as the main cluster client going
>> forward
>> * Rename CloudSolrClient -> CloudHttp1SolrClient and rename
>> BaseCloudSolrClient -> CloudSolrClient
>> * Make a new Builder that will instantiate a CloudSolrClient instance
>> with a LBHttp2SolrClient / Jetty client backing it
>>Users who want / need to use the old apache clients will now
>> use CloudHttp1SolrClient's Builder instead
>> * CloudHttp2SolrClient will remain (but can be deprecated?)
>>
>> Most SolrJ users will need to adapt their app when upgrading to solrj
>> 9.0, but we are willing to accept that even if things are not pre-announced
>> with deprecations.
>> We can introduce some deprecations and "bridge" code in 8.11.2 if we want
>> to provide a smoother path.
>>
>> I'm also willing to accept such a compat break, given that users can
>> still use 8.11 solrj with solr9 as a bridge, and that we document the
>> changes.
>>
>> David, I assume you were volunteering to land the proposed refactorings?
>> :) Do you have an ETA?
>>
>>
>> Can someone also please also comment on whether the rest of the
>> deprecations look ok? The Auth stuff is closely tied to apache-http-client
>> so will need to switch to Jetty-client before 10.0 if we're going to get
>> rid of the dependency from Solr.:
>>
>> ConcurrentUpdateSolrClient
>> HttpClientUtil
>> HttpClusterStateProvider
>> HttpSolrClient
>> Krb5HttpClientBuilder
>> LBHttpSolrClient
>> PreemptiveAuth
>> PreemptiveBasicAuthClientBuilderFactory
>> SolrClientBuilder
>> SolrHttpClientBuilder
>> SolrHttpClientContextBuilder
>> SolrHttpRequestRetryHandler
>>
>> Jan
>>
>> 17. mar. 2022 kl. 16:47 skrev Houston Putman :
>>
>> I think it's fine to change the SolrJ code in 9.0, it's a major version
>> and we are not doing it for a silly reason.
>>
>> As long as we document the changes well (maybe we need a separate page
>> for Major changes in SolrJ-9), I don't see a reason why we can't make these
>> changes.
>>
>> It could be that we should be even bolder in 10.0 and provide a more
>>> modern Cluster SolrJ client that supports an instant pub/sub over HTTP/2
>>> for clusterstate changes (i.e. a push from Solr server to client over
>>> HTTP/2), eliminating the need for user apps talking to Zookeeper at all.
>>> That would also make it easier for 3rd party clients to implement a good
>>> Solr client.
>>>
>>
>> That sounds like a great idea (would love to eliminate the need for users
>> to talk to ZK).
>>
>> On Thu, Mar 17, 2022 at 8:54 AM Jan Høydahl 
>> wrote:
>>
>>> One simple solution is to revert SOLR-15223 Deprecate HttpSolrClient and
>>> friends in 9.0, d

Reviving the "repro" build?

2022-03-18 Thread David Smiley
With builds failing pretty often, I think it's important that reproducible
test failures be detected as such and alerted so that we can focus our
build triage attention where it's most fruitful, and to reduce the time for
the bug to be active.

There used to be a "repro" build that Steve Rowe originally worked on:
https://ci-builds.apache.org/job/Lucene/job/Lucene-Solr-repro/ The
corresponding script is: dev-tools/scripts/reproduceJenkinsFailures.py
Clearly it hasn't survived the gradle transition.

Does anyone have thoughts on reviving this vs some other ideas you may have?

One idea I've had is to have the Jenkins job for the builds include a
reproducibility step -- an encore for the failing tests to try to do their
thing again.  Then we could differentiate the build status this way --
Unstable vs Failure on the reproducibility.


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Welcome Houston Putman as Solr's new PMC chair

2022-03-21 Thread David Smiley
Congrats Houston and thanks for your PMC chores Jan!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Mar 17, 2022 at 11:36 AM Houston Putman  wrote:

> Thanks Jan!
>
> Thank you so much for all of the hard work you have done over the past
> year.
> It's been an important, but difficult, year and we have been lucky to have
> you at the helm!
>
> I'm really excited for the next year and to see how far Solr is able to go
> by the time I "leave office".
> Please don't hesitate to reach out if I can help in any way (be it
> code/organization/community related).
>
> - Houston Putman
>
>
>
> On Thu, Mar 17, 2022 at 9:14 AM Jan Høydahl  wrote:
>
>> Hi,
>>
>> It's been an honour to serve as chair for the Apache Solr PMC (Project
>> Management Committee) for the last year. Quite a busy year for the project!
>>
>> We have a tradition to rotate the role among PMC members, and the PMC has
>> elected Houston Putman as the next Solr Chair.
>>
>> Congrats Houston, and good luck!
>>
>> Jan
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>> For additional commands, e-mail: dev-h...@solr.apache.org
>>
>>


Re: CloudSolrClient; do we deprecate or not?

2022-03-22 Thread David Smiley
I pushed the rename we discussed and added change notes.

I can see that the naming/deprecation choice varies between standalone vs
SolrCloud (simple deprecation vs rename & swap).  Not a problem but not
ideal.  I don't care too much because a user will likely do just one or the
other and not care much about the other side.  Still, for HttpSolrClient in
particular (the most common), it's not too late to un-deprecate it, and a
similar change could happen later.  Interestingly, BaseHttpSolrClient has
nothing of interest, unlike the cloud side of things.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Mar 22, 2022 at 10:16 AM Jan Høydahl  wrote:

> Hi all,
>
> Please help close this last(?) 9.0 blocker by reviewing
> https://github.com/apache/solr/pull/750
>
> Jan
>
> 18. mar. 2022 kl. 13:51 skrev David Smiley :
>
> The name CloudHttp1SolrClient is understandable because we have one with
> "Http2" in the name.  But our Http2 one speaks HTTP 1.1 too :-)
> I think the names CloudApacheHttpSolrClient or CloudLegacySolrClient are
> good names, and I lean to the latter because with the word "legacy" in its
> name, it screams, don't use me if you can avoid it ;-).  Also,
> CloudApacheHttpSolrClient is even more of a mouthful, and it could be not
> so obvious how to parse that (to our users) since Solr is also under the
> ASF and the Http part could be sort of obvious vs what we intend to mean --
> a specific "Apache" http client vs whatever other ones.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Mar 18, 2022 at 12:28 AM David Smiley  wrote:
>
>> Thank you for accepting my proposal -- I definitely volunteer to
>> implement it!
>>
>> ETA:   I've started... I should have a PR to share this weekend.
>>
>> One thing I want to point out that I see is that, as tempting as it may
>> be, all the places inside Solr that call the existing Http1 (using Apache
>> HttpClient) based builder will *continue* to do so.  Migrating to Http2 is
>> out of scope of this issue.  There are risks around authentication
>> propagation since there are known gaps there.  There will be some judgement
>> calls as to which internal method signatures should take CloudSolrClient
>> or CloudHttp1SolrClient but I lean to keep CloudSolrClient and do a bit of
>> casting on occasion when necessary (e.g. to access the HttpClient inside).
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Thu, Mar 17, 2022 at 12:17 PM Jan Høydahl 
>> wrote:
>>
>>> To wrap up:
>>>
>>> David's proposal:
>>> * Un-deprecate CloudSolrClient to use it as the main cluster client
>>> going forward
>>> * Rename CloudSolrClient -> CloudHttp1SolrClient and rename
>>> BaseCloudSolrClient -> CloudSolrClient
>>> * Make a new Builder that will instantiate a CloudSolrClient instance
>>> with a LBHttp2SolrClient / Jetty client backing it
>>>Users who want / need to use the old apache clients will now
>>> use CloudHttp1SolrClient's Builder instead
>>> * CloudHttp2SolrClient will remain (but can be deprecated?)
>>>
>>> Most SolrJ users will need to adapt their app when upgrading to solrj
>>> 9.0, but we are willing to accept that even if things are not pre-announced
>>> with deprecations.
>>> We can introduce some deprecations and "bridge" code in 8.11.2 if we
>>> want to provide a smoother path.
>>>
>>> I'm also willing to accept such a compat break, given that users can
>>> still use 8.11 solrj with solr9 as a bridge, and that we document the
>>> changes.
>>>
>>> David, I assume you were volunteering to land the proposed refactorings?
>>> :) Do you have an ETA?
>>>
>>>
>>> Can someone also please also comment on whether the rest of the
>>> deprecations look ok? The Auth stuff is closely tied to apache-http-client
>>> so will need to switch to Jetty-client before 10.0 if we're going to get
>>> rid of the dependency from Solr.:
>>>
>>> ConcurrentUpdateSolrClient
>>> HttpClientUtil
>>> HttpClusterStateProvider
>>> HttpSolrClient
>>> Krb5HttpClientBuilder
>>> LBHttpSolrClient
>>> PreemptiveAuth
>>> PreemptiveBasicAuthClientBuilderFactory
>>> SolrClientBuilder
>>> SolrHttpClientBuilder
>>> SolrHttpClientContextBuilder
>>> S

Re: CloudSolrClient; do we deprecate or not?

2022-03-23 Thread David Smiley
I have too little time to do more so the current state is it for me.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Mar 23, 2022 at 9:05 AM Jan Høydahl  wrote:

> If you feel inclined for a cleanup go ahead.
> But I think we can just as well move on as-is for now, and if we want
> nicer sounding client names in 10.x then re-visit.
>
> Jan
>
> 22. mar. 2022 kl. 16:22 skrev David Smiley :
>
> I pushed the rename we discussed and added change notes.
>
> I can see that the naming/deprecation choice varies between standalone vs
> SolrCloud (simple deprecation vs rename & swap).  Not a problem but not
> ideal.  I don't care too much because a user will likely do just one or the
> other and not care much about the other side.  Still, for HttpSolrClient in
> particular (the most common), it's not too late to un-deprecate it, and a
> similar change could happen later.  Interestingly, BaseHttpSolrClient has
> nothing of interest, unlike the cloud side of things.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Mar 22, 2022 at 10:16 AM Jan Høydahl 
> wrote:
>
>> Hi all,
>>
>> Please help close this last(?) 9.0 blocker by reviewing
>> https://github.com/apache/solr/pull/750
>>
>> Jan
>>
>> 18. mar. 2022 kl. 13:51 skrev David Smiley :
>>
>> The name CloudHttp1SolrClient is understandable because we have one with
>> "Http2" in the name.  But our Http2 one speaks HTTP 1.1 too :-)
>> I think the names CloudApacheHttpSolrClient or CloudLegacySolrClient are
>> good names, and I lean to the latter because with the word "legacy" in its
>> name, it screams, don't use me if you can avoid it ;-).  Also,
>> CloudApacheHttpSolrClient is even more of a mouthful, and it could be not
>> so obvious how to parse that (to our users) since Solr is also under the
>> ASF and the Http part could be sort of obvious vs what we intend to mean --
>> a specific "Apache" http client vs whatever other ones.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Fri, Mar 18, 2022 at 12:28 AM David Smiley  wrote:
>>
>>> Thank you for accepting my proposal -- I definitely volunteer to
>>> implement it!
>>>
>>> ETA:   I've started... I should have a PR to share this weekend.
>>>
>>> One thing I want to point out that I see is that, as tempting as it may
>>> be, all the places inside Solr that call the existing Http1 (using Apache
>>> HttpClient) based builder will *continue* to do so.  Migrating to Http2 is
>>> out of scope of this issue.  There are risks around authentication
>>> propagation since there are known gaps there.  There will be some judgement
>>> calls as to which internal method signatures should take CloudSolrClient
>>> or CloudHttp1SolrClient but I lean to keep CloudSolrClient and do a bit of
>>> casting on occasion when necessary (e.g. to access the HttpClient inside).
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Thu, Mar 17, 2022 at 12:17 PM Jan Høydahl 
>>> wrote:
>>>
>>>> To wrap up:
>>>>
>>>> David's proposal:
>>>> * Un-deprecate CloudSolrClient to use it as the main cluster client
>>>> going forward
>>>> * Rename CloudSolrClient -> CloudHttp1SolrClient and rename
>>>> BaseCloudSolrClient -> CloudSolrClient
>>>> * Make a new Builder that will instantiate a CloudSolrClient instance
>>>> with a LBHttp2SolrClient / Jetty client backing it
>>>>Users who want / need to use the old apache clients will now
>>>> use CloudHttp1SolrClient's Builder instead
>>>> * CloudHttp2SolrClient will remain (but can be deprecated?)
>>>>
>>>> Most SolrJ users will need to adapt their app when upgrading to solrj
>>>> 9.0, but we are willing to accept that even if things are not pre-announced
>>>> with deprecations.
>>>> We can introduce some deprecations and "bridge" code in 8.11.2 if we
>>>> want to provide a smoother path.
>>>>
>>>> I'm also willing to accept such a compat break, given that users can
>>>> still use 8.11 solrj with solr9 as a bridge, and that we document the
>>>> changes.
>>>>
>>>> David, I

Re: New branch and feature freeze for Solr 9.0.0

2022-03-25 Thread David Smiley
Woohoo!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Mar 25, 2022 at 4:16 PM Jan Høydahl  wrote:

> Hi,
>
> All code blockers are now cleared:
> https://issues.apache.org/jira/issues/?filter=12351219
> Some work remains for RefGuide and docker release procedures, we can
> continue on those in parallell with the RCs.
>
> I'll proceed with RC1.
>
> Jan
>
> 4. mar. 2022 kl. 21:52 skrev Houston Putman :
>
> I think we have another blocker for 9.0. Basically there is a bug in the
> updated version of commons-io that causes index files to be set to
> read-only in the filesystem occasionally. The solution is to upgrade
> commons-io, or find a workaround in Solr, but we can have that discussion
> on the JIRA.
>
> More info here: https://issues.apache.org/jira/browse/SOLR-16074
>
> On Tue, Mar 1, 2022 at 5:02 PM David Smiley  wrote:
>
>> I suppose the biggest spots for peer review are:
>> * use of brackets [ ] in the metric name where the request handler is.
>> Thus "/select[shard]"
>> * There is a fundamental difference in how the metrics are tracked on a
>> handler.  Previously, there were metrics for all of /select (no matter how
>> it was invoked), and a few for .distrib. & .shard. depending on how it was
>> invoked.  Now, the request is classified to be a shard request, or not a
>> shard request, after which separate metrics (same type/semantics) are
>> manipulated based on that classification, kind of as if there are two
>> distinct request handlers even though just one is registered.  I think
>> the PR makes this clear.  While I like it, the main trade-off is that a
>> user would be forced to aggregate metrics if they wanted a single metric
>> for the handler.  I think the isShard=true request changes the
>> personality/mode of the handler so much that I prefer to present it as its
>> own identity from a metrics standpoint.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Tue, Mar 1, 2022 at 4:19 PM Timothy Potter 
>> wrote:
>>
>>> Hi David,
>>>
>>> I read your note about SOLR-14401 but not clear what you need from me?
>>> Seems like you're renaming existing metrics and removing "distrib"
>>> from handlers that don't support a distrib mode, seems right to me.
>>>
>>> I actually haven't done much work on the metrics backend. For Grafana,
>>> it's a JSON file so search / replace the metrics you're changing. The
>>> Solr operator makes it really easy to set up Solr + ZK + Grafana +
>>> Prometheus + Exporter to test out your changes. It'll be pretty
>>> obvious if the dashboard is broken.
>>>
>>> Tim
>>>
>>> On Tue, Mar 1, 2022 at 7:01 AM David Smiley  wrote:
>>> >
>>> >
>>> >
>>> > On Tue, Mar 1, 2022 at 4:46 AM Jan Høydahl 
>>> wrote:
>>> >>
>>> >> Hi, and welcome to March!
>>> >>
>>> >> Our initial goal of a RC1 within February slipped, but we are still
>>> in a good position.
>>> >> I'll try to summarize the current code blockers:
>>> >>
>>> >>
>>> >> SOLR-16061  Decouple CloudSolrClient from ZkStateReader
>>> >>
>>> >> This is new, a spin-off from SOLR-15342 to prepare for solrj
>>> modularization. There is already a draft PR. Hope there will be progress on
>>> this so we don't have to delay solrj modularization until 10.0
>>> >
>>> >
>>> > I'm working with Haythem on this (a colleague).  I think it's close;
>>> it's "just" a refactoring.  The main constraint on this is Haythem's time.
>>> >
>>> >>
>>> >>
>>> >> SOLR-14290  Fix NPE in SolrTestCaseJ4 breaking external usage for
>>> master/9.x
>>> >>
>>> >> This has not seen any movement despite repeated reminders, so unless
>>> there is progress within a few days I'll remove it as blocker and add a
>>> note to the release notes that users relying on running test framework
>>> locally should wait for a later release.
>>> >
>>> >
>>> > I'm interested in looking but not until I get through the other two.
>>> >
>>> >>
>>> >> SOLR-14401  "distrib" request handler metrics should only be tracked
>>>

Re: [VOTE] Release Solr 9.0.0 RC3

2022-04-06 Thread David Smiley
+0 Not sure yet.

I got two failures:

(1) TestReplicationHandler -- didn't reproduce.  The logs refer to
"java.lang.AssertionError:
Directory not closed: MockDirectoryWrapper(ByteBuffersDirectory@1814bb6a
lockFactory=org.apache.lucene.store.SingleInstanceLockFactory@1b70f84b)" for
class level errors.  When I run this test in my IDE, the tests pass but I
get these errors for the test class overall.  So there is some test cleanup
to do.

(2) TestRollingRestart.  It didn't reproduce.  But I saw an easy NPE and
filed a JIRA issue & PR: https://issues.apache.org/jira/browse/SOLR-16145

Also, I want to point out a minor issue: the distribution doesn't include
the docker "examples" subdirectory, and maybe some others misc
files/directories specific to some modules:
https://github.com/apache/lucene-solr/pull/2104 (I guess I need to redo
this)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Apr 5, 2022 at 5:58 PM Jan Høydahl  wrote:

> Please vote for release candidate 3 for Solr 9.0.0
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC3-rev-e9e64f83a8c972b5a3f3460899c81ee9ccde2d1e
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC3-rev-e9e64f83a8c972b5a3f3460899c81ee9ccde2d1e
>
> You are encouraged to do an extra thorough test and manual inspection
> beyond
> running the smoketester, since this is a major release.
>
> You can build a release-candidate of the official docker image using the
> following command:
>
> DIST_BASE=https://dist.apache.org/repos/dist/dev/solr && \
>   RC_FOLDER=solr-9.0.0-RC3-rev-e9e64f83a8c972b5a3f3460899c81ee9ccde2d1e &&
> \
>   docker build $DIST_BASE/$RC_FOLDER/solr/docker/Dockerfile.official \
>   --build-arg SOLR_DOWNLOAD_URL=$DIST_BASE/$RC_FOLDER/solr/solr-9.0.0.tgz \
>   -t solr-rc:9.0.0-3
>
> The vote will be open for at least 72 hours i.e. until 2022-04-08 22:00
> UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> SUCCESS! [0:46:42.638796]
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: 2022 April, Solr committer meeting

2022-04-06 Thread David Smiley
So the next quarterly committer meeting was tentatively set by me for April
14th -- a week away now.  I've been in denial of this as I've been putting
off the prospect of coordinating yet another time slot that appeals to many
people around the globe.  One crude way to handle this is inertia -- it's
the same time (noon US eastern) unless someone volunteers to host it at a
time convenient to them?  At least this way it's not a recurring onerous
issue / hurdle.  And I personally don't have an agenda but that's what
crowd-sourcing is for ;-).

https://cwiki.apache.org/confluence/display/SOLR/2022-04+Meeting+notes

The meeting can be a good sounding board for socializing new things you are
working on or for discussing pain points.  And of course just to say
"hello" :-)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jan 20, 2022 at 8:21 PM David Smiley  wrote:

> I propose that the next committer meeting be held on Thursday, April
> 14th.  I'd be happy to host again unless someone wishes to take over.  The
> time is TBD.
>
> Continuing from today's meeting: We discussed the meeting invitation
> itself a little.  To be clear, all committers are officially invited via
> the email announcement to the dev list.  As a practical matter, I'm
> skeptical about adding 89 committers[1] to my Google Calendar.  Setting
> that up seems like a pain and I wonder if a list that long is allowed.
> Instead, I propose I duplicate the previous one, thus carry the list
> forward.  This is less work for everyone, I think.  When someone next takes
> over for me as host, we'll try to work this out.
>
> Perhaps the next meeting should be at a time more friendly to other
> timezones like those in India?  Just an idea; I feel bad when some of us
> can't attend.  Even if there are no requests for this, I propose we at
> least move the next meeting to one hour earlier than previously to make it
> a little nicer for Europe timezones.  Pacific can wake up at 8am --
> personally I wake up at 6:40am every day :-)
>
> [1]: https://projects.apache.org/committee.html?solr
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>


Re: 2022 April, Solr committer meeting

2022-04-13 Thread David Smiley
Thanks for stepping up Houston!  I'll create the calendar invite.  There's
a 25% chance I'll miss it at this date & time, so... perhaps you'd accept
one hour earlier?  Europe would like this more :-)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Apr 13, 2022 at 3:38 PM Eric Pugh 
wrote:

> That works for me!
>
> On Apr 13, 2022, at 11:51 AM, Houston Putman  wrote:
>
> How about we do next month? We should have 9.0 out by then and we can
> start discussing next steps!
>
> I'm happy to take on the planning of it, but unfortunately I don't have
> the ability to record google hangouts like you do David.
> So if you can do that part, I will take on planning the rest.
>
> Let's tentatively say May 11th.
>
> On Wed, Apr 13, 2022 at 11:46 AM Eric Pugh <
> ep...@opensourceconnections.com> wrote:
>
>> So….I went to check on this, and I guess it was yesterday?
>>
>> How about making the next one three months from yesterday and be July 12
>> at noon EST?
>>
>> Eric
>>
>>
>> On Apr 7, 2022, at 11:41 AM, Houston Putman  wrote:
>>
>> Tuesday April 12 is relatively safe, though it's a bit short notice?
>>
>> Wish that site went further than 7 days...
>>
>> On Thu, Apr 7, 2022 at 4:36 AM Jan Høydahl  wrote:
>>
>>> April 14th is a public holiday in Norway as well as a huge number of
>>> other countries https://www.timeanddate.com/holidays/
>>> Probably wise to find another date.
>>>
>>> Jan
>>>
>>> 7. apr. 2022 kl. 07:14 skrev David Smiley :
>>>
>>> So the next quarterly committer meeting was tentatively set by me for
>>> April 14th -- a week away now.  I've been in denial of this as I've been
>>> putting off the prospect of coordinating yet another time slot that appeals
>>> to many people around the globe.  One crude way to handle this is inertia
>>> -- it's the same time (noon US eastern) unless someone volunteers to host
>>> it at a time convenient to them?  At least this way it's not a
>>> recurring onerous issue / hurdle.  And I personally don't have an agenda
>>> but that's what crowd-sourcing is for ;-).
>>>
>>> https://cwiki.apache.org/confluence/display/SOLR/2022-04+Meeting+notes
>>>
>>> The meeting can be a good sounding board for socializing new things you
>>> are working on or for discussing pain points.  And of course just to say
>>> "hello" :-)
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Thu, Jan 20, 2022 at 8:21 PM David Smiley  wrote:
>>>
>>>> I propose that the next committer meeting be held on Thursday, April
>>>> 14th.  I'd be happy to host again unless someone wishes to take over.  The
>>>> time is TBD.
>>>>
>>>> Continuing from today's meeting: We discussed the meeting invitation
>>>> itself a little.  To be clear, all committers are officially invited via
>>>> the email announcement to the dev list.  As a practical matter, I'm
>>>> skeptical about adding 89 committers[1] to my Google Calendar.  Setting
>>>> that up seems like a pain and I wonder if a list that long is allowed.
>>>> Instead, I propose I duplicate the previous one, thus carry the list
>>>> forward.  This is less work for everyone, I think.  When someone next takes
>>>> over for me as host, we'll try to work this out.
>>>>
>>>> Perhaps the next meeting should be at a time more friendly to other
>>>> timezones like those in India?  Just an idea; I feel bad when some of us
>>>> can't attend.  Even if there are no requests for this, I propose we at
>>>> least move the next meeting to one hour earlier than previously to make it
>>>> a little nicer for Europe timezones.  Pacific can wake up at 8am --
>>>> personally I wake up at 6:40am every day :-)
>>>>
>>>> [1]: https://projects.apache.org/committee.html?solr
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>
>>>
>> ___
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
&g

Re: [RESULT][VOTE] Release Solr 9.0.0 RC3

2022-04-13 Thread David Smiley
This definitely needs to be addressed; basic working Maven coordinates are
important.  Next time I guess we need to validate this via the smoke
tester, I suppose?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Apr 13, 2022 at 5:59 PM Anshum Gupta  wrote:

> Thanks for the update, Houston.
>
> I think this calls for an RC4.
>
> I'll be happy to review the PRs as they come in.
>
> On Wed, Apr 13, 2022 at 2:32 PM Houston Putman  wrote:
>
>> Hello everyone,
>>
>> Unfortunately, while the build passed I noticed some issues when building
>> the remaining infrastructure to finish the release process (A fair amount
>> of stuff left to do since the docker and gradle migration).
>>
>> The biggest issue I found was that the solr-core maven artifact cannot be
>> used, since it still relies on the solr-server artifact, which doesn't
>> exist. There are also modules that don't have maven artifacts created. (And
>> module artifacts rely on solr-core, so they also cannot be used)
>>
>> I think this warrants an RC4 (and while we do have the RC3 release
>> artifacts in https://dist.apache.org/repos/dist/release/solr/solr/), we
>> can work with Apache infra to get that removed.
>>
>> If y'all think we should continue with the release anyways, I can do that
>> as well. Will wait for some consensus while continuing to build in the
>> necessary release steps. (for RC3 or RC4)
>>
>> Will have a fair number of PRs coming up to fix these things, so help
>> reviewing them would be much appreciated.
>>
>> - Houston
>>
>> On Fri, Apr 8, 2022 at 6:55 PM Jan Høydahl  wrote:
>>
>>> It's been >72h since the vote was initiated and the result is:
>>>
>>> +1  7  (7 binding)
>>>  0  1
>>> -1  0
>>>
>>> This vote has PASSED
>>>
>>> Congratulations everyone for reaching this milestone!
>>>
>>> There are still a few loose ends before the release can be published and
>>> announced.
>>> I'm away the next week, and Houston Putman has agreed to take over the
>>> RM job from here.
>>>
>>> Thanks for stepping up Houston!
>>>
>>> Jan
>>>
>>> 8. apr. 2022 kl. 23:43 skrev Mike Drob :
>>>
>>> If we're not doing anything with the release artifacts anyway (since
>>> you'll be on holiday, Jan), would it be fine to leave it open over the
>>> weekend? I didn't get nearly the amount of testing done on this that I
>>> wanted to, mainly trying to figure out if SOLR-16143 was a real problem
>>> (yes) or just a test issue (no) or worthy of blocking a release (probably
>>> not).
>>>
>>> On Fri, Apr 8, 2022 at 4:40 PM Anshum Gupta 
>>> wrote:
>>>
>>>> and just around the deadline, I got the smoke-tester to pass. Thanks to
>>>> everyone who helped :)
>>>>
>>>> I changed a ton of things but most likely, it was an init.gradle file
>>>> with some random stuff that was causing the issue. I remember having
>>>> deleted that file a few months ago, but not sure what job regenerates that
>>>> and adds that to the ~/.gradle folder.
>>>>
>>>> Here's my +1 (binding)
>>>> SUCCESS! [0:46:31.241458]
>>>>
>>>> Also, tested SolrJ and some basic search/indexing using some sample app.
>>>>
>>>>
>>>> On Fri, Apr 8, 2022 at 1:06 PM Jan Høydahl 
>>>> wrote:
>>>>
>>>>> This vote ends in an hour or so.
>>>>> While there are currently five +1's and a few +0, I feel it is
>>>>> inconclusive due to the SolrCell issue?
>>>>> Kevin, guess your vote will decide :)
>>>>>
>>>>> Jan
>>>>>
>>>>> 8. apr. 2022 kl. 18:00 skrev Kevin Risden :
>>>>>
>>>>> The smoke tester passed for me: SUCCESS! [0:54:21.639508]
>>>>>
>>>>> However, I'm running into issues checking Tika integration / Solr
>>>>> Cell. Following
>>>>> https://nightlies.apache.org/solr/draft-guides/solr-reference-guide-nightly/solr/9_0/indexing-guide/indexing-with-tika.html
>>>>>
>>>>> 2022-04-08 14:52:18.768 ERROR (qtp201274566-26) []
>>>>>> o.a.s.s.SolrRequestParsers Couldn't get multipart parts in order to 
>>>>>> delete
>>>>>> them => java.lang.IllegalStateExceptio

Re: [RESULT][VOTE] Release Solr 9.0.0 RC3

2022-04-15 Thread David Smiley
(no opinion) but thanks for driving this.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Apr 15, 2022 at 12:23 PM Kevin Risden  wrote:

> +1 option 2
>
> Kevin Risden
>
>
> On Fri, Apr 15, 2022 at 12:04 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> +1 for deleting the tag. Option 2.
>>
>> On Fri, Apr 15, 2022 at 9:28 PM Anshum Gupta 
>> wrote:
>>
>>> I'd suggest #2 because there's no point moving forward with a release,
>>> then announcing the issues with it or confusing users with a missing
>>> release announcement.
>>>
>>> On Fri, Apr 15, 2022 at 8:53 AM Houston Putman 
>>> wrote:
>>>
>>>> So I guess we need to make a decision everyone. Do we:
>>>>
>>>>1. Continue with the release to make sure that everything is there
>>>>that should be, but skip announcing it (Because we know there are 
>>>> issues).
>>>>Then immediately start the 9.0.1 release with the necessary fixes.
>>>>2. Stop this release, remove the release artifacts in
>>>>dist.apache.org and also delete the 9.0.0 git tag. (not sure how
>>>>feasible either of these are). Then start an RC4 for 9.0.0, with the
>>>>necessary fixes.
>>>>
>>>>
>>>> I guess, could everyone weigh in with their opinion? This is a sticky
>>>> situation and I don't want to move forward without consensus.
>>>>
>>>> - Houston
>>>>
>>>>
>>>> On Wed, Apr 13, 2022 at 6:06 PM Houston Putman 
>>>> wrote:
>>>>
>>>>> You’re right mike, that slipped my mind.
>>>>>
>>>>> On Wed, Apr 13, 2022 at 6:05 PM David Smiley 
>>>>> wrote:
>>>>>
>>>>>> This definitely needs to be addressed; basic working Maven
>>>>>> coordinates are important.  Next time I guess we need to validate this 
>>>>>> via
>>>>>> the smoke tester, I suppose?
>>>>>>
>>>>>> ~ David Smiley
>>>>>> Apache Lucene/Solr Search Developer
>>>>>> http://www.linkedin.com/in/davidwsmiley
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 13, 2022 at 5:59 PM Anshum Gupta 
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the update, Houston.
>>>>>>>
>>>>>>> I think this calls for an RC4.
>>>>>>>
>>>>>>> I'll be happy to review the PRs as they come in.
>>>>>>>
>>>>>>> On Wed, Apr 13, 2022 at 2:32 PM Houston Putman 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hello everyone,
>>>>>>>>
>>>>>>>> Unfortunately, while the build passed I noticed some issues when
>>>>>>>> building the remaining infrastructure to finish the release process (A 
>>>>>>>> fair
>>>>>>>> amount of stuff left to do since the docker and gradle migration).
>>>>>>>>
>>>>>>>> The biggest issue I found was that the solr-core maven artifact
>>>>>>>> cannot be used, since it still relies on the solr-server artifact, 
>>>>>>>> which
>>>>>>>> doesn't exist. There are also modules that don't have maven artifacts
>>>>>>>> created. (And module artifacts rely on solr-core, so they also cannot 
>>>>>>>> be
>>>>>>>> used)
>>>>>>>>
>>>>>>>> I think this warrants an RC4 (and while we do have the RC3 release
>>>>>>>> artifacts in https://dist.apache.org/repos/dist/release/solr/solr/),
>>>>>>>> we can work with Apache infra to get that removed.
>>>>>>>>
>>>>>>>> If y'all think we should continue with the release anyways, I can
>>>>>>>> do that as well. Will wait for some consensus while continuing to 
>>>>>>>> build in
>>>>>>>> the necessary release steps. (for RC3 or RC4)
>>>>>>>>
>>>>>>>> Will have a fair number of PRs coming up to fix these things, so
>>>>>>>> help reviewing them would be much appreciated.

Re: Can we eliminate solr.disableConfigSetsCreateAuthChecks parameter?

2022-04-26 Thread David Smiley
I'm glad you raise the question here but definitely raise this on the
SOLR-14663 as well so that you get attention from pertinent folks there.  I
wouldn't be surprised if some of our users have JIRA accounts but don't
subscribe to this list.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Apr 26, 2022 at 9:19 AM Andras Salamon 
wrote:

> Hi,
>
> We are using this property so I think it would be good to keep this.
>
> Best,
> Andras
>
> On Mon, Apr 25, 2022 at 4:46 PM Eric Pugh 
> wrote:
>
>> While looking at https://issues.apache.org/jira/browse/SOLR-16110 "Using
>> Schema/Config API breaks the File-Upload of Config Set File”, I learned
>> about the “solr.disableConfigSetsCreateAuthChecks” property:
>>
>>
>> https://github.com/apache/solr/blob/c99af207c761ec34812ef1cc3054eb2804b7448b/solr/core/src/java/org/apache/solr/handler/admin/ConfigSetsHandler.java#L71
>>
>> It’s not documented in the Ref Guide, and it’s listed as "// this is for
>> back compat only”.   Is it worth removing this feature in 9.1?
>>
>> It was introduced in https://issues.apache.org/jira/browse/SOLR-14663 
>> "ConfigSets
>> CREATE does not set trusted flag” which was first released in 8.6.3.
>>
>> If we are keeping it, then I’m happy to add it to the ref guide….
>>
>> Eric
>>
>> ___
>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>> | http://www.opensourceconnections.com | My Free/Busy
>> <http://tinyurl.com/eric-cal>
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>> This e-mail and all contents, including attachments, is considered to be
>> Company Confidential unless explicitly stated otherwise, regardless
>> of whether attachments are marked as such.
>>
>>


Re: Why is SolrRequest.getParams() abstract?

2022-04-27 Thread David Smiley
On Tue, Apr 26, 2022 at 12:50 PM Gus Heck  wrote:

> I was shocked to discover https://issues.apache.org/jira/browse/SOLR-14967
> causes solr to violate one of it's most key precepts that zk is not
> involved on every query.
>
> In looking into this, I ran across this code:
>
> https://github.com/apache/solr/blob/b218c177b8e3b387ada03acbad214a0c3bfe1443/solr/solrj/src/java/org/apache/solr/client/solrj/impl/CloudSolrClient.java#L965
>

That logic that checks that the params is an instanceof
ModifiableSolrParams has such a smell to it that I like to think it
wouldn't have passed anyone's code review.  Not that there is an obvious
solution (we can think of some but there are trade-offs) but the logic as
it was committed is too hacky / WIP.


> And this has a couple issues...
>
> First there is no guarantee that changes to the object returned by
> getParams() will be reflected on the request, which is apparently entirely
> free to return a copy of the parameters instead of the object it holds (a
> number of request classes do this) or construct a new object every time the
> method is called (quite a lot of request classes do this too).
>

Right.


> Second, as it turns out the most relevant classes do in fact return a
> reference rather than a copy (AbstractUpdateRequest) and QueryRequest,
> though QueryRequest can hold any type of SolrParams, which might not be
> modifiable... an as the comment asks... what then?
>
> I am not sure I can come up with a good reason that this freedom exists in
> the API, and why there are so many implementations (mostly admin) where
> request objects produce a new (mutable!) object every time getParams() is
> called. Is there somewhere we pass request objects to secondary threads
> that I'm not remembering?
>


> This entire area seems ripe for a (10.x) revamp... which if there's no
> good answer to the above questions should maybe standardize on use of
> ModifieableSolrParams by default and any subclass that really thinks it
> need defensive/immutable params documenting that in it's javadoc and using
> a subclass that throws an exception if an attempt to modify is made...
>

Something like that sounds reasonable.  Maybe draft up something more
concrete with an example or two.  I could also imagine making
ModifiableSolrParams more of a builder that *produces* an immutable
SolrParams.  Just an idea.  Of course any talk of messing with SolrParams
has vast repercussions across the codebase.  We should tread carefully (get
broad agreement and consider what API is breaking when).


> That might be crazy talk, but I haven't yet talked myself out of it, maybe
> someone here can save me some time and tell me why it's crazy? ;-)
>
> -Gus
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>


Re: [VOTE] Release Solr 9.0.0 RC4

2022-04-29 Thread David Smiley
+1 (binding) but flaky tests are in sad shape; not worth holding up a
release over.

I ran the smoke tester and only one test
failed: org.apache.solr.servlet.TestRequestRateLimiter.testConcurrentQueries
However this test I recall from conversations has some known flakiness.
But moreover, actually tons of tests are showing up as failing very often
in http://fucit.org/solr-jenkins-reports/failure-report.html (this test is
maybe in the middle; there are plenty worse).
I suppose it's adequate.  I wish the smoketester ran the tests as the very
last step and not prior to other checking.  "checkMaven" happens after.

I then repeated with --test-java17 and got 3 different errors.  But there
were clues the failures occured from Java 11 VMs, not 17.  Looking at the
smoketester script, it appears java17 is run _after_ java 11; so it never
made it this far.

I commented out Java 11 tests (because they were faiing) and ran again so
that the Java 17 ones would run.  Again, 2-3 test failed.

I did the docker build as instructed and I complete the Solr Text Tagger
tutorial.  Worked fine.

I'd like to try out the maven dependencies in a side project, since it was
observed there were issues.  I'm on travel at the moment but if someone has
tips on where to locate them / how to use them (anything non-obvious), I'd
appreciate it.

BTW thanks to everyone (esp. Jan & Houston) doing the legwork of doing the
release.  And the manual release checking I see most of you do -- very
important.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Wed, Apr 27, 2022 at 9:15 AM Jan Høydahl  wrote:

> Please vote for release candidate 4 for Solr 9.0.0
>
> The artifacts can be downloaded from:
>
> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC4-rev-d6e36d590896755ca962c6d2ddedf78ca4f463cc
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
>
> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC4-rev-d6e36d590896755ca962c6d2ddedf78ca4f463cc
>
> You are encouraged to do an extra thorough test and manual inspection
> beyond
> running the smoketester, since this is a major release.
>
> You can build a release-candidate of the official docker image using the
> following command:
>
> DIST_BASE=https://dist.apache.org/repos/dist/dev/solr && \
>   RC_FOLDER=solr-9.0.0-RC4-rev-d6e36d590896755ca962c6d2ddedf78ca4f463cc &&
> \
>   docker build $DIST_BASE/$RC_FOLDER/solr/docker/Dockerfile.official \
>   --build-arg SOLR_DOWNLOAD_URL=$DIST_BASE/$RC_FOLDER/solr/solr-9.0.0.tgz \
>   -t solr-rc:9.0.0-4
>
> The vote will be open for at least 72 hours i.e. until 2022-04-30 13:00
> UTC.
>
> [ ] +1  approve
> [ ] +0  no opinion
> [ ] -1  disapprove (and reason why)
>
> Here is my +1
>
> SUCCESS! [0:56:56.134141]
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: [VOTE] Release Solr 9.0.0 RC5

2022-05-09 Thread David Smiley
+1 (binding)

SUCCESS! [0:45:47.169811]


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, May 9, 2022 at 8:58 AM Jason Gerlowski 
wrote:

> +1 (binding)
>
> SUCCESS! [1:02:41.827601]
>
> Ran the smoketester and did some manual tests for incremental backups.
>
> On Sat, May 7, 2022 at 12:56 PM Joel Bernstein  wrote:
>
>> +1 (binding)
>>
>> SUCCESS! [0:50:47.511601]
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>>
>> On Fri, May 6, 2022 at 11:26 AM Eric Pugh <
>> ep...@opensourceconnections.com> wrote:
>>
>>> +1 (binding)
>>>
>>> SUCCESS! [1:05:27.496955]
>>>
>>> On May 6, 2022, at 11:16 AM, Timothy Potter 
>>> wrote:
>>>
>>> +1 (binding)
>>>
>>> SUCCESS! [0:47:39.581876]
>>>
>>> Just ran the smoke tester for this one, didn't have time to do any
>>> other manual tests.
>>>
>>> On Thu, May 5, 2022 at 1:47 PM Mike Drob  wrote:
>>>
>>>
>>> [INFO] There was an issue with SOLR-16133 that caused the smoke tester
>>> to fail with gpg errors on macOS. That change has been reverted from
>>> branch_9_0 and if you run into it, please try fetching the latest changes
>>> and running the smoke tester command again.
>>>
>>> On Thu, May 5, 2022 at 2:48 AM Jan Høydahl 
>>> wrote:
>>>
>>>
>>> Please vote for release candidate 5 for Solr 9.0.0
>>>
>>> The artifacts can be downloaded from:
>>>
>>> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC5-rev-a4eb7aa123dc53f8dac74d80b66a490f2d6b4a26
>>>
>>> You can run the smoke tester directly with this command:
>>>
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>>
>>> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC5-rev-a4eb7aa123dc53f8dac74d80b66a490f2d6b4a26
>>>
>>> You can build a release-candidate of the official docker image using the
>>> following command:
>>>
>>> DIST_BASE=https://dist.apache.org/repos/dist/dev/solr && \
>>>  RC_FOLDER=solr-9.0.0-RC5-rev-a4eb7aa123dc53f8dac74d80b66a490f2d6b4a26
>>> && \
>>>  docker build $DIST_BASE/$RC_FOLDER/solr/docker/Dockerfile.official \
>>>  --build-arg SOLR_DOWNLOAD_URL=$DIST_BASE/$RC_FOLDER/solr/solr-9.0.0.tgz
>>> \
>>>  -t solr-rc:9.0.0-5
>>>
>>> The vote will be open for at least 72 hours (plus weekend) i.e. until
>>> 2022-05-10 08:00 UTC.
>>>
>>> [x] +1  approve
>>> [ ] +0  no opinion
>>> [ ] -1  disapprove (and reason why)
>>>
>>> Here is my +1
>>>
>>> SUCCESS! [0:59:56.100439]
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
>>> 
>>> For additional commands, e-mail: dev-h...@solr.apache.org
>>> 
>>>
>>>
>>> ___
>>> *Eric Pugh **| *Founder & CEO | OpenSource Connections, LLC | 434.466.1467
>>> | http://www.opensourceconnections.com | My Free/Busy
>>> <http://tinyurl.com/eric-cal>
>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
>>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>>> This e-mail and all contents, including attachments, is considered to be
>>> Company Confidential unless explicitly stated otherwise, regardless
>>> of whether attachments are marked as such.
>>>
>>>


Re: 2022 May, Solr committer meeting

2022-05-09 Thread David Smiley
*If you didn't receive an email from me about the meeting invite, message
me directly.*
Minutes ago, I duplicated the previous committer meeting from January and
edited it slightly for tomorrow's meeting.  Therefore it has the same
invite list.  If for some reason you didn't get any notification / email
about this, simply let me know and I will add you.  Then henceforth, I
imagine it'll be a non-issue for you.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sun, May 8, 2022 at 7:51 PM Houston Putman  wrote:

> Thanks to everyone that filled out their availability!
>
> It looks like the best time (accounting for answers and timezone support)
> is:
>
> *Tuesday, May 10 at 12:00 PM (noon) EDT (GMT-4)*
>
> David agreed to make a Google Meet for everyone, so will defer to him on
> that.
>
> Please go ahead and fill out the agenda before the meeting if you have
> topics to discuss!
>
> - Houston
>
> On Fri, May 6, 2022 at 8:39 AM Houston Putman  wrote:
>
>> Hey everyone, a reminder to fill out your available times above. I will
>> be making a selection in a day or two and announce it before the week
>> starts.
>>
>> - Houston
>>
>> On Wed, Apr 27, 2022 at 9:23 AM Houston Putman 
>> wrote:
>>
>>> Hello Solr committers,
>>>
>>> The time has come for the next committers meeting.
>>>
>>> It will take place on one of *Wednesday, May 10-12th*.
>>>
>>> Please fill out what days/times work best for you here:
>>> https://doodle.com/meeting/organize/id/erkqM2Ka
>>>
>>> Note: This meeting is restricted to Solr committers, but the notes of
>>> the meeting will be made public.
>>>
>>> If you want items added to the agenda, please include them here:
>>> https://cwiki.apache.org/confluence/display/SOLR/2022-05+Meeting+notes
>>>
>>> - Houston
>>>
>>


Re: 2022 May, Solr committer meeting

2022-05-10 Thread David Smiley
As usual, I recorded the session.  It was an hour long.  If you're a
committer who wants access, just let me know and I will share it!


Re: [RESULT] [VOTE] Release Solr 9.0.0 RC5

2022-05-10 Thread David Smiley
Woohoo!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, May 10, 2022 at 11:09 AM Jan Høydahl  wrote:

> It's been >72h since the vote was initiated and the result is:
>
> +1  11  (9 binding)
>  0  0
> -1  0
>
> This vote has PASSED
>
> I'll wait until tomorrow with publishing the artifacts and announcing the
> release, to allow some latest edits to RefGuide.
>
> Congrats!
>
> Jan
>
> > 5. mai 2022 kl. 09:48 skrev Jan Høydahl :
> >
> > Please vote for release candidate 5 for Solr 9.0.0
> >
> > The artifacts can be downloaded from:
> >
> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC5-rev-a4eb7aa123dc53f8dac74d80b66a490f2d6b4a26
> >
> > You can run the smoke tester directly with this command:
> >
> > python3 -u dev-tools/scripts/smokeTestRelease.py \
> >
> https://dist.apache.org/repos/dist/dev/solr/solr-9.0.0-RC5-rev-a4eb7aa123dc53f8dac74d80b66a490f2d6b4a26
> >
> > You can build a release-candidate of the official docker image using the
> following command:
> >
> > DIST_BASE=https://dist.apache.org/repos/dist/dev/solr && \
> >  RC_FOLDER=solr-9.0.0-RC5-rev-a4eb7aa123dc53f8dac74d80b66a490f2d6b4a26
> && \
> >  docker build $DIST_BASE/$RC_FOLDER/solr/docker/Dockerfile.official \
> >  --build-arg SOLR_DOWNLOAD_URL=$DIST_BASE/$RC_FOLDER/solr/solr-9.0.0.tgz
> \
> >  -t solr-rc:9.0.0-5
> >
> > The vote will be open for at least 72 hours (plus weekend) i.e. until
> 2022-05-10 08:00 UTC.
> >
> > [x] +1  approve
> > [ ] +0  no opinion
> > [ ] -1  disapprove (and reason why)
> >
> > Here is my +1
> >
> > SUCCESS! [0:59:56.100439]
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: RefGuide for 9.0 viewable on website

2022-05-10 Thread David Smiley
It's really a wonderful upgrade to the ref guide!

The home/landing page has a "Query Guide" navigation panel linking directly
to the "Common Query Parameters" page.  That's wrong; it should point to
the top-most page about queries, which happens to be "Query Syntax and
Parsers".  The same issue is here with the "Indexing Guide".  The other
panels don't seem to have this problem.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, May 10, 2022 at 1:39 PM Marcus Eagan  wrote:

> +1 non binding.
>
> I absolutely love it.
>
> On Tue, May 10, 2022 at 2:50 AM Alessandro Benedetti 
> wrote:
>
>> Hi Jan,
>> I took a look, there's no problem with the triple search as before (but
>> this was expected as Houston gave the explanation weeks ago).
>> The new sections look good as well (minor formatting with the dense
>> vector similarity distances, but I don't really know how to fix it/ if it
>> is worth fixing it).
>>
>> To me is a +1
>>
>> Cheers
>> --
>> *Alessandro Benedetti*
>> CEO @ Sease Ltd.
>> *Apache Lucene/Solr Committer*
>> *Apache Solr PMC Member*
>>
>> e-mail: a.benede...@sease.io
>>
>>
>> *Sease* - Information Retrieval Applied
>> Consulting | Training | Open Source
>>
>> Website: Sease.io <http://sease.io/>
>> LinkedIn <https://linkedin.com/company/sease-ltd> | Twitter
>> <https://twitter.com/seaseltd> | Youtube
>> <https://www.youtube.com/channel/UCDx86ZKLYNpI3gzMercM7BQ> | Github
>> <https://github.com/seaseltd>
>>
>>
>> On Sun, 8 May 2022 at 23:49, Jan Høydahl  wrote:
>>
>>> Hi,
>>>
>>> The 9.0 (beta)) reference guide is now viewable on the main website
>>> (although not yet linked to from anywhere):
>>>
>>> https://solr.apache.org/guide/solr/9_0/
>>>
>>> Please take a moment to browse through and report (or better, submit a
>>> PR for) any issues.
>>>
>>> Things to pay attention to:
>>> - New structure, links etc
>>> - Test tnat tutorials, examples, commands work
>>> - All new 9.0 features documented
>>> - Removed features removed from guide
>>>
>>> PS: The "Upgrade Notes" is still work in progress.
>>>
>>> Jan
>>>
>>
>
> --
> Marcus Eagan
>
>


Re: Cleaning up IntelliJ warnings in code base

2022-05-27 Thread David Smiley
IntelliJ is produced by a company and I have no idea how they go about
selecting what the default inspections (what IntelliJ calls these) are.
Maybe it was one person there, maybe it was arbitrary by whoever wrote
the inspection, or maybe they had some more thoughtful approach that looked
at literature.  Regardless, I disagree with some of their choices.  I think
we should base our decisions on what inspections to address for ourselves,
not *just* because JetBrains included them.  I routinely adjust my IntelliJ
inspection settings to not harass me about some matters that I consider to
be frivolous.  For example boolean expression simplifications -- where we
as a project (when a part of Lucene) have chosen "== false" to be
clearer than an exclamation point adjacent to a boolean expression.

If we do some of this:  Agreed on picking exactly one "inspection" and
scoping to just one module at first.  Could increase to more commits in the
same PR if you get good feedback.
Personally, I wouldn't do this endeavor unless the particular inspection is
something that particularly motivates me / was a pet-peeve.
I think "getting to green" is a toal lost cause unless we were to enforce a
particular configured list of inspections (which is IntelliJ only,
remember).

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, May 27, 2022 at 12:52 PM Shawn Heisey  wrote:

> On 5/27/2022 8:24 AM, Eric Pugh wrote:
> > Hey all, was poking around at a unit test while watching TV and
> > noticed lots of warnings from IntelliJ, little stuff like exceptions
> > being thrown that don’t need to be thrown, unused variables, or typos.
>
> In eclipse, there are THOUSANDS of warnings.  And last I checked, even a
> bunch of errors.  But I was able to build 10.0.0-SNAPSHOT successfully.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: Bugfix release Lucene/Solr 8.11.2

2022-06-06 Thread David Smiley
I merged SOLR-16227 to main, 9, 8.11 some minutes ago.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jun 3, 2022 at 1:59 PM Mike Drob  wrote:

> Yes, please fix and backport SOLR-16227, it looks almost ready from the
> conversation on the PR. I will plan to do the first RC on Monday if we can
> get the backport completed today.
>
> On Thu, Jun 2, 2022 at 9:18 AM kiran chitturi 
> wrote:
>
>> Hi Mike,
>>
>> I found a new issue in Solr SQL (SOLR-16227) and have a fix for it (
>> https://github.com/apache/solr/pull/887). Can you wait on the release
>> for 8.11.2 till I can backport this?
>>
>> Thank you,
>> Kiran.
>>
>> On Tue, May 31, 2022 at 9:21 AM Mike Drob  wrote:
>>
>>> Howdy folks, now that Lucene 9.2 has wrapped and we're past the holiday
>>> weekend in the United States, I'd like to take a look at getting this
>>> rolling by the end of the week. I see an open PR for
>>> backporting LUCENE-10236 but it doesn't look like anything else would
>>> really be waiting at the moment.
>>>
>>> I will plan to build a release candidate on Thursday (sooner if
>>> LUCENE-10236 is committed, later if somebody else shouts that they have
>>> other issues).
>>>
>>> Thanks!
>>>
>>> On Tue, May 24, 2022 at 3:48 PM Jan Høydahl 
>>> wrote:
>>>
>>>> I bumped Jackson in https://issues.apache.org/jira/browse/SOLR-16213 and
>>>> also backported to 8_11. Wdyt?
>>>>
>>>> Jan
>>>>
>>>> 18. mai 2022 kl. 15:22 skrev Gus Heck :
>>>>
>>>> SOLR-16194 is in and ported to 8.11,.2
>>>>
>>>> On Wed, May 18, 2022 at 7:12 AM Jan Høydahl 
>>>> wrote:
>>>>
>>>>> I was pinged on https://issues.apache.org/jira/browse/SOLR-16019 because
>>>>> I have an in-flight PR with a backport. I'll complete and merge that PR.
>>>>>
>>>>> Jan
>>>>>
>>>>>
>>>>> 13. mai 2022 kl. 01:03 skrev Mike Drob :
>>>>>
>>>>> To: dev@lucene, dev@solr
>>>>>
>>>>> NOTICE:
>>>>>
>>>>> I am planning on preparing a bugfix release from branch branch_8_11
>>>>> (likely mid next week)
>>>>>
>>>>> Please observe the normal rules for committing to this branch:
>>>>>
>>>>> * Before committing to the branch, reply to this thread and argue
>>>>>   why the fix needs backporting and how long it will take.
>>>>> ** If you're backporting stuff this week still or over the weekend,
>>>>> then skip
>>>>> the bit about how long it will take.
>>>>> * All issues accepted for backporting should be marked with 8.11.2
>>>>>   in JIRA, and issues that should delay the release must be marked as
>>>>> Blocker
>>>>> * All patches that are intended for the branch should first be
>>>>> committed
>>>>>   to the unstable branch, merged into the stable branch, and then into
>>>>>   the current release branch.
>>>>> * Only Jira issues with Fix version 8.11.2 and priority "Blocker" will
>>>>> delay
>>>>>   a release candidate build.
>>>>>
>>>>> Also, please observe that since 9.0 already exists, there cannot be
>>>>> any index format breaking changes. It really should only be bug fixes that
>>>>> have already been verified on the 9x branch.
>>>>>
>>>>> Thanks,
>>>>> Mike
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> http://www.needhamsoftware.com (work)
>>>> http://www.the111shift.com (play)
>>>>
>>>>
>>>>


Re: Welcome Markus Jelsma as Solr committer

2022-06-21 Thread David Smiley
Welcome Markus!
Thanks for contributing to Solr over the years.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jun 21, 2022 at 2:13 PM Christine Poerschke (BLOOMBERG/ LONDON) <
cpoersc...@bloomberg.net> wrote:

> Hello everyone.
>
> On behalf of the Apache Solr PMC, I'm pleased to announce that Markus
> Jelsma has accepted the invitation to become a Solr committer.
>
> Markus - it's a tradition that you introduce yourself with a brief bio, if
> you wish.
>
> Congratulations and welcome!
>
> --
> Christine
>


Re: Potential Dead Code: MetricsCollectorHandler, SolrReporter

2022-07-01 Thread David Smiley
These are opt-in plugins living in Solr-core. A user can configure
reporters on all their nodes publishing to a node that has the
MetricsCollectorHandler configured.  AB explains their use here:
https://issues.apache.org/jira/browse/SOLR-15007?focusedCommentId=17234635&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17234635

When deeming something is not used/needed, I recommend digging through
commit history & JIRA, and even commenting on the JIRA introducing whatever
it was years later to ask questions like this.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jun 23, 2022 at 10:16 AM Jason Gerlowski 
wrote:

> Hi all,
>
> I was poking around and noticed what I think are some remnants on
> "main" of the metrics-history feature that was removed in 9.0.
> Particularly I'm wondering about the "MetricsCollectorHandler" and
> "SolrReporter" classes.
>
> Neither is referenced elsewhere in the code base or ref-guide as far
> as I can tell.  It looks like they were introduced as a part of
> SOLR-9858, which looks like a piece of the metrics-history work.
> (Though to play devil's advocate: that doesn't necessarily mean that
> they're not still useful even now that metrics-history is gone.)
>
> Anyway - if anyone has context on these classes or knows whether they
> can be deleted, please let me know.
>
> Best,
>
> Jason
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: MoveReplicaCmd with concurrent updates

2022-07-06 Thread David Smiley
On Tue, Jul 5, 2022 at 10:48 AM Houston Putman  wrote:

> That, or maybe the distributed update processor checks the state of the
> cluster on a failure and if the replica no longer exists, it acts
> accordingly.
>

That is what we're exploring.


Re: ClientUtils.escapeQueryChars escaping whitespace?

2022-07-11 Thread David Smiley
Yeah; agreed with Houston.  I think it's working as designed.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jul 11, 2022 at 11:40 AM Houston Putman  wrote:

> From my understanding, that method is supposed to basically escape
> characters inside of a term you are trying to query for.
> So since whitespace is not treated as a part of a term in a query string
> (it separates terms to be queried, using the default operator), it has to
> be escaped.
>
> How are you trying to use the method?
>
> On Thu, Jul 7, 2022 at 5:18 PM Joel Bernstein  wrote:
>
>> I was using the ClientUtils.escapeSpecialChars method and was
>> surprised that white space is being escaped.
>>
>>
>> https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/ClientUtils.java#L186
>>
>> I was considering removing this but wanted to see if anyone had a reason
>> for whitespace to be escaped.
>>
>>
>>
>> Joel Bernstein
>> http://joelsolr.blogspot.com/
>>
>


Re: UpdateLog memory usage on startup

2022-07-13 Thread David Smiley
Makes sense Bram.

I note that it's been over a month with no response.  Just a suggestion --
try commenting on the pertinent JIRA because it will get the attention of
the last committer (and interested parties).
BTW we could cap the initial ArrayList size to, say, Math.min(1024,n)

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jun 9, 2022 at 11:34 AM Bram Van Dam  wrote:

> Howdy,
>
> We've noticed that enabling larger transaction logs causes the memory
> requirements for Solr to increase: Solr consumes large amounts of memory
> at startup.
>
> After procuring a heap dump, this seems to be because Solr initializes
> ArrayLists in UpdateLog::getVersions with size
> maxNbTransactionLogEntries. While this may be more efficient if your
> actual log files are close to maximum size, this wastes memory when the
> actual logs are small. This is something that occurs frequently when you
> have a small number of shards which receive a lot of writes, and a lot
> of shards which receive few (or no) writes.
>
> We've seen cases where Solr needs an additional 10GiB of memory during
> startup. It gets freed afterwards, but it does make startup painful.
>
> The fix for SOLR-15676 further increased the memory footprint by
> allocating a LongSet of the same size.
>
> public List getVersions(int n, long maxVersion) {
>List ret = new ArrayList<>(n);
>LongSet set = new LongSet(n);
>
> The naïve fix would be to simply replace this init of new ArrayList<>(n)
> with new ArrayList<>(). ArrayList grows its capacity by 50% every time
> it's full, resulting in some extra garbage overhead and extra calls to
> array copy.
>
> A quick bit of napkin math shows that for 10M entries, the array will
> have to be reallocated 35 times. In our case, this is worth the extra
> overhead. In the general case, it might not be?
>
> Does anyone have any further insights?
>
> Thanks,
>
>   - Bram
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: v2 API "Broader Changes" and Next Steps

2022-07-14 Thread David Smiley
I did look over it twice now, here & there.  Boy, there is a big API
surface area to Solr!  Thanks for working on this huge task Jason!

I think it would be best to tackle one limited area first so we can see it
in practice both at the surface and implementation.  From an implementation
perspective, I strongly prefer to move our functionality to implement the
new API as at its core/directly, and then have v1 be a shim on top.  This
will lead to new/clean code in our future with legacy concerns off to the
side that will be easy to jettison some day.  I think the v2 we have today
didn't take that approach.

I'm not familiar with OpenAPI, but based on the conversations here, I love
the prospect of having our users in various languages more easily talk to
Solr.  Maybe it could be used in Solrj to reduce code maintenance as well.
Overall, similar to Swagger, which I have used in a POC and immediately
loved that I had a simple UI that I didn't have to write/maintain.

Let's not constrain ourselves with supporting the current v2 API on Solr
10!  Doing so increases the risk that this API change won't even happen in
the first place because it's too hard.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jul 14, 2022 at 7:00 AM Jason Gerlowski 
wrote:

> Hey all,
>
> Wanted to give another "plug" to the spreadsheet of proposed v2 API
> changes, in case folks missed it the first time around.  Please take a
> look and review whenever you get a chance!
>
>
> https://docs.google.com/spreadsheets/d/1HAoBBFPpSiT8mJmgNZKkZAPwfCfPvlc08m5jz3fQBpA/edit?usp=sharing
>
> > Love to see CORS support added
>
> I'm not a CORS expert, so maybe I'm misunderstanding it, but I always
> thought CORS was a security feature that was somewhat independent from
> an APIs shape/design? Do endpoints need to be designed "for" CORs in
> some way I'm missing?  Just trying to understand if and how it'd
> dovetail with the v2 effort here.
>
> Best,
>
> Jason
>
> On Thu, Jul 7, 2022 at 8:20 AM Eric Pugh
>  wrote:
> >
> > Love to see CORS support added ;-)
> >
> >
> > On Jul 6, 2022, at 9:45 AM, Jason Gerlowski 
> wrote:
> >
> > [Jan] I think it is better for the project to evolve and fix this
> >
> >
> > Glad to hear it; sorry for the confusion if I misunderstood your
> concerns!
> >
> > Well in that case it sounds like there's general support for the idea
> > of broader changes to the v2 API, and no categorical objections
> > (albeit a few concerns about helping users upgrade, etc.)
> >
> > Of course, there'll need to be a good bit of discussion still around
> > what specific changes to make.  REST and OpenAPI support are the two
> > things that've come up repeatedly in past discussions, so I've gone
> > ahead and put together a Google Sheet with first-drafts of the changes
> > each API would need if we go in that direction.  I've attached the
> > sheet to SOLR-15871 ("Cosmetic and consistency improvements for the v2
> > API") and linked it below.  Hopefully that'll be a good way to
> > kickstart the discussion.
> >
> >
> https://docs.google.com/spreadsheets/d/1HAoBBFPpSiT8mJmgNZKkZAPwfCfPvlc08m5jz3fQBpA/edit?usp=sharing
> >
> > Thanks all for the feedback so far!
> >
> > Best,
> >
> > Jason
> >
> > On Tue, Jun 21, 2022 at 4:25 AM Jan Høydahl 
> wrote:
> >
> >
> > I'd love to find a way to
> > address your concerns and still evolve v2 without backcompat, if we
> > can.
> >
> >
> > I just wanted to highlight that some users may be using v2 without
> realizing it was experimental due to the back-and-forth communication we
> have had on this.
> > Personally I think it is better for the project to evolve and fix this,
> even if that means we'll put extra migration effort on some v2 users in
> minor releases. We'd of course need to clearly mark such changes so it
> won't come as a surprise.
> >
> > Jan
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >
> > ___
> > Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
> http://www.opensourceconnections.com | My Free/Busy
> > Co-Author: Apache Solr Enterprise Search Server, 3rd Ed
> > This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless of
> whether attachments are marked as such.
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> For additional commands, e-mail: dev-h...@solr.apache.org
>
>


Re: v2 API "Broader Changes" and Next Steps

2022-07-14 Thread David Smiley
On Thu, Jul 14, 2022 at 4:31 PM Jason Gerlowski 
wrote:

> > I think it would be best to tackle one limited area first so we can see
> it in practice both at the surface and implementation.
>
> I think that makes sense, assuming that by "tackling" you mean
> updating the API path/verb/etc and moving the "real" logic over to the
> v2 class?  I'd rather we not take a "limited area" approach to getting
> consensus around the API endpoint design itself, as IMO that'd produce
> a less cohesive result.  I don't think that was what you meant, but
> just double-checking.
>

Sure -- API first.  There may be some implementation realities that cause
us to reconsider choices but we can take that as it happens.

~ David


Re: UpdateLog memory usage on startup

2022-07-19 Thread David Smiley
The migration you speak of is in Lucene, not Solr.  It would be noticed by
"Watchers".

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Tue, Jul 19, 2022 at 12:29 PM Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Great find! Lets have it committed.
>
> On Tue, Jul 19, 2022 at 9:49 PM Bram Van Dam  wrote:
>
> > Thanks for your reply, David. Given the apparent migration from from
> > Jira->Github, I didn't think that would get more response than the
> > mailing list 😅
> >
> > We've been running a patched version of 7.7 with a smaller Versions
> > arraylist for a while now, without any ill effects.
> >
> >   - Bram
> >
> > On 13/07/2022 23.54, David Smiley wrote:
> > > Makes sense Bram.
> > >
> > > I note that it's been over a month with no response.  Just a suggestion
> > --
> > > try commenting on the pertinent JIRA because it will get the attention
> of
> > > the last committer (and interested parties).
> > > BTW we could cap the initial ArrayList size to, say, Math.min(1024,n)
> > >
> > > ~ David Smiley
> > > Apache Lucene/Solr Search Developer
> > > http://www.linkedin.com/in/davidwsmiley
> > >
> > >
> > > On Thu, Jun 9, 2022 at 11:34 AM Bram Van Dam 
> > wrote:
> > >
> > >> Howdy,
> > >>
> > >> We've noticed that enabling larger transaction logs causes the memory
> > >> requirements for Solr to increase: Solr consumes large amounts of
> memory
> > >> at startup.
> > >>
> > >> After procuring a heap dump, this seems to be because Solr initializes
> > >> ArrayLists in UpdateLog::getVersions with size
> > >> maxNbTransactionLogEntries. While this may be more efficient if your
> > >> actual log files are close to maximum size, this wastes memory when
> the
> > >> actual logs are small. This is something that occurs frequently when
> you
> > >> have a small number of shards which receive a lot of writes, and a lot
> > >> of shards which receive few (or no) writes.
> > >>
> > >> We've seen cases where Solr needs an additional 10GiB of memory during
> > >> startup. It gets freed afterwards, but it does make startup painful.
> > >>
> > >> The fix for SOLR-15676 further increased the memory footprint by
> > >> allocating a LongSet of the same size.
> > >>
> > >> public List getVersions(int n, long maxVersion) {
> > >> List ret = new ArrayList<>(n);
> > >> LongSet set = new LongSet(n);
> > >>
> > >> The naïve fix would be to simply replace this init of new
> ArrayList<>(n)
> > >> with new ArrayList<>(). ArrayList grows its capacity by 50% every time
> > >> it's full, resulting in some extra garbage overhead and extra calls to
> > >> array copy.
> > >>
> > >> A quick bit of napkin math shows that for 10M entries, the array will
> > >> have to be reallocated 35 times. In our case, this is worth the extra
> > >> overhead. In the general case, it might not be?
> > >>
> > >> Does anyone have any further insights?
> > >>
> > >> Thanks,
> > >>
> > >>- Bram
> > >>
> > >> -
> > >> To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > >> For additional commands, e-mail: dev-h...@solr.apache.org
> > >>
> > >>
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
> > For additional commands, e-mail: dev-h...@solr.apache.org
> >
> >
>


Re: On tests labelled @Slow...

2022-07-21 Thread David Smiley
Thanks for spearheading this!

Your definition of "slow" seems fine.  We can change it later.  As long as
the build publishes tests with a runtime exceeding this threshold, we can
maintain this easily.

I think keeping @Slow makes sense so that we can identify these tests
as-such to avoid running them at the CLI during normal development to keep
us productive.  Obviously, slow tests need to run _sometimes_, which I
think should be at least CI & probably PR validation too.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Jul 21, 2022 at 4:00 PM Mike Drob  wrote:

> Howdy devs,
>
> I stumbled onto https://issues.apache.org/jira/browse/SOLR-16304 while
> trying to upgrade our Lucene dependency and it's motivated me to take a
> little bit of a look at our tests. I know that there are dragons here and
> I'm under no illusions that I can fix everything, but I feel like a
> thorough audit might be useful.
>
> The short of it is that @Slow is going away. We have choices on what to do.
> We currently have 112 tests annotated as such.
>
> Let's start with some definitions? What is our threshold for how slow
> is @Slow? Obviously this will vary from machine to machine, but maybe let's
> say that anything under 10s on my 2017 iMac Pro is fast and anything longer
> is slow? Arbitrary, and I reserve the right to move this later if I feel
> there's a better cut off.
>
> So maybe some tests get a new breath on life by being unlabelled. Maybe
> some other ones get fixed (reducing data size is one idea...)
>
> Some tests are slow because we have distributed systems and
> propagation delay and lots of gross sleeps and waits, and I don't want to
> touch those. Maybe those become Nightlies.
>
> Are there other approaches? What do folks want to do to move us forward?
>
> Mike
>


Re: On tests labelled @Slow...

2022-07-22 Thread David Smiley
Or Slow should be disabled by default?  One or the other.

In the Lucene issue you linked to
https://issues.apache.org/jira/browse/LUCENE-10532 Tomoko did a comparison
table of tests running with & without Slow, and across threads.  If we
assume at least 4 workers, what are the results?  I wouldn't be surprised
if disabling Slow could make a big difference for Solr due to the long tail
of slower tests and Gradle's inability to keep all workers busy.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jul 22, 2022 at 9:58 PM Mike Drob  wrote:

> Ok, another correction, tests.slow is enabled by default, so if they're
> already running most of the time then it's pretty "safe" to just axe the
> annotations.
>
> On Thu, Jul 21, 2022 at 8:19 PM Mike Drob  wrote:
>
> > Hmm... correction here - the failing Slow tests also happen to be
> > AwaitsFix tests so they were broken anyway. I wonder why my gradle
> command
> > decided to include them.
> >
> > On Thu, Jul 21, 2022 at 8:14 PM Mike Drob  wrote:
> >
> >> While I would agree with you in principle, I don't think the Slow tests
> >> are currently running anywhere right now. I tried running them locally
> and
> >> immediately got three reproducible failures.
> >>
> >> Uwe's jenkins doesn't run the slow tests and I don't see any jobs on ASF
> >> Jenkins that seem to do that either.
> >>
> >> On Thu, Jul 21, 2022 at 3:42 PM David Smiley 
> wrote:
> >>
> >>> Thanks for spearheading this!
> >>>
> >>> Your definition of "slow" seems fine.  We can change it later.  As long
> >>> as
> >>> the build publishes tests with a runtime exceeding this threshold, we
> can
> >>> maintain this easily.
> >>>
> >>> I think keeping @Slow makes sense so that we can identify these tests
> >>> as-such to avoid running them at the CLI during normal development to
> >>> keep
> >>> us productive.  Obviously, slow tests need to run _sometimes_, which I
> >>> think should be at least CI & probably PR validation too.
> >>>
> >>> ~ David Smiley
> >>> Apache Lucene/Solr Search Developer
> >>> http://www.linkedin.com/in/davidwsmiley
> >>>
> >>>
> >>> On Thu, Jul 21, 2022 at 4:00 PM Mike Drob  wrote:
> >>>
> >>> > Howdy devs,
> >>> >
> >>> > I stumbled onto https://issues.apache.org/jira/browse/SOLR-16304
> while
> >>> > trying to upgrade our Lucene dependency and it's motivated me to
> take a
> >>> > little bit of a look at our tests. I know that there are dragons here
> >>> and
> >>> > I'm under no illusions that I can fix everything, but I feel like a
> >>> > thorough audit might be useful.
> >>> >
> >>> > The short of it is that @Slow is going away. We have choices on what
> >>> to do.
> >>> > We currently have 112 tests annotated as such.
> >>> >
> >>> > Let's start with some definitions? What is our threshold for how slow
> >>> > is @Slow? Obviously this will vary from machine to machine, but maybe
> >>> let's
> >>> > say that anything under 10s on my 2017 iMac Pro is fast and anything
> >>> longer
> >>> > is slow? Arbitrary, and I reserve the right to move this later if I
> >>> feel
> >>> > there's a better cut off.
> >>> >
> >>> > So maybe some tests get a new breath on life by being unlabelled.
> Maybe
> >>> > some other ones get fixed (reducing data size is one idea...)
> >>> >
> >>> > Some tests are slow because we have distributed systems and
> >>> > propagation delay and lots of gross sleeps and waits, and I don't
> want
> >>> to
> >>> > touch those. Maybe those become Nightlies.
> >>> >
> >>> > Are there other approaches? What do folks want to do to move us
> >>> forward?
> >>> >
> >>> > Mike
> >>> >
> >>>
> >>
>


Finding an updated Java base Docker image for Solr 8

2022-07-26 Thread David Smiley
FYI https://github.com/docker-library/docs/pull/2162#issuecomment-1194542898

Essentially, the "openjdk" image isn't maintained anymore, and we ought to
pick an alternative.  There are support implications because image
suppliers/variants have OS or other quirks as to what is included.  I
haven't researched the alternatives yet but compatibility is paramount
because Solr 8 is the previous release and we wouldn't want to do anything
disruptive.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


  1   2   3   4   5   6   7   8   >