Re: NOTICE: Tests failing on branch_9x

2024-10-18 Thread Eric Pugh
I have time this evening (us est time zone) from 7 pm till about midnight to 
take a fresh stab at sorting this out.



Sent from my iPhone

> On Oct 18, 2024, at 7:29 AM, Jan Høydahl  wrote:
> 
> Hi,
> 
> Just a heads up that all test on branch_9x are currently failing. See 
> https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x/lastBuild/
> Consistently failing test is 
> "SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection"
> 
> The first run they started failing was 
> https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x/6215/
> 
> * SOLR-17489: CLI: Deprecate variations on solr urls options (#2756) (details 
> / githubweb)
> * Support backport of SOLR-17489 (details / githubweb)
> * Fix tests to pass in branch_9x (details / githubweb)
> 
> I and Eric have been looking into it a bit in 
> https://github.com/apache/solr/pull/2778 but it basUrl stuff is quite 
> convoluted...
> 
> Jan


-
To unsubscribe, e-mail: dev-unsubscr...@solr.apache.org
For additional commands, e-mail: dev-h...@solr.apache.org



Re: Welcome Christos Malliaridis as Solr Committer

2024-10-18 Thread Gus Heck
Congratulations :) Welcome!

On Fri, Oct 18, 2024 at 8:21 PM Houston Putman  wrote:

> Welcome Christos!
>
> - Houston
>
> On Fri, Oct 18, 2024 at 5:54 PM David Smiley  wrote:
>
> > The Project Management Committee (PMC) for Apache Solr has invited
> Christos
> > Malliaridis to become a committer and we are pleased to announce that
> > Christos has accepted.
> >
> > Christos, the tradition is that new committers introduce themselves with
> a
> > brief bio.
> >
> > Congratulations and welcome!
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
>


-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)


Re: Welcome Christos Malliaridis as Solr Committer

2024-10-18 Thread sanjay dutt
Yay! Congratulations and Welcome Christos.

Sanjay

On Sat., Oct. 19, 2024, 5:51 a.m. Houston Putman, 
wrote:

> Welcome Christos!
>
> - Houston
>
> On Fri, Oct 18, 2024 at 5:54 PM David Smiley  wrote:
>
> > The Project Management Committee (PMC) for Apache Solr has invited
> Christos
> > Malliaridis to become a committer and we are pleased to announce that
> > Christos has accepted.
> >
> > Christos, the tradition is that new committers introduce themselves with
> a
> > brief bio.
> >
> > Congratulations and welcome!
> >
> > ~ David Smiley
> > Apache Lucene/Solr Search Developer
> > http://www.linkedin.com/in/davidwsmiley
> >
>


Missing Docs for the cross-dc module?

2024-10-18 Thread David Eric Pugh
I noticed the README.md is empty.   I did spot the 
https://github.com/apache/solr-sandbox/blob/main/CROSSDC.md document that looks 
like it could be the basis...    Would be nice to have this potentially really 
useful feature properly documented!
Eric


Re: Welcome Christos Malliaridis as Solr Committer

2024-10-18 Thread Houston Putman
Welcome Christos!

- Houston

On Fri, Oct 18, 2024 at 5:54 PM David Smiley  wrote:

> The Project Management Committee (PMC) for Apache Solr has invited Christos
> Malliaridis to become a committer and we are pleased to announce that
> Christos has accepted.
>
> Christos, the tradition is that new committers introduce themselves with a
> brief bio.
>
> Congratulations and welcome!
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>


Re: Missing Docs for the cross-dc module?

2024-10-18 Thread Houston Putman
Hmmm that is not great. I think I converted that MD file into a ref guide
page instead. But the module should have a Readme for sure.

- Houston

On Fri, Oct 18, 2024 at 7:10 PM David Eric Pugh 
wrote:

> I noticed the README.md is empty.   I did spot the
> https://github.com/apache/solr-sandbox/blob/main/CROSSDC.md document that
> looks like it could be the basis...Would be nice to have this
> potentially really useful feature properly documented!
> Eric
>


Welcome Christos Malliaridis as Solr Committer

2024-10-18 Thread David Smiley
The Project Management Committee (PMC) for Apache Solr has invited Christos
Malliaridis to become a committer and we are pleased to announce that
Christos has accepted.

Christos, the tradition is that new committers introduce themselves with a
brief bio.

Congratulations and welcome!

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


Re: Missing Docs for the cross-dc module?

2024-10-18 Thread David Smiley
We needn't repeat ourselves.  It's adequate to have a README with a brief
summary and then point to the ref guide for further info.

On Fri, Oct 18, 2024 at 8:21 PM Houston Putman  wrote:

> Hmmm that is not great. I think I converted that MD file into a ref guide
> page instead. But the module should have a Readme for sure.
>
> - Houston
>
> On Fri, Oct 18, 2024 at 7:10 PM David Eric Pugh 
> wrote:
>
> > I noticed the README.md is empty.   I did spot the
> > https://github.com/apache/solr-sandbox/blob/main/CROSSDC.md document
> that
> > looks like it could be the basis...Would be nice to have this
> > potentially really useful feature properly documented!
> > Eric
> >
>


Proposed 10x blocker

2024-10-18 Thread Gus Heck
https://issues.apache.org/jira/browse/SOLR-17503

This is the stickiest bit in our lagging dependencies, but also one of the
most irritating from a user perspective IMHO.

-- 
http://www.needhamsoftware.com (work)
https://a.co/d/b2sZLD9 (my fantasy fiction book)


NOTICE: Tests failing on branch_9x

2024-10-18 Thread Jan Høydahl
Hi,

Just a heads up that all test on branch_9x are currently failing. See 
https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x/lastBuild/
Consistently failing test is 
"SolrCloudExampleTest.testLoadDocsIntoGettingStartedCollection"

The first run they started failing was 
https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x/6215/ 

* SOLR-17489: CLI: Deprecate variations on solr urls options (#2756) (details / 
githubweb)
* Support backport of SOLR-17489 (details / githubweb)
* Fix tests to pass in branch_9x (details / githubweb)

I and Eric have been looking into it a bit in 
https://github.com/apache/solr/pull/2778 but it basUrl stuff is quite 
convoluted...

Jan

Moving away from Zookeeper in SolrJ

2024-10-18 Thread David Smiley
I strongly believe that we need to get ZooKeeper out of our clients (that
use CloudSolrClient), and use Solr URLs (HTTP) for the cluster state
instead.  I'm arguing to make this strategic direction clear, and we're
already going in the right direction.  Realistically, I don't think
solrj-zookeeper should be eliminated as it exists for Solr 10 but I could
see doing so eventually (no rush!).  Starting with Solr 9.8, I'd like users
to start using the Solr HTTP alternative option, encouraged by the release
notes.  In Solr 10 we can remove any documentation in the ref guide on
CloudSolrClient working with ZooKeeper.  Javadocs in
CloudSolrClient.Builder can recommend Solr URLs instead of the ZooKeeper
option.  I don't have a strong opinion on exactly when to deprecate it.
Today is too soon.

Why:

   - Principled — ZooKeeper is conceptually behind Solr; clients shouldn’t
   talk to it.
   - Fewer dependencies for clients (no ZooKeeper or Netty).
   - Better security — only Solr should talk to ZooKeeper!  Security
   settings and key configuration files are stored in ZooKeeper.
   - Eliminate impact of ZK storage on clients.  The change of where the
   configSet name was stored in ZK is an example.  PRS is another.  And
   other changes I’ve seen in a fork.
   - Reduce complexity of SolrJ from an operational standpoint and bug
   risks (e.g. no ZkStateReader there).  No Zookeeper related configuration
   (jute.maxbuffer, etc.)
   - Reduce complexity of SolrCloud by limiting the range of use of key
   classes like ZkStateReader to only be in Solr instead of also existing in
   SolrJ.  For example it’s not clear if/when LazyCollectionRef’s are used
   in SolrJ but with this separation, it’d be clearer that it couldn’t exist
   in SolrJ.
   - Increase our options for classes in solrj-zookeeper, like adding more
   dependencies (traces & metrics) without concern of burdening any
   user/client.
   - Reliably working with a collection after collection creation.  If
   you’ve seen waitForActiveCollection after creating a collection in our
   tests, this is what I mean (and it’s not strictly a test issue).  It's
   sad; make them go away!

Progress has been made on the alternative:  Ishan & Noble got the ball
rolling years ago to introduce the HTTP alternative option.  I call it
HttpCSP internally based on an abbreviation of its class name.  But I don't
think anyone actually uses it based on how poorly it performed, as reported
in JIRA.  In Solr 9.1, SolrJ was modularized, creating the
"solrj-zookeeper" module (opt-out), and made opt-in for Solr 10.  Finally,
key performance improvements landed in Solr 9.8 for the HTTP option making
it viable for most users (IMO).  Credit to my colleagues Haythem & Aparna
on some of these.


That said, HttpCSP (and CloudSolrClient actually) hasn't reached its ideal
state yet.  Some improvement possibilities / problems:

   - The cached DocCollection (i.e. a collection's state) expires out of a
   cache with a hard-coded TTL, even if it’s actively being used.  I don’t
   think it should.  It’d lead to poor p99 client experienced request
   metrics for those that have to additionally fetch the DocCollection —
   avoidably.
   - There’s a DocCollection version staleness mechanism but IMO it’s not
   robust.
   - If all live nodes disappear temporarily (hard cluster restart), I
   could imagine the client failing permanently.  (credit to Ilan)
   - CloudSolrClient.getClusterState (and its equivalent method on the
   provider) goes from a trivial getter to a slow remote call fetching the
   entire cluster’s state; no cache.  We have code using it in various places;
   surely users too.  This class has issues (out of scope of this post), so
   I want to deprecate this so that the client never touches ClusterState.
   Getting live-nodes, DocCollection, and cluster properties are still
   accessible though.

The last one, basically banning ClusterState in SolrJ, is the biggest
performance trap / issue that needs to be prioritized; I plan to create a
JIRA or two.

I suppose I could make a SIP out of this... albeit maybe the time for that
was years ago when HttpCSP came into existence.  I'm just trying to see
this through to a conclusion.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


AW: Problem with GeoJSON field in SOLR 8.11.1

2024-10-18 Thread Volk, Beatrycze
Hi,

thank you very much for the answer!

Unfortunately, changing the order of elements in the JSON object is not an 
option for us, as they are provided by external clients and we have no 
influence on the file contents.

Do you know maybe how should look exactly our fieldType configuration to use 
suggested work-around with different GeoJSON Reader?

Best regards,
Beatrycze

Von: David Smiley 
Gesendet: Dienstag, 15. Oktober 2024 00:00
An: dev@solr.apache.org; Volk, Beatrycze 
Betreff: Re: Problem with GeoJSON field in SOLR 8.11.1

Hi,

This is a known limitation, inside a dependency of Solr's: 
https://github.com/locationtech/spatial4j/issues/227
But as indicated in that issue, you might be able to work-around by using the 
JTS spatial context, which uses a different GeoJSON Reader.  I think, anyway; 
it's been a long time.

~ David

On Mon, Oct 14, 2024 at 8:58 AM Volk, Beatrycze 
mailto:beatrycze.v...@slub-dresden.de>> wrote:
Hi,

I am not sure if this is the right mailing list, but I have a problem with the 
indexing of the GeoJSON. It is not exactly the configuration problem, but more 
question about the implementation.

I try to index this GeoJSON document:
{
   "type":"FeatureCollection",
   "features":[
  {
 "type":"Feature",
 "properties":{

 },
 "geometry":{
"coordinates":[
   [
  [
 12.979074171786522,
 51.35455355187656
  ],
  [
 12.742678165549023,
 51.30187234125975
  ],
  [
 12.614561448882142,
 51.23774698707524
  ],
  [
 12.610428651570203,
 51.195810322507896
  ],
  [
 12.946838352754384,
 51.19322038765756
  ],
  [
 13.019575585442539,
 51.24136935014246
  ],
  [
 13.07247539103443,
 51.26361475158893
  ],
  [
 13.193979632002566,
 51.26154586591673
  ],
  [
 13.190673394153208,
 51.30445616429853
  ],
  [
 12.980727290711684,
 51.31840629529299
  ],
  [
 12.979074171786522,
 51.35455355187656
  ]
   ]
],
"type":"Polygon"
 }
  }
   ]
}

My field is defined as follow:


The error which I am getting:
Apache Solr threw exception: "Solr HTTP error: Bad Request (400)
{
  "responseHeader":{
"status":400,
"QTime":2305},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","java.text.ParseException"],
"msg":"ERROR: [doc=20452uuid-6613d542-b888-44b3-8496-91283a8267e2] Error 
adding field 
'geom'='{\"type\":\"Feature\",\"properties\":{},\"geometry\":{\"coordinates\":[[[12.979074171786522,51.35455355187656],[12.742678165549023,51.30187234125975],[12.614561448882142,51.23774698707524],[12.610428651570203,51.195810322507896],[12.946838352754384,51.19322038765756],[13.019575585442539,51.24136935014246],[13.07247539103443,51.26361475158893],[13.193979632002566,51.26154586591673],[13.190673394153208,51.30445616429853],[12.980727290711684,51.31840629529299],[12.979074171786522,51.35455355187656]]],\"type\":\"Polygon\"}}'
 msg=Unable to parse shape given formats \"lat,lon\", \"x y\" or as GeoJSON 
because java.text.ParseException: Unable to make shape type: Feature input: 
{\"type\":\"Feature\",\"properties\":{},\"geometry\":{\"coordinates\":[[[12.979074171786522,51.35455355187656],[12.742678165549023,51.30187234125975],[12.614561448882142,51.23774698707524],[12.610428651570203,51.195810322507896],[12.946838352754384,51.19322038765756],[13.019575585442539,51.24136935014246],[13.07247539103443,51.26361475158893],[13.193979632002566,51.26154586591673],[13.190673394153208,51.30445616429853],[12.980727290711684,51.31840629529299],[12.979074171786522,51.35455355187656]]],\"type\":\"Polygon\"}}",
"code":400}}

Basically after some tests I was able to confirm that problem is caused by the 
ordering of type elements in JSON. If the type "Polygon" is placed as first 
then the document is indexed correctly. But the ordering of the elements should 
not be important here, as the GeoJSON validator accepts the order of the fields 
from the above presented JSON. The documents are generated automatically by the 
software to generate GeoJSON.

I would like to ask if maybe this problem was already addressed in some of the 
newer SOLR version. If not, are there ar