Re: Future of the cross-dc work

2023-09-11 Thread Mark Miller
FYI, there are two branches that we have used. The crossdc-wip branch has a fully single threaded polling loop, and batches updates to the queue, and those batches are then also batched by the queue. This branch has been used on clusters of billion document scale and in a production environment f

Re: Future of the cross-dc work

2023-09-11 Thread Mark Miller
The only real complexity around it is properly dealing with the queue in a large scale production environment, and none of that is code complexity. CrossDC is a critical feature for many, the problem with the previous iteration was it tried to be the queuing system and was obviously never going to

Re: Future of the cross-dc work

2023-09-10 Thread Andrzej Białecki
I can assure you that this project is interesting to other parties, too :) I’ve been spending some time testing it and making improvements, and it seems to work quite well. Removal of CDCR left a gap that users will try to fill in various haphazard ways. The impact of this hasn’t been fully fel

Re: Future of the cross-dc work

2023-09-09 Thread Mark Miller
It has integration tests, Kafka has an embedded version for tests, just no CI setup currently. If it comes into Solr, it will just pick up Solr’s CI. The design can work with any queuing system, but due to the various intricacies involved in the different queue implementations, adding support is b

Re: Future of the cross-dc work

2023-09-09 Thread Ishan Chattopadhyaya
Hi Anshum, I mostly agree with you. My point is that we shouldn't *release* it until broader community has had a chance to weigh in, both development and testing. Towards that, I like Jan's idea of using PRs etc for more code reviews, before merging it to some official branch. I'm not comfortable

Re: Future of the cross-dc work

2023-09-09 Thread Jan Høydahl
Hi, There has for sure been interest since CDCR disappeared. Given the close integration with Solr and its small footprint, I’m supportive of this becoming a Solr module with URP and a standalone app in the main repo. So the code can be cleaned up, documented and migrated over in a few PRs as a

Re: Future of the cross-dc work

2023-09-09 Thread Anshum Gupta
I wouldn't term people who work at a particular company as committers. I'm reasonably sure when folks contribute to the project they wear their ASF hat. If they don't, they should. Like most other features, this one also is a result of the need by users of the project. I completely agree it's bee

Re: Future of the cross-dc work

2023-09-09 Thread Ishan Chattopadhyaya
So far, this project has been an experimental project, mainly something that's being used and developed by Apple committers. I would be hesitant to support an official Apache release for the same without testing or interest by the broader community. Towards that, can we invite community members to

Re: Future of the cross-dc work

2023-09-08 Thread Mark Miller
I think the main motivation would be cost savings. The main thing I like about keeping it separate is the ability to have an independent release cycle. I initially preferred a separation due to that. But the cost for what it actually is, is high. It essentially consists of two fairly simple part

Re: Future of the cross-dc work

2023-09-08 Thread Houston Putman
I always assumed after the CDCR stuff was removed from Solr, the idea was to provide a better first-party solution one day. To me, first party means not "experimental" or "sandbox", so it makes sense to live in the main Solr repo. In general, I agree with Eric, nothing should "live" in the solr-sa

Re: Future of the cross-dc work

2023-09-08 Thread Eric Pugh
My perspective of Solr-sandbox is that it’s an area for ideas to be worked on, but no real promises…. They might be abandoned at any moment, or have issues.. No real expectation of docs or any kind of support. It’s meant for solr committers to collaborate with other solr committers on new t

Re: Future of the cross-dc work

2023-09-08 Thread David Smiley
Hi Houston, Can you please elaborate on the purpose: > and moving it into Solr would allow others to use and collaborate on it easier. How is that? I am guessing another motivation may be visibility / awareness. If that is a motivation, I think that can be addressed with prominent references in