I wanted to follow up and share that I was able to get the compression logic to 
all work.  

See 
https://github.com/apache/solr/pull/2298/commits/ee8657438799c07baadd8efa0f272bb57380d772

I *believe* that we could make the compression logic a option on 
SolrZkClient.Builder(), maybe 
SolrZkClient.Builder().withStateFileCompression(minStateByteLenForCompression, 
compressor)

And that would simplify the code…?



> On Mar 14, 2024, at 2:44 PM, Eric Pugh <ep...@opensourceconnections.com> 
> wrote:
> 
> Justin, I went back and crawled through with fresh eyes based on what you 
> shared ;-).   Now I understand that SOLRHOME variable in ZkCLI.
> 
> Okay, so….  It appears that the logic to know if compression is enabled 
> REQUIRES you to have a solr_home, which means the compressed put only works 
> if you are on the same box as the solr_home?
> 
> Unless, can I get it from ZK?   So that we don’t have to be on the same box?  
>  What if we assume that this property is defined in solr.xml in ZK, and just 
> look there?   Oh wait, Ref Guide says:
> 
> Loading solr.xml from Zookeeper is deprecated, and will not be supported in a 
> future version. Being the node config of Solr, this file must be available at 
> early startup and also be allowed to differ between nodes.
> 
> Do we have an API that provides access to Solr’s solr.xml settings?   
> 
> I *think* this is hard to solve cleanly because the zk sub commands don’t 
> interact with Solr, instead they go directly to ZooKeeper..  If they were 
> mediated by Solr, then the logic about compression choices could remain 
> purely on the server side and not touch the client….
> 
> 
> Eric
>  
> 
>> On Mar 5, 2024, at 9:39 AM, Eric Pugh <ep...@opensourceconnections.com> 
>> wrote:
>> 
>> Thanks for sharing this…..   
>> 
>> So, maybe the least bad is to just copy the logic, maybe into SolrCLI.java?  
>>  
>> 
>>> On Mar 5, 2024, at 9:29 AM, Justin Sweeney <justin.sweene...@gmail.com> 
>>> wrote:
>>> 
>>> The tricky part of setData as compared to getData is that in getData we can
>>> tell the data is compressed based on the initial bytes read. For setData
>>> the only way to know if we should compress the provided data is if we read
>>> the solr.xml to know if compression is enabled or by adding an argument to
>>> classes like ZkCpTool.
>>> 
>>> Putting an uncompressed state.json into a cluster where compression is used
>>> should still work fine so at this point I've erred on people using these
>>> tools outside of the Solr cluster having an understanding of how they have
>>> set up compression or not for when adding data into the cluster. Not
>>> opposed to adding more arguments for some classes, but we don't have a way
>>> to just handle compression in setData in the same way we do with getData.
>>> 
>>> On Sat, Mar 2, 2024 at 11:23 AM Eric Pugh <ep...@opensourceconnections.com 
>>> <mailto:ep...@opensourceconnections.com>>
>>> wrote:
>>> 
>>>> Looking at this, when we use the ZkCpTool to upload content, I’m not sure
>>>> it goes through the compression step?
>>>> 
>>>> Looks like eventually we get to SolrZkClient.setData() and I don’t see
>>>> anything about compression..    Unlike the SolrZkClient.getData()…
>>>> 
>>>> Am I reading this right?   Shouldn’t setData mimic getData in handling
>>>> compression?
>>>> 
>>>> Thanks for looking at this!
>>>> 
>>>> 
>>>> 
>>>>> On Feb 29, 2024, at 12:29 PM, Justin Sweeney <justin.sweene...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> I actually think that use case should just work since the SolrZkClient
>>>> can
>>>>> already handle compressed state.json, assuming you are just using the
>>>>> default ZLib implementation of compression. When getting data it looks
>>>> like
>>>>> the ZkCpTool calls SolrZkClient.getData() which is able to check if the
>>>>> data is compressed.
>>>>> 
>>>>> On Thu, Feb 29, 2024 at 8:58 AM Eric Pugh <
>>>> ep...@opensourceconnections.com <mailto:ep...@opensourceconnections.com> 
>>>> <mailto:ep...@opensourceconnections.com>>
>>>>> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> I am poking around ZkCLI.java, and noticed that the compression for a
>>>>>> “state.json” file logic is in this file.   I’m realizing that the
>>>> existing
>>>>>> bin/solr zk cp command knows nothing about a “state.json” file being
>>>>>> compressed or not, and so if you do
>>>>>> 
>>>>>>       bin/solr zk cp my_local_state.json zk:/state.json -z
>>>> localhost:9983
>>>>>> 
>>>>>> Then I think you don’t get the compression aspect kicking in.
>>>>>> 
>>>>>> I could copy that logic into the ZkCpTool.java, but wondering if there
>>>> is
>>>>>> a better refactoring?   Could this logic live in either
>>>> SolrZkClient.java
>>>>>> or ZkMaintenanceUtils.java ?
>>>>>> 
>>>>>> Thoughts?
>>>>>> 
>>>>>> Eric
>>>>>> _______________________
>>>>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
>>>>>> http://www.opensourceconnections.com 
>>>>>> <http://www.opensourceconnections.com/> <
>>>>>> http://www.opensourceconnections.com/> | My Free/Busy <
>>>>>> http://tinyurl.com/eric-cal>
>>>>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
>>>>>> 
>>>> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
>>>>> 
>>>>>> 
>>>>>> This e-mail and all contents, including attachments, is considered to be
>>>>>> Company Confidential unless explicitly stated otherwise, regardless of
>>>>>> whether attachments are marked as such.
>>>> 
>>>> _______________________
>>>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 |
>>>> http://www.opensourceconnections.com 
>>>> <http://www.opensourceconnections.com/> <
>>>> http://www.opensourceconnections.com/> | My Free/Busy <
>>>> http://tinyurl.com/eric-cal>
>>>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed <
>>>> https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>>>> 
>>>> This e-mail and all contents, including attachments, is considered to be
>>>> Company Confidential unless explicitly stated otherwise, regardless of
>>>> whether attachments are marked as such.
>> 
>> _______________________
>> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
>> http://www.opensourceconnections.com <http://www.opensourceconnections.com/> 
>> | My Free/Busy <http://tinyurl.com/eric-cal>  
>> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
>> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>>  
>> This e-mail and all contents, including attachments, is considered to be 
>> Company Confidential unless explicitly stated otherwise, regardless of 
>> whether attachments are marked as such.
>> 
> 
> _______________________
> Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
> http://www.opensourceconnections.com <http://www.opensourceconnections.com/> 
> | My Free/Busy <http://tinyurl.com/eric-cal>  
> Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
> <https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
>   
> This e-mail and all contents, including attachments, is considered to be 
> Company Confidential unless explicitly stated otherwise, regardless of 
> whether attachments are marked as such.
> 

_______________________
Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com <http://www.opensourceconnections.com/> | 
My Free/Busy <http://tinyurl.com/eric-cal>  
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed 
<https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw>
    
This e-mail and all contents, including attachments, is considered to be 
Company Confidential unless explicitly stated otherwise, regardless of whether 
attachments are marked as such.

Reply via email to