+Casey — who might have some insight

> On Aug 31, 2022, at 5:45 AM, Alex Hussein-Kershaw (HE/HIM) 
> <[email protected]> wrote:
> 
> Hi Eric, 
> 
> Thanks for your response. Answers below.
> 
> Is it the case that the object does not appear when you list the RGW bucket 
> it was in?
> - The reason our client finds it again is because we do list all the objects 
> in the bucket and cache the keys nightly. 
> - I think it is returned as part of the API call to list all objects in the 
> bucket.
> - I can see the "list objects" operation happening in HAProxy logs.    " 
> cephs3/S3:10.245.0.20 72/0/34/361/503 200 33819 - - ---- 23/13/0/1/0 0/0 
> {lusrebuild} "GET /edin2z6-scsdata/ HTTP/1.1""
> - That's the mechanism that results in us doing a GET on the object directly 
> days after we should have forgotten about it. 
> - It's strange that it takes a few days for this to reproduce - the object 
> doesn't show up in the next nightly list operation, but a subsequent one. 
> 
> You referred to "one side of my cluster”. Does that imply you’re using 
> multisite?
> - Yes (I think you missed that in first line of my last email) - we are using 
> multisite 😊 
> 
> We are using bucket versioning for this bucket (also note the object has a 
> versionID in my initial email - I believe that would show as 'null' if 
> versioning is disabled - although I appreciate sharing output of an 
> unfamiliar tool maybe wasn't that helpful / clear):
> [root@edin2z6 edin2z6_sdc] ~> aws s3api --endpoint=http://127.3.3.3:7480 
> get-bucket-versioning --bucket edin2z6-scsdata
> {
>    "Status": "Enabled",
>    "MFADelete": "Disabled"
> }
> 
> We don't actually have a use for bucket versioning (and have turned it off 
> for our deployments since this system was deployed - this is just a remnant). 
> If you think this might be the cause of the problem I can disable it and see 
> if it will repro. 
> 
> I'm using the aws s3api CLI tool to get the info I've shared.   
> 
> Thanks for the info regarding the multi-part and tail objects, good to know 
> that this won't be the cause.  
> 
> Kindest regards,
> Alex
> 
> -----Original Message-----
> From: J. Eric Ivancich <[email protected]> 
> Sent: Tuesday, August 30, 2022 5:19 PM
> To: Alex Hussein-Kershaw (HE/HIM) <[email protected]>
> Cc: Ceph Users <[email protected]>
> Subject: [EXTERNAL] Re: [ceph-users] S3 Object Returns Days after Deletion
> 
> A couple of questions, Alex.
> 
> Is it the case that the object does not appear when you list the RGW bucket 
> it was in?
> 
> You referred to "one side of my cluster”. Does that imply you’re using 
> multisite?
> 
> And just for completeness, this is not a versioned bucket?
> 
> With a size of 6252 bytes, it wouldn’t be a multi-part upload or require tail 
> objects.
> 
> So during a delete the bucket index shard is modified to remove the entry and 
> the head object (which in your case is the only object) is deleted from 
> rados. If there were tail objects, they’d generally get cleaned up over time 
> via RGW’s garbage collection mechanism.
> 
> It should also be noted that the bucket index does not need to be consulted 
> during a GET operation.
> 
> I looked for the string “SSECustomerAlgorithm” in the ceph source code and 
> couldn’t find it. Which tool is generating your “details about the object”?
> 
> Eric
> (he/him)
> 
>> On Aug 30, 2022, at 4:35 AM, Alex Hussein-Kershaw (HE/HIM) 
>> <[email protected]> wrote:
>> 
>> Hi Ceph-Users,
>> 
>> I'm running Ceph 15.2.13 with RGWs and multisite. I've got some odd S3 
>> object behaviour I'm really baffled by. Hoping to get some debugging advice.
>> 
>> The problem I have is that I delete an object, attempt some reads of the 
>> deleted object and get hit with 404s (totally reasonable - it no longer 
>> exists, right?). However a few days later, the object seems to be magically 
>> recreated, and a subsequent GET request that I'd expect to return a 404 
>> returns a 200. Looking on the cluster I can see the object genuinely does 
>> still exist.
>> 
>> I have a single client on my Storage Cluster. It contacts the Cluster via 
>> HAProxy which I've pasted some logs for below showing the described 
>> behaviour.
>> 
>> [15/Aug/2022:05:24:40.612] s3proxy cephs3/S3:10.245.0.23 49/0/0/55/104 204 
>> 234 - - ---- 22/15/0/1/0 0/0 {WSD} "DELETE 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:40.650] s3proxy cephs3/S3:10.245.0.21 69/0/0/22/107 404 
>> 455 - - ---- 22/15/0/1/0 0/0 {ISSMgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.549] s3proxy cephs3/S3:10.245.0.20 12/0/0/20/90 404 
>> 455 - - ---- 37/16/2/1/0 0/0 {S3Mgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.635] s3proxy cephs3/S3:10.245.0.21 0/0/31/17/63 404 
>> 455 - - ---- 37/16/0/1/0 0/0 {S3Mgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.699] s3proxy cephs3/S3:10.245.0.21 1/0/0/19/35 404 455 
>> - - ---- 37/16/0/1/0 0/0 {S3Mgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.733] s3proxy cephs3/S3:10.245.0.23 4/0/0/19/39 404 455 
>> - - ---- 38/16/1/1/0 0/0 {ISSMgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.772] s3proxy cephs3/S3:10.245.0.23 62/0/0/19/98 404 
>> 455 - - ---- 39/16/1/1/0 0/0 {S3Mgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.871] s3proxy cephs3/S3:10.245.0.23 1/0/0/22/39 404 455 
>> - - ---- 39/16/0/1/0 0/0 {S3Mgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [15/Aug/2022:05:24:53.899] s3proxy cephs3/S3:10.245.0.20 53/0/0/30/100 404 
>> 455 - - ---- 40/16/0/1/0 0/0 {ISSMgr} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> [17/Aug/2022:13:43:20.861] s3proxy cephs3/S3:10.245.0.23 699/0/0/33/749 200 
>> 6815 - - ---- 17/12/0/1/0 0/0 {other} "GET 
>> /edin2z6-scsdata/84/40/20220815042412F3DB2300000018-Subscriber HTTP/1.1"
>> 
>> Some details about the object:
>> 
>> {
>>   "AcceptRanges": "bytes",
>>   "ContentType": "binary/octet-stream",
>>   "LastModified": "Mon, 15 Aug 2022 04:24:40 GMT",
>>   "ContentLength": 6252,
>>   "Expires": "Thu, 01 Jan 1970 00:00:00 UTC",
>>   "SSECustomerAlgorithm": "AES256",
>>   "VersionId": "muroZd4apIM6RIkNpSEQfh8ZrYfFJWs",
>>   "ETag": "\"0803e745fbeac3be88e82adf2ef6240b\"",
>>   "SSECustomerKeyMD5": "2b6tFaOW0qSq1FOhX+WgZw==",
>>   "Metadata": {}
>> }
>> 
>> The only logical reason I could come up with that this object still exists 
>> is that it was recreated (my client shouldn't have - but obviously not 
>> impossible), but the LastModified date above rules this out, so I think it 
>> must be a Ceph thing.
>> 
>> How can the delete succeed, the object be temporarily deleted, and then pop 
>> back into existence?
>> 
>> Not sure if relevant, one side of my cluster reports health, the other side 
>> reports some large omap objects:
>> 
>> $ ceph health detail
>> HEALTH_WARN 5 large omap objects
>> [WRN] LARGE_OMAP_OBJECTS: 5 large omap objects
>>   5 large objects found in pool 'siteB.rgw.buckets.index'
>>   Search the cluster log for 'Large omap object found' for more details.
>> 
>> Thanks,
>> 
>> Alex Kershaw
>> Software Engineer
>> Office: 01316 500883
>> [email protected]<mailto:[email protected]>
>> 
>> _______________________________________________
>> ceph-users mailing list -- [email protected] To unsubscribe send an 
>> email to [email protected]
>> 
> 

_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to