danielcweeks commented on PR #15428:
URL: https://github.com/apache/iceberg/pull/15428#issuecomment-4014002209
> > The only two requests that should be cached are HEAD and GET.
>
> But this rule isn't _enforced_. Nothing in the specs prevents a server
from sending a `Cache-Control: private` header for other methods. And by doing
so, it would break the client. I'm not saying it makes sense to do so, I'm
saying that it's not fair for a server to break the client so easily. If the
client's cache is designed to only handle these 2 methods and nothing else, I
think the client should make sure to filter out other methods. It seems a bit
pointless to me to require from servers to send the `Cache-Control` header when
the client already knows what requests it can and cannot cache.
I believe the default is that the client doesn't cache unless told to do so,
which makes caching a server responsibility. While it might make sense to
limit to just the two methods we expect, it should be the client's
responsibility to fix a bad server implementation. Yes, the client would
break, but it's really the server that needs to be fixed.
> Yes but in fact, the most problematic scenario for me is a `GET` request
with a `range` header. If a server decides to sign the `range` header (which is
imho totally valid), the client would break. The prevailing philosophy is that
"the server decides what to to sign," but in reality, the server's control
appears limited due to potential client-side cache issues. Again, it appears to
me that, if the client already knows that it would break if the server signs
some header, it's best for the client to proactively remove that header from
the request to sign.
I think that's putting to much control in the clients hand and limits what
functionality the server has in deciding what to sign for. If a client "hides"
the range header, a server would only have the option to sign for everything or
nothing. While in practice, I don't know of any implementation is protecting
ranges of files, it is entirely feasible and since it's the servers
responsibility to protect the data, it should have the final say on what it
allows to be read.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]