Hi Eugen,
After reviewing the code, it doesn't seem to be limited to the official
'stretch' mode. Hopefully devs can confirm that.
Now, I'm wondering how rados_replica_read_policy compares to
rbd_read_from_replica_policy. Do they work the exact same way, with
rados_replica_read_policy being limited to librados clients (e.g., RGW) while
rbd_read_from_replica_policy is limited to RBD clients (krbd, librbd)?
In any case, it seems that rados_replica_read_policy = localize might require
the same crush_location to be set on the client's side, just like
rbd_read_from_replica_policy. See 'man rbd 8' or [1]:
crush_location=x - Specify the location of the client in terms of CRUSH
hierarchy (since 5.8). This is a set of key-value pairs separated from each
other by '|', with keys separated from values by ':'. Note that '|' may need to
be quoted or escaped to avoid it being interpreted as a pipe by the shell. The
key is the bucket type name (e.g. rack, datacenter or region with default
bucket types) and the value is the bucket name. For example, to indicate that
the client is local to rack "myrack", data center "mydc" and region "myregion":
crush_location=rack:myrack|datacenter:mydc|region:myregion
Each key-value pair stands on its own: "myrack" doesn't need to reside in
"mydc", which in turn doesn't need to reside in "myregion". The location is not
a path to the root of the hierarchy but rather a set of nodes that are matched
independently, owning to the fact that bucket names are unique within a CRUSH
map. "Multipath" locations are supported, so it is possible to indicate
locality for multiple parallel hierarchies:
crush_location=rack:myrack1|rack:myrack2|datacenter:mydc
If you happen to test rados_replica_read_policy = localize, let us know how it
works. ;-)
Cheers,
Frédéric.
[1] https://github.com/ceph/ceph/blob/main/doc/man/8/rbd.rst
----- Le 13 Juin 25, à 10:56, Eugen Block [email protected] a écrit :
> And a follow-up question:
> The description only states:
>
>> If set to ``localize``, read operations will be sent to the closest
>> OSD as determined by the CRUSH map.
>
> But how does the client determine where the nearest OSD is? Will there
> be some sort of score similar to the MON connection score? I'd
> appreciate any insights.
>
> Zitat von Eugen Block <[email protected]>:
>
>> Hi *,
>>
>> I have a question regarding the upcoming feature to optimize read
>> performance [0] by reading from the nearest OSD, especially in a
>> stretch cluster across two sites (or more). Anthony pointed me to
>> [1], looks like a new config option will be introduced in Tentacle:
>>
>> rados_replica_read_policy
>>
>> Will this config option be limited to the "official" stretch mode?
>> Or will it be possible to utilize it independent of the cluster
>> layout?
>>
>> Thanks!
>> Eugen
>>
>> [0] https://ceph.io/en/news/blog/2025/stretch-cluuuuuuuuusters-part2/
>> [1]
>> https://github.com/ceph/ceph/blob/d28e5fe890016235e302122f955fc910c96f2d43/src/common/options/global.yaml.in#L6504
>
>
> _______________________________________________
> ceph-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]