attilakreiner commented on code in PR #10681: URL: https://github.com/apache/iceberg/pull/10681#discussion_r1678355098
########## .baseline/checkstyle/checkstyle.xml: ########## @@ -281,16 +280,13 @@ </module> <module name="InnerAssignment"/> <!-- Java Coding Guidelines: Inner assignments: Not used --> <module name="MemberName"> <!-- Java Style Guide: Non-constant field names --> - <property name="format" value="^[a-z][a-zA-Z0-9]+$"/> - <message key="name.invalidPattern" value="Member name ''{0}'' must match pattern ''{1}''."/> + <property name="format" value="^[a-z][a-zA-Z0-9]++$"/> Review Comment: The `*+` and `++` are _possessive_ quantifiers (as opposed to `*` and `+` which are _greedy_; and `*?` and `+?` which are _lazy_). I found the explanation on this page very useful: https://www.rexegg.com/regex-quantifiers.php#possessive My understanding is that in cases when we can use either _greedy_ and _possessive_ quantifiers, _possessive_ are faster. Quote form the link above: > This behavior is particularly useful when you know there is no valid reason why the engine should ever backtrack into a section of matched text, as you can save the engine a lot of needless work. This seems a common regex feature, but honestly speaking I wasn't familiar with in until @findepi pointed it out (thx for that) in the review of #10673. I understand using _possessive_ quantifiers is good for performance, OTOH it's possible that many people are unfamiliar with this syntax, so it may be a tradeoff in terms of readability (and also a deviation from checkstyle standards). WDYT? ########## .baseline/checkstyle/checkstyle.xml: ########## @@ -281,16 +280,13 @@ </module> <module name="InnerAssignment"/> <!-- Java Coding Guidelines: Inner assignments: Not used --> <module name="MemberName"> <!-- Java Style Guide: Non-constant field names --> - <property name="format" value="^[a-z][a-zA-Z0-9]+$"/> - <message key="name.invalidPattern" value="Member name ''{0}'' must match pattern ''{1}''."/> + <property name="format" value="^[a-z][a-zA-Z0-9]++$"/> Review Comment: The `*+` and `++` are _possessive_ quantifiers (as opposed to `*` and `+` which are _greedy_; and `*?` and `+?` which are _lazy_). I found the explanation on this page very useful: https://www.rexegg.com/regex-quantifiers.php#possessive My understanding is that in cases when we can use either _greedy_ or _possessive_ quantifiers, _possessive_ are faster. Quote form the link above: > This behavior is particularly useful when you know there is no valid reason why the engine should ever backtrack into a section of matched text, as you can save the engine a lot of needless work. This seems a common regex feature, but honestly speaking I wasn't familiar with in until @findepi pointed it out (thx for that) in the review of #10673. I understand using _possessive_ quantifiers is good for performance, OTOH it's possible that many people are unfamiliar with this syntax, so it may be a tradeoff in terms of readability (and also a deviation from checkstyle standards). WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org