Hi Brent,

Perhaps the wording would be better phrased as "boundary from non-uppercase-character to uppercase-character", i.e. numbers and Emoji are treated the same as lowercase characters and are included in the original word. The following are [unit test cases from the associated PR](https://github.com/apple/swift/pull/12779/files#diff-26b09c16508c21f9f59dcf6c7a41d4b4R422), which should indicate the behavior implemented here:

```swift
let toSnakeCaseTests = [
  ("simpleOneTwo", "simple_one_two"),
  ("myURL", "my_url"),
  ("singleCharacterAtEndX", "single_character_at_end_x"),
  ("thisIsAnXMLProperty", "this_is_an_xml_property"),
  ("single", "single"), // no underscore
  ("", ""), // don't die on empty string
  ("a", "a"), // single character
  ("aA", "a_a"), // two characters
  ("version4Thing", "version4_thing"), // numerics
("partCAPS", "part_caps"), // only insert underscore before first all caps ("partCAPSLowerAGAIN", "part_caps_lower_again"), // switch back and forth caps. ("manyWordsInThisThing", "many_words_in_this_thing"), // simple lowercase underscore more
  ("asdfĆqer", "asdf_ćqer"),
  ("already_snake_case", "already_snake_case"),
  ("dataPoint22", "data_point22"),
  ("dataPoint22Word", "data_point22_word"),
  ("_oneTwoThree", "_one_two_three"),
  ("oneTwoThree_", "one_two_three_"),
  ("__oneTwoThree", "__one_two_three"),
  ("oneTwoThree__", "one_two_three__"),
  ("_oneTwoThree_", "_one_two_three_"),
  ("__oneTwoThree", "__one_two_three"),
  ("__oneTwoThree__", "__one_two_three__"),
  ("_test", "_test"),
  ("_test_", "_test_"),
  ("__test", "__test"),
  ("test__", "test__"),
("m͉̟̹y̦̳G͍͚͎̳r̤͉̤͕ͅea̲͕t͇̥̼͖U͇̝̠R͙̻̥͓̣L̥̖͎͓̪̫ͅR̩͖̩eq͈͓u̞e̱s̙t̤̺ͅ", "m͉̟̹y̦̳_g͍͚͎̳r̤͉̤͕ͅea̲͕t͇̥̼͖_u͇̝̠r͙̻̥͓̣l̥̖͎͓̪̫ͅ_r̩͖̩eq͈͓u̞e̱s̙t̤̺ͅ"), // because Itai wanted to test this
  ("🐧🐟", "🐧🐟") // fishy emoji example?
]
```

And for completeness, the [complementary test cases](https://github.com/apple/swift/pull/12779/files#diff-26b09c16508c21f9f59dcf6c7a41d4b4R540):

```swift
let fromSnakeCaseTests = [
  ("", ""), // don't die on empty string
  ("a", "a"), // single character
  ("ALLCAPS", "ALLCAPS"), // If no underscores, we leave the word as-is
  ("ALL_CAPS", "allCaps"), // Conversion from screaming snake case
("single", "single"), // do not capitalize anything with no underscore
  ("snake_case", "snakeCase"), // capitalize a character
  ("one_two_three", "oneTwoThree"), // more than one word
  ("one_2_three", "one2Three"), // numerics
  ("one2_three", "one2Three"), // numerics, part 2
("snake_Ćase", "snakeĆase"), // do not further modify a capitalized diacritic
  ("snake_ćase", "snakeĆase"), // capitalize a diacritic
("alreadyCamelCase", "alreadyCamelCase"), // do not modify already camel case
  ("__this_and_that", "__thisAndThat"),
  ("_this_and_that", "_thisAndThat"),
  ("this__and__that", "thisAndThat"),
  ("this_and_that__", "thisAndThat__"),
  ("this_aNd_that", "thisAndThat"),
  ("_one_two_three", "_oneTwoThree"),
  ("one_two_three_", "oneTwoThree_"),
  ("__one_two_three", "__oneTwoThree"),
  ("one_two_three__", "oneTwoThree__"),
  ("_one_two_three_", "_oneTwoThree_"),
  ("__one_two_three", "__oneTwoThree"),
  ("__one_two_three__", "__oneTwoThree__"),
  ("_test", "_test"),
  ("_test_", "_test_"),
  ("__test", "__test"),
  ("test__", "test__"),
  ("_", "_"),
  ("__", "__"),
  ("___", "___"),
("m͉̟̹y̦̳G͍͚͎̳r̤͉̤͕ͅea̲͕t͇̥̼͖U͇̝̠R͙̻̥͓̣L̥̖͎͓̪̫ͅR̩͖̩eq͈͓u̞e̱s̙t̤̺ͅ", "m͉̟̹y̦̳G͍͚͎̳r̤͉̤͕ͅea̲͕t͇̥̼͖U͇̝̠R͙̻̥͓̣L̥̖͎͓̪̫ͅR̩͖̩eq͈͓u̞e̱s̙t̤̺ͅ"), // because Itai wanted to test this
  ("🐧_🐟", "🐧🐟") // fishy emoji example?
]
```

— Itai

On 9 Nov 2017, at 5:57, Brent Royal-Gordon via swift-evolution wrote:

> On Nov 6, 2017, at 12:54 PM, Tony Parker via swift-evolution <[email protected]> wrote:

Converting from camel case to snake case:

1. Splits words at the boundary of lower-case to upper-case
2. Inserts `_` between words
3. Lowercases the entire string
4. Preserves starting and ending `_`.

For example, `oneTwoThree` becomes `one_two_three`. `_oneTwoThree_` becomes `_one_two_three_`.

My first thought was "are you handling `valueAsHTML` correctly?", but it looks like you are with the "boundary of lower-case to upper-case" wording. But what do you plan to do for numbers? Characters in caseless scripts? Emoji (which are valid in Swift identifiers)? I don't necessarily have strong opinions about the right answer—just want to make sure you do *something* about it.

--
Brent Royal-Gordon
Architechies


_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
[email protected]
https://lists.swift.org/mailman/listinfo/swift-evolution

Reply via email to