Next plate is (1a) incidental whitespace.
Having decided that we are content with "fat" delimiters (""") for multi-line
strings, we have some more choices to make regarding multi-line strings.
(We're not going to talk about "raw" strings yet; let's finish the multi-line
course first.)
Multi-line strings are different from single-line strings in a number of ways,
so let's get clear on what we want "multi-line" to mean.
Line terminators: When strings span lines, they do so using the line
terminators present in the source file, which may vary depending on what
operating system the file was authored. Should this be an aspect of
multi-line-ness, or should we normalize these to a standard line terminator?
It seems a little weird to treat string literals quite so literally; the choice
of line terminator is surely an incidental one. I think we're all comfortable
saying "these should be normalized", but its worth bringing this up because it
is merely one way in which incidental artifacts of how the string is embedded
in the source program force us to interpret what the user meant. Which brings
us to the next incidental aspect...
Whitespace: A multi-line string is nestled in the context of a Java source
program. It is likely (though not guaranteed) that the indentation of lines
has been distorted by the desire to make the embedded snippet align with the
enclosing lines. Most of the time, there is some combination of incidental
whitespace and intended whitespace. There are a number of algorithms by which
we could try to intuit which the user intended. Which brings us to ask:
- Assuming the existence of a reasonable algorithm for re-aligning text, what
should the _default_ be for the language? Should it assume the user wants
re-alignment, or make the user explicitly opt in?
- If the choice is "automatically align", how would we indicate the desire to
opt out?
- Should we limit what we do automatically to only what can be done by an
equivalent library routine?
(Again, let's focus on the requirements and semantics and defaults first,
before we bikeshed the syntax.)
Its hard to answer the above without a clear understanding of the use cases.
So, here's a partial catalog of examples; let's play "what was the user
thinking", and see if we can agree on that.
Examples;
String a = """
+--------+
| text |
+--------+
"""; // first characters in first column?
String b = """
+--------+
| text |
+--------+
"""; // first characters in first column or indented four spaces?
String c = """
+--------+
| text |
+--------+
"""; // first characters in first column or indented several?
String d = """
+--------+
| text |
+--------+
"""; // first characters in first column or indented four?
String e =
"""
+--------+
| text |
+--------+
"""; // heredoc?
String f = """
+--------+
| text |
+--------+
"""; // one or all leading or trailing blank lines stripped?
String g = """
+--------+
| text |
+--------+"""; // Last \n dropped
String h = """+--------+
| text |
+--------+"""; // determine indent of first line using scanner
knowledge?
String i = """ "nested" """; // strip leading/trailing space?
String j = ("""
public static void """ + name + """(String... args) {
System.out.println(String.join(args));
}
""").align(); // how do we handle expressions with multi-line
strings?
String k = """
public static void %s(String... args) {
System.out.println(String.join(args));
}
""".format(name); // is this the answer to multi-line string
expressions?
As we can see, there were a lot of cases where the user _probably_ wanted one
thing, but _might have_ wanted another. What control knobs do we have, that we
could assign meaning to, that would let the user choose either way? Candidates
include:
- The opening line (is it blanks followed by a newline, or are there
non-whitespace characters?)
- The position of the close delimiter (is it on its own line, or not?)
Similarly, we have a number of policy choices:
- Do we allow content on the same lines as the delimiters?
- Should we always add a final newline?
- Should we strip blanks lines? Only on the first and last? All leading and
trailing?
- How do we interpret auto-alignment on single-line strings? Strip?
- Should we right strip lines?
And some syntax choices (not to be discussed now):
- How do we indicate opt-out?
Comments?
Examples narrative. Don’t peek yet. Stop and comment first.
Unlike most other Java constructs, multi-line strings force us to look at
coding style "square on". Keep in mind that we are often guilty of making
assumptions about developer coding style. For instance, we may assume that
multi-line strings tend to be large elements. We may also assume that
developers will declare static final String variables to keep multi-line
strings from messing up their code. All very neat and tidy, but... we know
from experience that developers will use multi-line strings everywhere, as they
have with array initialization and large lambda bodies.
From this, we recommend that multi-line string fat delimiters should follow the
brace pattern used in array initialization, lambdas and other Java constructs.
The open delimiter should end the current line. Content follows on separate
lines, indented one level. The close delimiter starts a new line, back
indented one level, followed by the continuation of enclosing expression.
So as in this brace pattern;
int[] ia = new int[] {
1,
2,
3
};
we have the fat delimiter pattern;
String d = """
+--------+
| text |
+--------+
""";
and;
String.format("""
public static void %s(String... args) {
System.out.println(String.join(args));
}
""", name);
The fat delimiter pattern also significantly helps with future editing in and
around the multi-line string. For example, changing the length of the variable
name in the above "String d =" example doesn't affect the positioning of the
string content or the close delimiter.
If we adopt this style, some of the answers to the incidentals questions become
easier or even moot. Other styles are still valid, but the result of automatic
incidental handling may be surprising.
Note that fat delimiters can be used on single lines. What are the semantics
for auto-alignment in that case? The question of stripping whitespace and
newlines is not really about alignment. It's about what are the rules for
handling incidental characters in a fat delimiter string.
Continuing with the examples, let's assume some (negotiable) auto-alignment
basic rules;
1. All content lines are uniformly right stripped. Whitespace at the end of
lines is not something that is consistently managed by IDEs/editors.
2. End of lines are always translated to \n.
3. If the content after the open delimiter is empty then the first end of line
is discarded.
4. Content is left justified while preserving relative indentation.
And as a reminder, in the last round we introduced or attempted to introduce
the following String methods;
- String::indent(n) - used to change indentation, line by line (in JDK 11)
- String::align() and String::align(n) - used to manage incidental indentation
(didn't make it)
- String::format as an instance method (resolution issues YTBD)
__________________________________________________________________________________________________
String a = """
+--------+
| text |
+--------+
"""; // first characters in first column?
RESULT:
+--------+\n
| text |\n
+--------+\n
The problem with this example is that it is not following the fat delimiter
pattern. Let's change the variable name "a" to "something".
String something = """
.......... +--------+
.......... | text |
.......... +--------+
.......... """; // first characters in first column?
The "." indicate all the places where we had to add whitespace to maintain the
pattern used.
__________________________________________________________________________________________________
String b = """
+--------+
| text |
+--------+
"""; // first characters in first column or indented four?
RESULT:
+--------+\n
| text |\n
+--------+\n
Same maintenence problem as example (a).
Still works, but the question here is, do we give meaning to indentation
relative to the close delimiter? Did we want?;
+--------+\n
| text |\n
+--------+\n
It's a nice trick but we sabotage the fat delimiter pattern. We would always
get at least one level of indentation, whether we wanted it or not. Maybe
better to code as;
String b = """
+--------+
| text |
+--------+
""".indent(4);
So the question here is: should it be possible to specify "extra" indentation
through the positioning of quotes, or are we better off saying that any extra
indentation should be done through library calls? Also noting that the library
calls might be subject to compile time folding.
__________________________________________________________________________________________________
String c = """
+--------+
| text |
+--------+
"""; // first characters in first column or indented several?
RESULT:
+--------+\n
| text |\n
+--------+\n
The amount of indentation is not a problem, just an aesthetic issue.
__________________________________________________________________________________________________
String d = """
+--------+
| text |
+--------+
"""; // first characters in first column or indented four?
RESULT:
+--------+\n
| text |\n
+--------+\n
Text book fat delimiter pattern.
__________________________________________________________________________________________________
String e =
"""
+--------+
| text |
+--------+
"""; // heredoc?
RESULT:
+--------+\n
| text |\n
+--------+\n
Just an aesthetic issue.
__________________________________________________________________________________________________
String f = """
+--------+
| text |
+--------+
"""; // one or all leading or trailing blank lines stripped?
As-is would generate;
\n
\n
+--------+\n
| text |\n
+--------+\n
\n
\n
\n
If we stripped away all leading or trailing blank lines, we would then have
code as;
String f = "\n".repeat(2) + """
+--------+
| text |
+--------+
""" + "\n".repeat(2);
__________________________________________________________________________________________________
String g = """
+--------+
| text |
+--------+"""; // Last \n dropped
RESULT:
+--------+\n
| text |\n
+--------+
This one is likely okay. It's not the fat delimiter pattern, but the oddity
makes it clear we mean something different; we want to drop the last \n.
__________________________________________________________________________________________________
String h = """+--------+
| text |
+--------+"""; // determine indent of first line using scanner
knowledge?
RESULT:
+--------+\n
| text |\n
+--------+
We can do this because the compiler's scanner can determine the indentation on
the open delimiter line. However, this one is problematic if we require a
String method to duplicate the compiler's algorithm (String::align). Tool
vendors may also find this one problematic.
__________________________________________________________________________________________________
String i = """ "nested" """; // strip leading/trailing space?
RESULT:
"nested"
This one still follows the rules; left and right stripped.
__________________________________________________________________________________________________
String j = ("""
public static void """ + name + """(String... args) {
System.out.println(String.join(args));
}
""").align(); // how do we handle expressions with multi-line
strings?
Mid-string substitution gets messy fast. Let's break the example down to the
following (without align.)
String j = """
public static void """ + name + """(String... args) {
System.out.println(String.join(args));
}
""";
This is the same as
String j =
"""
public static void """
+ name +
"""(String... args) {
System.out.println(String.join(args));
}
""";
Which works fine if we say no \n when close delimiter is on the same line. The
other requirement is there is that each multi-line string componment ends up
with a common indentation. The odds of that happening are poor.
Guess we're stuck with parentheses String::align. Unless...
__________________________________________________________________________________________________
String k = """
public static void %s(String... args) {
System.out.println(String.join(args));
}
""".format(name); // is this the answer to multi-line string
expressions?
RESULT:
public static void methodName(String... args) {
System.out.println(String.join(args));
}
Maybe a better substitution solution.
__________________________________________________________________________________________________