Hi devs,
tl;dr: Is there any (efficient) way to get the String consumed by a `reads`?
I'm stuck in thinking about a fix for #19746. Happily, the problem is simple
enough that I could assign it in the first few weeks of a Haskell course... and
yet I can't find a good solution! So I pose it here for inspiration.
The high-level problem: Assign correct source spans to options within a
OPTIONS_GHC pragma.
Current approach: The payload of an OPTIONS_GHC pragma gets turned into a
String and then processed by GHC.Utils.Misc.toArgs :: String -> Either String
[String]. The result of toArgs is either an error string (the Left result) or a
list of lexed options (the Right result).
A little-known fact is that Haskell strings can be put in a OPTIONS_GHC pragma.
So I can write both {-# OPTIONS_GHC -funbox-strict-fields #-} and {-#
OPTIONS_GHC "-funbox-strict-fieds" #-}. Even stranger, I can write {-#
OPTIONS_GHC ["-funbox-strict-fields"] #-}, where GHC will understand a list of
strings. While I don't really understand the motivation for this last feature
(I posted #19750 about this), the middle option, with the quotes, seems like it
might be useful.
Desired approach: change toArgs to have this type: RealSrcLoc -> String ->
Either String [Located String], where the input RealSrcLoc is the location of
the first character of the input String. Then, as toArgs processes the input,
it advances the RealSrcLoc (with advanceSrcLoc), allowing us to create correct
SrcSpans for each String.
Annoying fact: Not all characters advance the source location by one character.
Tabs and newlines don't. Perhaps some other characters don't, too.
Central stumbling block: toArgs uses `reads` to parse strings. This makes great
sense, because `reads` already knows how to convert Haskell String syntax into
a proper String. The problem is that we have no idea what characters were
consumed by `reads`. And, short of looking at the length of the remainder
string in `reads` and comparing it to the length of the input string, there
seems to be no way to recreate this lost information. Note that comparing
lengths is slow, because we're dealing with Strings here. Once we know what was
consumed by `reads`, then we can just repeatedly call advancedSrcLoc, and away
we go.
Ideas to get unblocked:
1. Just do the slow (quadratic in the number of options) thing, looking at the
lengths of strings often.
2. Reimplement reading of strings to return both the result and the characters
consumed
3. Incorporate the parsing of OPTIONS_GHC right into the lexer
It boggles me that there isn't a better solution here. Do you see one?
Thanks,
Richard
_______________________________________________
ghc-devs mailing list
[email protected]
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs