On Fri, Feb 6, 2026 at 11:46 AM Markus Armbruster <[email protected]> wrote:
> > In particular, because parse_array() and parse_object() can assume
> > that the opening bracket/brace is not anymore in the token stream, it
> > is now apparent that the first entry of an array incorrectly used the
> > '[' character (stored in "token") as its location.
>
> For a value of "apparent". Not sure I would've seen it without your
> heads up.
Me neither - I found it only while reviewing my own changes. :)
> I believe you're talking about
>
> obj = parse_value(ctxt);
> if (obj == NULL) {
> parse_error(ctxt, token, "expecting value");
> goto out;
> }
>
> where @token indeed holds the array's '['.
Correct.
> parse_error(ctxt, NULL, "premature EOI");
>
> These bad arguments are all masked by parse_error() completely ignoring
> its argument! We tell users what's wrong with the input, but telling
> them where it's wrong has been TODO since the initial commit in 2009[*].
> I see your PATCH 5 finally takes care of it.
Yep, the clearer distinction between the parser and streamer helps a
little in that respect.
> > - key_obj = parse_value(ctxt);
> > + key_obj = parse_value(ctxt, token);
> > key = qobject_to(QString, key_obj);
> > if (!key) {
> > - parse_error(ctxt, peek, "key is not a string in object");
> > + parse_error(ctxt, token, "key is not a string in object");
> > goto out;
> > }
> >
>
> @peek isn't exactly a great name, but @token tells me even less.
> @first_token?
Possibly - though it all goes away since the parser becomes a state
machine, and the only separate functions that survive are for
one-token literals; which is why I just went simply with "token"
everywhere (as in, "next token" or "lookahead that led the parser to
call this function").
> > static QObject *parse_object(JSONParserContext *ctxt)
> > {
> > QDict *dict = NULL;
> > - JSONToken *token, *peek;
> > -
> > - token = parser_context_pop_token(ctxt);
> > - assert(token && token->type == JSON_LCURLY);
> > + const JSONToken *token;
>
> Before the patch, parse_object() parses the entire object, as its name
> indicates, with the precondition that the lookahead is '{'.
>
> After the patch, it parses the part of the object after the '{'. Hmm.
>
> We could pass it the first token, like we do for all the other parsing
> functions except this one and parse_array(), and keep the assertion.
> I'd prefer that, but it's a matter of taste.
I see, yes we could. I can do this in v2, since I'm sure you'll
(rightly) find something in the other patches that warrants a resend.
> This is the only call of parser_context_pop_token() that isn't
> immediately followed by "if it returned null, report "premature EOI".
> I went "let's factor this out", then realized the next patch replaces
> all this. Nevermind!
Yeah, not an excuse for being sloppy but perhaps an excuse for not
being too picky. :)
> * parse_pair() and parse_value() now receive the first token as
> argument. Popping the first token moves to their callers.
>
> * parse_keyword(), parse_interpolation(), and parse_literal() now
> receive the token as argument. Popping it moves to their callers.
> parse_string() works like this even before the patch, and is not
> changed.
>
> * parse_object() and parse_array() now parse the part after the object's
> / array's first token.
>
> Works.
Good :)
Paolo