On Fri, Feb 6, 2026 at 11:46 AM Markus Armbruster <[email protected]> wrote:
> > In particular, because parse_array() and parse_object() can assume
> > that the opening bracket/brace is not anymore in the token stream, it
> > is now apparent that the first entry of an array incorrectly used the
> > '[' character (stored in "token") as its location.
>
> For a value of "apparent".  Not sure I would've seen it without your
> heads up.

Me neither - I found it only while reviewing my own changes. :)

> I believe you're talking about
>
>         obj = parse_value(ctxt);
>         if (obj == NULL) {
>             parse_error(ctxt, token, "expecting value");
>             goto out;
>         }
>
> where @token indeed holds the array's '['.

Correct.

>         parse_error(ctxt, NULL, "premature EOI");
>
> These bad arguments are all masked by parse_error() completely ignoring
> its argument!  We tell users what's wrong with the input, but telling
> them where it's wrong has been TODO since the initial commit in 2009[*].
> I see your PATCH 5 finally takes care of it.

Yep, the clearer distinction between the parser and streamer helps a
little in that respect.

> > -    key_obj = parse_value(ctxt);
> > +    key_obj = parse_value(ctxt, token);
> >      key = qobject_to(QString, key_obj);
> >      if (!key) {
> > -        parse_error(ctxt, peek, "key is not a string in object");
> > +        parse_error(ctxt, token, "key is not a string in object");
> >          goto out;
> >      }
> >
>
> @peek isn't exactly a great name, but @token tells me even less.
> @first_token?

Possibly - though it all goes away since the parser becomes a state
machine, and the only separate functions that survive are for
one-token literals; which is why I just went simply with "token"
everywhere (as in, "next token" or "lookahead that led the parser to
call this function").

> >  static QObject *parse_object(JSONParserContext *ctxt)
> >  {
> >      QDict *dict = NULL;
> > -    JSONToken *token, *peek;
> > -
> > -    token = parser_context_pop_token(ctxt);
> > -    assert(token && token->type == JSON_LCURLY);
> > +    const JSONToken *token;
>
> Before the patch, parse_object() parses the entire object, as its name
> indicates, with the precondition that the lookahead is '{'.
>
> After the patch, it parses the part of the object after the '{'.  Hmm.
>
> We could pass it the first token, like we do for all the other parsing
> functions except this one and parse_array(), and keep the assertion.
> I'd prefer that, but it's a matter of taste.

I see, yes we could. I can do this in v2, since I'm sure you'll
(rightly) find something in the other patches that warrants a resend.

> This is the only call of parser_context_pop_token() that isn't
> immediately followed by "if it returned null, report "premature EOI".
> I went "let's factor this out", then realized the next patch replaces
> all this.  Nevermind!

Yeah, not an excuse for being sloppy but perhaps an excuse for not
being too picky. :)

> * parse_pair() and parse_value() now receive the first token as
>   argument.  Popping the first token moves to their callers.
>
> * parse_keyword(), parse_interpolation(), and parse_literal() now
>   receive the token as argument.  Popping it moves to their callers.
>   parse_string() works like this even before the patch, and is not
>   changed.
>
> * parse_object() and parse_array() now parse the part after the object's
>   / array's first token.
>
> Works.

Good :)

Paolo


Reply via email to