#25705: Parameters are not adapted or quoted in Query.__str__
-------------------------------------+-------------------------------------
Reporter: Dmitry Dygalo | Owner: Alex
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Alex):
Replying to [comment:16 Simon Charette]:
> > I wouldn't consider someone doing str(qs.query) to then pass in to
raw() (which the docs already warn of SQL injections) a real use case that
anyone would do.
>
> I don't agree on that point. From helping users through various forums
in most cases the reason why they were bringing up the issue of improper
quoting was when they tried executing the query as is. The duplicates of
this ticket seem to support this position (see #17741 and #25092
[https://stackoverflow.com/questions/31287184/django-orm-magic-bug-
or-a-feature/31287932#31287932 SO]) and I don't think would be hard to
find other examples.
If the worry is `raw(str(qs.query))`, then not having the quoting is a
security issue.
In fact I can actually do an SQL injection abusing this lack of quoting.
{{{
qs = User.objects.filter(first_name='"test")--', is_staff=True).only('id')
>>> <QuerySet []>
list(User.objects.raw(str(qs.query)))
>>> [<User: test>]
list(User.objects.raw(str(qs.query)))
>>> 'SELECT "auth_user"."id" FROM "auth_user" WHERE
("auth_user"."first_name" = "test")-- AND "auth_user"."is_staff")'
}}}
With proper quoting, as the one done when executing normally, the query is
`SELECT "auth_user"."id" FROM "auth_user" WHERE ("auth_user"."first_name"
= \'"test")--\' AND "auth_user"."is_staff")`
So if `raw(str(qs.query))` is a risk, then quoting the parameters would
fix it.
> > Also my recommended choice was not trying to emulate parameter
quoting, but to use the database library mogrify method in 3 of the 5
supported backends.
>
> You second alternative mentioned
>
> > Use the mogrify function in the first three backends, and manually
quote the parameters in the other two.
>
> I ''think'' that would qualify as a form of emulation or best-effort
wouldn't it? What guarantees do we have that the quoting is appropriate
for all datatypes that these backends support as parameters? Are we
feeling confident, knowing that users will attempt to pipe `str(qs.query)`
into `raw` and that `Query.__str__` always use `DEFAULT_DB_ALIAS` for
compilation #25947 (even if the proper backend can often only be
determined as execution time) that we are exposing the right tools to
users and we can commit to them being safe for the years to come?
And as I said just after that, that alternative was rejected in the past,
would be too much effort and I was only mentioning it as the only way to
fix it in every backend. I should have been clearer that I wasn't pushing
for this option.
> IMO the usage of `raw(str(qs.query))`, and the main motive for this
ticket, is a symptom of a lack of documented way for safely building and
executing the SQL and parameters from a `QuerySet` object which makes me
believe the focus should be on documenting `queryset.qs.sql_with_params()`
first instead.
That idea was rejected [https://code.djangoproject.com/ticket/34636 14
months ago], but it could be reconsidered in light of this discussion.
Also, I didn't actually find `qs.query` being documented, mostly this
[https://docs.djangoproject.com/en/5.0/ref/models/querysets/#pickling-
querysets mention] about pickling querysets.
> As for the printing and copy-pasting into a shell use case we've reached
an agreement that the only way to see the query with the parameters binded
correctly is after executing it. Knowing that, and the fact
`Query.__str__` is still pretty legible even without the exact mogrifying
that only leaves the copy-pasting into a shell use case. Should we make it
easier to retrieve the exact SQL associated with a particular queryset
instead? Something like `QuerySet.executed_query: str | None` that is only
present on ''evaluated'' querysets.
I would be happy with this option. An alternative would be an error if the
queryset hasn't been evaluated instead of None. But I'm not that big of a
fan of the error idea.
> These appears like solutions that prevent misuse of `Query.__str__` and
avoid maintaining correct and safe x-backend mogrifying solutions?
However, my initial investigation was not deep enough and I found more
reasons to discard the mogrify option:
- In mysqlclient, mogrify was added in version
[https://github.com/PyMySQL/mysqlclient/blob/main/HISTORY.rst#whats-new-
in-220 2.2.0] and django supports
[https://docs.djangoproject.com/en/5.0/ref/databases/#mysqlclient 1.4.3]
onwards, so there would have to be an additional check and a difference in
behavior.
- In the MySQL Connector/Python driver I couldn't find a mogrify method.
Not the recommended driver, but still supported.
- In psicopg3, `mogrify` is only on the
[https://www.psycopg.org/psycopg3/docs/api/cursors.html#psycopg.ClientCursor.mogrify
ClientCursor] class. While that's the default in Django, server side is
also [https://docs.djangoproject.com/en/5.0/ref/databases/#server-side-
parameters-binding supported]. Again, an additional check. Also, I'm not
sure if this means that these usages of mogrify
[https://code.djangoproject.com/ticket/25705?replyto=16#comment:12 I
mentioned] in the postgres backend don't work with server side parameter
binding.
> Django is already using the postgres mogrify in its own compose_sql
function in the
[https://github.com/django/django/blob/f302343380c77e1eb5dab3b64dd70895a95926ca/django/contrib/postgres/search.py#L317
search backend],
[https://github.com/django/django/blob/f302343380c77e1eb5dab3b64dd70895a95926ca/django/db/backends/postgresql/schema.py#L46
schema queries] and
[https://github.com/django/django/blob/f302343380c77e1eb5dab3b64dd70895a95926ca/django/db/backends/postgresql/base.py#L98
ensuring the role of the connection]
My proposal is to close this as a wontfix and work on the
`QuerySet.executed_query` idea and maybe reconsider documenting
`sql_with_params`
--
Ticket URL: <https://code.djangoproject.com/ticket/25705#comment:17>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/01070190b7c4320d-058c250f-9228-4666-abc0-2d5f2184b4b9-000000%40eu-central-1.amazonses.com.