Hey all,

On Mon, Feb 26, 2018 at 1:06 AM, Curtis Maloney <cur...@tinbrain.net> wrote:
> As discussed on IRC, I think the wording here is a bit weak... "it might be"
> probably ought be "it is probably".

Sure, sgtm.

On Mon, Feb 26, 2018 at 1:22 AM, Tom Forbes <t...@tomforb.es> wrote:
> Personally I’m quite against changing that warning. I have only ever seen
> one application where the use of an in-database file is appropriate and they
> where using the FILESTREAM type in SQL Server which offers some pretty
> advanced semantics compared to other databases (more akin to Django’s file
> storage than a BLOB column).

I think you might be underestimating some good use cases of using
databases for binary files. On top of my head, some random examples:

- PDF attached to objects (identification forms, etc)
- small spreadsheets to review
- word documents
- user submitted source code

Those are small files that will have a small performance impact on
your database, and that will greatly benefit from having ACID
guarantees and no need for consistency synchronisation between the
database and the filesystem.

> I’ve seen a lot of beginners use BLOB/byte fields where it’s really not
> needed and struggle with some insane performance issues due to it -
> especially with Django fetching all columns in a model by default.

That's a really good point I hadn't considered. Django fetching
everything by default is a big pitfall that requires to be extra
careful, and people should definitely be made aware of that.

> Also the
> link you gave (and thanks for linking, it’s an interesting read) is
> obviously Postgres specific, the issues you might face doing this are very
> vendor specific and non-portable - sqlite recommends against storing
> anything larger than 100kb in a row for example.

Sure, but it wouldn't be the only thing in the docs with a warning
that states "it might be a bad idea depending on your database
vendor". I don't think we should deter people from doing the things
their vendor empowers them to do, but rather make them aware of the
differences.

> I feel like the warning should implicitly say “do not do this, really don’t,
> but if you’re super super super sure you 100% need to then you’re going to
> disregard this warning anyway”, which the current one does quite well. To
> put it another way, if you’re at the point where you need to do this you’re
> way past reading the warning in the Django docs, and we should deter people
> who might make the wrong choice at the start.

But my whole point is that there *are* cases where beginners might be
scared by that warning although it would be easier and better for them
to use a database for what they are doing.

On Mon, Feb 26, 2018 at 2:22 AM, Adam Johnson <m...@adamj.eu> wrote:
> Did you know Facebook store their assets in MySQL, because it's the fastest
> replicated super-reliable thing to put them in?
> https://secure.phabricator.com/book/phabflavor/article/soon_static_resources/
> (near the end of 'Caches and Serving Content')

Oh, that's neat. Although I don't really expect beginners to have
facebook-level problematics for distributing their assets. Definitely
showcases that it's not always a bad idea though.

> The nice thing about leaving the warning as stern as it is is that
> anybody who is absolutely sure that they need to store files this way
> isn’t going to stop because of the warning to begin with;

Again, the point is that sometimes people are *not* absolutely sure,
but they would greatly benefit from doing what their DB vendor
empowers them to do. This is also the case for experienced people, and
I think we want to point them to the right direction to know more
rather than just dismiss the idea altogether.

> while
> weaking the warning will most assuredly lead to “Django is Slow” posts
> by newcomers that didn’t know SELECT * would be slow when there’s
> 3-5MB of data per row.

Well, sure, if your goal is to stop having people bug you, being
misleadingly dismissive is always a good idea. I'm not sure that's
what we should aim for, though. I'd rather inform people of the
drawbacks so that they can make an informed choice, even if that means
having to deal with people making the wrong choice with good advice.

After reading all your (very interesting) comments, I'd like to update
my documentation suggestion:

> Although you might think about storing files in the database, consider
> that it is probably a bad design choice. This field is not a
> replacement for proper static files handling.
>
> There might be some edge-cases where you do want the guarantees that
> the database offers you for small binary files, depending on your
> database vendor. Be sure to be aware of the general and
> vendor-specific trade-offs and limitations[1][2] before you decide to
> do so.

> You should also consider another performance pitfall: Django fetches
> all rows of a table by default, and thus requires extra-care if you
> create large rows. You should properly limit the amount of data
> fetched from the table by using values() when needed.

> [1]: https://wiki.postgresql.org/wiki/BinaryFilesInDB
> [2]: https://www.sqlite.org/intern-v-extern-blob.html

Cheers,

-- 
Antoine Pietri

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAOMH6m0vWJDb8J1ThA%3DpEqUUgUNdh%2BmTtQgAM2RNrXRidg3tkw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to