My Take on Django Channels

Mark Lavin Thu, 05 May 2016 12:34:39 -0700


After somewhat hijacking another thread 
https://groups.google.com/d/msg/django-developers/t_zuh9ucSP4/eJ4TlEDMCAAJ 
I thought it was best to start fresh and clearly spell out my feelings 
about the Channels proposal. To start, this discussion of “Django needs a 
websocket story” reminds me very much of the discussions about NoSQL 
support. There were proof of concepts made and the sky is falling arguments 
about how Django would fail without MongoDB support. But in the end the 
community concluded that `pip install pymongo` was the correct way to 
integrate MongoDB into a Django project. In that same way, it has been 
possible for quite some time to incorporate websockets into a Django 
project by running a separate server dedicated for handling those 
connections in a framework such as Twisted, Tornado, Aiohttp, etc and 
establishing a clear means by which the two servers communicate with one 
another as needed by the application. Now this is quite vague and ad-hoc 
but it does work. To me this is the measuring stick by which to judge 
Channels. In what ways is it better or worse than running a separate server 
process for long-lived vs short-lived HTTP connections?

At the application development level, Channels has the advantage of a
clearly defined interprocess communication which would otherwise need to be
written. However, The Channel API is built more around a simple queue/list
rather than a full messaging layer. The choices of backends are currently
limited to in-memory (not suitable for production), the ORM DB (not
suitable for production), and Redis. While Redis PUB/SUB is nice for
fanout/broadcast messaging, it isn’t a proper message queue. It also
doesn’t support TLS out of the box. For groups/broadcast the Redis Channel
backend also doesn’t use PUB/SUB but instead emulates that feature. It
likely can’t use PUB/SUB due to the choice of sharding. This seemingly
ignores robust existing solutions like Kombu, which is designed around AMQP
concepts. Kombu supports far more transports than the Channel backends
while emulating the same features, such as groups/fanout, and more such as
topic exchanges, QoS, message acknowledgement, compression, and additional
serialization formats.

Architecturally, both of these approaches require running two processes.
The current solution would run a WSGI server for short lived connections
and an async server for long lived connections. Channels runs a front-end
interface server, daphne, and the back-end worker servers. Which is more
scalable? That’s hard to say. They both scale the same way: add more
processes. It’s my experience that handling long-lived vs short-lived HTTP
connections have different scaling needs so it is helpful to be able to
scale them independently as one might do without Channels. That distinction
can’t be made with Channels since all HTTP connections are handled by the
interface servers. Channels has an explicit requirement of a backend/broker
server which requires its own resources. While not required in the separate
server setup, it’s likely that there is some kind of message broker between
the servers so at best we’ll call this a wash in terms of resources.
However, the same is not true for latency. Channels will handle the same
short-lived HTTP connections by serializing the request, putting it into
the backend, deserializing request, processing the response in the worker,
serializing the response, putting it into the backend, deserializing
response, and sending it to the client. This is a fair bit of extra work
for no real gain since there is no concept of priority or backpressure.
This latency also exists for the websocket message handling. While Channels
may try to claim that it’s more resilient/fault tolerant because of this
messaging layer, it claims “at most once” delivery which means that a
message might never be delivered. I don’t think that claim has much merit.
As noted in previous discussions, sending all HTTP requests unencrypted
through the Channel backend (such as Redis) raises a number of potential
security/regulatory issues which have yet to be addressed.

One key difference to me is that pushing Channels as the new Django
standard makes Django’s default deployment story much more complicated.
Currently this complication is the exception not the rule. Deployment is a
frequent complaint, not just from people new to Django. Deployment of
Python apps is a pain and this requires running two of them even if you
aren’t using websockets. To me that is a huge step in the wrong direction
for Django in terms of ease of deployment and required system resources.

Channels claims to have a better zero-downtime deployment story. However,
in practice I’m nTot convinced that will be true. A form of graceful reload
is supported by the most popular WSGI servers so it isn’t really better
than what we currently have. The Channel docs note that you only need to
restart the workers when deploying new code so you won’t drop HTTP
connections. But the interface application definition and the worker code
live in the same code base. It will be difficult to determine whether or
not you need to restart the interface or not on a given deployment so many
people will likely error on the side of restarting the interface as well.
With a separate async server, likely in a separate code base, it would be
easy to deploy them independently and only restart the websocket
connections when needed. Also, it’s better if your application can handle
gracefully disconnections/reconnections for the websocket case anyway since
you’ll have to deal with that reality on mobile data connections and
terrible wifi.

There is an idea floating around of using Channels for background
jobs/Celery replacement. It is not/should not be. The message delivery is
not guaranteed and there is no retry support. This is explicitly outside of
the stated design goals of the project. Allowing this idea to continue in
any form does a disservice to the Django community who may use Channels in
this way. It’s also a slap in the face to the Celery authors who’ve worked
for years to build a robust system which is superior to this naive
implementation.

So Channels is at best on par with the existing available approaches and at
worst adds a bunch of latency, potentially dropped messages, and new points
of failure while taking up more resources and locks everyone into using
Redis. It does provide a clear message framework but in my opinion it’s too
naive to be useful. Given the complexity in the space I don’t trust
anything built from the ground up without having a meaningful production
deployment to prove it out. It has taken Kombu many years to mature and I
don’t think it can be rewritten easily.

I see literally no advantage to pushing all HTTP requests and responses
through Redis. What this does enable is that you can continue to write
synchronous code. To me that’s based around some idea that async code is
too hard for the average Django dev to write or understand. Or that nothing
can be done to make parts of Django play nicer with existing async
frameworks which I also don’t believe is true. Python 3.4 makes writing
async Python pretty elegant and async/await in 3.5 makes that even better.

Sorry this is so long. Those who saw the DjangoCon author’s panel know that
quickly writing walls of unreadable text is my forte. It’s been building
for a long time. I have an unsent draft to Andrew from when he wrote his
first blog post about this idea. I deeply regret not sending it and
beginning to engage in this discussion earlier. It’s hard for me to
separate this work from the process by which it was created. Russ touched
on my previous experience with the DEP process and I will admit that has
jaded many of my interactions with the core team. Building consensus is
hard and I’m posting this to help work towards the goal of community
consensus. Thanks for taking the time to read this all the way through and
I welcome any feedback.

Best,

Mark Lavin

--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/8c4b240d-2248-4af7-a8f6-f8ddff2d807e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

My Take on Django Channels

Reply via email to