After somewhat hijacking another thread 
https://groups.google.com/d/msg/django-developers/t_zuh9ucSP4/eJ4TlEDMCAAJ 
I thought it was best to start fresh and clearly spell out my feelings 
about the Channels proposal. To start, this discussion of “Django needs a 
websocket story” reminds me very much of the discussions about NoSQL 
support. There were proof of concepts made and the sky is falling arguments 
about how Django would fail without MongoDB support. But in the end the 
community concluded that `pip install pymongo` was the correct way to 
integrate MongoDB into a Django project. In that same way, it has been 
possible for quite some time to incorporate websockets into a Django 
project by running a separate server dedicated for handling those 
connections in a framework such as Twisted, Tornado, Aiohttp, etc and 
establishing a clear means by which the two servers communicate with one 
another as needed by the application. Now this is quite vague and ad-hoc 
but it does work. To me this is the measuring stick by which to judge 
Channels. In what ways is it better or worse than running a separate server 
process for long-lived vs short-lived HTTP connections?

At the application development level, Channels has the advantage of a 
clearly defined interprocess communication which would otherwise need to be 
written. However, The Channel API is built more around a simple queue/list 
rather than a full messaging layer. The choices of backends are currently 
limited to in-memory (not suitable for production), the ORM DB (not 
suitable for production), and Redis. While Redis PUB/SUB is nice for 
fanout/broadcast messaging, it isn’t a proper message queue. It also 
doesn’t support TLS out of the box. For groups/broadcast the Redis Channel 
backend also doesn’t use PUB/SUB but instead emulates that feature. It 
likely can’t use PUB/SUB due to the choice of sharding. This seemingly 
ignores robust existing solutions like Kombu, which is designed around AMQP 
concepts. Kombu supports far more transports than the Channel backends 
while emulating the same features, such as groups/fanout, and more such as 
topic exchanges, QoS, message acknowledgement, compression, and additional 
serialization formats.

Architecturally, both of these approaches require running two processes. 
The current solution would run a WSGI server for short lived connections 
and an async server for long lived connections. Channels runs a front-end 
interface server, daphne, and the back-end worker servers. Which is more 
scalable? That’s hard to say. They both scale the same way: add more 
processes. It’s my experience that handling long-lived vs short-lived HTTP 
connections have different scaling needs so it is helpful to be able to 
scale them independently as one might do without Channels. That distinction 
can’t be made with Channels since all HTTP connections are handled by the 
interface servers. Channels has an explicit requirement of a backend/broker 
server which requires its own resources. While not required in the separate 
server setup, it’s likely that there is some kind of message broker between 
the servers so at best we’ll call this a wash in terms of resources. 
However, the same is not true for latency. Channels will handle the same 
short-lived HTTP connections by serializing the request, putting it into 
the backend, deserializing request, processing the response in the worker, 
serializing the response, putting it into the backend, deserializing 
response, and sending it to the client. This is a fair bit of extra work 
for no real gain since there is no concept of priority or backpressure. 
This latency also exists for the websocket message handling. While Channels 
may try to claim that it’s more resilient/fault tolerant because of this 
messaging layer, it claims “at most once” delivery which means that a 
message might never be delivered. I don’t think that claim has much merit. 
As noted in previous discussions, sending all HTTP requests unencrypted 
through the Channel backend (such as Redis) raises a number of potential 
security/regulatory issues which have yet to be addressed.

One key difference to me is that pushing Channels as the new Django 
standard makes Django’s default deployment story much more complicated. 
Currently this complication is the exception not the rule. Deployment is a 
frequent complaint, not just from people new to Django. Deployment of 
Python apps is a pain and this requires running two of them even if you 
aren’t using websockets. To me that is a huge step in the wrong direction 
for Django in terms of ease of deployment and required system resources.

Channels claims to have a better zero-downtime deployment story. However, 
in practice I’m nTot convinced that will be true. A form of graceful reload 
is supported by the most popular WSGI servers so it isn’t really better 
than what we currently have. The Channel docs note that you only need to 
restart the workers when deploying new code so you won’t drop HTTP 
connections. But the interface application definition and the worker code 
live in the same code base. It will be difficult to determine whether or 
not you need to restart the interface or not on a given deployment so many 
people will likely error on the side of restarting the interface as well. 
With a separate async server, likely in a separate code base, it would be 
easy to deploy them independently and only restart the websocket 
connections when needed. Also, it’s better if your application can handle 
gracefully disconnections/reconnections for the websocket case anyway since 
you’ll have to deal with that reality on mobile data connections and 
terrible wifi.

There is an idea floating around of using Channels for background 
jobs/Celery replacement. It is not/should not be. The message delivery is 
not guaranteed and there is no retry support. This is explicitly outside of 
the stated design goals of the project. Allowing this idea to continue in 
any form does a disservice to the Django community who may use Channels in 
this way. It’s also a slap in the face to the Celery authors who’ve worked 
for years to build a robust system which is superior to this naive 
implementation.

So Channels is at best on par with the existing available approaches and at 
worst adds a bunch of latency, potentially dropped messages, and new points 
of failure while taking up more resources and locks everyone into using 
Redis. It does provide a clear message framework but in my opinion it’s too 
naive to be useful. Given the complexity in the space I don’t trust 
anything built from the ground up without having a meaningful production 
deployment to prove it out. It has taken Kombu many years to mature and I 
don’t think it can be rewritten easily.

I see literally no advantage to pushing all HTTP requests and responses 
through Redis. What this does enable is that you can continue to write 
synchronous code. To me that’s based around some idea that async code is 
too hard for the average Django dev to write or understand. Or that nothing 
can be done to make parts of Django play nicer with existing async 
frameworks which I also don’t believe is true. Python 3.4 makes writing 
async Python pretty elegant and async/await in 3.5 makes that even better.

Sorry this is so long. Those who saw the DjangoCon author’s panel know that 
quickly writing walls of unreadable text is my forte. It’s been building 
for a long time. I have an unsent draft to Andrew from when he wrote his 
first blog post about this idea. I deeply regret not sending it and 
beginning to engage in this discussion earlier. It’s hard for me to 
separate this work from the process by which it was created. Russ touched 
on my previous experience with the DEP process and I will admit that has 
jaded many of my interactions with the core team. Building consensus is 
hard and I’m posting this to help work towards the goal of community 
consensus. Thanks for taking the time to read this all the way through and 
I welcome any feedback.

Best,

Mark Lavin

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/8c4b240d-2248-4af7-a8f6-f8ddff2d807e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to