Hi gabor,
I've put up some patches to help with the unicode conversion of
django. We have a site which is shortly going to production where we
actually have to handle multiple unicode scripts including some which
have characters that do not fall into iso-8859-1.
Since I'm pretty lazy and I'm not
Adrian Holovaty wrote:
> On 8/8/06, gabor <[EMAIL PROTECTED]> wrote:
>> i think unicodizing django can be done in 4 easily separated steps/parts:
>>
>> 1. request/response
>> 2. templating-system
>> 3. database-system
>> 4. "overall unicode-conversion". this is mostly about replacing
>> bytestring
On 20-aug-2006, at 8:55, Malcolm Tredinnick wrote:
>> 5. Internally, work with unicode strings exclusively (after
>> transcoding the request and the template). Response should be python
>> unicode as well up until the moment it gets sent out.
>
> That's the idea.
Not so fast.
You want to be li
On 8/20/06, Malcolm Tredinnick <[EMAIL PROTECTED]> wrote:
> Metaphorically cutting off both our arms so that we appear
> more aerodynamic is probably not a gain worth making.
That's going in my quotes file.
--
"May the forces of evil become confused on the way to your house."
-- George Carlin
Malcolm Tredinnick wrote:
> Metaphorically cutting off both our arms so that we appear
> more aerodynamic is probably not a gain worth making.
This is the explanation! :-)
>> 5. Internally, work with unicode strings exclusively (after
>> transcoding the request and the template). Response shou
On Sun, 2006-08-20 at 07:15 +0200, Julian 'Julik' Tarkhanov wrote:
>
> On 17-aug-2006, at 1:08, Bill de hÓra wrote:
>
> > like wanting to serve utf8 rss feeds, but have latin1 come
> > in and out of mysql.
>
> Might seem very extreme, but I would love to chime in. Maybe it would
> be wise to
On 17-aug-2006, at 1:08, Bill de hÓra wrote:
> like wanting to serve utf8 rss feeds, but have latin1 come
> in and out of mysql.
Might seem very extreme, but I would love to chime in. Maybe it would
be wise to go even further, whereby:
1. Hardcode Django to output and input UTF-8 as the most
In China GB18030 is required to be used by law, any most sites just
assume the browser uses that as the default, so they don't even specify
a character encoding.
Your likely setup for international web sites is to have Unicode in the
database (since databases have special support for it and it is
gabor wrote:
>
> currently my plan is to have the following behaviour:
>
> 1. i assume that every GET/POST param comes in encoded as
> settings.DEFAULT_CHARSET, and will decode it accordingly. if it fails,
> then it fails.
Assuming "you got served" with settings.DEFAULT_CHARSET, then sure.
On 8/16/06, gabor <[EMAIL PROTECTED]> wrote:
> 3. will assume the database is in DEFAULT_CHARSET
> - maybe can we somehow ask the db for it's charset?
I think you really have to allow for different charset in the DB--
legacy integration, remember.
--~--~-~--~~~---
Jeremy Dunck wrote:
> I hearby degree that all strings in computing should have a charset
> associated with them.
>
> ...
>
> Damn, it didn't work.
ROTFL! On a more positive note, kudos to Gábor for looking at this.
Gábor, if you get a dev branch, I'll be happy work against it.
cheers
Bill
gabor wrote:
> 3. will assume the database is in DEFAULT_CHARSET
> - maybe can we somehow ask the db for it's charset?
>
> so, what do you think?
> or should we make it possible to have a system with mixed charsets?
> (well, maybe having a different DB_CHARSET and a DEFAULT_CHARSET could
Jeremy Dunck wrote:
> On 8/16/06, Bill de hÓra <[EMAIL PROTECTED]> wrote:
>> Now. Most (all?) browser UAs sniff the content to second guess the media
>> type. They don't much pay attention to Content-Type (I think maybe IE
>> ignores it altogether). The problem for this example is they might be
>>
On 8/9/06, gabor <[EMAIL PROTECTED]> wrote:
> hmmm.. are you sure that the situation with unicode-aware editors is so bad?
>
> could you name some non-unicode-aware editors?
> for me it seems that from notepad through vim to eclipse everything does
> unicode fine...
On Windows, I used UltraEdit,
On 8/16/06, Bill de hÓra <[EMAIL PROTECTED]> wrote:
> Now. Most (all?) browser UAs sniff the content to second guess the media
> type. They don't much pay attention to Content-Type (I think maybe IE
> ignores it altogether). The problem for this example is they might be
> doing something similar f
Gábor Farkas wrote:
> for example, using this html file:
>
> http://localhost:7000";>
>
>
>
> (+ additional xhtml-headers, http-equiv-content-type=utf-8 etc)
>
> firefox submits this:
>
>
> POST / HTTP/1.1
> Host: localhost:7000
> User-Agent: Mozilla/5.0 (X11; U; Linux i686; en
Bill de hÓra wrote:
> gabor wrote:
>
>> so what do you think about the following approach:
>>
>> try ascii-decoding
>> if fails, try utf8-decoding
>> if fails do iso-8859-1-decoding (this cannot fail).
>>
>> ?
>
> Dumb question maybe. How do you know this encoding ladder will work?
it depends o
Malcolm Tredinnick wrote:
> On Wed, 2006-08-09 at 21:51 +0200, gabor wrote:
> [...]
>> phew... the immortal
>> how-tolerant-we-should-be-when-doing-unicode-conversion problems :-)
>
> Agreed. This is much easier on my side of the fence (lobbing problems),
> than your side (solving them).
> [...]
gabor wrote:
> so what do you think about the following approach:
>
> try ascii-decoding
> if fails, try utf8-decoding
> if fails do iso-8859-1-decoding (this cannot fail).
>
> ?
Dumb question maybe. How do you know this encoding ladder will work?
> but imho this should happen only in "specia
On 8/8/06, gabor <[EMAIL PROTECTED]> wrote:
> i think unicodizing django can be done in 4 easily separated steps/parts:
>
> 1. request/response
> 2. templating-system
> 3. database-system
> 4. "overall unicode-conversion". this is mostly about replacing
> bytestrings with u"bla" in the code, and s
On 8/10/06, Ivan Sagalaev <[EMAIL PROTECTED]> wrote:
>
> Malcolm Tredinnick wrote:
> > I completely agree this is painful and normally I would punt. But my
> > crystal ball tells me that you will then get bug reports from Mr
> > Sagalaev, who is generally both very diligent in his debugging and li
gabor wrote:
> hmmm.. are you sure that the situation with unicode-aware editors is so bad?
>
> could you name some non-unicode-aware editors?
> for me it seems that from notepad through vim to eclipse everything does
> unicode fine...
Ok, I should rephrase it. Even if most editors do support u
Malcolm Tredinnick wrote:
> I completely agree this is painful and normally I would punt. But my
> crystal ball tells me that you will then get bug reports from Mr
> Sagalaev, who is generally both very diligent in his debugging and likes
> to use some language with a funny alphabet. If whatever y
On Wed, 2006-08-09 at 21:51 +0200, gabor wrote:
[...]
> phew... the immortal
> how-tolerant-we-should-be-when-doing-unicode-conversion problems :-)
Agreed. This is much easier on my side of the fence (lobbing problems),
than your side (solving them).
> i generally prefer to do as little guesswo
Malcolm Tredinnick wrote:
> A couple of comments on the patch itself. I realise it's only a proof of
> concept at the moment, so take as more things to think about when you
> want to tidy it up:
>
> (1) A docstring like """needed to workaround the cgi.parse_sql
> unicode-problem""" is not very fu
Ivan Sagalaev wrote:
> First of all, Gabor, thank you very much for doing this!
>
thanks :)
> gabor wrote:
>> today i experimented a little with the django source code,
>> and here are the results.
>>
>> if you apply a very small patch (65lines, attached), you can write a view
>> completely in
First of all, Gabor, thank you very much for doing this!
gabor wrote:
> today i experimented a little with the django source code,
> and here are the results.
>
> if you apply a very small patch (65lines, attached), you can write a view
> completely in unicode.
> means:
> - GET/POST contains uni
Shouldn't the UTF-8 encoding be also defined in all files as described
here: http://www.python.org/dev/peps/pep-0263/ ?
That is using
#!/usr/bin/python
# -*- coding: UTF-8 -*-
at the beginning of python code files.
This works pretty good at least when you need to create new instances
of models
Hey Gabor,
On Wed, 2006-08-09 at 01:03 +0200, gabor wrote:
> today i experimented a little with the django source code,
> and here are the results.
>
> if you apply a very small patch (65lines, attached), you can write a view
> completely in unicode.
> means:
> - GET/POST contains unicode data
>
today i experimented a little with the django source code,
and here are the results.
if you apply a very small patch (65lines, attached), you can write a view
completely in unicode.
means:
- GET/POST contains unicode data
- request.META contains unicode data
- you can put unicode text into the Htt
30 matches
Mail list logo