Steven D'Aprano writes:
> Of course you're right, but I have understood the above as being a
> sketch and not real code. (E.g. does "header" really mean the literal
> string "header", or does it stand in for something which is a header?)
> In real code, one would need to have some way of te
On 01/12/2014 04:02 PM, Stephen J. Turnbull wrote:
So when you talk about "we", I suspect you are not the "we" everybody
else is arguing with. In particular, AIUI your use case is not
included in the use cases most of us -- including Steven -- are
thinking about.
Ah, so even in the minority I
Ethan Furman writes:
> > This kind of subtlety is precisely why MAL warned about use of latin1
> > to smuggle bytes.
>
> And why I've been fighting Steven D'Aprano on it.
No, I think you haven't been fighting Steven d'A on "it". You're
talking about parsing and generating structured binary
On Mon, Jan 13, 2014 at 07:31:16AM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
>
> > then the name is horribly misleading, and it is best handled like this:
> >
> > content = '\n'.join([
> > 'header',
> > 'part 2 %.3f' % number,
> > binary_image_d
On 01/12/2014 02:31 PM, Stephen J. Turnbull wrote:
This corrupts binary_image_data. Each byte > 127 will be replaced by
two bytes. In the second case, you can use latin1 to encode, it it
gives you what you want.
This kind of subtlety is precisely why MAL warned about use of latin1
to smuggle
Steven D'Aprano writes:
> then the name is horribly misleading, and it is best handled like this:
>
> content = '\n'.join([
> 'header',
> 'part 2 %.3f' % number,
> binary_image_data.decode('latin-1'),
> utf16_string, # Misleading name, actually Unicode
On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop wrote:
> On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano
> wrote:
>> First, "utf16_string" confuses me. What is it? If it is a Unicode
>> string, i.e.:
>
> It is a Unicode string which happens to contain code points outside U+00FF
> (as with the TTF e
On 01/12/2014 01:59 PM, Mark Shannon wrote:
Why not just use six.byte_format(fmt, *args)?
It works on both Python2 and Python3 and accepts the numerical format
specifiers, plus '%b' for inserting bytes and '%a'
for converting text to ascii.
Sounds like the second best option!
Admittedly it
Why not just use six.byte_format(fmt, *args)?
It works on both Python2 and Python3 and accepts the numerical format
specifiers, plus '%b' for inserting bytes and '%a' for converting text
to ascii.
Admittedly it doesn't exist yet,
but it could and it would save a lot of arguing :)
(Apologies t
On 01/12/2014 12:39 PM, Stephen J. Turnbull wrote:
Daniel Holth writes:
> -1 on adding more surrogateesapes by default. It's a pain to track
> down where the encoding errors came from.
What do you mean "by default"? It was quite explicit in the code I
posted, and it's the only reasonable t
Daniel Holth writes:
> -1 on adding more surrogateesapes by default. It's a pain to track
> down where the encoding errors came from.
What do you mean "by default"? It was quite explicit in the code I
posted, and it's the only reasonable thing to do with "text data
without known (but ASCII com
Wait a second, this is how I understood it but what Nick said made me think
otherwise...
On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano wrote:
> On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote:
> > On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano >wrote:
> >
> > Just to check I und
On Sun, Jan 12, 2014 at 11:16:37PM +1000, Nick Coghlan wrote:
> > content = '\n'.join([
> > 'header',
> > 'part 2 %.3f' % number,
> > binary_image_data.decode('latin-1'),
> > utf16_string.encode('utf-16be').decode('latin-1'),
> > 'trailer']).encode('lati
On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote:
> On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano wrote:
>
> > On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
> >
> > > AFAIK (and just for the record), there could be both Latin1 text and
> > UTF-16
> > > in a PDF (a
On Sun, Jan 12, 2014 at 2:16 PM, Nick Coghlan wrote:
> Why are you proposing to do the *join* in text space? Encode all the parts
> separately, concatenate them with b'\n'.join() (or whatever separator is
> appropriate). It's only the *text formatting operation* that needs to be
> done in text sp
On 12 Jan 2014 21:53, "Juraj Sukop" wrote:
>
>
>
>
> On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano
wrote:
>>
>> On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
>>
>> > AFAIK (and just for the record), there could be both Latin1 text and
UTF-16
>> > in a PDF (and other encodin
On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano wrote:
> On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
>
> > AFAIK (and just for the record), there could be both Latin1 text and
> UTF-16
> > in a PDF (and other encodings too), depending on the font used:
> [...]
> > In Python2
On Sun, 12 Jan 2014 17:51:41 +1000, Nick Coghlan wrote:
> On 12 January 2014 04:38, R. David Murray wrote:
> > But! Our goal should be to help people convert to Python3. So how can
> > we find out what the specific problems are that real-world programs are
> > facing, look at the *actual code*,
On 12 January 2014 04:38, R. David Murray wrote:
> But! Our goal should be to help people convert to Python3. So how can
> we find out what the specific problems are that real-world programs are
> facing, look at the *actual code*, and help that project figure out the
> best way to make that cod
On 12 January 2014 02:33, M.-A. Lemburg wrote:
> On 11.01.2014 16:34, Nick Coghlan wrote:
>> While that was an *expedient* (and, in fact, necessary) solution at
>> the time, the fact it is still thoroughly confusing people 13 years
>> later shows it is not a *comprehensible* solution.
>
> FWIW: I
On 01/11/2014 06:29 PM, Steven D'Aprano wrote:
On Sat, Jan 11, 2014 at 11:05:36AM -0800, Ethan Furman wrote:
On 01/11/2014 10:36 AM, Steven D'Aprano wrote:
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
unicode to bytes
bytes to unicode using latin1
unicode to bytes
On 12 Jan 2014 03:29, "Ethan Furman" wrote:
>
> On 01/11/2014 12:43 AM, Nick Coghlan wrote:
>>
>>
>> In particular, the bytes type is, and always will be, designed for
>> pure binary manipulation [...]
>
>
> I apologize for being blunt, but this is a lie.
>
> Lets take a look at the methods define
On Sat, Jan 11, 2014 at 11:05:36AM -0800, Ethan Furman wrote:
> On 01/11/2014 10:36 AM, Steven D'Aprano wrote:
> >On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
> >>
> >> unicode to bytes
> >> bytes to unicode using latin1
> >> unicode to bytes
> >
> >Where do you get this from
On 11Jan2014 13:15, Juraj Sukop wrote:
> On Sat, Jan 11, 2014 at 5:14 AM, Cameron Simpson wrote:
> > data = b' '.join( bytify( [ 10, 0, obj, binary_image_data, ... ] ) )
>
> Thanks for the suggestion! The problem with "bytify" is that some items
> might require different formatting than other
On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote:
> AFAIK (and just for the record), there could be both Latin1 text and UTF-16
> in a PDF (and other encodings too), depending on the font used:
[...]
> In Python2, txt is just a str, but in Python3 handling everything as latin1
> st
On Sat, Jan 11, 2014 at 04:28:34PM -0500, Terry Reedy wrote:
> The problem with some criticisms of using 'unicode in Python 3' is that
> there really is no such thing. Unicode in 3.0 to 3.2 used the old
> internal model inherited from 2.x. Unicode in 3.3+ uses a different
> internal model that
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2014-01-11, 18:09 GMT, you wrote:
>> We are NOT going back to the confusing incoherent mess that
>> is the Python 2 model of bolting Unicode onto the side of
>> POSIX . . .
>
> We are not asking for that.
Yes, you do. Maybe not you personally, bu
On Sat, Jan 11, 2014 at 07:22:30PM +, MRAB wrote:
> >with open("outfile.pdf", "w", encoding="latin-1") as f:
> > f.write(pdf)
> >
> [snip]
> The second example won't work because you're forgetting about the
> handling of line endings in text mode.
So I did! Thank you for the correction.
On Fri, Jan 10, 2014 at 9:13 PM, Juraj Sukop wrote:
>
>
>
> On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou wrote:
>
>> Also, when you say you've never encountered UTF-16 text in PDFs, it
>> sounds like those people who've never encountered any non-ASCII data in
>> their programs.
>
>
> Let me
On 01/11/2014 12:45 PM, Donald Stufft wrote:
FWIW as one of the people who it took Python3 to finally figure out how to
actually use unicode, it was the absence of encode on bytes and decode on
str that actually did it. Giving bytes a format method would not have affected
that either way I don’t
On Sat, Jan 11, 2014 at 4:28 PM, Terry Reedy wrote:
> On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote:
>
>> We already *have* a type in Python 3.3 that provides text
>> manipulations on arrays of 8-bit objects: str (per PEP 393).
>>
>> > BTW: I don't know why so many people keep asking for use c
On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote:
We already *have* a type in Python 3.3 that provides text
manipulations on arrays of 8-bit objects: str (per PEP 393).
> BTW: I don't know why so many people keep asking for use cases.
> Isn't it obvious that text data without known (but ASCI
On Jan 11, 2014, at 10:34 AM, Nick Coghlan wrote:
> Yes, it bloody well does. The number of people who have told me that
> using Python 3 is what allowed them to finally understand how Unicode
> works vastly exceeds the number of wire protocol and file format devs
> that have complained about wo
On Sat, 11 Jan 2014 11:54:26 -0800, Ethan Furman wrote:
> On 01/11/2014 11:49 AM, Stephen J. Turnbull wrote:
> > MRAB writes:
> >
> > > > with open("outfile.pdf", "w", encoding="latin-1") as f:
> > > > f.write(pdf)
> > > >
> > > [snip]
> > > The second example won't work because you
On 01/11/2014 11:49 AM, Stephen J. Turnbull wrote:
MRAB writes:
> > with open("outfile.pdf", "w", encoding="latin-1") as f:
> > f.write(pdf)
> >
> [snip]
> The second example won't work because you're forgetting about the
> handling of line endings in text mode.
Not so fast! F
MRAB writes:
> > with open("outfile.pdf", "w", encoding="latin-1") as f:
> > f.write(pdf)
> >
> [snip]
> The second example won't work because you're forgetting about the
> handling of line endings in text mode.
Not so fast! Forgot, yes (me too!), but not work? Not quite:
with o
On 01/11/2014 10:36 AM, Steven D'Aprano wrote:
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
unicode to bytes
bytes to unicode using latin1
unicode to bytes
Where do you get this from? I don't follow your logic. Start with a text
template:
template = """\xDE\xAD\xBE\
On 2014-01-11 05:36, Steven D'Aprano wrote:
[snip]
Latin-1 has the nice property that every byte decodes into the character
with the same code point, and visa versa. So:
for i in range(256):
assert bytes([i]).decode('latin-1') == chr(i)
assert chr(i).encode('latin-1') == bytes([i])
pa
On Sat, Jan 11, 2014 at 05:33:17PM +0100, M.-A. Lemburg wrote:
> FWIW: I quite liked the Python 2 model, but perhaps that's because
> I already knww how Unicode works, so could use it to make my
> life easier ;-)
/incredulous
I would really love to see you justify that claim. How do you use the
On Sat, Jan 11, 2014 at 04:15:35PM +0100, M.-A. Lemburg wrote:
> I think we need to step back a little from the purist view
> of things and give more emphasis on the "practicality beats
> purity" Zen.
>
> I complete agree with Stephen, that bytes are in fact often
> an encoding of text. If that t
M.-A. Lemburg writes:
> I complete agree with Stephen, that bytes are in fact often
> an encoding of text. If that text is ASCII compatible, I don't
> see any reason why we should not continue to expose the C lib
> standard string APIs available for text manipulations on bytes.
We already *ha
tl;dr: At the end I'm volunteering to look at real code that is having
porting problems.
On Sat, 11 Jan 2014 17:33:17 +0100, "M.-A. Lemburg" wrote:
> asciistr is interesting in that it coerces to bytes instead
> of to Unicode (as is the case in Python 2).
>
> At the moment it doesn't cover the m
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote:
> On 01/11/2014 07:38 AM, Steven D'Aprano wrote:
> >
> >The point that I am making is that many people want to add formatting
> >operations to bytes so they can put ASCII strings inside bytes. But (as
> >far as I can tell) they don't nee
On 01/11/2014 07:34 AM, Nick Coghlan wrote:
On 12 January 2014 01:15, M.-A. Lemburg wrote:
We don't have to be pedantic about the bytes/text separation.
It doesn't help in real life.
Yes, it bloody well does. The number of people who have told me that
using Python 3 is what allowed them to fi
On 01/11/2014 12:43 AM, Nick Coghlan wrote:
In particular, the bytes type is, and always will be, designed for
pure binary manipulation [...]
I apologize for being blunt, but this is a lie.
Lets take a look at the methods defined by bytes:
dir(b'')
['__add__', '__class__', '__contains__', '
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 2014-01-11, 10:56 GMT, you wrote:
> I don't know what the fuss is about.
I just cannot resist:
When you are calm while everybody else is in the state of
panic, you haven’t understood the problem.
-- one of many collections of Murphy’
On 11.01.2014 16:34, Nick Coghlan wrote:
> On 12 January 2014 01:15, M.-A. Lemburg wrote:
>> On 11.01.2014 14:54, Georg Brandl wrote:
>>> Am 11.01.2014 14:49, schrieb Georg Brandl:
Am 11.01.2014 10:44, schrieb Stephen Hansen:
> I mean, its not like the "bytes" type lacks knowledge of
On 01/11/2014 07:38 AM, Steven D'Aprano wrote:
The point that I am making is that many people want to add formatting
operations to bytes so they can put ASCII strings inside bytes. But (as
far as I can tell) they don't need to do this, because they can treat
Unicode strings containing code point
On Sun, 12 Jan 2014 01:34:26 +1000
Nick Coghlan wrote:
>
> Yes, it bloody well does. The number of people who have told me that
> using Python 3 is what allowed them to finally understand how Unicode
> works vastly exceeds the number of wire protocol and file format devs
> that have complained ab
On Sat, Jan 11, 2014 at 01:56:56PM +0100, Juraj Sukop wrote:
> On Sat, Jan 11, 2014 at 6:36 AM, Steven D'Aprano wrote:
> > If you consider PDF as binary with occasional pieces of ASCII text, then
> > working with bytes makes sense. But I wonder whether it might be better
> > to consider PDF as mos
On 12 January 2014 01:15, M.-A. Lemburg wrote:
> On 11.01.2014 14:54, Georg Brandl wrote:
>> Am 11.01.2014 14:49, schrieb Georg Brandl:
>>> Am 11.01.2014 10:44, schrieb Stephen Hansen:
>>>
I mean, its not like the "bytes" type lacks knowledge of the subset of
bytes
that happen to b
On Sat, 11 Jan 2014 08:26:57 +0100
Georg Brandl wrote:
> Am 11.01.2014 03:04, schrieb Antoine Pitrou:
> > On Fri, 10 Jan 2014 20:53:09 -0500
> > "Eric V. Smith" wrote:
> >>
> >> So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
> >> 3892. See for example http://bugs.python.or
On 11.01.2014 14:54, Georg Brandl wrote:
> Am 11.01.2014 14:49, schrieb Georg Brandl:
>> Am 11.01.2014 10:44, schrieb Stephen Hansen:
>>
>>> I mean, its not like the "bytes" type lacks knowledge of the subset of bytes
>>> that happen to be 7-bit ascii-compatible and can't perform text-ish
>>> oper
Am 11.01.2014 14:49, schrieb Georg Brandl:
> Am 11.01.2014 10:44, schrieb Stephen Hansen:
>
>> I mean, its not like the "bytes" type lacks knowledge of the subset of bytes
>> that happen to be 7-bit ascii-compatible and can't perform text-ish
>> operations
>> on them--
>>
>> Python 3.3.3 (v3.3
Am 11.01.2014 10:44, schrieb Stephen Hansen:
> I mean, its not like the "bytes" type lacks knowledge of the subset of bytes
> that happen to be 7-bit ascii-compatible and can't perform text-ish operations
> on them--
>
> Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 b
Am 11.01.2014 09:43, schrieb Nick Coghlan:
> On 11 January 2014 12:28, Ethan Furman wrote:
>> On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
>>>
>>> On Fri, 10 Jan 2014 20:53:09 -0500
>>> "Eric V. Smith" wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
>>
On Sat, Jan 11, 2014 at 6:36 AM, Steven D'Aprano wrote:
>
> I'm sorry, I don't understand what you mean here. I'm honestly not
> trying to be difficult, but you sound confident that you understand what
> you are doing, but your description doesn't make sense to me. To me, it
> looks like you are c
On Sat, Jan 11, 2014 at 5:14 AM, Cameron Simpson wrote:
>
> Hi Juraj,
>
Hello Cameron.
> data = b' '.join( bytify( [ 10, 0, obj, binary_image_data, ... ] ) )
>
Thanks for the suggestion! The problem with "bytify" is that some items
might require different formatting than other items. For ex
ting adults.
-Original Message-
From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames@python.org]
On Behalf Of Nick Coghlan
Sent: 11. janúar 2014 08:43
To: Ethan Furman
Cc: python-dev@python.org
Subject: Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args)
to
On 1/11/2014 1:44 AM, Stephen Hansen wrote:
There's been a number of examples given: PDF, HTTP, network streams
that switch inline from text-ish to binary and back-again.. But, we
can focus that down to a very narrow and not at all uncommon situation
in the latter.
PDF has been mentioned a fe
For not caring much, your own stubbornness is quite notable throughout this
discussion. Stones and glass houses. :)
That said:
Twisted and Mercurial aren't the only ones who are hurt by this, at all.
I'm aware of at least two other projects who are actively hindered in their
support or migration
On 11 January 2014 12:28, Ethan Furman wrote:
> On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
>>
>> On Fri, 10 Jan 2014 20:53:09 -0500
>> "Eric V. Smith" wrote:
>>>
>>>
>>> So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
>>> 3892. See for example http://bugs.python.org/issue
On 11 January 2014 08:58, Ethan Furman wrote:
> On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
>>
>> On Fri, 10 Jan 2014 17:33:57 -0500
>> "Eric V. Smith" wrote:
>>>
>>> On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
"Eric V. Smith" wrote:
>
Am 11.01.2014 03:04, schrieb Antoine Pitrou:
> On Fri, 10 Jan 2014 20:53:09 -0500
> "Eric V. Smith" wrote:
>>
>> So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
>> 3892. See for example http://bugs.python.org/issue3982#msg180432 .
I agree.
> Then we might as well not do an
On Fri, Jan 10, 2014 at 06:17:02PM +0100, Juraj Sukop wrote:
> As you may know, PDF operates over bytes and an integer or floating-point
> number is written down as-is, for example "100" or "1.23".
I'm sorry, I don't understand what you mean here. I'm honestly not
trying to be difficult, but you
On 11Jan2014 00:43, Juraj Sukop wrote:
> On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner
> wrote:
> > What not building "10 0 obj ... stream" and "endstream endobj" in
> > Unicode and then encode to ASCII? Example:
> >
> > data = b''.join((
> > ("%d %d obj ... stream" % (10, 0)).encode('ascii')
To avoid implicit conversion between str and bytes, I propose adding only
limited %-format,
not .format() or .format_map().
"limited %-format" means:
%c accepts integer or bytes having one length.
%r is not supported
%s accepts only bytes.
%a is only format accepts arbitrary object.
And other fo
On 01/10/2014 06:39 PM, Antoine Pitrou wrote:
I know what a network protocol with ill-defined encodings
looks like.
For the record, I've been (and I suspect Eric and some others have also been) talking about well-defined encodings. For
the DBF files that I work with, there is binary, ASCII,
On 01/10/2014 06:39 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 18:28:41 -0800
Ethan Furman wrote:
Is it safe to assume you don't use Python for the use-cases under discussion?
You know, I've done quite a bit of network programming.
No, I didn't, that's why I asked.
I've also done an ex
To avoid implicit conversion between str and bytes, I propose adding only
limited %-format,
not .format() or .format_map().
"limited %-format" means:
%c accepts integer or bytes having one length.
%r is not supported
%s accepts only bytes.
%a is only format accepts arbitrary object.
And other fo
On Fri, 10 Jan 2014 18:28:41 -0800
Ethan Furman wrote:
>
> Is it safe to assume you don't use Python for the use-cases under discussion?
You know, I've done quite a bit of network programming. I've also done
an experimental port of Twisted to Python 3. I know what a network
protocol with ill-def
On 01/10/2014 06:04 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 20:53:09 -0500
"Eric V. Smith" wrote:
So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
3892. See for example http://bugs.python.org/issue3982#msg180432 .
Then we might as well not do anything, since any at
On Fri, 10 Jan 2014 20:53:09 -0500
"Eric V. Smith" wrote:
>
> So, I'm -1 on the PEP. It doesn't address the cases laid out in issue
> 3892. See for example http://bugs.python.org/issue3982#msg180432 .
Then we might as well not do anything, since any attempt to advance
things is met by stubborn o
On 1/10/2014 8:12 PM, Antoine Pitrou wrote:
> On Fri, 10 Jan 2014 16:23:53 -0800
> Ethan Furman wrote:
>> On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
>>>
>>> With Victor's consent, I overhauled PEP 460 and made the feature set
>>> more restricted and consistent with the bytes/str separation.
>>
On Fri, 10 Jan 2014 16:23:53 -0800
Ethan Furman wrote:
> On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
> >
> > With Victor's consent, I overhauled PEP 460 and made the feature set
> > more restricted and consistent with the bytes/str separation.
>
> From the PEP:
> =
> > Python 3 gen
On 01/08/2014 02:42 PM, Antoine Pitrou wrote:
With Victor's consent, I overhauled PEP 460 and made the feature set
more restricted and consistent with the bytes/str separation.
From the PEP:
=
Python 3 generally mandates that text be stored and manipulated as
unicode (i.e. str ob
On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou wrote:
> Also, when you say you've never encountered UTF-16 text in PDFs, it
> sounds like those people who've never encountered any non-ASCII data in
> their programs.
Let me clarify: one does not think in "writing text in Unicode"-terms in
PDF.
On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop wrote:
> What this all means is that the PDF objects are expressed in ASCII,
> "stream" objects like images and fonts may have a binary part and I never
> saw those UTF+16 strings.
>
hmm -- I wonder if they are out there in the wild, though
> u
On Sat, 11 Jan 2014 00:43:39 +0100
Juraj Sukop wrote:
> Basically, to ".encode('ascii')" every possible
> number is not exactly simple or pretty.
Well it strikes me that the PDF format itself is not exactly simple or
pretty. It might be convenient that Python 2 allows you, in certain
cases, to "i
On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner
wrote:
>
> What not building "10 0 obj ... stream" and "endstream endobj" in
> Unicode and then encode to ASCII? Example:
>
> data = b''.join((
> ("%d %d obj ... stream" % (10, 0)).encode('ascii'),
> binary_image_data,
> ("endstream endobj").e
On Fri, Jan 10, 2014 at 10:52 PM, Chris Barker wrote:
> On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop wrote:
>
>> As you may know, PDF operates over bytes and an integer or floating-point
>> number is written down as-is, for example "100" or "1.23".
>>
>
> Just to be clear here -- is PDF specifical
On Fri, 10 Jan 2014 18:14:45 -0500
"Eric V. Smith" wrote:
>
> >> Because embedding the ASCII equivalent of ints and floats in byte streams
> >> is a common operation?
> >
> > Again, if you're representing "ASCII", you're representing text and
> > should use a str object.
>
> Yes, but is there e
On 1/10/2014 6:02 PM, Antoine Pitrou wrote:
> On Fri, 10 Jan 2014 14:58:15 -0800
> Ethan Furman wrote:
>> On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
>>> On Fri, 10 Jan 2014 17:33:57 -0500
>>> "Eric V. Smith" wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
> On Fri, 10 Jan 2014 12:5
On Fri, 10 Jan 2014 14:58:15 -0800
Ethan Furman wrote:
> On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
> > On Fri, 10 Jan 2014 17:33:57 -0500
> > "Eric V. Smith" wrote:
> >> On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
> >>> On Fri, 10 Jan 2014 12:56:19 -0500
> >>> "Eric V. Smith" wrote:
>
>
On 01/10/2014 02:42 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 17:33:57 -0500
"Eric V. Smith" wrote:
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
On Fri, 10 Jan 2014 12:56:19 -0500
"Eric V. Smith" wrote:
I agree. I don't see any reason to exclude int and float. See Guido's
messages http:/
On Fri, 10 Jan 2014 17:33:57 -0500
"Eric V. Smith" wrote:
> On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
> > On Fri, 10 Jan 2014 12:56:19 -0500
> > "Eric V. Smith" wrote:
> >>
> >> I agree. I don't see any reason to exclude int and float. See Guido's
> >> messages http://bugs.python.org/issue3982#
On Fri, 10 Jan 2014 17:20:32 -0500
"Eric V. Smith" wrote:
>
> Isn't the point of the PEP to make it easier to port 2.x code to 3.5?
> Is
> there really existing code like this in 2.x?
No, but so what? The point of the PEP is not to allow arbitrary
Python 2 code to run without modification under
On 1/10/2014 5:29 PM, Antoine Pitrou wrote:
> On Fri, 10 Jan 2014 12:56:19 -0500
> "Eric V. Smith" wrote:
>>
>> I agree. I don't see any reason to exclude int and float. See Guido's
>> messages http://bugs.python.org/issue3982#msg180423 and
>> http://bugs.python.org/issue3982#msg180430 for some ju
On Fri, 10 Jan 2014 12:56:19 -0500
"Eric V. Smith" wrote:
>
> I agree. I don't see any reason to exclude int and float. See Guido's
> messages http://bugs.python.org/issue3982#msg180423 and
> http://bugs.python.org/issue3982#msg180430 for some justification and
> discussion.
If you are represent
On 1/10/2014 5:12 PM, Victor Stinner wrote:
> 2014/1/10 Juraj Sukop :
>> In the case of PDF, the embedding of an image into PDF looks like:
>>
>> 10 0 obj
>> << /Type /XObject
>> /Width 100
>> /Height 100
>> /Alternates 15 0 R
>> /Length 2167
>> >
2014/1/10 Juraj Sukop :
> In the case of PDF, the embedding of an image into PDF looks like:
>
> 10 0 obj
> << /Type /XObject
> /Width 100
> /Height 100
> /Alternates 15 0 R
> /Length 2167
> >>
> stream
> ...binary image data...
> ends
On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop wrote:
> As you may know, PDF operates over bytes and an integer or floating-point
> number is written down as-is, for example "100" or "1.23".
>
Just to be clear here -- is PDF specifically bytes+ascii?
Or could there be some-other-encoding unicode
Am 10.01.2014 18:56, schrieb Eric V. Smith:
> On 1/10/2014 12:17 PM, Juraj Sukop wrote:
>> (Sorry if this messes-up the thread order, it is meant as a reply to the
>> original RFC.)
>>
>> Dear list,
>>
>> newbie here. After much hesitation I decided to put forward a use case
>> which bothers me a
On 06/01/2014 13:24, Victor Stinner wrote:
Hi,
bytes % args and bytes.format(args) are requested by Mercurial and
Twisted projects. The issue #3982 was stuck because nobody proposed a
complete definition of the "new" features. Here is a try as a PEP.
Apologies if this has already been said, b
On 1/10/2014 12:17 PM, Juraj Sukop wrote:
> (Sorry if this messes-up the thread order, it is meant as a reply to the
> original RFC.)
>
> Dear list,
>
> newbie here. After much hesitation I decided to put forward a use case
> which bothers me about the current proposal. Disclaimer: I happen to
>
(Sorry if this messes-up the thread order, it is meant as a reply to the
original RFC.)
Dear list,
newbie here. After much hesitation I decided to put forward a use case
which bothers me about the current proposal. Disclaimer: I happen to write
a library which is directly influenced by this.
As
On Fri, 10 Jan 2014 11:32:05 +1000
Nick Coghlan wrote:
> >
> > It's consistent with bytearray.join's behaviour:
> >
> > >>> x = bytearray()
> > >>> x.join([b"abc"])
> > bytearray(b'abc')
> > >>> x
> > bytearray(b'')
>
> Yeah, I guess I'm OK with us being consistent on that one. It's still
> weird
On 10 Jan 2014 03:32, "Antoine Pitrou" wrote:
>
> On Fri, 10 Jan 2014 05:26:04 +1000
> Nick Coghlan wrote:
> >
> > We should probably include format_map for consistency with the str API.
>
> Yes, you're right.
>
> > >However, I
> > > also added bytearray into the mix, as bytearray objects should
On Fri, 10 Jan 2014 05:26:04 +1000
Nick Coghlan wrote:
>
> We should probably include format_map for consistency with the str API.
Yes, you're right.
> >However, I
> > also added bytearray into the mix, as bytearray objects should
> > generally support the same operations as bytes (and they can
On 9 Jan 2014 06:43, "Antoine Pitrou" wrote:
>
>
> Hi,
>
> With Victor's consent, I overhauled PEP 460 and made the feature set
> more restricted and consistent with the bytes/str separation.
+1
I was initially dubious about the idea, but the proposed semantics look
good to me.
We should probab
1 - 100 of 171 matches
Mail list logo