It looks like both recfromtxt and loadtxt are flexible enough to
handle string/bytes en/decoding, - with a bit of work and using enough
information
>>> dtype=[('f0', '>> data = numpy.recfromtxt(open('Õscar_3.txt',"rb"), dtype=dtype,
>>> delimiter=',',converters={3:lambda x: x.decode('utf8')})
>>>
On Fri, Jan 17, 2014 at 5:30 PM, Chris Barker wrote:
> Folks,
>
> I've been blathering away on the related threads a lot -- sorry if it's
> too much. It's gotten a bit tangled up, so I thought I'd start a new one to
> address this one question (i.e. dont bring up genfromtext here):
>
> Would it b
On Fri, Jan 17, 2014 at 4:43 PM, wrote:
> On Fri, Jan 17, 2014 at 4:20 PM, Chris Barker
> wrote:
> > On Fri, Jan 17, 2014 at 12:36 PM, wrote:
> >>
> >> > ('S' ?) -- which is probably not what you want particularly if you
> >> > specify
> >> > an encoding. Though I can't figure out at the moment
Folks,
I've been blathering away on the related threads a lot -- sorry if it's too
much. It's gotten a bit tangled up, so I thought I'd start a new one to
address this one question (i.e. dont bring up genfromtext here):
Would it be a good thing for numpy to have a one-byte--per-character string
t
On Fri, Jan 17, 2014 at 1:43 PM, wrote:
> > 2) Either:
> > a) open as a binary file and use bytes for anything that doesn't
> parse
> > as text -- this means that the user will need to do the conversion to
> text
> > themselves
> >
> > b) decode as latin-1: this would work well for ascii an
On Fri, Jan 17, 2014 at 4:20 PM, Chris Barker wrote:
> On Fri, Jan 17, 2014 at 12:36 PM, wrote:
>>
>> > ('S' ?) -- which is probably not what you want particularly if you
>> > specify
>> > an encoding. Though I can't figure out at the moment why the previous
>> > one
>> > failed -- where did the
On Fri, Jan 17, 2014 at 12:36 PM, wrote:
> > ('S' ?) -- which is probably not what you want particularly if you
> specify
> > an encoding. Though I can't figure out at the moment why the previous one
> > failed -- where did the bytes object come from when the encoding was
> > specified?
>
> Yes,
Small note:
Being an English speaker I don't normally use non-ascii characters in
> filenames but my system (Ubuntu Linux) still uses utf-8 rather than
> latin-1 or
> (and rightly so!).
just to be really clear -- encoding for filenames and encoding for
file content have nothing to do with each-o
On Fri, Jan 17, 2014 at 3:17 PM, Chris Barker wrote:
> >>> numpy.recfromtxt(open('Õscar_3.txt',"r", encoding='utf8'),
> delimiter=',')
>>
>> Traceback (most recent call last):
>> File "", line 1, in
>> numpy.recfromtxt(open('Õscar_3.txt',"r", encoding='utf8'),
>> delimiter=',')
>> File "
On Fri, Jan 17, 2014 at 5:18 AM, Freddie Witherden wrote:
> In terms of HDF5 it is interesting to look at how h5py -- which has to
> go between NumPy types and HDF5 conventions -- handles the problem as
> described here:
>
> http://www.h5py.org/docs/topics/strings.html
from that:
"""All strin
>>> numpy.recfromtxt(open('Õscar_3.txt',"r", encoding='utf8'),
delimiter=',')
> Traceback (most recent call last):
> File "", line 1, in
> numpy.recfromtxt(open('Õscar_3.txt',"r", encoding='utf8'),
> delimiter=',')
> File "C:\Programs\Python33\lib\site-packages\numpy\lib\npyio.py",
> lin
On Fri, Jan 17, 2014 at 1:38 AM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:
>
> This thread is getting a little out of hand which is my fault for
> initially mixing different topics in one mail,
>
still a bit mixed ;-) -- but I think the loadtxt issue requires a lot
less discussion,
On Fri, Jan 17, 2014 at 2:18 PM, Julian Taylor
wrote:
> On 17.01.2014 15:12, Julian Taylor wrote:
>> On Fri, Jan 17, 2014 at 2:40 PM, Oscar Benjamin
>> mailto:oscar.j.benja...@gmail.com>> wrote:
>>
>> On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
>> > On Fri, Jan 17, 2014
On 17.01.2014 15:12, Julian Taylor wrote:
> On Fri, Jan 17, 2014 at 2:40 PM, Oscar Benjamin
> mailto:oscar.j.benja...@gmail.com>> wrote:
>
> On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
> > On Fri, Jan 17, 2014 at 1:44 PM, Oscar Benjamin
> > mailto:oscar.j.benja...@gm
17.01.2014 15:09, Aldcroft, Thomas kirjoitti:
[clip]
> I've been playing around with porting a stack of analysis libraries
> to Python 3 and this is a very timely thread and comment. What I
> discovered right away is that all the string data coming from
> binary HDF5 files show up (as expected) as
On Fri, Jan 17, 2014 at 10:58:25AM -0500, josef.p...@gmail.com wrote:
> On Fri, Jan 17, 2014 at 10:26 AM, Oscar Benjamin
> wrote:
> > On Fri, Jan 17, 2014 at 03:12:32PM +0100, Julian Taylor wrote:
> >
> > You don't show how you created the file. I think that in your case the
> > content of 'filena
On Fri, Jan 17, 2014 at 10:26 AM, Oscar Benjamin
wrote:
> On Fri, Jan 17, 2014 at 03:12:32PM +0100, Julian Taylor wrote:
>> On Fri, Jan 17, 2014 at 2:40 PM, Oscar Benjamin
>> wrote:
>>
>> > On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
>> > >
>> > > no, the right solution is to ad
On Fri, Jan 17, 2014 at 03:12:32PM +0100, Julian Taylor wrote:
> On Fri, Jan 17, 2014 at 2:40 PM, Oscar Benjamin
> wrote:
>
> > On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
> > >
> > > no, the right solution is to add an encoding argument.
> > > Its a 4 line patch for python2 and
On Fri, Jan 17, 2014 at 2:40 PM, Oscar Benjamin
wrote:
> On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
> > On Fri, Jan 17, 2014 at 1:44 PM, Oscar Benjamin
> > wrote:
> >
> > > On Fri, Jan 17, 2014 at 10:59:27AM +, Pauli Virtanen wrote:
> > > > Julian Taylor googlemail.com> wr
On Fri, Jan 17, 2014 at 8:40 AM, Oscar Benjamin
wrote:
> On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
>> On Fri, Jan 17, 2014 at 1:44 PM, Oscar Benjamin
>> wrote:
>>
>> > On Fri, Jan 17, 2014 at 10:59:27AM +, Pauli Virtanen wrote:
>> > > Julian Taylor googlemail.com> writes:
On Fri, Jan 17, 2014 at 02:10:19PM +0100, Julian Taylor wrote:
> On Fri, Jan 17, 2014 at 1:44 PM, Oscar Benjamin
> wrote:
>
> > On Fri, Jan 17, 2014 at 10:59:27AM +, Pauli Virtanen wrote:
> > > Julian Taylor googlemail.com> writes:
> > > [clip]
> >
>
> > > > For backward compatibility we *ca
On Fri, Jan 17, 2014 at 2:10 PM, Julian Taylor <
jtaylor.deb...@googlemail.com> wrote:
> On Fri, Jan 17, 2014 at 1:44 PM, Oscar Benjamin <
> oscar.j.benja...@gmail.com> wrote:...
> ...
> No latin1 de/encoding is required for anything, I don't know why you would
> want do to that in this context.
>
On 17/01/14 13:09, Aldcroft, Thomas wrote:
> I've been playing around with porting a stack of analysis libraries to
> Python 3 and this is a very timely thread and comment. What I
> discovered right away is that all the string data coming from binary
> HDF5 files show up (as expected) as 'S' type,
On Fri, Jan 17, 2014 at 1:44 PM, Oscar Benjamin
wrote:
> On Fri, Jan 17, 2014 at 10:59:27AM +, Pauli Virtanen wrote:
> > Julian Taylor googlemail.com> writes:
> > [clip]
>
> > > For backward compatibility we *cannot* change S.
>
> Do you mean to say that loadtxt cannot be changed from decodi
On Fri, Jan 17, 2014 at 5:59 AM, Pauli Virtanen wrote:
> Julian Taylor googlemail.com> writes:
> [clip]
> > - inconvenience in dealing with strings in python 3.
> >
> > bytes are not strings in python3 which means ascii data is either a byte
> > array which can be inconvenient to deal with or 4
On Fri, Jan 17, 2014 at 10:59:27AM +, Pauli Virtanen wrote:
> Julian Taylor googlemail.com> writes:
> [clip]
> > - inconvenience in dealing with strings in python 3.
> >
> > bytes are not strings in python3 which means ascii data is either a byte
> > array which can be inconvenient to deal wi
On Fri, Jan 17, 2014 at 5:59 AM, Pauli Virtanen wrote:
> Julian Taylor googlemail.com> writes:
> [clip]
>> - inconvenience in dealing with strings in python 3.
>>
>> bytes are not strings in python3 which means ascii data is either a byte
>> array which can be inconvenient to deal with or 4 byte
Julian Taylor googlemail.com> writes:
[clip]
> For backward compatibility we *cannot* change S.
> Maybe we could change the meaning of 'a' but it would be safer
> to add a new dtype, possibly 'S' can be deprecated in favor
> of 'B' when we have a specific encoding dtype.
Note that the rename 'S'
Julian Taylor googlemail.com> writes:
[clip]
> - inconvenience in dealing with strings in python 3.
>
> bytes are not strings in python3 which means ascii data is either a byte
> array which can be inconvenient to deal with or 4 byte unicode which
> wastes space.
>
> A proposal to fix this would
This thread is getting a little out of hand which is my fault for initially
mixing different topics in one mail, so let me try to summarize:
We have three issues here:
- a loadtxt bug when loading strings in python3
this has nothing to do with encodings or dtypes it is a bug that should be
fixed.
On 2014-01-17 00:28, Stephan Hoyer wrote:
> There was a discussion last year about slicing along specified axes in
> numpy arrays:
> http://mail.scipy.org/pipermail/numpy-discussion/2012-April/061632.html
> [1]
>
> I'm finding that slicing along specified axes is a common task for me
> when writin
31 matches
Mail list logo