[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-30 Thread Eric V. Smith
Eric V. Smith added the comment: I've created issue40825 for adding a "strict" parameter. -- ___ Python tracker ___ ___ Python-bugs

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-29 Thread Terry J. Reedy
Terry J. Reedy added the comment: I make 5 core developers who agree that csv should definitely *not* assume that bytes given to it represent encoded text, reverting to the confusion of Python 1 and 2. (And even it if did, it should not assume that the encoding of the given to it and the enc

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-27 Thread Skip Montanaro
Skip Montanaro added the comment: I would also that tweaking Python to make this work with no change in Pandas would be a case of the tail wagging the dog. A big tail, but a tail nonetheless. -- ___ Python tracker

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-27 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Since Pandas opens an output file it has control on what encoding is used. It is a Pandas' responsibility to decode bytes, or raise an exception, or just ignore the problem if it is pretty uncommon case. Pandas already have a complex code for formatting ou

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-27 Thread Skip Montanaro
Skip Montanaro added the comment: This likely worked in the past because bytes == str in Python 2.x. This is just a corner case people porting from 2 to 3 need to address in their code. Papering over it so people using Pandas don't have to do the right thing is no reason to make changes. Byt

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-27 Thread Eric V. Smith
Eric V. Smith added the comment: > Short of outright banning the use of bytes (raise a TypeError), I think > the current behaviour is least-worst. I agree. I'd like to see the TypeError raised for everything that's not a string or number (since the docs say that's what's accepted), but at th

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-26 Thread Steven D'Aprano
Steven D'Aprano added the comment: On further thought, no, I don't think it would be a reasonable feature. User opens the CSV file, probably using the default encoding (UTF-8?) but potentially in anything. They collect some data as bytes. Those bytes could be from any unknown encoding. When

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-26 Thread Steven D'Aprano
Steven D'Aprano added the comment: The csv file object knows the encoding it was opened with, I think? If so, we could add an enhancement that bytes objects are first decoded to str using the same encoding the file was opened with. That seems like a reasonable new feature to me. Everything

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-26 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It would be confusing. There would be the encoding argument for open() to encode strings to bytes and the encoding argument for csv.writer() to decode bytes to strings. The latter will be ignored in all normal cases (for strings and numbers). --

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-26 Thread Rémi Lapeyre
Rémi Lapeyre added the comment: I don't think this would be accepted but I think you could try to propose that on the python-ideas mailing list to get some feedback on your proposal. -- nosy: +skip.montanaro ___ Python tracker

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-25 Thread Sidhant Bansal
Sidhant Bansal added the comment: Hi Remi, I understand your concerns with the current approach to resolve this issue. I would like to propose a new/different change to the way `csv.writer` works. I am putting here the diff of how the updated docs (https://docs.python.org/3/library/csv.htm

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-25 Thread Rémi Lapeyre
Rémi Lapeyre added the comment: > in real-life that b-prefixed string is just not readable by another program > in an easy way If another program opens this CSV file, it will read the string "b'A'" which is what this field actually contains. Everything that is not a number or a string gets c

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-25 Thread Sidhant Bansal
Sidhant Bansal added the comment: Hi Remi, Currently a code like this: ``` with open("abc.csv", "w", encoding='utf-8') as f: data = [b'\x41'] w = csv.writer(f) w.writerow(data) with open("abc.csv", "r") as f: rows = csv.reader(f) for row in rows: print(row[0]) # pri

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-25 Thread Rémi Lapeyre
Rémi Lapeyre added the comment: > As an example, if I write character "A" as a byte, i.e b'A' in a csv file But you can't write b'A' in a csv file, what you can't do is write `b'a'.decode()` or `b'a'.decode('latin1')` or `b'a'.decode('whatever')` but the string representation of a byte string

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-25 Thread Sidhant Bansal
Sidhant Bansal added the comment: Yes, I do recognise that the current doc states that csv only supports strings and numbers. However the use case here is motivated when the user wants to write bytes and numbers/strings mixed to a CSV file. Currently providing bytes to write to a CSV passes

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-25 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: According to the documentation (https://docs.python.org/3/library/csv.html#writer-objects): """ A row must be an iterable of strings or numbers for Writer objects and a dictionary mapping fieldnames to strings or numbers (by passing them through str() fir

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-24 Thread Sidhant Bansal
Change by Sidhant Bansal : -- keywords: +patch pull_requests: +19634 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20371 ___ Python tracker ___ _

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-24 Thread Sidhant Bansal
Sidhant Bansal added the comment: The following code ``` import csv with open("abc.csv", "w") as f: data = [b'\xc2a9', b'\xc2a9'] w = csv.writer(f) w.writerow(data) ``` writes "b'\xc2a9',b'\xc2a9'" in "abc.csv", i.e the b-prefixed byte string instead of the actual bytes. Although on

[issue40762] Writing bytes using CSV module results in b prefixed strings

2020-05-24 Thread Sidhant Bansal
New submission from Sidhant Bansal : The following code ``` import csv with open("abc.csv", "w") as f: data = [b'\xc2a9', b'\xc2a9'] w = csv.writer(f) w.writerow(data) ``` writes "b'\xc2a9',b'\xc2a9'" in "abc.csv", i.e the b-prefixed byte string instead of the actual bytes. Although