Hi,
I use the following code to process TSV input.
$ printf '%s\t%s\n' {1..10} | ./main.py
['1', '2']
['3', '4']
['5', '6']
['7', '8']
['9', '10']
$ cat main.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import sys
for line in sys.stdin:
fields=line.rstrip('\n').split('\t')
print fields
But I am not sure it will process utf-8 input correctly. Thus, I come
up with this code. However, I am not sure if this is really necessary
as my impression is that utf-8 character should not contain the ascii
code for TAB. Is it so? Thanks.
$ cat main1.py
#!/usr/bin/env python
# vim: set noexpandtab tabstop=2 shiftwidth=2 softtabstop=-1 fileencoding=utf-8:
import sys
for line in sys.stdin:
#fields=line.rstrip('\n').split('\t')
fields=line.rstrip('\n').decode('utf-8').split('\t')
print [x.encode('utf-8') for x in fields]
$ printf '%s\t%s\n' {1..10} | ./main1.py
['1', '2']
['3', '4']
['5', '6']
['7', '8']
['9', '10']
--
Regards,
Peng
--
https://mail.python.org/mailman/listinfo/python-list