Re: Python read text file columnwise
[email protected] wrote: > Hello >> >> I'm very new in python. I have a file in the format: >> >> 2018-05-31 16:00:0028.90 81.77 4.3 >> 2018-05-31 20:32:0028.17 84.89 4.1 >> 2018-06-20 04:09:0027.36 88.01 4.8 >> 2018-06-20 04:15:0027.31 87.09 4.7 >> 2018-06-28 04.07:0027.87 84.91 5.0 >> 2018-06-29 00.42:0032.20 104.61 4.8 > > I would like to read this file in python column-wise. > > I tried this way but not working > event_list = open('seismicity_R023E.txt',"r") > info_event = read(event_list,'%s %s %f %f %f %f\n'); There is actually a library that implements a C-like scanf. You can install it with $ pip install scanf After that: $ cat read_table.py from scanf import scanf with open("seismicity_R023E.txt") as f: for line in f: print( scanf("%s %s %f %f %f\n", line) ) $ cat seismicity_R023E.txt 2018-05-31 16:00:0028.90 81.77 4.3 2018-05-31 20:32:0028.17 84.89 4.1 2018-06-20 04:09:0027.36 88.01 4.8 2018-06-20 04:15:0027.31 87.09 4.7 2018-06-28 04.07:0027.87 84.91 5.0 2018-06-29 00.42:0032.20 104.61 4.8 $ python read_table.py ('2018-05-31', '16:00:00', 28.9, 81.77, 4.3) ('2018-05-31', '20:32:00', 28.17, 84.89, 4.1) ('2018-06-20', '04:09:00', 27.36, 88.01, 4.8) ('2018-06-20', '04:15:00', 27.31, 87.09, 4.7) ('2018-06-28', '04.07:00', 27.87, 84.91, 5.0) ('2018-06-29', '00.42:00', 32.2, 104.61, 4.8) $ However, in the long term you may be better off with a tool like pandas: >>> import pandas >>> pandas.read_table( ... "seismicity_R023E.txt", sep=r"\s+", ... names=["date", "time", "foo", "bar", "baz"], ... parse_dates=[["date", "time"]] ... ) date_timefoo bar baz 0 2018-05-31 16:00:00 28.90 81.77 4.3 1 2018-05-31 20:32:00 28.17 84.89 4.1 2 2018-06-20 04:09:00 27.36 88.01 4.8 3 2018-06-20 04:15:00 27.31 87.09 4.7 4 2018-06-28 04:00:00 27.87 84.91 5.0 5 2018-06-29 00:00:00 32.20 104.61 4.8 [6 rows x 4 columns] >>> It will be harder in the beginning, but if you work with tabular data regularly it will pay off. -- https://mail.python.org/mailman/listinfo/python-list
Silent data corruption in pandas, was Re: Python read text file columnwise
Peter Otten wrote: > [email protected] wrote: > >> Hello >>> >>> I'm very new in python. I have a file in the format: >>> >>> 2018-05-31 16:00:0028.90 81.77 4.3 >>> 2018-05-31 20:32:0028.17 84.89 4.1 >>> 2018-06-20 04:09:0027.36 88.01 4.8 >>> 2018-06-20 04:15:0027.31 87.09 4.7 >>> 2018-06-28 04.07:0027.87 84.91 5.0 >>> 2018-06-29 00.42:0032.20 104.61 4.8 >> >> I would like to read this file in python column-wise. > However, in the long term you may be better off with a tool like pandas: > import pandas pandas.read_table( > ... "seismicity_R023E.txt", sep=r"\s+", > ... names=["date", "time", "foo", "bar", "baz"], > ... parse_dates=[["date", "time"]] > ... ) > date_timefoo bar baz > 0 2018-05-31 16:00:00 28.90 81.77 4.3 > 1 2018-05-31 20:32:00 28.17 84.89 4.1 > 2 2018-06-20 04:09:00 27.36 88.01 4.8 > 3 2018-06-20 04:15:00 27.31 87.09 4.7 > 4 2018-06-28 04:00:00 27.87 84.91 5.0 > 5 2018-06-29 00:00:00 32.20 104.61 4.8 > > [6 rows x 4 columns] > > It will be harder in the beginning, but if you work with tabular data > regularly it will pay off. After posting the above I noted that the malformed time in the last two rows was silently botched. So I just spent an insane amount of time to try and fix this from within pandas: import datetime import numpy import pandas def parse_datetime(dt): return datetime.datetime.strptime( dt.replace(".", ":"), "%Y-%m-%d %H:%M:%S" ) def date_parser(dates, times): return numpy.array([ parse_datetime(date + " " + time) for date, time in zip(dates, times) ]) df = pandas.read_table( "seismicity_R023E.txt", sep=r"\s+", names=["date", "time", "foo", "bar", "baz"], parse_dates=[["date", "time"]], date_parser=date_parser ) print(df) There's probably a better way as I am only a determined amateur... -- https://mail.python.org/mailman/listinfo/python-list
Re: Silent data corruption in pandas
Peter Otten wrote:
[Practising the bad habit of public soliloquy]
> def parse_datetime(dt):
> return datetime.datetime.strptime(
> dt.replace(".", ":"), "%Y-%m-%d %H:%M:%S"
> )
>
>
> def date_parser(dates, times):
> return numpy.array([
> parse_datetime(date + " " + time)
> for date, time in zip(dates, times)
> ])
This can be rewritten:
@numpy.vectorize
def date_parser(date, time):
return datetime.datetime.strptime(
date + " " + time.replace(".", ":"),
"%Y-%m-%d %H:%M:%S"
)
--
https://mail.python.org/mailman/listinfo/python-list
Re: Python read text file columnwise
On 12/01/19 1:03 PM, Piet van Oostrum wrote: [email protected] writes: Hello I'm very new in python. I have a file in the format: 2018-05-31 16:00:0028.90 81.77 4.3 2018-05-31 20:32:0028.17 84.89 4.1 2018-06-20 04:09:0027.36 88.01 4.8 2018-06-20 04:15:0027.31 87.09 4.7 2018-06-28 04.07:0027.87 84.91 5.0 2018-06-29 00.42:0032.20 104.61 4.8 I would like to read this file in python column-wise. I tried this way but not working event_list = open('seismicity_R023E.txt',"r") info_event = read(event_list,'%s %s %f %f %f %f\n'); To the OP: Python's standard I/O is based around data "streams". Whilst there is a concept of "lines" and thus an end-of-line character, there is not the idea of a record, in the sense of fixed-length fields and thus a defining and distinction between data items based upon position. Accordingly, whilst the formatting specification of strings and floats might work for output, there is no equivalent for accepting input data. Please re-read refs on file, read, readline, etc. Why would you think that this would work? To the PO: Because in languages/libraries built around fixed-length files this is how one specifies the composition of fields making up a record - a data structure which dates back to FORTRAN and Assembler on mainframes and other magtape-era machines. Whilst fixed-length records/files are, by definition, less flexible than the more free-form data input Python accepts, they are more efficient and faster in situations where the data (format) is entirely consistent - such as the OP is describing! -- Regards =dn -- https://mail.python.org/mailman/listinfo/python-list
RE: Python read text file columnwise
-Original Message- From: Avi Gross Sent: Saturday, January 12, 2019 8:26 PM To: 'DL Neil' Subject: RE: Python read text file columnwise I am not sure what the big deal is here. If the data is consistently formatted you can read in a string per line and use offsets as in line[0:8] and so on then call the right transformations to comvert them to dates and so on. If it is delimited by something consistent like spaces or table or commas, we have all kinds of solutions ranging from splitting the line on the delimiter to using the kind of functionality that reads in such files into a pandas DataFrame. In the latter case, you get the columns already. In the former, there are well known ways to extract the info such as: [row[0] for row in listofrows] And repeat for additional items. Or am I missing something and there is no end of line and you need to read in the entire file and split it into size N chunks first? Still fairly straightforward. -Original Message- From: Python-list On Behalf Of DL Neil Sent: Saturday, January 12, 2019 4:48 PM To: [email protected] Subject: Re: Python read text file columnwise On 12/01/19 1:03 PM, Piet van Oostrum wrote: > [email protected] writes: > >> Hello >>> >>> I'm very new in python. I have a file in the format: >>> >>> 2018-05-31 16:00:0028.90 81.77 4.3 >>> 2018-05-31 20:32:0028.17 84.89 4.1 >>> 2018-06-20 04:09:0027.36 88.01 4.8 >>> 2018-06-20 04:15:0027.31 87.09 4.7 >>> 2018-06-28 04.07:0027.87 84.91 5.0 >>> 2018-06-29 00.42:0032.20 104.61 4.8 >> >> I would like to read this file in python column-wise. >> >> I tried this way but not working >>event_list = open('seismicity_R023E.txt',"r") >> info_event = read(event_list,'%s %s %f %f %f %f\n'); To the OP: Python's standard I/O is based around data "streams". Whilst there is a concept of "lines" and thus an end-of-line character, there is not the idea of a record, in the sense of fixed-length fields and thus a defining and distinction between data items based upon position. Accordingly, whilst the formatting specification of strings and floats might work for output, there is no equivalent for accepting input data. Please re-read refs on file, read, readline, etc. > Why would you think that this would work? To the PO: Because in languages/libraries built around fixed-length files this is how one specifies the composition of fields making up a record - a data structure which dates back to FORTRAN and Assembler on mainframes and other magtape-era machines. Whilst fixed-length records/files are, by definition, less flexible than the more free-form data input Python accepts, they are more efficient and faster in situations where the data (format) is entirely consistent - such as the OP is describing! -- Regards =dn -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
