Re: [Tutor] Fwd: how to sort the file out
On Wed, Sep 7, 2011 at 2:25 PM, Hugo Arts wrote: > forgot to forward this to list, sorry. > > > -- Forwarded message -- > From: Hugo Arts > Date: Wed, Sep 7, 2011 at 8:24 AM > Subject: Re: [Tutor] how to sort the file out > To: lina > > > On Wed, Sep 7, 2011 at 8:16 AM, lina wrote: >> On Wed, Sep 7, 2011 at 1:28 PM, Hugo Arts wrote: >>> I was assuming that the numbers were field 1, and the letter/number >>> combinations were field 2. If I understand him correctly, he wants the >> >> Yes. >> >>> lines in file 2 to be arranged such that the order of field two is the >>> same as it is in file 1. In that case, you can do it with one sexy >>> sort() call (and a little preprocessing to load the files), but I >>> don't want to get all into explain and then realize he wants something >>> totally different. >> You understand right. >> > > Well, it is fairly simple. You grab field2 from file1, and put all the > items into a list (should be easy right? open(), readlines(), split(), > grab second item). Then, you grab file2 as a list of tuples (same > procedure as the first file, but you just put the results of split() > right into the list. Then you sort with a key like so: > > list2.sort(key=lambda x: list1.index(x[1])) > > You see? For each item in the second list, we grab field 2, and look > up what index it has in list1. Then we use that index as the sort key. > That way they'll be sorted like file1 is. not easy to me. Sign ... I may do it by hand, cut and pasta following the orders. > > After that it's just a matter of writing back to the file, easy peasy. > > HTH, > Hugo > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > -- Best Regards, lina ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Fwd: how to sort the file out
On Wed, Sep 7, 2011 at 2:57 PM, lina wrote: > On Wed, Sep 7, 2011 at 2:25 PM, Hugo Arts wrote: >> forgot to forward this to list, sorry. >> >> >> -- Forwarded message -- >> From: Hugo Arts >> Date: Wed, Sep 7, 2011 at 8:24 AM >> Subject: Re: [Tutor] how to sort the file out >> To: lina >> >> >> On Wed, Sep 7, 2011 at 8:16 AM, lina wrote: >>> On Wed, Sep 7, 2011 at 1:28 PM, Hugo Arts wrote: I was assuming that the numbers were field 1, and the letter/number combinations were field 2. If I understand him correctly, he wants the >>> >>> Yes. >>> lines in file 2 to be arranged such that the order of field two is the same as it is in file 1. In that case, you can do it with one sexy sort() call (and a little preprocessing to load the files), but I don't want to get all into explain and then realize he wants something totally different. >>> You understand right. >>> >> >> Well, it is fairly simple. You grab field2 from file1, and put all the >> items into a list (should be easy right? open(), readlines(), split(), >> grab second item). Then, you grab file2 as a list of tuples (same >> procedure as the first file, but you just put the results of split() >> right into the list. Then you sort with a key like so: >> >> list2.sort(key=lambda x: list1.index(x[1])) >> >> You see? For each item in the second list, we grab field 2, and look >> up what index it has in list1. Then we use that index as the sort key. >> That way they'll be sorted like file1 is. > > not easy to me. > > Sign ... I may do it by hand, cut and pasta following the orders. After overcoming the nearly-give-up frustration, I finally see the deadly-beautiful results. It's so BEAUTIFUL. amazing ... >> >> After that it's just a matter of writing back to the file, easy peasy. >> >> HTH, >> Hugo >> ___ >> Tutor maillist - Tutor@python.org >> To unsubscribe or change subscription options: >> http://mail.python.org/mailman/listinfo/tutor >> > > > > -- > Best Regards, > > lina > -- Best Regards, lina ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to sort the file out
lina wrote: > HI, I have two files, one is reference file, another is waiting for adjust > one, > > File 1: > > 1 C1 > 2 O1 [...] > 33 C19 > 34 O5 > 35 C21 > > File 2: > 3 H16 > 4 H5 [...] > 39 H62 > 40 O2 > 41 H22 > > I wish the field 2 from file 2 arranged the same sequence as the field > 2 of file 1. > > Thanks for any suggestions, > > I drove my minds into nuts already, three hours passed and I still > failed to achieve this. You could have written the above after three minutes. To get the most out of this mailing list you should give some details of what you tried and how it failed. This gives us valuable information about your level of knowledge and confidence that you are trying to learn rather than get solutions on the cheap. However, I'm in the mood for some spoonfeeding: indexfile = "tmp_index.txt" datafile = "tmp_data.txt" sorteddatafile = "tmp_output.txt" def make_lookup(lines): r"""Build a dictionary that maps the second column to the line number. >>> make_lookup(["aaa bbb\n", "ccc ddd\n"]) == {'bbb': 0, 'ddd': 1} True """ position_lookup = {} for lineno, line in enumerate(lines): second_field = line.split()[1] position_lookup[second_field] = lineno return position_lookup with open(indexfile) as f: position_lookup = make_lookup(f) # With your sample data the global position_lookup dict looks like this now: # {'C1': 0, 'O1': 1, 'C2': 2,... , 'O5': 33, 'C21': 34} def get_position(line): r"""Extract the second field from the line and look up the associated line number in the global position_lookup dictionary. Example: get_position("15 C2\n") The line is split into ["15", "C2"] The second field is "C2" Its associated line number in position_lookup: 2 --> the function returns 2 """ second_field = line.split()[1] return position_lookup[second_field] with open(datafile) as f: # sort the lines in the data file using the line number in the index # file as the sort key lines = sorted(f, key=get_position) with open(sorteddatafile, "w") as f: f.writelines(lines) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how to sort the file out
On Wed, Sep 7, 2011 at 4:52 PM, Peter Otten <__pete...@web.de> wrote: > lina wrote: > >> HI, I have two files, one is reference file, another is waiting for adjust >> one, >> >> File 1: >> >> 1 C1 >> 2 O1 > [...] >> 33 C19 >> 34 O5 >> 35 C21 >> >> File 2: >> 3 H16 >> 4 H5 > [...] >> 39 H62 >> 40 O2 >> 41 H22 >> >> I wish the field 2 from file 2 arranged the same sequence as the field >> 2 of file 1. >> >> Thanks for any suggestions, >> >> I drove my minds into nuts already, three hours passed and I still >> failed to achieve this. > > You could have written the above after three minutes. To get the most out of > this mailing list you should give some details of what you tried and how it > failed. This gives us valuable information about your level of knowledge and > confidence that you are trying to learn rather than get solutions on the > cheap. > > However, I'm in the mood for some spoonfeeding: LOL ... thanks. I am very very low leavel in python, many times I just gave up due to frustration in using it. and escape back to bash, awk. > > indexfile = "tmp_index.txt" > datafile = "tmp_data.txt" > sorteddatafile = "tmp_output.txt" > > def make_lookup(lines): > r"""Build a dictionary that maps the second column to the line number. > > >>> make_lookup(["aaa bbb\n", "ccc ddd\n"]) == {'bbb': 0, 'ddd': 1} > True > """ > position_lookup = {} > for lineno, line in enumerate(lines): > second_field = line.split()[1] > position_lookup[second_field] = lineno > return position_lookup > > with open(indexfile) as f: > position_lookup = make_lookup(f) > > # With your sample data the global position_lookup dict looks like this now: > # {'C1': 0, 'O1': 1, 'C2': 2,... , 'O5': 33, 'C21': 34} > > def get_position(line): > r"""Extract the second field from the line and look up the > associated line number in the global position_lookup dictionary. > > Example: > get_position("15 C2\n") > The line is split into ["15", "C2"] > The second field is "C2" > Its associated line number in position_lookup: 2 > --> the function returns 2 > """ > second_field = line.split()[1] > return position_lookup[second_field] > > with open(datafile) as f: > # sort the lines in the data file using the line number in the index > # file as the sort key > lines = sorted(f, key=get_position) > > with open(sorteddatafile, "w") as f: > f.writelines(lines) > It's an amazing opportunity to learn. I will try it now. > > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > http://mail.python.org/mailman/listinfo/tutor > -- Best Regards, lina ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] how obsolete is 2.2?
I found a book at the local library that covers python but it's 2.2. I already have been using 2.7 for basic stuff and would like to know if it's worth my time to read this book. Are there any glaring differences that would be easy to point out, or is it too convoluted? Also, am I correct in thinking that 3.0 will always be called 3.0 but will change over time and will always include experimental features, while 2.x will gradually increase the 'x' and the highest 'x' will indicate the most current, stable release? oh, and a question on 'pickle': can pickle deserialize things that were not serialized by python? can it convert data into a python data type regardless of it was originally a 'c array' or something else that python doesn't support? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] how obsolete is 2.2, and a pickle question
I found a book at the local library that covers python but it's 2.2. I already have been using 2.7 for basic stuff and would like to know if it's worth my time to read this book. Are there any glaring differences that would be easy to point out, or is it too convoluted? Also, am I correct in thinking that 3.0 will always be called 3.0 but will change over time and will always include experimental features, while 2.x will gradually increase the 'x' and the highest 'x' will indicate the most current, stable release? oh, and a question on 'pickle': can pickle deserialize things that were not serialized by python? can it convert data into a python data type regardless of it was originally a 'c array' or something else that python doesn't support? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] how obsolete is 2.2?
Hi, please don't repost your question before waiting at least a little for an answer. c smith, 08.09.2011 05:26: I found a book at the local library that covers python but it's 2.2. That's way old then. It won't teach you anything about the really interesting and helpful things in Python, such as generators, itertools or the "with" statement, extended APIs and stdlib modules, and loads of other goodies and enhanced features, such as metaclasses and interpreter configuration stuff. I already have been using 2.7 for basic stuff and would like to know if it's worth my time to read this book. Likely not. Better read a recent tutorial or spend your time getting used to the official Python documentation. Are there any glaring differences that would be easy to point out, or is it too convoluted? Tons of them, too many to even get started. You might want to take a look at the "what's new" pages in the Python documentation. That will give you a pretty good idea of major advances. Also, am I correct in thinking that 3.0 will always be called 3.0 No. It's called Python 3 (sometimes historically named Python 3k or Python 3000), with released versions being 3.0, 3.1.x and 3.2.x and the upcoming release being 3.3. but will change over time and will always include experimental features Well, it's the place where all current development happens, be it experimental or not. while 2.x will gradually increase the 'x' Nope. 'x' is fixed at 7, Py2.7 is the officially last release series of Python 2, although with an extended maintenance time frame of several years. and the highest 'x' will indicate the most current, stable release? That's right, both for the Py2.x and Py3.x releases. oh, and a question on 'pickle': Let's keep that in your other post, to let it serve a purpose. Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor