Thank you very much for the help. First I want count by city and year. City year count Xc1. 2001. 1 Xc1. 2002. 3 Yv1. 2001. 1 Yv2. 2002. 4 This worked fine !
Now I want to count by city only City. Count Xc1. 4 Yv2. 5 Then combine these two objects with the original data and send it to a file called "detout" with these columns: "City", " year ", "x ", "cycount ", "citycount" Many thanks again This worked fine. I tried to count only by city and combine the three objects together City Xc1 4 Yv2 5 Sent from my iPad > On Mar 10, 2016, at 3:11 AM, Jussi Piitulainen > <[email protected]> wrote: > > Val Krem writes: > >> Hi all, >> >> I am a new learner about python (moving from R to python) and trying >> read and count the number of observation by year for each city. >> >> >> The data set look like >> city year x >> >> XC1 2001 10 >> XC1 2001 20 >> XC1 2002 20 >> XC1 2002 10 >> XC1 2002 10 >> >> Yv2 2001 10 >> Yv2 2002 20 >> Yv2 2002 20 >> Yv2 2002 10 >> Yv2 2002 10 >> >> out put will be >> >> city >> xc1 2001 2 >> xc1 2002 3 >> yv1 2001 1 >> yv2 2002 3 >> >> >> Below is my starting code >> count=0 >> fo=open("dat", "r+") >> str = fo.read(); >> print "Read String is : ", str >> >> fo.close() > > Below's some of the basics that you want to study. Also look up the csv > module in Python's standard library. You will want to learn these things > even if you end up using some sort of third-party data-frame library (I > don't know those but they exist). > > from collections import Counter > > # collections.Counter is a special dictionary type for just this > counts = Counter() > > # with statement ensures closing the file > with open("dat") as fo: > # file object provides lines > next(fo) # skip header line > for line in fo: > # test requires non-empty string, but lines > # contain at least newline character so ok > if line.isspace(): continue > # .split() at whitespace, omits empty fields > city, year, x = line.split() > # collections.Counter has default 0, > # key is a tuple (city, year), parentheses omitted here > counts[city, year] += 1 > > print("city") > for city, year in sorted(counts): # iterate over keys > print(city.lower(), year, counts[city, year], sep = "\t") > > # Alternatively: > # for cy, n in sorted(counts.items()): > # city, year = cy > # print(city.lower(), year, n, sep = "\t") > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
