Paul Kraus wrote: > Here is the code that I used. Its functional and it works but there has got > to > be some better ways to do a lot of this. Transversing the data structure > still seems like I have to be doing it the hard way. > > The input data file has fixed width fields that are delimited by pipe. > So depending on the justification for each field it will either have leading > or ending whitespace. > ############################# > import re > import string > results = {} > def format_date(datestring): > (period,day,year) = map(int,datestring.split('/') ) > period += 2 > if period == 13: period = 1; year += 1 > if period == 14: period = 2; year += 1 if period > 12: period -= 12; year += 1
> if year > 80: > year = '19%02d' % year > else: > year = '20%02d' % year > return (year,period) > > def format_qty(qty,credit,oreturn): > qty = float(qty) > if credit == 'C' or oreturn == 'Y': > return qty * -1 > else: > return qty > > textfile = open('orders.txt','r') > for line in textfile: > fields = map( string.strip, line.split( '|' ) ) > fields[4] = format_qty(fields[ 4 ],fields[ 1 ], fields[ 2 ] ) qty = format_qty(fields[ 4 ],fields[ 1 ], fields[ 2 ] ) would be clearer in subsequent code. > (year, period) = format_date( fields[7] ) > for count in range(12): > if count == period: > if results.get( ( year, fields[6], count), 0): > results[ year,fields[6], count] += fields[4] > else: > results[ year,fields[6],count] = fields[4] The loop on count is not doing anything, you can use period directly. And the test on results.get() is not needed, it is safe to always add: key = (year, fields[6], period) results[key] = results.get(key, 0) + qty > > sortedkeys = results.keys() > sortedkeys.sort() > > for keys in sortedkeys: > res_string = keys[0]+'|'+keys[1] > for count in range(12): > if results.get((keys[0],keys[1],count),0): > res_string += '|'+str(results[keys[0],keys[1],count]) > else: > res_string += '|0' > print res_string This will give you duplicate outputs if you ever have more than one period for a given year and field[6] (whatever that is...). OTOH if you just show the existing keys you will not have entries for the 0 keys. So maybe you should go back to your original idea of using a 12-element list for the counts. Anyway in the above code the test on results.get() is not needed since you just use the default value in the else: res_string += str(results.get((keys[0],keys[1],count),0)) > </code> > _______________________________________________ > Tutor maillist - Tutor@python.org > http://mail.python.org/mailman/listinfo/tutor > > _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor