Hi, I'm learning Python so I can take advantage of the really cool stuff in the Natural Language Toolkit. But I'm having problems with some basic file manipulation stuff. My basic question: How do I read data in from a csv, manipulate it, and then add it back to the csv in new columns (keeping the manipulated data in the "right row")? Here's an example of what my data looks like ("test-8-29-10.csv"):
MyWord Category Ct CatCt ! A 2932 456454 ! B 2109 64451 a C 7856 90000 a A 19911 456454 abnormal C 174 90000 abnormally D 5 77777 cats E 1999 886454 cat B 160 64451 # I want to read in the MyWord for each row and then do some stuff to it and add in some new columns. Specifically, I want to "lemmatize" and "stem", which basically means I'll turn "abnormally" into "abnormal" and "cats" into "cat". import nltk wnl=nltk.WordNetLemmatizer() porter=nltk.PorterStemmer() text=nltk.word_tokenize(TheStuffInMyWordColumn) textlemmatized=[wnl.lemmatize(t) for t in text] textPort=[porter.stem(t) for t in text] # This creates the right info, but I don't really want "textlemmatized" and "textPort" to be independent lists, I want them inside the csv in new columns. # If I didn't want to keep the information in the Category and Counts columns, I would probably do something like this: for word in text: word2=wnl.lemmatize(word) word3=porter.stem(word) print word+";"+word2+";"+word3+"\r\n") # Looking through some of the older discussions about the csv module, I found this code helps identify headers, but I'm still not sure how to use them--or how to word the for-loop that I need correctly so I iterate through each row in the csv file. f_out.close() fp=open(r'c:test-8-29-10.csv', 'r') inputfile=csv.DictReader(fp) for record in inputfile: print record {'Category': 'A', 'CatCt': '456454', 'MyWord': '!', 'Ct': '2932'} {'Category': 'B', 'CatCt': '64451', 'MyWord': '!', 'Ct': '2109'} ... fp.close() # So I feel like I have *some* of the pieces, but I'm just missing a bunch of little connections. Any and all help would be much appreciated! Tyler
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor