> Ok. I see something suspicious here. The for loop: > > ###### > for l in xx: > train_tokens.append(l) > ###### > > assumes that we get tokens from the 'xx' token. Is this true? Are you > sure you don't have to specifically say: > > ###### > for l in xx['SUBTOKENS']: > ... > ######
Hi Enas, I'm taking python-list again out of CC: my apologies to the others for not catching that sooner. Enas, please do not crosspost to multiple mailing lists. You have been doing this since at least June: http://mail.python.org/pipermail/python-list/2005-June/287371.html http://mail.python.org/pipermail/tutor/2005-June/039351.html http://mail.python.org/pipermail/tutor/2005-July/039642.html http://mail.python.org/pipermail/python-list/2005-July/289505.html http://mail.python.org/pipermail/tutor/2005-October/042155.html http://mail.python.org/pipermail/python-list/2005-October/303805.html If you crosspost, we at Tutor won't be able to see responses that go to python-list, and visa-versa. The end result clutters both lists and isn't friendly to either community. Please read: http://www.gweep.ca/~edmonds/usenet/ml-etiquette.html and try to change your habits in this area. Anyway, just as a concrete example of this: ###### >>> from nltk.tokenizer import * >>> text_token = Token(TEXT='hello world this is a test') >>> text_token.keys() ['TEXT'] >>> WhitespaceTokenizer().tokenize(text_token) >>> text_token.keys() ['TEXT', 'SUBTOKENS'] >>> text_token['SUBTOKENS'] [<hello>, <world>, <this>, <is>, <a>, <test>] >>> type(text_token['SUBTOKENS'][0]) <class 'nltk.token.Token'> ###### Do you understand this code, or is there something here that you're not familar with? Good luck to you. _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor