> Ok. I see something suspicious here. The for loop:
>
> ##
> for l in xx:
> train_tokens.append(l)
> ##
>
> assumes that we get tokens from the 'xx' token. Is this true? Are you
> sure you don't have to specifically say:
>
> ##
> for l in xx['SUBTOKENS']:
> ...
> ##
On Mon, 14 Nov 2005, enas khalil wrote:
> hello all
[program cut]
Hi Enas,
You may want to try talking with NTLK folks about this, as what you're
dealing with is a specialized subject. Also, have you gone through the
tokenization tutorial in:
http://nltk.sourceforge.net/tutorial/tok
hello all when i run the code : # -*- coding: cp1256 -*-from nltk.tagger import *from nltk.corpus import brownfrom nltk.tokenizer import WhitespaceTokenizer # Tokenize ten texts from the Brown Corpustrain_tokens = []xx=Token(TEXT=open('fataha2.txt').read())WhitespaceTokenizer().tokenize(xx)f