I wrote codes to add 'like' at the end of every 3 word in a nltk text as
follows:
>>> text = nltk.corpus.brown.words(categories = 'news')
>>> def hedge(text):
for i in range(3,len(text),4):
new_text = text.insert(i, 'like')
return new_text[:50]
>>> hedge(text)
Tr
On page 77 of the book natural language processing with Python, we have such an
exercise: The polysemy of a word is the number of senses it has. Using WordNet,
we can determine that the noun doghas seven senses with len(wn.synsets('dog',
'n')).
Compute the average polysemy of nouns, verbs, adjec
I don't know why lemma_list = [synset.lemma_names for synset in synset_list]
will lead to such an error.
I have to use extend to solve the problem for lemma_list. The following codes
are successful, take all the nouns as an example:
>>> def average_polysemy(pos):
synset_list = list(wn.all_s
Is there a unique method in python to unique a list? thanks
--
http://mail.python.org/mailman/listinfo/python-list
> Try backdenting that statement. You're currently doing it at every
>
> iteration of the loop - that's why it's so much slower.
Thanks. I works now.
>>> def average_polysemy(pos):
synset_list = list(wn.all_synsets(pos))
sense_number = 0
lemma_list = []
for syns
Thanks. I try to use set() suggested by you. However, not successful. Please
see:
>>> synsets = list(wn.all_synsets('n'))
>>> synsets[:5]
[Synset('entity.n.01'), Synset('physical_entity.n.01'),
Synset('abstraction.n.06'), Synset('thing.n.12'), Synset('object.n.01')]
>>> lemma_set = set()
>>> for
Thanks very much for all of your tips. Take noun as an example. First, I need
find all the lemma_names in all the synsets whose pos is 'n'. Second, for each
lemma_name, I will check all their sense number.
1) Surely,we can know the number of synset whose pos is noun by
>>> len([synset for syn
> In fact, I'm guessing that's your problem. I think you're ending up
>
> with a list of lists of strings, when you think you're getting a list of
>
> strings.
>
Thanks. You guess right. It turns out that lemma_list is a list of list, as I
tested in the previous post.
--
http://mail.python
> structures are simple, just plain print will work, but for more
>
> complicated structures, pprint.pprint() is a life saver.
>
I did try . However,
>>> pprint.pprint(lemma_list)
Traceback (most recent call last):
File "", line 1, in
pprint.pprint(lemma_list)
NameError: name 'pprint
Thanks. By the way, do we have a list of explanations of error message? If so,
whenever we come across error message, we can refer to it and solve the problem
accordingly.
--
http://mail.python.org/mailman/listinfo/python-list
In order to solve the following question,
http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html:
★ Use one of the predefined similarity measures to score the similarity of each
of the following pairs of words. Rank the pairs in order of decreasing
similarity. How close is your ranking to the o
yes, thanks all your tips. I did try sorted with itemgetter. However, the
sorted results are same as follows whether I set reverse=True or reverse=
False. Isn't it strange? Thanks.
>>> import nltk
>>> from nltk.corpus import wordnet as wn
>>> pairs = {'car':'automobile', 'gem':'jewel', 'journey'
Dear all, the problem has been solved as follows. Thanks anyway:
>>> import nltk
>>> from nltk.corpus import wordnet as wn
>>> pairs = {'car':'automobile', 'gem':'jewel', 'journey':'voyage'}
>>> list_simi=[]
>>> for key in pairs:
word1 = wn.synset(str(key) + '.n.01')
word2 = wn.syn
Thanks indeed for all your suggestions. When I try my above codes, what puzzles
me is that when the data in the dictionary increase, some data become missing
in the sorted result. Quite odd. In the pairs, we have {'journey':'voyage'} but
in the sorted result no ('journey-voyage',0.25), which did
Thanks indeed for your tips. Now I understand the difference between tuples and
dictionaries deeper.
--
http://mail.python.org/mailman/listinfo/python-list
15 matches
Mail list logo