On 2 June 2016 at 19:19, Olaoluwa Thomas <thomasolaol...@gmail.com> wrote: > Thanks, everyone, for your help. > > The objective was to extract all words from each line and place them in a > list IF they didn't already exist in it. > I sorted it out by adding little bits of everyone's suggestions.
Well done. It looks like you have it working now which is good. Now that you've told us about the extra bit that you only wanted to store *unique* words I thought I'd tell you about Python's set data type. A set is a container similar to a list but different. You can create a set by using curly brackets {} rather than square brackets [] for a list e.g.: >>> myset = {3,2,6} >>> myset set([2, 3, 6]) Notice first that when we print the set out it doesn't print the elements in the same order that we put them in. This is because a set doesn't care about the order of the elements. Also a set only ever stores one copy of each item that you add so >>> myset.add(-1) >>> myset set([2, 3, -1, 6]) >>> myset.add(6) >>> myset set([2, 3, -1, 6]) The add method adds an element but when we add an element that's already in the set it doesn't get added a second time: a set only stores unique elements (which is what you want to do). So you wrote: > lst = list() > for line in fhand: > words = line.split() > for word in words: > if word in lst: > continue > else: > lst.append(word) > lst.sort() > print lst Using a set we could instead write: unique_words = set() for line in fhand: words = line.split() for word in words: unique_words.add(word) lst = sorted(unique_words) print lst This is made simpler because we didn't need to check if the word was already in the set (the set.add method takes care of this for us). However since a set doesn't have an "order" it doesn't have a sort method. If we want a sorted list we can use the sorted function to get that from the set. Also there is a set method "update" which can add many elements at once given e.g. a list like words so we can do: unique_words = set() for line in fhand: words = line.split() unique_words.update(words) lst = sorted(unique_words) print lst I've mentioned two big differences between a set and a list: a set is unordered and only stores unique elements. There is another significant difference which is about how long it takes the computer to perform certain operations with sets vs lists. In particular when we do this if word in lst: continue else: lst.append(word) Testing if word is "in" a list can take a lot longer than testing if it is "in" a set. If you need to test many times whether different objects are "in" a list then it can often make your program a lot slower than if you used a set. You can understand this intuitively by thinking that "if word in lst" requires the computer to loop through all items of the list comparing them with word. With sets the computer has a cleverer way of doing this that is much faster when the set/list is large (look up hash tables if you're interested). -- Oscar _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor