Greetings again Hal,

Thank you for posting your small amounts of code and results inline. Thanks for also including clear questions. Your "surface" still seems to add extra space, so, if you could trim that, you may get even more responses from others who are on the Tutor mailing list.

Now, on to your question.

fname = raw_input("Enter file name: ")
if len(fname) < 1 : fname = "mbox-short.txt"
fh = open(fname)
count = 0
for line in fh:
   if not line.startswith('From'): continue
   line2 = line.strip()
   line3 = line2.split()
   line4 = line3[1]
   addresses = set()
   addresses.add(line4)
   count = count + 1
   print addresses
print "There were", count, "lines in the file with From as the first word"

The code produces the following out put:

In [15]: %run _8_5_v_13.py
Enter file name: mbox-short.txt
set(['stephen.marqu...@uct.ac.za'])

  [ ... snip ... ]

set(['c...@iupui.edu'])

Question no. 1: is there a build in function for set that parses the data for duplicates.

The problem is not with the data structure called set().

Your program is not bad at all.

I would suggest making two small changes to it.

I think I have seen a pattern in the samples of code you have been sending--this pattern is that you reuse the same variable inside a loop, and do not understand why you are not collecting (or accumulating) all of the results.

Here's your program.  I have moved two lines.  The idea here is to initialize
the 'addresses' variable before the loop begins (exactly like you do with the
'count' variable).  Then, after the loop completes (and, you have processed
all of your input and accumulated all of the desired data), you can also print
out the contents of the set variable called 'addresses'.

Try this out:

  fname = raw_input("Enter file name: ")
  if len(fname) < 1 : fname = "mbox-short.txt"
  fh = open(fname)
  count = 0
  addresses = set()
  for line in fh:
     if not line.startswith('From'): continue
     line2 = line.strip()
     line3 = line2.split()
     line4 = line3[1]
     addresses.add(line4)
     count = count + 1
  print "There were", count, "lines in the file with From as the first word"
  print addresses


Question no. 2: Why is there not a building function for append?

Question no. 3: If all else fails, i.e., append & set, my only option is the slice the data set?

I do not understand these two questions.

Good luck.

-Martin

P.S. By the way, Alan Gauld has also responded to your message, with
  a differently-phrased answer, but, fundamentally, he and I are
  saying the same thing.  Think about where you are initializing
  your variables, and know that 'addresses = set()' in the middle
  of the code is re-initializing the variable and throwing away
  anything that was there before..

--
Martin A. Brown
http://linux-ip.net/
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to