Query regarding set([])?

2009-07-10 Thread vox
Hi,
I'm contsructing a simple compare-script and thought I would use set
([]) to generate the difference output. But I'm obviosly doing
something wrong.

file1 contains 410 rows.
file2 contains 386 rows.
I want to know what rows are in file1 but not in file2.

This is my script:
s1 = set(open("file1"))
s2 = set(open("file2"))
s3 = set([])
s1temp = set([])
s2temp = set([])

s1temp = set(i.strip() for i in s1)
s2temp = set(i.strip() for i in s2)
s3 = s1temp-s2temp

print len(s3)

Output is 119. AFAIK 410-386=24. What am I doing wrong here?

BR,
Andy
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Query regarding set([])?

2009-07-10 Thread vox
On Jul 10, 2:04 pm, Peter Otten <[email protected]> wrote:
> You are probably misinterpreting len(s3). s3 contains lines occuring in
> "file1" but not in "file2". Duplicate lines are only counted once, and the
> order doesn't matter.
>
> So there are 119 lines that occur at least once in "file2", but not in
> "file1".
>
> If that is not what you want you have to tell us what exactly you are
> looking for.
>
> Peter

Hi,
Thanks for the answer.

I am looking for a script that compares file1 and file2, for each line
in file1, check if line is present in file2. If the line from file1 is
not present in file2, print that line/write it to file3, because I
have to know what lines to add to file2.

BR,
Andy



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Query regarding set([])?

2009-07-10 Thread vox
On Jul 10, 4:17 pm, Dave Angel  wrote:
> vox wrote:
> > On Jul 10, 2:04 pm, Peter Otten <[email protected]> wrote:
>
> >> You are probably misinterpreting len(s3). s3 contains lines occuring in
> >> "file1" but not in "file2". Duplicate lines are only counted once, and the
> >> order doesn't matter.
>
> >> So there are 119 lines that occur at least once in "file2", but not in
> >> "file1".
>
> >> If that is not what you want you have to tell us what exactly you are
> >> looking for.
>
> >> Peter
>
> > Hi,
> > Thanks for the answer.
>
> > I am looking for a script that compares file1 and file2, for each line
> > in file1, check if line is present in file2. If the line from file1 is
> > not present in file2, print that line/write it to file3, because I
> > have to know what lines to add to file2.
>
> > BR,
> > Andy
>
> There's no more detail in that response.  To the level of detail you
> provide, the program works perfectly.  Just loop through the set and
> write the members to the file.
>
> But you have some unspecified assumptions:
>     1) order doesn't matter
>     2) duplicates are impossible in the input file, or at least not
> meaningful.  So the correct output file could very well be smaller than
> either of the input files.
>
> And a few others that might matter:
>     3) the two files are both text files, with identical line endings
> matching your OS default
>     4) the two files are ASCII, or at least 8 bit encoded, using the
> same encoding  (such as both UTF-8)
>     5) the last line of each file DOES have a trailing newline sequence

Thanks all for the input!
I have guess I have to think it through a couple times more. :)

BR,
Andy
-- 
http://mail.python.org/mailman/listinfo/python-list