Re: binary file compare...

2009-04-18 Thread Piet van Oostrum
> Adam Olsen (AO) wrote: >AO> The Wayback Machine has 150 billion pages, so 2**37. Google's index >AO> is a bit larger at over a trillion pages, so 2**40. A little closer >AO> than I'd like, but that's still 56294995000 to 1 odds of having >AO> *any* collisions between *any* of the file

Re: binary file compare...

2009-04-17 Thread Steven D'Aprano
On Fri, 17 Apr 2009 11:19:31 -0700, Adam Olsen wrote: > Actually, *cryptographic* hashes handle that just fine. Even for files > with just a 1 bit change the output is totally different. This is known > as the Avalanche Effect. Otherwise they'd be vulnerable to attacks. > > Which isn't to say

Re: binary file compare...

2009-04-17 Thread Lawrence D'Oliveiro
In message , Nigel Rantor wrote: > Adam Olsen wrote: > >> The chance of *accidentally* producing a collision, although >> technically possible, is so extraordinarily rare that it's completely >> overshadowed by the risk of a hardware or software failure producing >> an incorrect result. > > Not

Re: binary file compare...

2009-04-17 Thread Adam Olsen
On Apr 17, 9:59 am, SpreadTooThin wrote: > You know this is just insane.  I'd be satisfied with a CRC16 or > something in the situation i'm in. > I have two large files, one local and one remote.  Transferring every > byte across the internet to be sure that the two files are identical > is just n

Re: binary file compare...

2009-04-17 Thread Adam Olsen
On Apr 17, 9:59 am, norseman wrote: > The more complicated the math the harder it is to keep a higher form of > math from checking (or improperly displacing) a lower one.  Which, of > course, breaks the rules.  Commonly called improper thinking. A number > of math teasers make use of that. Of cou

Re: binary file compare...

2009-04-17 Thread Adam Olsen
On Apr 17, 5:30 am, Tim Wintle wrote: > On Thu, 2009-04-16 at 21:44 -0700, Adam Olsen wrote: > > The Wayback Machine has 150 billion pages, so 2**37.  Google's index > > is a bit larger at over a trillion pages, so 2**40.  A little closer > > than I'd like, but that's still 56294995000 to 1 od

Re: binary file compare...

2009-04-17 Thread SpreadTooThin
On Apr 17, 4:54 am, Nigel Rantor wrote: > Adam Olsen wrote: > > On Apr 16, 11:15 am, SpreadTooThin wrote: > >> And yes he is right CRCs hashing all have a probability of saying that > >> the files are identical when in fact they are not. > > > Here's the bottom line.  It is either: > > > A) Sever

Re: binary file compare...

2009-04-17 Thread norseman
Adam Olsen wrote: On Apr 16, 11:15 am, SpreadTooThin wrote: And yes he is right CRCs hashing all have a probability of saying that the files are identical when in fact they are not. Here's the bottom line. It is either: A) Several hundred years of mathematics and cryptography are wrong. The

Re: binary file compare...

2009-04-17 Thread Tim Wintle
On Thu, 2009-04-16 at 21:44 -0700, Adam Olsen wrote: > The Wayback Machine has 150 billion pages, so 2**37. Google's index > is a bit larger at over a trillion pages, so 2**40. A little closer > than I'd like, but that's still 56294995000 to 1 odds of having > *any* collisions between *any* o

Re: binary file compare...

2009-04-17 Thread Nigel Rantor
Adam Olsen wrote: On Apr 16, 11:15 am, SpreadTooThin wrote: And yes he is right CRCs hashing all have a probability of saying that the files are identical when in fact they are not. Here's the bottom line. It is either: A) Several hundred years of mathematics and cryptography are wrong. The

Re: binary file compare...

2009-04-17 Thread Nigel Rantor
Adam Olsen wrote: On Apr 16, 4:27 pm, "Rhodri James" wrote: On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen wrote: On Apr 16, 3:16 am, Nigel Rantor wrote: Okay, before I tell you about the empirical, real-world evidence I have could you please accept that hashes collide and that no matter ho

Re: binary file compare...

2009-04-16 Thread Adam Olsen
On Apr 16, 4:27 pm, "Rhodri James" wrote: > On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen wrote: > > On Apr 16, 3:16 am, Nigel Rantor wrote: > >> Okay, before I tell you about the empirical, real-world evidence I have > >> could you please accept that hashes collide and that no matter how many

Re: binary file compare...

2009-04-16 Thread Adam Olsen
On Apr 16, 11:15 am, SpreadTooThin wrote: > And yes he is right CRCs hashing all have a probability of saying that > the files are identical when in fact they are not. Here's the bottom line. It is either: A) Several hundred years of mathematics and cryptography are wrong. The birthday problem

Re: binary file compare...

2009-04-16 Thread Rhodri James
On Thu, 16 Apr 2009 10:44:06 +0100, Adam Olsen wrote: On Apr 16, 3:16 am, Nigel Rantor wrote: Okay, before I tell you about the empirical, real-world evidence I have could you please accept that hashes collide and that no matter how many samples you use the probability of finding two files th

Re: binary file compare...

2009-04-16 Thread Adam Olsen
On Apr 16, 8:59 am, Grant Edwards wrote: > On 2009-04-16, Adam Olsen wrote: > > I'm afraid you will need to back up your claims with real files. > > Although MD5 is a smaller, older hash (128 bits, so you only need > > 2**64 files to find collisions), > > You don't need quite that many to have a

Re: binary file compare...

2009-04-16 Thread SpreadTooThin
On Apr 16, 3:16 am, Nigel Rantor wrote: > Adam Olsen wrote: > > On Apr 15, 12:56 pm, Nigel Rantor wrote: > >> Adam Olsen wrote: > >>> The chance of *accidentally* producing a collision, although > >>> technically possible, is so extraordinarily rare that it's completely > >>> overshadowed by the

Re: binary file compare...

2009-04-16 Thread Grant Edwards
On 2009-04-16, Adam Olsen wrote: > The chance of *accidentally* producing a collision, although > technically possible, is so extraordinarily rare that it's > completely overshadowed by the risk of a hardware or software > failure producing an incorrect result. Not when

Re: binary file compare...

2009-04-16 Thread Nigel Rantor
Adam Olsen wrote: On Apr 16, 3:16 am, Nigel Rantor wrote: Adam Olsen wrote: On Apr 15, 12:56 pm, Nigel Rantor wrote: Adam Olsen wrote: The chance of *accidentally* producing a collision, although technically possible, is so extraordinarily rare that it's completely overshadowed by the risk

Re: binary file compare...

2009-04-16 Thread Adam Olsen
On Apr 16, 3:16 am, Nigel Rantor wrote: > Adam Olsen wrote: > > On Apr 15, 12:56 pm, Nigel Rantor wrote: > >> Adam Olsen wrote: > >>> The chance of *accidentally* producing a collision, although > >>> technically possible, is so extraordinarily rare that it's completely > >>> overshadowed by the

Re: binary file compare...

2009-04-16 Thread Nigel Rantor
Adam Olsen wrote: On Apr 15, 12:56 pm, Nigel Rantor wrote: Adam Olsen wrote: The chance of *accidentally* producing a collision, although technically possible, is so extraordinarily rare that it's completely overshadowed by the risk of a hardware or software failure producing an incorrect resu

Re: binary file compare...

2009-04-16 Thread Adam Olsen
On Apr 15, 12:56 pm, Nigel Rantor wrote: > Adam Olsen wrote: > > The chance of *accidentally* producing a collision, although > > technically possible, is so extraordinarily rare that it's completely > > overshadowed by the risk of a hardware or software failure producing > > an incorrect result.

Re: binary file compare...

2009-04-15 Thread Nigel Rantor
Adam Olsen wrote: The chance of *accidentally* producing a collision, although technically possible, is so extraordinarily rare that it's completely overshadowed by the risk of a hardware or software failure producing an incorrect result. Not when you're using them to compare lots of files. Tr

Re: binary file compare...

2009-04-15 Thread Adam Olsen
On Apr 15, 11:04 am, Nigel Rantor wrote: > The fact that two md5 hashes are equal does not mean that the sources > they were generated from are equal. To do that you must still perform a > byte-by-byte comparison which is much less work for the processor than > generating an md5 or sha hash. > > I

Re: binary file compare...

2009-04-15 Thread SpreadTooThin
On Apr 15, 8:04 am, Grant Edwards wrote: > On 2009-04-15, Martin wrote: > > > > > Hi, > > > On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards wrote: > >> On 2009-04-13, SpreadTooThin wrote: > > >>> I want to compare two binary files and see if they are the same. > >>> I see the filecmp.cmp functi

Re: binary file compare...

2009-04-15 Thread Nigel Rantor
Grant Edwards wrote: We all rail against premature optimization, but using a checksum instead of a direct comparison is premature unoptimization. ;) And more than that, will provide false positives for some inputs. So, basically it's a worse-than-useless approach for determining if two files

Re: binary file compare...

2009-04-15 Thread Nigel Rantor
Martin wrote: On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano wrote: The checksum does look at every byte in each file. Checksumming isn't a way to avoid looking at each byte of the two files, it is a way of mapping all the bytes to a single number. My understanding of the original question

Re: binary file compare...

2009-04-15 Thread Grant Edwards
On 2009-04-15, Martin wrote: > On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano > I'd still say rather burn CPU cycles than development hours (if I got > the question right), _Hours_? Calling the file compare module takes _one_line_of_code_. Implementing a file compare from scratch takes abo

Re: binary file compare...

2009-04-15 Thread Grant Edwards
On 2009-04-15, Martin wrote: > Hi, > > On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards wrote: >> On 2009-04-13, SpreadTooThin wrote: >> >>> I want to compare two binary files and see if they are the same. >>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling >>> that it is doin

Re: binary file compare...

2009-04-15 Thread Martin
On Wed, Apr 15, 2009 at 11:03 AM, Steven D'Aprano wrote: > The checksum does look at every byte in each file. Checksumming isn't a > way to avoid looking at each byte of the two files, it is a way of > mapping all the bytes to a single number. My understanding of the original question was a way t

Re: binary file compare...

2009-04-15 Thread Steven D'Aprano
On Wed, 15 Apr 2009 07:54:20 +0200, Martin wrote: >> Perhaps I'm being dim, but how else are you going to decide if two >> files are the same unless you compare the bytes in the files? > > I'd say checksums, just about every download relies on checksums to > verify you do have indeed the same fil

Re: binary file compare...

2009-04-14 Thread Martin
Hi, On Mon, Apr 13, 2009 at 10:03 PM, Grant Edwards wrote: > On 2009-04-13, SpreadTooThin wrote: > >> I want to compare two binary files and see if they are the same. >> I see the filecmp.cmp function but I don't get a warm fuzzy feeling >> that it is doing a byte by byte comparison of two files

Re: binary file compare...

2009-04-14 Thread Adam Olsen
On Apr 13, 8:39 pm, Grant Edwards wrote: > On 2009-04-13, Peter Otten <[email protected]> wrote: > > > But there's a cache. A change of file contents may go > > undetected as long as the file stats don't change: > > Good point.  You can fool it if you force the stats to their > old values after you

Re: binary file compare...

2009-04-13 Thread Grant Edwards
On 2009-04-13, Peter Otten <[email protected]> wrote: > But there's a cache. A change of file contents may go > undetected as long as the file stats don't change: Good point. You can fool it if you force the stats to their old values after you modify a file and you don't clear the cache. -- Gra

Re: binary file compare...

2009-04-13 Thread Dave Angel
SpreadTooThin wrote: On Apr 13, 2:37 pm, Grant Edwards wrote: On 2009-04-13, Grant Edwards wrote: On 2009-04-13, SpreadTooThin wrote: I want to compare two binary files and see if they are the same. I see the filecmp.cmp function but I don't get a warm fuzzy feeling that i

Re: binary file compare...

2009-04-13 Thread Steven D'Aprano
On Mon, 13 Apr 2009 15:03:32 -0500, Grant Edwards wrote: > On 2009-04-13, SpreadTooThin wrote: > >> I want to compare two binary files and see if they are the same. I see >> the filecmp.cmp function but I don't get a warm fuzzy feeling that it >> is doing a byte by byte comparison of two files t

Re: binary file compare...

2009-04-13 Thread Peter Otten
Grant Edwards wrote: > On 2009-04-13, Grant Edwards wrote: >> On 2009-04-13, SpreadTooThin wrote: >> >>> I want to compare two binary files and see if they are the same. >>> I see the filecmp.cmp function but I don't get a warm fuzzy feeling >>> that it is doing a byte by byte comparison of two

Re: binary file compare...

2009-04-13 Thread SpreadTooThin
On Apr 13, 2:37 pm, Grant Edwards wrote: > On 2009-04-13, Grant Edwards wrote: > > > > > On 2009-04-13, SpreadTooThin wrote: > > >> I want to compare two binary files and see if they are the same. > >> I see the filecmp.cmp function but I don't get a warm fuzzy feeling > >> that it is doing a by

Re: binary file compare...

2009-04-13 Thread Grant Edwards
On 2009-04-13, Grant Edwards wrote: > On 2009-04-13, SpreadTooThin wrote: > >> I want to compare two binary files and see if they are the same. >> I see the filecmp.cmp function but I don't get a warm fuzzy feeling >> that it is doing a byte by byte comparison of two files to see if they >> are t

Re: binary file compare...

2009-04-13 Thread SpreadTooThin
On Apr 13, 2:03 pm, Grant Edwards wrote: > On 2009-04-13, SpreadTooThin wrote: > > > I want to compare two binary files and see if they are the same. > > I see the filecmp.cmp function but I don't get a warm fuzzy feeling > > that it is doing a byte by byte comparison of two files to see if they

Re: binary file compare...

2009-04-13 Thread Grant Edwards
On 2009-04-13, SpreadTooThin wrote: > I want to compare two binary files and see if they are the same. > I see the filecmp.cmp function but I don't get a warm fuzzy feeling > that it is doing a byte by byte comparison of two files to see if they > are they same. Perhaps I'm being dim, but how el

Re: binary file compare...

2009-04-13 Thread SpreadTooThin
On Apr 13, 2:00 pm, Przemyslaw Kaminski wrote: > SpreadTooThin wrote: > > I want to compare two binary files and see if they are the same. > > I see the filecmp.cmp function but I don't get a warm fuzzy feeling > > that it is doing a byte by byte comparison of two files to see if they > > are they

Re: binary file compare...

2009-04-13 Thread Przemyslaw Kaminski
SpreadTooThin wrote: > I want to compare two binary files and see if they are the same. > I see the filecmp.cmp function but I don't get a warm fuzzy feeling > that it is doing a byte by byte comparison of two files to see if they > are they same. > > What should I be using if not filecmp.cmp? W

Re: Binary file Pt 1 - Only reading some

2008-02-05 Thread Gabriel Genellina
En Tue, 05 Feb 2008 11:50:25 -0200, Mastastealth <[EMAIL PROTECTED]> escribi�: > On Feb 5, 1:17 am, Gabriel Genellina <[EMAIL PROTECTED]> wrote: >> Using the struct module  http://docs.python.org/lib/module-struct.html >> >> import struct >> data = info.read(15) >> str1, str2, blank, height, wid

Re: Binary file Pt 1 - Only reading some

2008-02-05 Thread Mastastealth
On Feb 5, 8:50 am, Mastastealth <[EMAIL PROTECTED]> wrote: > What is this value for? "6s3s1cBBBh" and why is my unpack limited to a > length of "16"? > > Unfortunately it seems my understanding of binary is way too basic for > what I'm dealing with. Can you point me to a simple guide to > explainin

Re: Binary file Pt 1 - Only reading some

2008-02-05 Thread Mastastealth
On Feb 5, 1:17 am, Gabriel Genellina <[EMAIL PROTECTED]> wrote: > Using the struct module  http://docs.python.org/lib/module-struct.html > > import struct > data = info.read(15) > str1, str2, blank, height, width, num2, num3 = > struct.unpack("6s3s1cBBBh", data) > > Consider this like a "first atte

Re: Binary file Pt 1 - Only reading some

2008-02-04 Thread Gabriel Genellina
On 5 feb, 01:51, Mastastealth <[EMAIL PROTECTED]> wrote: > I'm trying to create a program to read a certain binary format. I have > the format's spec which goes something like: > > First 6 bytes: String > Next 4 bytes: 3 digit number and a blank byte > --- > Next byte: Height (Number up to 255) > N

Re: Binary file Pt 1 - Only reading some

2008-02-04 Thread Jared Grubb
You should look into the struct module. For example, you could do the same thing via (using the variable names you used before): header_str = info.read(13) a,b,c,d,e = struct.unpack("6s4sBBB", header_str) After that, you will probably be able to get the integers by (doing it one at a time... read'

Re: Binary file output using python

2007-04-19 Thread Nick Craig-Wood
Peter Otten <[EMAIL PROTECTED]> wrote: > Chi Yin Cheung wrote: > > > Is there a way in python to output binary files? I need to python to > > write out a stream of 5 million floating point numbers, separated by > > some separator, but it seems that all python supports natively is string > > infor

Re: Binary file output using python

2007-04-17 Thread Peter Otten
Chi Yin Cheung wrote: > Is there a way in python to output binary files? I need to python to > write out a stream of 5 million floating point numbers, separated by > some separator, but it seems that all python supports natively is string > information output, which is extremely space inefficient.

Re: Binary file output using python

2007-04-17 Thread bvukov
On Apr 17, 10:30 pm, Thomas Dybdahl Ahle <[EMAIL PROTECTED]> wrote: > Den Tue, 17 Apr 2007 11:07:38 -0700 skrev kyosohma: > > > On Apr 17, 12:41 pm, Chi Yin Cheung <[EMAIL PROTECTED]> wrote: > >> Hi, > >> Is there a way in python to output binary files? I need to python to > >> write out a stream o

Re: Binary file output using python

2007-04-17 Thread Thomas Dybdahl Ahle
Den Tue, 17 Apr 2007 11:07:38 -0700 skrev kyosohma: > On Apr 17, 12:41 pm, Chi Yin Cheung <[EMAIL PROTECTED]> wrote: >> Hi, >> Is there a way in python to output binary files? I need to python to >> write out a stream of 5 million floating point numbers, separated by >> some separator, but it seem

Re: Binary file output using python

2007-04-17 Thread Michael Hoffman
Michael Hoffman wrote: > Chi Yin Cheung wrote: >> Hi, >> Is there a way in python to output binary files? I need to python to >> write out a stream of 5 million floating point numbers, separated by >> some separator, but it seems that all python supports natively is >> string information output,

Re: Binary file output using python

2007-04-17 Thread Michael Hoffman
Chi Yin Cheung wrote: > Hi, > Is there a way in python to output binary files? I need to python to > write out a stream of 5 million floating point numbers, separated by > some separator, but it seems that all python supports natively is string > information output, which is extremely space inef

Re: Binary file output using python

2007-04-17 Thread kyosohma
On Apr 17, 12:41 pm, Chi Yin Cheung <[EMAIL PROTECTED]> wrote: > Hi, > Is there a way in python to output binary files? I need to python to > write out a stream of 5 million floating point numbers, separated by > some separator, but it seems that all python supports natively is string > information

Re: Binary File Reading : Metastock

2006-05-03 Thread malv
Jack wrote: > Hi > > I am having a little trouble trying to read a binary file, I would like > to write an ascii to Metastock converter in python but am not having a > lot of success. > > The file formats are > > http://sf.gds.tuwien.ac.at/00-pdf/m/mstockfl/MetaStock.pdf > > > If any one can point

Re: Binary File Reading : Metastock

2006-05-03 Thread bruno at modulix
Jack wrote: > Hi > > I am having a little trouble trying to read a binary file, I would like > to write an ascii to Metastock converter in python but am not having a > lot of success. > > The file formats are > > http://sf.gds.tuwien.ac.at/00-pdf/m/mstockfl/MetaStock.pdf > > > If any one can p

Re: binary file

2005-06-20 Thread Scott David Daniels
Kent Johnson wrote: > Nader Emami wrote: >> Kent Johnson wrote: >>> Nader Emami wrote: I have used the profile module to measure some thing as the next command: profile.run('command', 'file') ...How can I read (or convert) the binary file to an ascii file? >>> Use an instance o

Re: binary file

2005-06-20 Thread Kent Johnson
Nader Emami wrote: > Kent Johnson wrote: > >> Nader Emami wrote: >> >>> L.S., >>> >>> I have used the profile module to measure some thing as the next >>> command: >>> >>> profile.run('command', 'file') >>> >>> But this make a binary file! How can I write the result of 'profile' >>> in a ascii f

Re: binary file

2005-06-20 Thread Nader Emami
Kent Johnson wrote: > Nader Emami wrote: > >> L.S., >> >> I have used the profile module to measure some thing as the next command: >> >> profile.run('command', 'file') >> >> But this make a binary file! How can I write the result of 'profile' >> in a ascii file? Others how can I read (or convert

Re: binary file

2005-06-20 Thread Kent Johnson
Nader Emami wrote: > L.S., > > I have used the profile module to measure some thing as the next command: > > profile.run('command', 'file') > > But this make a binary file! How can I write the result of 'profile' in > a ascii file? Others how can I read (or convert) the binary file to am > asc