Raimo Niskanen <[email protected]> wrote:

> Rodrigo,
>
> was there anything wrong with my answer below (and others equal),
> apart from it not being the one you wanted, since you keep repeating
> the same question over and over again?
>
> Do you have a better answer?  Please share it for us to check.

Raimo, if people believe that hash(A)=hash(B) implies A=B, so strong
believe, that they use it in their programs, that they insult those
that contradict them, then my answer will not convince them, because I 
say that hash(A)=hash(B) is very far away of implying A=B. If you get
alone the answer, then it will be different, you will believe it. Just
imagine the hash function as a function from X with cardinality m to
an Y with cardinality n, where m>n, m much bigger than n: how do look
the set of the elements in X that are mapped to a single y in Y? Imagine
the partition of X in theese sets X_1, ...., X_n with cardinalities
m_1, ...., m_n and try to do conditional probabilities. But sure there
is a more direct way. Just play with it.

If you have one, two, three, a handful files with 4TB and calculate 
the hashes, the probability that two hashes coincide are sure far away.

If rsync take two 4TB files, one in the client, one in the server,
divide them in a lot of 500 bytes blocks, the one in the server in 
a lot more blocks than the one in the client, calculates hashes
here and there, and compare hashes here with hashes there, then we 
have something completely different.

>From time to time I think I should follow Kenneth Westerbacks recomendation
and go to a  math-for-idiots list, for example to Usenet Group "sci.math",
and then make a link to this thread in gmane: they will sure admire Marc
Espies wisdom and his efforts teaching idiots like me.

Rodrigo.

Reply via email to