What is a hash function ?
A hash function takes any string as input and computes a fixed size fingerprint.
The hashes of two equal messages are equal. If the hashes of two strings are equal, it does not imply that the two strings are equal.
When the hashes of two different strings are equal, it is called a collision.
What the hash functions are used for ?
The hash functions are used to generate a fingerprint of a string, which is easier to manipulate than the initial string. If the two hashes are different, then the two input strings are different.
For example, if we want to check if two big files are different, it is easier to compare their hashes. If their hashes are different, it means the two files are different.
What are the properties of a secure hash function (from a cryptographic point of view) ?
The properties of a secure hash function from a cryptographic point of view are the following:
- the function must be collision free, ie nobody can find two different string having the same hash value;
- the function must have the hiding property, ie given a hash, it is infeasible to find the initial string having this hash;
- the function must be puzzle-friendly, ie given a hash, it is infeasible to build a string having the same hash;
Examples of hash
We will use the md5 algorithm for the following examples.
Let's create first a file hello-world.txt, with the text "Hello World !!!".
To compute the hash, we execute the following command:
openssl dgst -md5 hello-world.txt
The result is:
If we change the text in the file hello-world.txt from "Hello World !!!" to "hello World !!!", the hash becomes:
The md5 algorithm is no more used in cryptography, because it is possible to create collisions.