checksums explained
sha1 and md5 hashing algorithms..

image of a .hash file, in super-large 256 pixel size A checksum is an advanced form of redundancy check, a one-way "digital fingerprint", or more correctly, an asymmetric cryptographic computation. Essentially, a checksum is a unique signature, created by performing lots (and lots) of one-way manipulations on some data (aka, "the message"), eventually producing the fixed-length string we know as a "hash", aka. "message digest".

Crucially, the steps taken to compute this signature are well-known, and can be re-calculated, relatively quickly, anywhere, and at any time in the future, producing the exact same hash. Because the hashing algorithm is well-known, and 100% pre-determined; any change in the computed hash indicates that the message itself MUST have changed. Due to what's known as the "Avalanche effect", even a minute change in the data results in a completely different hash.

Hashing functions suitable for cryptographic purposes must fulfil two important criteria..
  1. It must be computationally infeasible to derive the original data from the hash, and..
  2. It must be computationally infeasible to create another file with the same hash (aka, "a hash collision")
Note, I don't say "impossible", because any cryptographic function is susceptible to brute-force attack; the feasibility depends on how much computational time would be required to successfully "break" a given hash, and find collisions. For MD5, this time has decreased dramatically over the last few years, though for SHA1, it can still be measured in the realms of "all the computers on the planet working together for X number of years", which is clearly beyond the resources of most people.

However, efforts are underway to do exactly this, mainly utilizing large distributed computing networks, like BOINC is.

For file verification purposes (checksum's main use) the MD5 algorithm is perfect, mainly because of its good speed. However, if there is the potential for intentional file tampering, SHA1 is the preferred algorithm because, as yet, there is no known way to compute a useful hash collision in a practical time-frame. Before that becomes a reality, checksum will likely have other algorithms available. For now, these two work great..

MD5

MD5, (aka. 'Message-Digest Algorithm 5') the most commonly used cryptographic hash function, was invented in 1991, by Ronald Rivest at MIT (previous to this, he had developed MD4). An MD5 hash has 128-bit hash value, which is typically represented as a 32-character hexadecimal number, e.g. d24c7f0e7bc6d4cb9dacb0ff5027cc98

In the mid-Nineties, MD5 was successfully "cracked". By no means does this make MD5 useless, rather, it is no longer recommended for situations where security is the prime concern.

SHA1

The SHA1 (aka. "Secure Hashing Algorithm FIPS PUB 180-1") cryptographic function was created by the NSA, and first published by NIST in 1995. SHA1 computes a message digest that is 160 bits long, and represented as a 40-character hexadecimal number, e.g. 77e0c5a57709fbaa65e21cb7aa22184a99536df5.