More information about md5deep can be found at http://md5deep.sourceforge.net
What are checksums?
A checksum is a unique string of characters, or “hash”, assigned to a file. The hash stays the same until the file changes. This is useful for long-term preservation as a way to keep tabs on the degradation on files. This tutorial details how to generate and use checksums with md5deep.
How do I install md5deep?
The installation procedure varies by operating system. Consult the md5deep manual for your particular OS.
Which OS is best?
Md5deep is installed. Now what?
The checksum process works through Unix commands in Terminal (Terminal is a pre-installed program on the Mac OSX and Linux operating systems). In order to generate a checksum for a file, you must navigate to its directory by typing cd [directory name]
Once in the correct directory, simply typing “md5deep” and a filename will generate a hash. An asterisk after md5deep will designate all files in that folder. In this example, the hash is the string of characters starting with ff229...
The real power of md5deep comes with adding letters, or “flags”, to the command line. These flags perform different operations such as matching, recursively generating hashes, and estimating the time needed to generate large sets of hashes. For the DAMS's purposes, only a few flags (-r, -x, and -m) are used frequently.
Writing checksums to a file
Before discussing the flags, it’s important to know how to write a command’s output to a file. Users can either designate an existing file or let Terminal create the file for them.
There are three files on my Desktop I would like to generate hashes for:
By typing "md5deep *" I generated these three hashes.
To create a file for the hashes, simply type md5deep * >> [filename]
This command does not display any output in Terminal unless there are errors. The file should appear in your folder like so:
The -r flag allows md5deep to run hashes on the contents of sub-directories, including any directories within that sub-directory. For example, on my desktop there's a directory called “checksum_test.” Within that directory there are four more sub-directories.
Simply typing md5deep checksum_test will not return any hashes. Instead, it will say:
In order to give md5deep the permission to go into a folder (and that folder’s folders), the recursive flag needs to be added:
Md5deep also allows users to generate hashes and compare them to a pre-established list of hashes. There are two types of matching: positive and negative. Positive matching shows filenames with hashes that DO match, and negative matching shows filenames with hashes that DO NOT match. This is where you can see if the hash (and, most importantly, the file) has changed.
The syntax for positive matching is:
The syntax for negative matching is:
Remember that an asterisk signifies all files in that directory.
So let's say someone added something to the “test.txt” file after I wrote its hash to the “known_hashes.csv” file. When running a positive match, “test.txt” will not appear in the list of matches
Conversely, if a negative match is run, “text.txt” will be the only file to appear. Notice the tilde. This symbol denotes the old version of the file.
If you wish to show the hashes next to the file names, simply capitalize the flag:
Matching can also be done recursively. Just be sure to put the recursive flag first.