Hashbert – file verification by maintaining a file of hash-codes
: Recalculating all the hash-codes of your files may take a really long time.
Hashbert allows you to skip files whose last-mod-time and size hasn't changed.
So assume you have a hierarchy of files such as backup hard drive.
Go to the root level of that hierarchy and run:
A file called hashcodes.txt
will be created that looks something like this:
d404401c8c6495b206fc35c95e55a6d5 1474896174 3 a.txt
bfcc9da4f2e1d313c63cd0a4ee7604e9 1474896180 3 dir1/b.txt
Where each row represents a file in your file-hierarchy. The columns are: hashcode modTime filesize filename
As files are added/changed/deleted, then you later again call hashbert sync
and hashcodes.txt will be updated, although this time it will go much faster since all files whose modTime and size hasn't changed are skipped.
Later you want to check the files, you then call:
And the hashcodes are recalculated, to see if any of the files have gone corrupt.
How to obtain:
(Note, only tested on linux)
- Download the files.
- Compile with g++ -o hashbert hashbert.cc -lcrypto -std=gnu++17 -O3
- Note: the -lcrypto option requires an extra library to be installed:
- On Debian 8.3 I needed to run sudo apt-get install libssl-dev
- On Manjaro 17 I needed to install the boost-package.
- Make sure the executable hashbert-file is found (is in a folder that is also in the execution path) (on Debian/Manjaro I move the file with the command: sudo cp hashbert /usr/local/bin/)
usage: hashbert <command> [<args>]
: Synchronize (or create) hashcodes.txt.
: Goes through hashcodes.txt row by row and recalculate the hashcode and reports any missmatch.
Display link to this page
Specify HASHCODEFILE (default hashcodes.txt)
directory (containing the files that will end up in hashcodes.txt). (default: '.') If you use relative search paths (not starting with "/") then you must make sure you stand in the right directory when you run the program. (Open hashcodes.txt and see where the filenames start if you are uncertain.)
args (available for "check" only):
Start on row N.
If the "sync" process becomes interupted, there will be a file with the suffix ".new.tmp" (hashcodes.txt.new.tmp) with the hashcodes calculated when the process was interupted. You can rename that to hashcodes.txt and restart the process and it will continue where it left of.