Hashbert – file verification by maintaining a file of hash-codes

Problem: Recalculating all the hash-codes of your files may take a really long time.

Solution: Hashbert allows you to skip files whose last-mod-time and size hasn't changed.

hashbert sync

So assume you have a hierarchy of files such as backup hard drive.

Go to the root level of that hierarchy and run:

hashbert sync
A file called hashcodes.txt will be created that looks something like this:
d404401c8c6495b206fc35c95e55a6d5 1474896174 3 a.txt
bfcc9da4f2e1d313c63cd0a4ee7604e9 1474896180 3 dir1/b.txt
Where each row represents a file in your file-hierarchy. The columns are: hashcode modTime filesize filename

As files are added/changed/deleted, then you later again call hashbert sync and hashcodes.txt will be updated, although this time it will go much faster since all files whose modTime and size hasn't changed are skipped.

hashbert check

Later you want to check the files, you then call:
hashbert check
And the hashcodes are recalculated, to see if any of the files have gone corrupt.

How to obtain:

(Note, only tested on linux)


usage: hashbert <command> [<args>]


sync: Synchronize (or create) hashcodes.txt.

check: Goes through hashcodes.txt row by row and recalculate the hashcode and reports any missmatch.


--help Display link to this page

-f HASHCODEFILE Specify HASHCODEFILE (default hashcodes.txt)

-d DIRECTORY directory (containing the files that will end up in hashcodes.txt). (default: '.') If you use relative search paths (not starting with "/") then you must make sure you stand in the right directory when you run the program. (Open hashcodes.txt and see where the filenames start if you are uncertain.)

args (available for "check" only):

--start N Start on row N.

Example of how I personally use the program (in combination with rsync)

Other notes

If the "sync" process becomes interupted, there will be a file with the suffix ".new.tmp" (hashcodes.txt.new.tmp) with the hashcodes calculated when the process was interupted. You can rename that to hashcodes.txt and restart the process and it will continue where it left of.

Issues and bugs

Source code