A small tool to process text files, and pick out strings that look like cryptographic hashes, GUIDs etc:
This isn't achieved just through fixed regular expressions for standard hash types. Rather, it uses trigrams to know what English text sort of looks like, and picks out substrings that deviate wildly from that. (As a result, it might have more false positives with different languages.)
By default the tool just highlights these, but if you pass in the --replace
argument, it randomizes all the hashes it finds - this can be handy for things
like constructing documentation without leaking secrets. By default this just
writes the new file to stdout, but --in-place
(which implies --replace
)
does a destructive in-place edit of all the provided files.
The randomization reserves character classes, so for example:
W6B43240-ad76s==62231DH00
might get randomized to
Q1L83073-mn13q==03510AP62
Debian packages, and standalone binaries for common platforms, are available on the release page.
If you use Nix then just use this repo's flake.nix
or
import its default.nix
; in addition, if you have Cachix
then cachix use simonchatts
gives you access to the binaries pre-built by
GitHub (x86_64 linux/macOS).
Otherwise, cargo build --release
.
Colour highlighting is done if (and only if) stdout is a terminal.
The --debug
flag might be handy if looking at the implementation - this
highlights in green those substrings that pass the basic pre-filter, but that
aren't categorized as hashes by the actual trigram algorithm.