Four secret scanners compared on four codebases

Trestle, Gitleaks, TruffleHog, and detect-secrets run with default settings over four public codebases. Every scanner catches an obvious API key. The differences show up with the secrets that don't look like secrets: hashed passwords, weak passwords, credit card numbers, and values that get sent to the browser.

Tool comparison.

What every scanner finds#

Most secret scanners find the same easy things. Point any of the four tools in this comparison at a repository and they will report the private keys, the cloud and API keys, and the session tokens. All four did that on every codebase we tested.

The differences appear with the secrets that don't look like secrets. A hashed password is just a long string of letters and numbers. A database password can be an ordinary word with nothing random about it. A credit card number, a crypto wallet recovery phrase, a configuration value that ends up bundled into a web page: none of these match the format of a known token, and none of them stand out as random text. Whether a scanner reports them depends on how it works.

Three of these tools, Gitleaks, TruffleHog, and detect-secrets, scan the raw text. They match it against patterns for known secret formats, and they flag strings that look random enough to be a key. The fourth, Trestle, reads the structure of each file the way the program that loads the file would. It sees each value together with the field it belongs to, so it recognizes a password by that field's name, even when the value looks like an ordinary word. This comparison measures what that difference is worth, using each tool's default settings on public code anyone can download and check.

How the test was run#

The four codebases are public and were chosen to be different from one another. Two are insecure demo projects, one is a set of credentials committed on purpose, and one is a clean software release that should contain no secrets at all. Together they test two separate things: how much real material a scanner finds, and how much unrelated noise it produces when there is little or nothing to find.

Each tool ran with its default settings against the full set of files: no exclusions, no allow-lists, and no saved baseline. That is what a developer gets the first time they run it. The finding counts come straight from each tool's own output, and whether a finding is real, low-value, or a false alarm was decided by checking it against the known contents of the repository. Where to get each repository, the commands, and the exact output of every run are gathered at the end of the article.

What each tool catches#

The first three rows are the common ground: every scanner finds private keys, cloud and API keys, and session tokens. The rows below are where the tools differ. Each one is a type of secret that turned up in at least one codebase and that only some tools reported. The pattern is steady. Once a secret stops matching a recognizable format, the text-scanning tools tend to miss it.

Type of secretTrestleGitleaksTruffleHogdetect-secrets
How it worksreads code structure and valuesmatches text patternsmatches patterns, then checks if livematches patterns and randomness
Private keys
Cloud and API keys Amazon, Google, Slack, Stripe, and so on
Session tokens
Hashed passwordsyesnonono
Plain passwords ordinary words, nothing random to catchyessomenosome
Credit card numbers and personal datayesnonono
Crypto wallet recovery phraseyesnonono
Secrets sent to the browser a config value bundled into the web pageyesnonono
Checks whether a key is still activeyesnoyesno
Below the first three rows, Trestle was the only tool that reported hashed passwords, plain-word passwords, card and personal data, wallet recovery phrases, and secrets that get sent to the browser. Checking whether a key is still active is the one extra capability it shares with TruffleHog. The sections below show where each of these findings came from.

Results, one codebase at a time#

In each chart, the length of the bar is the number of findings. The color shows what those findings turned out to be once checked. A long bar is not automatically good: on a clean codebase it just means more false alarms to sort through. Where detect-secrets produced far more findings than the others, its bar is cut off and faded, with the real number shown beside it.

leaky-repo#

planted credentials

A test of how much of a known set each tool finds. Every tool reached the obvious keys and tokens. The difference was in the hashed and weak passwords stored in the same files.

Trestle
71 all real
detect-secrets
46 39 distinct
Gitleaks
22
TruffleHog
11 1 false
All four tools found the keys and cloud credentials. Only Trestle also reported the hashed passwords and the weak passwords stored alongside them. The counts are also affected by how each tool reports: Trestle lists related warnings separately, and detect-secrets reported several lines twice under different rules, giving 46 findings but 39 distinct ones.

spectral-goat#

insecure demo project

This project includes data-science notebooks, and those notebooks are full of long random-looking text. It is the clearest single look at how much noise a randomness check produces. Bars are scaled to the range of the other three tools; detect-secrets runs well past it.

detect-secrets
7,700 ~850x
Trestle
9 all real
Gitleaks
9 1 false
TruffleHog
6 2 false
Trestle's nine findings were all real and included a hashed password and a database password that the other tools missed. One of Gitleaks's nine was a placeholder Slack token whose value was clearly a dummy, which Trestle left out. TruffleHog reported two false alarms from one of the notebooks. detect-secrets returned about 7,700 findings, nearly all of them long random-looking text from the notebooks, which buries the handful that matter.

WordPress 7.0#

clean release

This release contains no real secrets, so anything reported is a false alarm. Bars are scaled to the range of the other three tools; detect-secrets runs well past it.

detect-secrets
180 all false
Gitleaks
2 all false
TruffleHog
2 all false
Trestle
nothing found
0
Only Trestle reported nothing, which is the correct result. Gitleaks raised two false alarms (a placeholder value and a theme name), TruffleHog two (an example web address from the documentation), and detect-secrets 180.

OWASP Juice Shop#

real app, all files included

With no files excluded, the raw counts here are mostly test files and older copies kept in the project's history, and that is true for every tool. The total is therefore misleading. What matters is which kinds of secret each tool reported. Bars are scaled to the range of the other three tools; detect-secrets runs well past it.

detect-secrets
2,114 ~19x
Trestle
110 ~25 real
Gitleaks
59 ~5 in src
TruffleHog
10 1 in src
Setting aside the test files and history that pad all four counts, the difference is in the kinds of secret found. Trestle's findings included a private key, credit card numbers, login and seed credentials, a wallet recovery phrase, and a setting that gets sent to the browser. TruffleHog found one real key in the source and missed the rest. The card numbers, the recovery phrase, and the browser exposure appeared only in Trestle's results. The 2,114 findings from detect-secrets were mostly long random-looking strings from the compiled front-end files, not test files.
real or high-value test file, history, or low-value false alarm or random-string noise

False alarms and speed#

Finding secrets is only half of the job. The other half is how much unrelated material you read through to act on the real findings. The clean WordPress release is the clearest measure of this. With no secrets to find, every result is a false alarm, and the range ran from zero for Trestle to 180 for detect-secrets. The noise comes from one practice: flagging anything that looks random. That accounts for almost all of detect-secrets' 7,700 findings on the demo project and 2,114 on Juice Shop, because the long encoded strings in notebooks and compiled files look random whether or not they hold a secret.

On speed, Trestle was the fastest on three of the four codebases and nearly tied on the fourth, the 66 KB planted-credentials set, where Gitleaks was faster by about 0.02 seconds. The gap grows with size. On the WordPress release, Trestle finished in about 0.7 seconds against roughly 13 seconds for Gitleaks. detect-secrets was the slowest throughout, taking up to about 51 seconds on Juice Shop.

Scan time, in secondsTrestleGitleaksTruffleHogdetect-secrets
leaky-repo0.100.097.071.35
spectral-goat0.110.903.255.11
WordPress 7.00.7313.425.1838.68
Juice Shop0.425.784.0650.88

Which tool to use#

If the secrets you care about are keys, cloud credentials, and tokens, any of these four tools will find them. Choose based on speed, ease of use, and how the output fits your workflow. On the clean release, Gitleaks and TruffleHog stayed nearly silent, with two false alarms each. detect-secrets is built to be used with a saved baseline that you review and approve once; run without one, its randomness checks produce a lot of noise.

Most codebases hold more than keys. They hold hashed passwords, database passwords that are ordinary words, payment data in test files, recovery phrases, and settings that get sent to the browser. These are the types a text-pattern scanner cannot recognize and a randomness check cannot catch. Across every codebase in this test, Trestle was the only tool that reported them, while staying silent on the clean release and finishing fastest on the larger sets. That is the practical effect of reading values out of the code instead of scanning text.

The results are worth checking rather than taking on trust. The repositories are public, the commands are four lines, and the points where the tools disagree are the most useful part. Download them and run the test.

How to reproduce this#

This section has everything needed to repeat the comparison: where to get each repository, the commands, and a sample of the output of every run. Run the four commands at the top of each repository and compare against the output shown.

# the same four commands, run at the top of each repository
trestle --explain
gitleaks dir -v
trufflehog filesystem .
detect-secrets scan

--explain is a Pro flag, and it is not required to reproduce these results. It adds a plain-language explanation and fix steps to each finding. Plain trestle reports the same findings without it.

The four repositories#

Planted credentials

leaky-repo

github.com/Plazmaz/leaky-repo

commit 2e95135

Credentials committed on purpose: keys, tokens, hashed passwords, and config files. A test of how much of a known set each tool finds.

Insecure demo project

spectral-goat

github.com/SpectralOps/spectral-goat

commit c72fdbb

An insecure demo project that also ships data-science notebooks, which are full of long random-looking strings. A good test of how many false alarms a tool raises.

Clean release

WordPress 7.0

github.com/WordPress/WordPress

commit b16cd68

A standard release download, 3,578 files. The secret values a site creates at install time don't exist yet, so there is nothing real to find. The correct result is to report nothing.

Real app with planted secrets

OWASP Juice Shop

github.com/juice-shop/juice-shop

commit f356a09

A realistic web application with known secrets across its source, configuration, and tests. A test of finding real secrets inside the kind of code people ship.

What each command printed#

The full output of each tool. Select a tool, then a repository.

$trestle --explain
Loading output...