Reversing hashes of PwnedPasswords api using number of breaches

R

I was recently working on a requirement to log the number of breached sites a password appeared on when customers were registering (if that password had been breached at all)
Importantly, we are not logging the breached password itself (nor the hash of the password) – just the number of breaches that particular password appeared in (as per the Pwned Passwords data set)

So, to log this, I’m raising a custom Application Insights event, using the client side Javascript SDK.

Pwned Passwords implements a k-anonymity approach to protecting the hashed passwords, to prevent the necessity to call the API with the full password hash. You can’t validate the entire hash, by calling the API with it.
Instead, you call the API, supplying the first 5 characters of the hash.

For example, the SHA1 hash of ‘password1’ is `e38ad214943daad1d64c102faec29de4afe9da3d`
So, to test if ‘password1’ has been breached, we call the Pwned Passwords API, specifying the first 5 characters of that hash as a parameter:

https://api.pwnedpasswords.com/range/e38ad

That returns some 486 results in total. In other words, 486 breached password hashes start with `e38ad`
The results are all the suffixes of those hashes, along with the number of breaches that hash appeared on.
This way, we can validate if that password hash appears in the breached list, without giving away the password, or even the hashed password, by searching the list for the original password hash; minus the first five characters:
pwnedpasswords e38ad results

We can see that the suffix of our hash is in that list:
e38ad214943daad1d64c102faec29de4afe9da3d

So far, so good, we can perform this client side quite easily, and send our custom event to Application Insights.

Here’s a documented, cut down version of the code I was using to do that:


//this would come from our user submitted form
var $input = 'password1';

//sha1 function out of scope for this example, just know that we sha1 hash the input :)
var inputHash = sha1($input);

//PwnedPasswords uses the first 5 chars of the hash to perform a range query
var hashSub = inputHash.substring(0, 5);

$.get("https://api.pwnedpasswords.com/range/" + hashSub, function (data) {

var breachedItems = data.split("\n");

// iterate over the breachedItems returned from pwnedpasswords
// if the hash matches (by )
for (var index = 0; index < breachedItems.length; index++) {

// this is typically in the format of: c2d18a7d49b0d4260769eb03d027066d29a:181 - or <hash>:<number of breaches>
var breachedItemParts = breachedItems[index].split(":");

var breachedItemHash = breachedItemParts[0];
var numberOfBreaches = breachedItemParts[1];

// compare the breachedItemHash (which is the last 35 characters of the hash)
// with the last 35 characters of our inputHash
if (breachedItemHash === inputHash.substring(5)) {

window.appInsights.trackEvent("PwnedPasswordUsed",
{
message: "This password appears on " + numberOfBreaches + " breached sites.",
numberOfBreachedSites: numberOfBreaches,
});
}
}
});

This worked great.
The problem, however, is Application Insights, by default, logs dependencies, and other telemetry.

If we take a look at End-to-end transaction details of our logged custom event, we can see that before our CUSTOMEVENT is logged, we have a dependency – an Ajax call, to `api.pwnedpasswords.com/range/e38ad`

App Insights showing AJAX call to pwned passwords API

Our custom event, by design, contains the number of breaches.

In a lot of cases, this breach number was unique in the returned range.
In the case of ‘password1’ (at the time of writing) this number was 2391888.

When plain text searching for ‘2391888’ (the count, recorded in our custom event) we can see:
pwnedpasswords e38ad results - count

Join the prefix, to the returned suffix and we know the full hash is:
e38ad214943daad1d64c102faec29de4afe9da3d

Because these are generally ‘weak’ or breached passwords, we can try a SHA1 reverse tool, such as: https://isc.sans.edu/tools/reversehash.html – which gives us the correct, original password: password1

The security risk here is incredibly low…

For a start, our Application Insights instance is obviously protected.
Secondly, not every password can be reversed. Also, the count may be something quite low, like 1, or 2 – hundreds of breached passwords returned in the range query may also have a breach count of 1, or 2.

However small the risk, I wanted to eradicate this, while still using the client-side Javascript SDK for Application Insights.
I’ve written a separate post on how to Hide Sensitive Data with Application Insights JavaScript SDK using a Telemetry Initializer

Another possible solution…

This would require the PwnedPasswords api to change slightly, but instead of returning the whole suffix of the hash, perhaps the last n characters could be returned.
By omitting 5 characters from the start of the suffix, the collision rate would be sufficiently low enough, but make hash-reversal impossible.

Just a thought…