Pragmatism in the real world

Migrating to password_verify

I’ve recently been updating a website that was written a long time ago that has not been touched in a meaningful way in many years. In addition to the actual work I was asked to do, I took the opportunity to update the password hashing routines.

This site is so old that the passwords are stored using MD5 hashes and that’s not really good enough today, so I included updating to bcrypt hashing with password_hash() and password_verify() in my statement of work.

I’ve done this process before, but don’t seem to have documented it, so thought I’d write it the steps I took in case it helps anyone else.

Updating existing passwords in the database

The first thing I did was hash all the passwords in the database to bcrypt with password_hash. As the current passwords are stored in hashed form, we don’t have the original plain-text passwords, so we end up with bcrypt hashes containing the MD5 hashes. This is okay as we can handle this in the login process.

This update is a one-off PHP script:

$sql = 'SELECT id, password FROM user';
$rs = $database->execute($sql);
$rows = $rs->GetArray();
foreach ($rows as $row) {
    $sql = 'UPDATE user SET password = ? WHERE id = ?';
    $database->execute($sql, [
        password_hash($row['password'], PASSWORD_DEFAULT),
        $row['id'],
    ]);
}
echo "Passwords updated\n";

This website uses ADOdb so I just continued using it. The principles apply regardless of whether you’re using PDO or any other database abstraction library.

I also had to update the database schema and change the password column from varchar(32) to varchar(255). The 255 characters is recommenced by the PHP manual page as it allows for the algorithm to change again.

Updating login

The authentication code needs updating to deal with bcrypt passwords. It currently looks like this:

$email = $_POST['email_address'];
$password = $_POST['password'];

$sql = "SELECT * FROM user where email = ? and password = ?";
$rs = $database->Execute($sql, array($email, md5($password)));
if ($rs->RecordCount() == 1) {
    // valid user
    $_SESSION['user'] = $rs->FetchRow();
}

In this code, there is a single step that only retrieves the user if and only if the email address and the MD5 of the plain text password match in the database record. If precisely one record is returned, it is assigned to the session.

To use password_verify(), we need a two step process:

  1. Retrieve the user via email address
  2. Check the retrieved hashed password against the password the user has supplied

Step 1

For the first step, I can retrieve the user by removing the password check from the SQL query:

$sql = "SELECT * FROM user where email = ?";
$rs = $database->Execute($sql, array($email));
if ($rs->RecordCount() == 1) {
    // ...

Step 2

I now need to check the password, which I do with password_verify():

if ($rs->RecordCount() == 1) {
    $user = $rs->FetchRow();
    $validPassword = password_verify($password, $user['password']);
    if ($validPassword) {
        // valid user
        $_SESSION['user'] = $user;
    }
}

This works great for all users who have an updated singly hashed plain text password, but none of my existing users can log in! This is because their bcrypt passwords are an MD5 hash of their plain text password.

To allow all users to log in, we need to also check for an MD5 hash if the password_verify() fails:

    $validPassword = password_verify($password, $user['password']);
    if (!$validPassword) {
        // check for a legacy password
        $validPassword = password_verify(md5($password), $user['password']);
    }
    if ($validPassword) {
        // valid user
        $_SESSION['user'] = $user;

In this code, we MD5 the password supplied by the user and check again with password_verify against the database record. If it succeeds this time, then the credentials are verified.

Now all our users can successfully log in.

In place migrating

As the login process is the only time when we have the user’s plain text password available to us, this is the ideal time to migrate the user’s password in the database from a hashed MD5 string to a hashed plain text password.

I did this in the code where we checked for the MD5 version, but only if the check was successful:

    $validPassword = password_verify($password, $user['password']);
    if (!$validPassword) {
        // check for a legacy password
        $validPassword = password_verify(md5($password), $user['password']);
        if ($validPassword) {
            // migrate user's record to bcrypt
            $sql = 'UPDATE user SET password = ? WHERE id = ?';
            $hashedPassword = password_hash($password, PASSWORD_DEFAULT);
            $database->Execute($sql, [$hashedPassword, $user['id']]);
        }
    }

Now, every time a user logs in with an MD5 hashed password, we will automatically re-hash their plain text password to bcrypt.

Updating password creation

Finally, I went through and fixed all the code that created a password in the database. This was in the user admin section and the user’s change-password and reset-password pages.

In all cases, I changed:

$password = md5($new_password);

to

$password = password_hash($new_password, PASSWORD_DEFAULT);

password_hash() requires a second parameter which is the algorithm to use. Unless you have a specific reason not to, use PASSWORD_DEFAULT.

That’s it

That’s all the steps that I went though. I would expect that for applications actively maintained, that most if not all have been updated by now as PHP 5.5 came out in 2009! However it wouldn’t surprise me if there’s many sites out there that were built by an agency in the past where the client doesn’t actively maintain it, but only asks for updates when changes are required – as in this case.

11 thoughts on “Migrating to password_verify

  1. It was pointed out to me on Twitter (Thanks Ashley!) that you could also need to update your password_hash'd password to change the cost function or algorithm.

    The same basic process is used, where you can use password_needs_rehash to see if the password in the database needs updating or not.

  2. I would add one note to this, and that is to say that the REASON you've included in-place migration from the migrated md5+bcrypt to pure bcrypt is that the md5 variants have a lower entropy limit (128bits max) than an unhash user supplied password (184bits for bcrypt). So replacing the rough upgrade with a user driven post-update gives us higher security over time.

  3. Hi Rob,

    On this line:
    "I now need to check the password, which I do with password_hash():"

    I think you mean "password_verify" ?

    Also, it's interesting that you migrate users' passwords as they log in. Is the hope that someday you'll find that no legacy passwords are in the database, and you'll be able to remove all the special-case code? That raises some interesting questions about purging if someone hasn't logged in in a long time, etc.

  4. This introduces a new vulnerability. The reason we prefer bcrypt to md5 is because it's more secure in the case that an attacker gets hold of a copy of one or more of the hashes from the database. With bcrypt it's a lot harder for them to find out what the original password was than it would be with bcrypt.

    But with the scheme above the attacker doesn't necassarily need to know the original password. If they get a copy of the database from before the one-off script was run then they can just present one of those hashes as if it was a password. password_verify($password, $user['password']) will return true and the attacker will be logged in to the victim's account.

    You need to store information about which way the password was hashed – the offline process will generate hashes as bcrypt(md5(password)) and the new passwords will be stored as bcrypt(password). There's no way to distinguish these by looking at the output, but you need to just check against the same algorithm when someone tries to log in.

    There are some examples of how to do this at https://paragonie.com/blog/2016/02/how-safely-store-password-in-2016#legacy-hashes and https://www.michalspacek.com/upgrading-existing-password-hashes

      1. You can prevent the vulnerability exposed above introducing a $secret value in the hash of the old passwords: password_hash($row['password'] . $secret, PASSWORD_DEFAULT).
        In this way, if an attacker has a copy of the old database with md5($password), he/she cannot use the hash to pass the first validation with $validPassword = password_verify($password, $user['password']);. For the second validation, you need to use $validPassword = password_verify(md5($password) . $secret, $user['password']);. You can generate the $secret using random_bytes() and store the output in a configuration file. I suggest to use a long size for $secret, in order to prevent it to send the md5 hash + secret in the password input HTML form. Usually, password are limited to a max size, e.g. 20 characters.

  5. And there is a probability that bcrypt() will be broken at some point in the future. So then legacy passwords will have to be updated to new_algorithm(bcrypt(md5(password))). So this approach is not fully future-proof.

    We've "allowed" the legacy hashes to remain in the DB for a while (not good, I know), and have updated hashes for legacy users after login. Now we are forcing password resets on users who have not logged in for a few months and are hoping the likelyhood of user annoyance is low.

    Probably would have been best to flag legacy users in the DB and convert their passwords to bcrypt(md5()), then used the approach you've suggested and deflag user after their password hash had been updated. Run with that for a while to get active users updated, and then force the password change on flagged users who are more likely to be dormant. Legacy password code can then be removed.

  6. Hi Rob,
    I am in exactly the same boat, in that I have taken on a website that was built years ago, has multiple users, but the DB is still using md5 and I want to upgrade to current standards. I’m going to be giving this a go tmw and hopefully I can get it to work, as it’s been hurting my brain past few days

  7. I'm currently in a similar situation.
    I've got an application to maintain and there is a plan to develop it further. There is an old authentication mechanism that involves SHA1 passwords. However, there is salt in place and each hash is done 1000 times before saving to the DB and checking with the provided password.

    My question is: is it enough in today's standards or there is a strong need to introduce password_verify and password_hash? I understand that bcrypt is great on passwords because it's slow in resolving so attacker can brute-force hashes pretty slow (and he probably will give up in the process). But here, with hashing 1000 times with sha1, is it also slow to brute-force (I assume it takes 1000 times longer that simple sha1 hash).

    I would really appreciate your feedback, Rob.

    Thanks!

Comments are closed.