Password Hashing. What it is and how to make it not suck.

The PlayStation Blog recently posted that all our passwords on PSN were not encrypted, causing a crazy amount of panic across the internet. However, yesterday Sony clarified that while the were not encrypted, they were hashed.

“Phew!” was my first thought, but I’m guessing most people were thinking “Eh?” So I put together this descriptions of why password hashing by itself sucks and how to improve it significantly. I’m going to take a guess and assume that the PSN has been using version 2, and will from now on be using version 3.

When you register for websites or online services, you have to set a password to enable yourself to login again in the future. Your username and password needs to be stored in a database so that when you ask to login, the server can verify your details are correct and allow you access.

Let’s look at the basic way of doing this (by the way, the WRONG way) and then work our way up to how most websites (should) be storing your password.

(comic from XKCD.com)

Version 1 – Plain-text

Joe has registered on my website and I have chosen to store his password in “plain-text”. This means I store his password with no other security measures than normal.

So in my database I store:

Username: Joe
Email: Joe@bloggs.com
Password: 12345

Yes, it’s a bad password. But you’d be surprised how many people use that one. (see top passwords on Gawker leak: http://blogs.wsj.com/digits/2010/12/13/the-top-50-gawker-media-passwords/)

Now when Joe tried to log into my website, I look at the password he gave me and compare it to my database. Let’s say Joe gives me his password “12345″ – Hurrah! It matches! I can let him login and access my lovely website.

Where are the problems with this? First, anybody running the website can easily look into their database and read all the passwords for all their users. Ideally you want even the admins on the website to not be able to know your password. Secondly, all the security is based on the database. If somebody managed to break into the website, they may be able to break into the database and download all your usernames, emails and passwords.

We need a better form of security.

Version 2 – Password Hashing

Now we are going to secure our passwords with something called “hashing”. We use a mathematical equation called a “hash function” to turn your password into a piece of nonsensical data. There are many different types of hash functions we could use, however they ideally need to have these properties:

  • One-way only
    • This means if we take a password and run it through a hash function, we cannot reverse the process. This means you can’t take the password hash, run it through a modified version of the hash function and get the original password.
    • This requires some complex mathematics to ensure it’s absolutely impossible to find a way of reversing the hash function.
  • No collisions
    • We don’t want two passwords resulting in the same password hash. For example, if “12345″ and “password” resulted in the same password hash, people will be able to login with either of these passwords.
    • This will make more sense after an example.

So, for this example we’re going to use a famous hash function called MD5 (which has actually been proven to have some rare hash collisions, there are better functions available now, but for this example we’ll use a popular one).

When Joe registers, instead of storing his password in plain-text, we store the result of the hash function.

Username: Joe
Email: Joe@bloggs.com
Password: 827ccb0eea8a706c4c34a16891f84e7b

You can see that the result of “12345″ is a long piece of text that is impossible to understand.

Now, when Joe tries to log in, we take his password. We run the hash function on the password he gave us and we compare the two hashes instead. If he gives us “12345″, we will run it through the hash function, check the resulting password hash and if it matches the hash we have in the database – Hurrah! We have logged Joe into the site again.

But is this really safe enough?

Note that this time, we never store the plain password. So an admin can’t look through the database and read everyone’s passwords. But, there is still a flaw in this system.

What if we built a massive database of every single possible combination of letters, numbers and symbols and ran the same MD5 hash function over every possibility and saved the result. It will take a very very long time to calculate, but people have done exactly this. They have created databases where you can type in a password hash, and it will search through their massive databases trying to find the password that originally created it.

This is the problem of everybody using the same hash functions. But there are very few available that are secure and strong enough.

However, there is a solution to this problem too.

Version 3 – Salted hashes

Salting is almost exactly the same as password hashing, but with one minor difference. We add a new piece of data to each user in our database. For this example, I’m going to generate a random piece of text for Joe using a random text generator.

For Joe, we generated a random piece of text “b5h64h0c78FbXWJHKl7DDKKE35d6SO”. We shall call this his “password salt”. We store this alongside his username and email address in the database.

Now, instead of storing the hash of only his password, we also add our salt to his password. Now instead of performing the hash function of “12345″ we perform the hash function of “12345b5h64h0c78FbXWJHKl7DDKKE35d6SO”. Notice it starts with Joe’s normal password, but we add our salt onto the end. This gives us a new password hash to store.

Username: Joe
Email: Joe@bloggs.com
Password: f88378f45a99a13be6f42cefbd80e976
Salt: b5h64h0c78FbXWJHKl7DDKKE35d6SO

So now, we have made Joe’s password very long. It would take way too long for somebody to go through every single possibility up to the point of a 35 letter password because of the salt we added on. This is why it’s vital that websites add the salt to each user, making it impossible to pre-calculate as many password possibilities as possible, since every user will have a completely different salt, it will take centuries of computation to get anywhere close to finding the right one.

Recently, Gawker, (a website network including Fleshbot, Deadspin, Lifehacker, Gizmodo, io9, Kotaku, Jalopnik and Jezebel) was hacked and their database was compromised. They did not use password salting. Millions of passwords were instantly looked up in large password hash database. It’s hard to know how many other websites out there don’t salt their password hashes.

We’ve glanced over a lot of password security, but I thought it would be helpful to essentially explain how your data is secure. After the recent media hype over hacked systems, people actually suddenly seem to care about their online information. Just wait until people get into your Facebook. If you don’t want people to know about it, don’t put it online.

How can we improve this further? Look up Two-Factor Authentication – banks (and recently Gmail http://googleblog.blogspot.com/2011/02/advanced-sign-in-security-for-your.html) implement it and it will keep your account significantly more secure: http://en.wikipedia.org/wiki/Two-factor_authentication

Cross-posted from my blog, http://www.cubehouse.org/blog/2011/05/02/password-hashing-how-to-make-it-not-suck-a-basic-guide/

May 4th, 2011 by | 9 comments
Cubehouse is a very British computer scientist who loves everything PlayStation Home and serves as HSM's webmaster. He lives and works in Bristol, UK and holds a Masters Degree in Computer Science. Cubehouse is working in the internet security and gaming industries.

Share

Short URL:
http://psho.me/ct

9 Responses to “Password Hashing. What it is and how to make it not suck.”

  1. johneboy1970 says:

    Thanks Cube! Very timely and informative.

    • Keara22hi says:

      Say what???? I long for the old days of rotary dial phones and 8-track stereo systems. This new stuff is so scary. Makes one want to move to a very remote island in the middle of the ocean and hide from the rest of civilization. Oh, wait -- I already did that.

  2. Queen_Eli says:

    My hash is always very salty, I believe it’s the corned beef that makes it so.

  3. NorseGamer says:

    That stick-figure featured image, by the way, is *still* cracking me up.

  4. Katsuune says:

    Thanks for the timely summary on hash functions, cube. It’s interesting that SCE, in their initial description of how user data had been compromised, did not mention that the stolen passwords were hashed. Presumably their security advisors had been quick to point out that with Version 2 hashes, the existence of precomputed crypto libraries could make it laughably easy to recover the more frequently-used passwords.

    This discussion is all the more appropriate, seeing that Sony is just one of many companies scrambling to deal with threats to user’s security. Just this morning it was announced that the web-based password management service ‘LastPass’ was hacked 2 days ago, and in their case, “analysis of the outbound data transfer from the server [was] large enough to have included people’s email addresses, the server salt and their salted password hashes from the database.”

    Let’s note that even though LastPass was storing salted passwords they are still requiring users to change their master passwords. And like Sony, they are not exactly sure at this point what exactly was intercepted from their database.

    My view on this is that no matter how tight the security is on a data center, it still becomes a giant liability the moment that someone steals the data. IMO the very best thing for Sony to do would be not just beef up their procedures, but make it a point to collect less personally-identifying data on users. For example, why ask us for a DOB at all, when simply asking for year of birth (or an age range) would do just as well to ensure that someone’s old enough to play online?! Mimimizing the personal data collected makes it that much less valuable to thieves looking to harvest credit card information. It makes no sense for SCE to make themselves a more inviting target for hackers, by demanding user information beyond what is absolutely necessary to fulfill customer service concerns.

    • Cubehouse says:

      The normal hashing technique was actually perfectly fine years ago, because computing power wasn’t all that powerful and the internet was mainly for “technologically advanced” users who understood the idea of a secure password.
      Now-a-days, there are so many people using the internet who don’t understand the issues with password security and have completely awful passwords like “Password” and “123456″. I know that none of my passwords I have ever used is available in those hash databases because I made it secure.
      Salting passwords is there to protect people with bad passwords.

      And LastPass is just one of a long long string of attacks. It’s inevitable. Just wait until your (or your friend’s) Facebook gets attacked. How bad is that? With Facebook, you have to trust every single one of your friends to also be secure to avoid your data getting out. My old school friends on Facebook are idiots. I am worried.

Leave a Reply

Allowed tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


2 + = six