In his interview with John Oliver, Edward Snowden said something that many computer-savvy people are aware of and that is that modern computers can sweep through the entire set of possibilities of eight-character passwords in less than a second and that is how hackers break into systems. He suggested that rather than using complicated and hard to remember combinations of characters, we need to think in terms of long phrases that are easy for each user to remember but are unlikely to be found in any written form anywhere.
But what I don’t understand is how that is used to break into systems. While you can sweep through all the possibilities quickly, you don’t know which is the correct one and isn’t that determined by trial and error? Usually when you log into a system, you type in your username and password and it is only if the password matches the username that you are allowed in. Otherwise you get an error message. That alone takes a few seconds. And in many systems, after a few failed attempts, you are locked out and have to contact the system administrator to get a new password.
So why aren’t hackers stopped after the first few guesses? I did a little searching on the internet but could find a good answer. Anyone here know?
Some Old Programmer says
Well, one class of dangerous attacks comes from unauthorized access to the password hash result.
(For non-programmers, you might want to look at Tom Scott’s Youtube video “How NOT to store passwords!” on the computerphile channel).
The tldr: whatever you type in as your password goes through a one-way process, “hash”, that results in a string of gibberish. A well designed password system will not store your in-the-clear password, but the hash. If the result of the hash is compromised, it’s straightforward for someone to use their own computer resources to crack the password without involving the target site.
Note that I glossed over a lot, and this is not intended to be a tutorial or exhaustive explanation of how passwords are cracked.
Kengi says
The password file on a computer can only be so secure. After all, before you log in, the process running to log you in must have access to the password file to check to see if you entered the correct username and matching password. So this can’t be buried behind too much of a firewall, or the process used to log you in wouldn’t have access to that file.
Thus, some times when a hacker “attacks” a system, it can be to exploit a weakness which simply allows access to the area of the files system (or, indeed, RAM) the password file is stored in.
As Some Old Programmer pointed out, this file in one-way encrypted, so you can’t just read it to see the passwords. However, once a hacker has this file, they can run the same process on their own computer (kind of simulating the login process) and look to see if any entry matches any of the hashed passwords after running a word through the login encryption process. In effect, they get to try logins as fast as their local computer can process the encryption over and over.
There have long been dictionary files of common passwords (college sports team names, for example) as well as every word in most languages. Routines add numbers to the beginning and ends, plus replace numbers with letters, and try each one at great speed. Special computers which are good at trying things in parallel (often utilizing several video cards for such parallel number crunching) can burn through the common words in a fraction of a second, and then try every combination of every letter and number up to a certain number of characters.
Of course, since you need to encrypt each try, then compare it to the hash (or hashes) in the password file, the time it takes is limited by the complexity of the hash algorithm. As the number of characters goes up by one, the number of attempts to brute force the password goes up exponentially faster.
JohnE says
One way is that the attackers can copy off the password details and do the brute-force attack offline. The passwords are stored as a “hash” as above, and cracked by doing repeated hashing of trial passwords until they find one that matches.
The password details are typically stored in an SQL database along with the rest of the site data and can be accessed by things like sql injection attacks, where an sql command to unload this data is imbedded into the incoming message.
moarscienceplz says
I think you’re probably right, Mano. However, I think that people are prone to use certain words that are either common, or can be guessed at if you know something about the person you are trying to imitate, so it’s not necessary to try every possible combo. I think that’s the main reason why you are often required to use upper and lower-case letters and numerals, not so much because it increases the total universe of combinations (letting you choose whether or not to include extra sets actually would make even more combinations possible) but because it forces you to broaden your personal universe of combos.
Maybe what Snowden was thinking of is administrator-level passwords. It would be tricky to limit these to three strikes because if you lock out the admin, who is going to let him/her back in?
Some Old Programmer says
On a more basic note, good computer security doesn’t assume that you can keep a given set of data perfectly secure. A reason passwords aren’t kept in the clear? Assume that someone will be get unauthorized access to it--eventually you’ll be right, particularly if you consider that IT personnel need wide ranging access. Similarly, don’t assume you can keep your hash function perfectly secure; programmers write it, programmers can access it. Thus good computer security typically has multiple layers of barriers and safeguards.
Oh, and a disclaimer: I won’t call myself a security expert.
Scr... Archivist says
Kengi @2,
Do they do this to reverse-engineer the encryption process and thus decrypt all of the passwords in the file? Or do they only figure out the test passwords that their system tries and finds a match for?
In other words, is my super-strong password in danger because it happens to be in a list with someone’s “123456”?
Alan says
Crypto not exactly my thing, but it depends on what you are trying to attack. For a website, there’s a limit to how frequently you can attempt dictionary word guesses based on the connection speed and any limitations put on incorrect attempts by the site in question. For an OS, if you have access to the password hashes -- in the form of a file you can work on offline -- it is probably trivial to find “solutions”, since for any given hash there may be more than one method of generating it. So the problem is reduced to “what string, when hashed, gives me this hashed string” so all you need to do is generate lots of hashes and work backwards to do a hash lookup to get the the original string. It may or may not be the original password, but the cracker doesn’t care.
Kengi says
Scr… Archivist@6,
The encryption algorithms for most systems are all based upon standards and well known, so there is no need to reverse-engineer the process. Since it is a one-way process (based upon the mathematics), there is no way to “de-crypt” the file. Hackers can only keep trying a set of characters, encrypt those characters, then look at the file to see if what comes out matches anything in the password file. If there’s a match, they know the input for that single hash, which is the password for a single account in the file.
So no, they don’t get the entire list by finding one password.
However, this brings up another issue, which is to never use the same password for most or all of your accounts. Since many people re-use passwords (and account names, or email addresses as account names), once a hacker has your password from one system, they often try to login using the same credentials on others.
Thus, if someone hacks the FTB servers and brute forces your password, they may try the same combination at gmail to see if it works. And once they have gmail, they move on to banking systems.
To avoid this problem, you should use a different password for each account, and use two-factor authentication as well if available (which it is at most large banks and gmail).
Each password should be at least 16-18 characters and not be based upon a word at all. Phrases are better, but dictionary files of common phrases are becoming more common. The best is to have each password be a set of nonsense characters, numbers, and symbols.
To keep track of them you should use a password manager. Thus, you keep a single, complex pass phrase which unlocks each unique password when needed. I use a manager called 1Password, but there are several good ones on the market.
Here’s an overview from Ars Technica:
http://arstechnica.com/information-technology/2013/06/the-secret-to-online-safety-lies-random-characters-and-a-password-manager/
doublereed says
Schneier has a good explanation on secure passwords. He first explains how passwords are broken, and then explains how to pick good ones.
It sounded to me like Snowden was assuming the attacker has access to the hash of the password, which would be an offline password attack (like if an attacker gets access to the database of hashed passwords). The attacker just puts the hash into the password cracker and cranks it out offline. You’re describing issues an online password attack, which are much slower (because they’re over a network or over the internet), and can also be prevented with locking accounts like you say. Generally that online attacks only gets really easy passwords.
sc_770d159609e0f8deaa72849e3731a29d says
Is this true?
Using the shift key I could produce over ninety different characters. Eight randomly chosen characters offer much more than 4,300,000,000,000,000 possible eight-character passwords. Can a computer go through that many possibilities that fast?
Dunc says
@10: It depends on exactly what search space you’re looking at. In reality, most people only use lower-case letters for passwords, which reduces the search space considerably.
Some Old Programmer says
@10: I don’t know if it’s true, but it’s unlikely to be necessary. If someone is looking for access using any of the accounts, it’s likely easy enough to find a weak password among the set of accounts. If they’re targeting a specific account, there are online databases that they can query to find a cleartext password that will give a particular hash result for a given hash function. These databases (per the aforementioned Youtube computerphile video) contain the common hash results for entire language dictionaries and iconic phrases.
Mano Singham says
sc….
Actually, Snowden said it could be done in less than a second, making the challenge even greater. I made a mistake and corrected it. I will take Snowden’s word for it that it can be done.
Kengi says
@10,
It depends upon the algorithm being used, the attack method, and the hardware used for attack. Back in 2012, a cheap cracker could run 8.2 billion password combinations each second in just plain brute force mode. That could easily be scaled up using better hardware.
However, that speed could be multiplied many fold by using modified attacks which do things like pre-calculate some of the data needed (using what are called rainbow tables), and by using combinations of dictionary and brute-force attacks. By pre-calculating possible character combinations computers only need a fast system of comparing tables of characters to each other, but do need extremely large storage systems to do it. It would take more than 3,000 TB of space to store every possible combination of 10 character passwords, but only 167 GB of space to store a rainbow table expressing almost all of those combinations. That’s why even ten-character passwords are now useless against this type of attack.
Hackers also use their own clouds of systems to speed up the attacks.
I really couldn’t say if a system now could brute-force all eight character password possibilities in less than a second now if it had to calculate each result using a common system such as MD5, but using rainbow tables it sounds easy enough. And with each new generation of processors, video cards (which are used because of their massive parallel processing power), and high density storage, the process gets even faster. As of a year or two ago, twelve character passwords were considered to be barely adequate.
Thankfully, the problem for hackers goes up exponentially with each additional character.
Additionally, as some have already pointed out, this is only one attack method. That’s why you should never re-use the same password across systems even if you are confident it can’t be easily attacked using this method. And since long, random strings of characters are the best solution to this attack, password managers become a must, and regular part of daily life. The next problem, of course, is keeping password managers secure…
Kengi says
Here’s an article written by a journalist who tried his hand at password cracking:
http://arstechnica.com/security/2013/03/how-i-became-a-password-cracker/
He used a file with 17,000 passwords and his goal was to see how many he could crack in under 30 minutes. He got 4900 in a single three-second initial run, and eventually got 8059.
Dunc says
Backup and redundancy become issues too… I keep everything in a password manager, so I no longer even know the password to my primary email address. If I were to ever lose access to my password database, I’d be in a bit of a bind… So I have it synched to two different computers (using Google Drive) and I have a backup on a USB stick on my keyring.
Reginald Selkirk says
25-GPU cluster cracks every standard Windows password in <6 hours
That was in 2012, with a system a dedicated person could construct on their own. I’m sure the NSA would have access to better resources. Password cracking is a perfect application for massively parallel computing.
Reginald Selkirk says
In other words, your vital information is out there on Teh Interwebs. I’m not sure why this would make you feel safe.
Kengi says
@18,
Security is always a trade-off with ease-of-use.
The password file of the typical password manager uses even stronger encryption than most login systems, since it doesn’t have to process a lot of logins per second. So long as a good, long, difficult pass phrase is used, it would be much more difficult to brute force than a typical password file.
Of course, that’s assuming the password manager properly implements the algorithms, and there are no other unknown flaws.
No system can be perfectly secure, but proper use of a password manager, even when that password file is backed up into a cloud, is still considered more secure than using a small handful of memorable passwords (or even pass phrases) for dozens of accounts.
Of course, as more people start using password managers, those systems will become a focus for attack…
One thing to consider is how you would be able to reset you various passwords if you lost your password manager file. Most of my accounts can be reset by having access to one of my email accounts (through gmail). I make sure that account uses a pass phrase I can remember without aid of the manager (yet is still long and somewhat complex), and then use two-factor authentication for that account (through an iPhone app which produces a new code every 30 seconds).
If I lose my phone as well as my password manager at the same time, I have a paper pad of five codes which can be used to authorize a new computer access to my email (which would still need the pass phrase) which I keep in my wallet. I also keep a copy of the paper pad in a safe deposit box. If my wallet is ever stolen, I can de-authorize those codes and generate a new set. And anyone who steals my wallet would still need my pass phrase.
Security can be complex, but with a few simple adjustments to certain accounts, and some planning, it becomes part of your daily routine.
Who Cares says
First Snowden was talking about an offline attack. The amount of data sent in under a second would result in a DoS effect if the server doesn’t throttle the connection due to the amount of request done. Or the attacker not getting locked out after failing several attempts.
Offline a dedicated machine can do over 10 billion attempts/seconds these days.
Bakunin says
xkcd’s take
The argument for length is that 6 lower-case letters has 26^6 possibilities. Every additional letter increases the exponent, and 16 letters is easier for people to remember than 8 characters of nonsense. 8 characters with one capital, one special character, and one numeral has 80•26^6 possibilities (assuming 8 allowed special characters).
doublereed says
@21 Bakunin
The schneier article I posted earlier explains why the xkcd method does not work.
Bakunin says
The core idea I took from the xkcd method is that length trumps complexity. The classic 8 character, alphanumeric, caps mixed with punctuation style isn’t as memorable or secure as sheer length. Also, Schneier’s comments on the xkcd method in the linked article are fallacious. The passwords Schneier criticizes like “iloveyousomuch” and “i hate hackers” aren’t random words, they’re phrases. Conflating English phrases with strings of random words is logically invalid.
Marcus Ranum says
Password cracking parallelizes perfectly.
One machine that can try 100 million passwords per hour is one thing.
1 million machines that can try 10 million passwords per hour is quite another.
The NSA used to have its own chip fab plant, operated by Fairchild; if they wanted their own silicon password cracking engines, they could have them. Nowadays, though, there are FPGA arrays that make it even easier. Then it’s just matter of adding electricity and money.
The bombe brute-force decryptors Bletchleyy Park used against the German Enigma were basically the same principle: try multiple key-settings at a time. When the correct key is in any of the machines, your ciphertext suddenly spits out a recognizable message. It’s even easier if the messages include some kind of integrity checksum, which basically acts as a flag that the decryption was correct.
Well-designed authentication systems make it impossible for the attacker to check authentications quickly, and slow down proportionally to how many bad guesses they get over time. The internet, however, is not full of well-designed authentication systems.
Marcus Ranum says
Edit to my #24 above: the reason well-designed authentication systems slow down proportionally with bad guesses is because it removes the parallelism of attacks.
Marcus Ranum says
There is another technique that can be used to attack some authentication systems, namely to precompute all the possible hashes (it’s just a bunch of hard drive space, right?) then “cracking” becomes a matter of a table lookup. If you think that sounds impossible, consider Google’s search index.
Ben Finney says
doublereed, #9:
That is a good article. It is mistaken about the XKCD 936 scheme though; as explored in the comments on Schneier’s article (and I don’t think Schneier has responded), the article mistakenly assumes the user will be choosing the passphrase.
No, XKCD 936 does *not* allow users to choose the words; its claimed entropy is because it uses words picked by a *random* process. Human choices are terrible at simulating randomness, but that’s not what the scheme relies on. Schneier attacks only a straw man.
So, four words chosen *randomly* from a 2000-word dictionary of common words has about 44 bits of entropy. If the human restricts the space of possible passphrases, then yes it will be weaker. So don’t do that.
fentex says
The not so obvious reason why having your password cracked after a hashed database has been compromised is that we have need for so many passwords in our lives we repeat them.
So if a small company with unpatched servers and poor professionalism gets it’s password data stolen then the attacker can probably identify you by email in attached records, crack your password, then randomly try services that use your name or email address as your identity (such as Facebook) and once they’re in because you re-use your password use credentials and information gathered from there to gradually escalate their access -- eventually by social engineering attacks against your banks or other services to get at money.
And this becomes increasingly likely as more and more services use Facebook sign ins to authorize users -- it’s convenient but also centralizes where attacks have to succeed to compromise you.
fentex says
I meant; a not so obvious reason why it’s a problem to have one password recovered from a stolen database of hashed results is that we re-use passwords, so having what seems a minor service compromised can easily lead to greater damage.
Alan says
A posting referencing this article came up today in Slashdot and provides more information on the statistical approach to password cracking.
cantataprofana says
One thing to remember is that the absolute time taken to crack passwords is not really that important -- what matters is: will the attacker be able to successfully crack any passwords before the organisation in question finds out that a breach has happened (and takes remedial action)?
Knowing that security breaches often go *months* without even being detected, never mind fixed, the hashing method used needs to ensure that an attacker will need to spend (at least) months before successfully cracking any password hashes. And this requirement is *in addition* to the need for users to choose complex passwords.
Figure 5 in this report might help visualise the issue: http://www.verizonenterprise.com/DBIR/2015/insiders/?utm_source=pr&utm_medium=pr&utm_campaign=dbir2015
Marcus Ranum @26:
Sounds like you mean “rainbow tables”. Using proper salting along with hashing should make use of rainbow tables infeasible. But as you point out, there are plenty of places on the internet where this won’t be done properly.
Dunc says
@18:
Well, my Google Drive password is extremely strong, and my Google account uses two-factor authentication. The passphrase for my password manager is even stronger, and protected by the strongest algorithm currently available. Plus the password manager implements some additional tricks to make brute-forcing impractical…
As someone else noted, security is always a trade-off with useability. Since losing access to my password manager would be very, very bad indeed, it’s extremely important for me to have a reliable backup and synchronisation mechanism. Given the way I’ve got it set up, about the only viable attack profile (other than good old-fashioned “rubber hose cryptanalysis”) is if a very powerful attacker (e.g. the NSA) gets access to my Google Drive from the inside, and then either dedicates an absurd amount of computing power to brute-forcing the encryption of the database, or knows of a major undiscovered flaw in AES. That’s not a particularly plausible scenario, in my estimation, and if you’re going to start worrying about that level of attack, it’s time to give up and go live in a shack in the woods, because they either already own all of your accounts anyway, or they can just do an end-run around authentication entirely (as they would have to do to get access to my Google Drive without me knowing about it).
I also change my more important password on a regular basis, which limits the timescale for an attacker to manage that brute-force attack, should they somehow get access to my password database.
DsylexicHippo says
A password of sufficient length and complexity that is widely considered as strong today will eventually be regarded as weak over time and at some point in the future as a joke. Who could have even guessed the sort of computing power we now have in 2015 back in the eighties? My point is that we should always retire passwords after a set time and keep creating newer and more complex ones to keep up with the increasing muscle of brute force crackers.
lanir says
Several commenters have called out Schneier on his “XKCD method” criticism. It only took me a moment to see what his point was.
If you have a good idea that words are being used, you can shift your thinking away from atomic units of one alphanumeric digit and into one word at a time (including common misspellings, abbreviations and letter substitutions). You’ve got significantly more words than there are alphanumeric characters but the number of slots is going to be much more limited. How many words will be in most phassphrases? Four? Six? More? You’re already over 20 characters, probably by a good margin at that point.
I’m not a security expert so I can’t say whether this is practical in the real world. But the logic of how to do it is not hard to string together and if you’re still focused primarily on character length some pieces of the process will be much simpler than you’d expect.