What are rainbow tables?
What sounds like a fun children’s game is actually a powerful tool for decrypting passwords. A relatively large group of people on both sides of the law pour lots of energy into cracking passwords – either because they stand to profit from the criminal process or because they’re experts who regularly check security standards for effectiveness. Rainbow tables make it possible under certain circumstances to find out passwords within seconds.
Even if you’re not up to any criminal activity, it’s worth understanding the process. With this knowledge in the back of your mind, you can better understand as a user why complex passwords are necessary, and as a provider of a web offer what must be considered to secure your passwords.
Why do you need rainbow tables?
Passwords these days are (hopefully) no longer stored online without encryption. When users set a password for their account on an online platform, the string doesn’t appear in plaintext on any database or server. This method would be far too unsafe: a hacker would only have to gain access to the database and could then immediately break into the accounts of each user.
For e-commerce, online banking, or online government services, these would be fatal cases. Instead, online services use various cryptographic mechanisms to encrypt their users’ passwords: only the hash value of the password appears in the database itself.
The password can’t be directly identified from the hash value, either – even if you know the crypto function. There’s no way to recalculate the operation. Hackers use brute force attacks instead: with this, a computer program keeps guessing until it’s found the correct character sequence.
This method can also be combined with password dictionaries. These files – which can be obtained via the internet – contain numerous passwords that are either very popular or were captured during a previous attack on another system. This saves hackers time with the decryption: First, they try all passwords in the dictionary. Depending on the complexity of the passwords (length and used characters), this can cost some time and computing power, though.
Don’t use any simple terms for your passwords, as they make it much too easy for attackers. In our article on safe passwords you can learn how to design a good password.
Rainbow tables go a step further than password dictionaries, as they can also be found online, and can be used to crack passwords. These files, some of which can be multiple hundred gigabytes large, contain passwords together with their hash values along with the encryption algorithm used. It’s not complete, though. Instead, certain chains are created from which the actual values can easily be calculated, which reduces the storage requirements of the still massive tables. With rainbow tables, hash values found in a database can be used to sort your passwords into plaintext.
How do rainbow tables work?
To understand how rainbow tables function, you have to understand the basic workings of crypto-algorithms. This makes it easier to see the benefits of the pre-made tables and understand the time-memory tradeoff.
Encryption technology
Since cryptographic hash functions started being used in encryption, the corresponding algorithms have been changed time and time again. Standards that were uncrackable 10 years ago are now considered a serious breach of security. What they all have in common, though, is that the content to be encrypted still runs through algorithms and a hash value is generated at the end. This hash value is generally a hexadecimal number with a specified length. It doesn’t matter how long the original content is, because at the end it’s always a 128-bit hash value, for example. Three properties are decisive for encryption:
- The same input always generates the same hash value, and is the only way that the value can function as a checksum. Is the entered password identical to what’s saved in the database? The system may only grant access if both hash values are the same.
- A hash value should always be unique, so different entries can’t generate the same hash value. Only in this way can the function make sure that the correct password was also entered. Since the number of possible hash values is limited, but the number of possible entries isn’t, such collisions can’t be excluded. Modern hash functions and hashes with a sufficient length minimize the risk as much as possible.
- Hash values can’t be recalculated: original content can never be derived from the hash value itself. This is why hash values can’t also be decrypted, as is sometimes vaguely claimed. Instead, hash values can only be comprehended.
- Hash functions have to be relatively complex – but not too complex: to ensure security, an algorithm can’t work too quickly, because that would also make the work easier for attackers. The conversion also shouldn’t be too complex, as it does still need to be applied in practice.
Hash values aren’t only used to encrypt passwords. The functions also serve as checksum for complete programs, for example: the algorithms generate a hash value from the entire source code. This ensures, for example, that the version of the program downloaded from the internet is identical to the original and hasn’t been replaced by malware.
Reduction functions
The hash values contained in rainbow tables aren’t created with an attack, but already exist beforehand, meaning attackers can obtain rainbow tables and use them to find out passwords. These files are very large, though. So that the required storage space doesn’t get out of hand, rainbow tables use a reduction function that change the hash value into plaintext. Important: The reduction function doesn’t reverse the hash value, so it doesn’t output the original plaintext (i.e. the password) – because this isn’t possible – but instead outputs a completely new one.
A new hash value is them generated from this text. In a rainbow table, this takes place not only one time, but many times, resulting in a chain. In the final table, however, only the first password and the last hash value of a chain appear. Based on this information and taking the reduction functions into account, all other values can also be determined. The hash value to be cracked is then reduced again according to the same rules and hashed, and each intermediate result is compared with the values in the table.
The challenge of creating a table is that the source content, which represents the beginning of a new chain, can’t have already appeared as plaintext in a previous chain. With this technology, the size of such tables can be extremely reduced, and yet they still are several hundred gigabytes large.
Time-memory tradeoff
A time-memory tradeoff is basically when you accept a longer runtime in favor of fewer memory requirements – or the other way around. A brute force attack takes up very little storage space, since the cryptographic calculations for each attack are performed anew. A table, on the other hand, in which billions of passwords are presented together with their hash values, takes up an enormous amount of storage space, but can very quickly run decryptions. Rainbow tables represent a compromise of both. In principle, they also perform real-time calculations, but to a lesser extent, and so save a lot of storage space compared to complete tables.
Procedure within rainbow tables
The initial situation: You have a specific hash value and would like to discover the actual password behind it. First, search through the list for the hash value. If it’s found either at the beginning or the end of a chain, then the password can be found relatively quickly, so you only have to decipher the repetitions of the chain to get the desired result. But what happens when you don’t find the hash value in the table?
In this case, you start with a reduction of the hash value using the same function that was used to create the chains. The result then passes through the hash function. Repeat this until you find the hash value in one of the end points. This doesn’t give you the password you’re looking for, though. But you’ve now found the corresponding chain for the hash value. So, you now start at the beginning of the chain and carry out the reductions and hashing alternately until you reach the desired hash value and the plaintext of the password.
What do tables have to do with rainbows?
At the end of the day, you may very well ask yourself what these tables have to do with rainbows. In practice, you can use not only a reduction function, but also a different one in each step. This provides better reduction results and avoids the repetition of hash values in the table, but also has the disadvantage that finding combinations of hash values and passwords in the chain is somewhat more complex.
The reductions then must be gone through in order: if you assume that the chain was built with the reductions R1, R2, and R3, then you would start the search with function R3. If this doesn’t return any results, then start with R2 and then go to R3, and so on. Within the table, the different reduction functions can be marked with colors, which leads to a colorful rainbow with a corresponding number of iterations – and, the name.
Rainbow tables explained: an example
The best way to understand rainbow tables is to see an example of the process. But we won’t use the popular hash functions for password security for this, since they are much too complex for a simple example. Instead, we’ll use a much simpler function: multiplication.
The explanation: The entered password is k. In this case, m is any multiplier (2000 in this example). Usually a quantity of the golden ratio (0,618) is set for A. Modulo (mod) extracts the remainder of a division, performed in this case by 1. The Gaussian brackets round off the result to a whole integer at the end, if necessary. The result h(k) is then the hash value h for input k.
If you want to ty out the function in Excel yourself, you can use the BORDER function for rounding off and the REST function for modulo. So: =BORDER(REST(A1*0.618;1)*2000;1)
As possible passwords, we assume a character set with only numbers and only two places, so 00-99. This holds the table to a manageable range, and letters would first have to be translated into numerical values anyway. For the password 78 it then follows that:
Since our hash value consists of four digits, we fill out the beginning with a 0: 0408.
Password | Hash value |
---|---|
Null | Null |
01 | 1236 |
02 | 0472 |
03 | 1708 |
… | … |
78 | 0408 |
… | … |
99 | 0364 |
In a rainbow table for this hash function, reduction functions now need to be run. One very simple option for reducing the hash value is, for example, to use only the last two digits. So, in the case of the password 78 and the corresponding hash value 0408, the reduction is 08. A hash value is formed again from this with the help of the presented function, and so on.
The frequency of repetitions is up to you. The more often you run a repetition, the less storage space the rainbow table needs – but the processing time increases. In this example, we run a reduction three times.
p1 | h1 | p2 | h2 | p3 | h3 | p4 | h4 |
---|---|---|---|---|---|---|---|
Null | Null | Null | Null | Null | Null | Null | Null |
01 | 1236 | 36 | 0496 | 96 | 0656 | 56 | 1215 |
02 | 0472 | 72 | 0992 | 92 | 1712 | 12 | 0832 |
03 | 1708 | 08 | 1888 | 88 | 0768 | 68 | 0048 |
04 | 0944 | 44 | 0384 | 84 | 1824 | 24 | 1664 |
05 | 0180 | 80 | 0879 | 79 | 1644 | 44 | 0384 |
The above table shows the complete chain with the results of the hash and reduction functions. The goal of a rainbow table, though, is to shorten the range. That’s why, in the finished rainbow table, only the left and right columns of the table are included. All other values can be derived from these.
p1 | h4 |
---|---|
Null | Null |
01 | 1215 |
02 | 0832 |
03 | 0048 |
04 | 1664 |
05 | 0384 |
06 | 1260 |
07 | 0656 |
09 | 0944 |
10 | 0607 |
11 | 0539 |
13 | 0607 |
14 | 1824 |
17 | 0272 |
18 | 0651 |
19 | 1104 |
20 | 1664 |
21 | 0204 |
22 | 1552 |
25 | 0944 |
26 | 1215 |
27 | 0832 |
29 | 1664 |
30 | 0384 |
31 | 1260 |
33 | 0272 |
34 | 0944 |
37 | 0992 |
38 | 0656 |
39 | 1824 |
40 | 1440 |
41 | 0159 |
42 | 0272 |
43 | 0651 |
45 | 1824 |
46 | 0204 |
47 | Null |
49 | 0384 |
50 | Null |
53 | 0048 |
54 | 1664 |
55 | 0384 |
57 | 0656 |
58 | 1328 |
59 | 0651 |
61 | 0539 |
62 | 0992 |
63 | 0656 |
65 | 1440 |
66 | Null |
69 | 1104 |
70 | 1664 |
71 | 0204 |
73 | 1712 |
74 | 0384 |
77 | 0832 |
78 | 0048 |
81 | 1260 |
82 | 1712 |
83 | 0272 |
85 | 0428 |
86 | 1484 |
89 | 1824 |
90 | 0384 |
93 | 0700 |
94 | 1552 |
95 | 1824 |
97 | 1552 |
98 | 1036 |
99 | 0384 |
In this example, the size of the rainbow table only decreased slightly from the original table: 140 entries compared to 200. This is due to the already small range, the less complex hash and reduction functions, and the small number of reductions. This rainbow table then makes a very suitable example.
Now not all hash values are available in the table. If, for example, you knew that there was a password behind the hash value 1888, you would now search the created rainbow table and find that the value doesn’t appear in the table, but is hidden in a chain. Now you need to reduce the value, which results in 88. This value is also not part of the table, so you’ll recalculate the hash value (0768) and the reduction (68). The corresponding hash value 0048 is in the third line. But the password in the same line (03) doesn’t belong directly to the hash value, so it’s only the beginning of the chain.
It provides a good starting point for the next calculation, though: from 03 you can calculate the hash value 1708. This reduces to 08 and builds a new hash value: 1888 – the target hash value. So, the password 08 belongs to this value.
Measures against rainbow tables
Now that you understand how attackers can break into user accounts using rainbow tables, it should also be clear that you need to use suitable defense mechanisms. Both users and those responsible for the website can take measures to prevent such attacks, or at least make them more difficult.
User
For users themselves, the following generally applies: passwords should be as long as possible and contain upper-case letters, lower-case letters, numbers, and other characters – this helps against brute force attacks and rainbow tables, since it makes decryption too elaborate. The length of the password also exponentially increases the size of the required table. It’s also recommended to not use any real words, but instead use random strings of characters to protect against attacks based on dictionaries. A password manager could help with this.
It’s also very important – regardless of which form of attack is being used – to not use passwords more than once. Once someone has managed to corrupt a database, decrypt the passwords, and access personal data, it’s easy to try out the exact same password on all other online accounts.
Administrator
But server operators can also do a lot to protect users. To start with, you should obviously try to not let attackers access the databases with the hash values in the first place. But that’s easier said than done, as is proven by the large number of attacks on the servers of big companies. You’ll inevitably need to secure the hash value. That begins by not using outdated algorithms.
Both MD5 and SHA-1 have long been considered unsafe, and corresponding rainbow tables are very easy to find online. Passwords that have been hashed this way can be uncovered within seconds. This is why it’s essential to keep yourself informed as to whether there are new algorithms or how safe the used hash function still is. SHA-2 and its best-known variant SHA-256 are still considered safe, but SHA-3 is now also available, which promises even longer safety.
To make decryption using rainbow tables a bit more difficult, you can use something called salt: when a user sets a password, the system also creates a random value, the salt. This value flows together with the password into the hash function and so generates a different value than the password alone.
Salt and hash value are stored together in the database. This can be confusing: attackers who receive the contents of the database have the username, hash value, and corresponding salt. Brute force and dictionary attacks can’t be avoided, but the additional measure particularly helps against rainbow tables. Such a table is created in advance, based on a hash algorithm and independent of the database being used. The salts also can’t be contained in rainbow tables, since the creator of the table didn’t know the salt yet.
Another benefit of salt is that the user can’t make note of it. This means they can be completely chaotic and incredibly long. It makes the hash values themselves so complex that working with them is more difficult. A salt also prevents two or more people from entering the same password and so writing the same hash value into the database. Finally, salt helps with the problem that users insist on using the same password for various services. The stored hash value is different for each service, based on the salt.
In addition to salt, there’s also pepper: this complicates attacks with brute force or dictionaries. Pepper is also a random character sequence incorporated with the password in the hash value – best if combined with salt. As opposed to salt, pepper isn’t saved together with other login data in a database, but instead is separated and stored in as secure a location as possible.
Oftentimes, a set character string is used for all passwords on the platform. So, pepper doesn’t help with the fact that several users have the same password, because they also use the same pepper – which leads to identical hash values. Administrators should select a combination of salt and pepper as a result.