Technical

Verifying session tokens without storing them

Yegor Sak
Yegor Sak

Most computer systems that are designed to work with multiple users usually make use of session tokens. What this essentially means is that when you login into a system with your username and password, you are given a token which grants you access to resources based on your permission level. This is usually managed for you behind the scenes when you interact with a website, or application which would then use this session token to make all subsequent requests to a central API to fetch resources (account status, server locations, notifications, etc) that it needs.

Once a session token is created, it’s usually stored in a central database that is operated by the service you’re using. Every request that you make attaches this token, which is then verified against the database to make sure it’s valid and hasn’t expired. This is a simple system which works great, however it comes at a cost: every single login has to be logged into a database. The operator then can refer to this database and see exactly how many times you logged in, when you did it, and see any other information that is attached to the token. For most services this makes complete sense, but given the nature of the service that Windscribe provides, we aim to do better. Our goal is to provide the best privacy solution we can, which means we want to store as little information on you as possible, so we came up with an alternative solution. This next bit is a bit more technical, but should be pretty easy to follow.


A session token has to have 3 qualities:

  1. Uniquely identify your account
  2. Be really hard to guess
  3. Can be revoked

Point #2 is why session/access tokens are usually fairly long, and consist of random letters and numbers, like so:

32b226a4961e95687b5004c5b080a84c6a07e144

This token would be stored in a database, along with your user ID. Any request that supplies this token will be assumed to have originated from the user who’s ID is attached to the token. If this token is deleted, then you no longer have access to the service and you will usually have to login again.

We wanted to design a system which has the same 3 qualities, but without actually storing anything in a database. This also has a benefit for the operator, since the infrastructure to store session tokens is no longer needed. The solution is surprisingly simple and it involves digital signatures.

The following is a simplified example for illustration purposes only.

Let’s say you made an account on Windscribe, with the username “garry”. We perform a hash function on your username. In the example below we use SHA1, but you probably want to use something like SHA512 or BCrypt if the stakes are high. It returns a cryptographic hash (called a signature), which looks like this:

sha1(garry) = 0ab32e193e58c743deddac0c7187002d0e6744bd

We can then append it to the username to create a simple session token:

garry-0ab32e193e58c743deddac0c7187002d0e6744bd

If we just did that, this would be a pretty terrible system, since it only has the 1/3 qualities which we desire. It wouldn’t take a lot of trial and error to figure out that you just have to perform a SHA1 hash function on the username to get the signature, and there is no way to invalidate this session.

We have to introduce a secret into the equation. A secret is exactly what it sounds like: a secret word, number, string of emojis, or a combination of all 3 that is extremely hard to guess. Think of it as a really secure password that is only known to the operator of the service. Let’s pretend our secret is “hunter2”, but in reality it’s going to be a lot longer and a lot more random than that. So now our signature generation algorithm would look like this:

sha1(garry . hunter2) = 815b84de64aa1b79508eff62e53e41d7b87fcf88

So your new session token would look like this:

garry-815b84de64aa1b79508eff62e53e41d7b87fcf88

As you see, the signature looks a lot different, and it would be significantly harder for you to guess it to access the “garry” account, since you would have to know or guess the secret, which is infeasible if the secret is long and random enough.

This however suffers from 2 problems:

  1. You’re using the same secret for all users. In the unlikely event of you cracking it, you would be able to access the information of all users.
  2. You cannot invalidate the token. If someone gets this token, they can permanently access your account and there is nothing you can do about it other than delete it and make a new one.

To fix these problems, we have to add a 2nd signature, which would use a unique secret specific to each user. When you sign up, we generate a random, unique and long secret on your behalf, and store it in the database along with your account data (username, password, email, etc). Let’s pretend the secret for “garry” user is “totallyasecret”.

So when a session token is generated, we perform 2 hash functions:

sha1(garry . hunter2) = 815b84de64aa1b79508eff62e53e41d7b87fcf88sha1(garry . totallyasecret) = 5d6815ef456c8e48f69db720982b87a68d9a70cb

Then we concatonate it all together to create a session token:

garry-815b84de64aa1b79508eff62e53e41d7b87fcf88–5d6815ef456c8e48f69db720982b87a68d9a70cb

When the above session token is supplied via an API call, the server will perform the same hash functions it did to generate the session token when you logged in, and compare the signatures. If they match, it means that this is a legitimate session for “garry”, and access can be granted to whatever resources that (s)he has access to.

The 2nd signatures solves the above 2 problems because:

  1. The attacker would have to brute force 2 signatures for each user in order to gain access.
  2. The user specific secret can be changed, which would effectively invalidate all previously created sessions for that specific account.

Then we can get even more fancy and include other data into the session token, like session creation timestamp which would be be included into the hash function input. Then you can automatically expire session tokens based on TTLs without any external database look-ups.

We’re already testing this system on our network. If you login right now, you will get a procedurally generated session token that we don’t have to store, which also gives you greater privacy while using Windscribe.


Yegor Sak
Yegor Sak