Design Dig - Oauth 2.0 | Pankaj Baagwan Engineering Blog

I have seen a lot of confusion among people, when term “OAuth” pops up. Some are of the opinion that it is login flow and other people opines it as a account security thing that is better than existing one.

Here; we are going to walk you through OAuth and explain its working. And hopefully, I would leave you with a better understanding of OAuth, what, how and where of it

Introduction

Putting it simple, OAuth is not a service; neither it is an API. It is just a standard to be followed for Centralized authentication mechanism.

An open protocol to allow secure authorization in a simple and standard method from web, mobile and desktop applications.
OAuth.net

More intrinsically, OAuth is a standard that apps, services, websites can use to provide their users with secure yet delegated access to their services. Oauth works over HTTPS (SSL/TLS Layer) and authorizes device(s), Service(s), API(s), Servers and Applications with a unique access token rather than credentials.

As of today; there are two variants/versions of OAuth are in use OAuth 1.0a (with security fix to 1.0) and OAuth 2.0. These are not progressive versions but rather two completely different implementation from One another, hence no backward compatibility or re-use of architecture. OAuth 2.0 is more widely used now a days since it is newer more secure version (comparatively). There are many legacy application which may still be using OAuth 1.0 or upgraded that to OAuth 1.0a. So whenever OAuth is mentioned in this article it is referring to OAuth 2.0 implicitly

Why OAuth?

OAuth was created as a response to the direct authentication pattern in which user creates an account assigning itself a username and password (referred as credentials). This way user has to create an account on any private, customisable web application out there to avail there services. It may or may not be using HTTPS (SSL/TLS) protocol for it (i.e Gmail, Github, Facebook etc). This is called password anti-pattern.

In API access, applications still use HTTP Basic Authentication mechanism, where consumer of API has to pass on credentials (APIK Key or secret) on every request made to API service provider to authenticate itself.

To implement a better system for the web, Sederated Identity was created for single sign-on (SSO). In this scenario, an end-user talks to their identity provider, and the identity provider generates a cryptographically signed token which it hands off to the application to authenticate the end-user. The application trusts the identity provider. As long as that trust relationship works with the signed assertion, you’re good to go. See following diagram

Federated identity was gained popularity by SAML 2.0, an OASIS Standard released on March 15, 2005. It’s a large spec but the main two components are its authentication request protocol (aka Web SSO) and the way it packages identity attributes and signs them, called SAML assertions. This may contain user information to identify end-user for application (application keep this for record) and that it is coming from Identity provider which has signed the token as generated

An quick Introduction to SAML

SAML is basically a session cookie in your browser that gives you access to webapps. It’s limited in the kinds of device profiles and scenarios you might want to do outside of a web browser.

When SAML 2.0 was launched in 2005, it made sense then. But, a lot has changed in internet landscapes since then. Now we have modern web and native application development platforms (mobile, pad or native desktop apps). There are Single Page Applications (SPAs) like Facebook, and Twitter. They have different behaviors than your traditional web application, because they make AJAX calls (background HTTP calls) to APIs. Mobile phones make API calls too, as do Smart TVs, gaming consoles, and IoT devices. SAML SSO isn’t particularly good at any of this.

OAuth and APIs

A lot has changed with the way we build APIs too. In 2005, people were invested in WS-* for building web services. Now a days, most applications have moved to RESTful APIs. RESTful is, in a nutshell, HTTP designed around resources. Remaining are moving to GraphQL APIs, GraphQL is nutshell is get only information that is needed for consumer application.

Developers build a lot of APIs. The API Economy is a common buzzword you might hear in boardrooms today. Companies need to protect their APIs in a way that allows many devices to access them. In the good old days, user would enter username/password and the app would login directly as user. This gave rise to the delegated authorization problem.

“How can I use application/services without creating/giving a credentials to it?”

OAuth is a delegated authorization framework for APIs. It enables apps to obtain limited access (scopes) to a user’s data without giving away a user’s credentials. It decouples authentication from authorization and supports multiple use cases addressing different device capabilities. It supports server-to-server apps, browser-based apps, mobile/native apps, and consoles/TVs.

You can think of this like hotel key cards, but for apps. If you have a hotel key card, you can get access to your room. How do you get a hotel key card? You have to do an authentication process at the front desk to get it. After authenticating and obtaining the key card, you can access resources across the hotel.

To break it down simply, OAuth is where:

App requests authorization from User
User authorizes App and delivers proof
App presents proof of authorization to server to get a Token
Token is restricted to only access what the User authorized for the specific App

OAuth Central Components

OAuth is built on the following central components:

Scopes and Consent
Actors
Clients
Tokens
Authorization Server
Flows

OAuth Scopes

Scopes are what you see on the authorization screens when an app requests permissions. They’re bundles of permissions asked for by the client when requesting a token. These are coded by the application developer when writing the application.

In screen above, one can see that application/service is only asking permission to read tweets and see who you follow.

Scopes decouple authorization policy decisions from enforcement. This is the first key aspect of OAuth. The permissions are front and center. They’re not hidden behind the app layer that you have to reverse engineer. They’re often listed in the API docs: here are the scopes that this app requires.

You have to capture this consent. This is called trusting on first use. It’s a pretty significant user experience change on the web. Most people before OAuth were just used to name and password dialog boxes. Now you have this new screen that comes up and you have to train users to use. Retraining the internet population is difficult. There are all kinds of users from the tech-savvy young folk to grandparents that aren’t familiar with this flow. It’s a new concept on the web that’s now front and center. Now you have to authorize and bring consent.

The consent can vary based on the application. It can be a time-sensitive range (day, weeks, months), but not all platforms allow you to choose the duration. One thing to watch for when you consent is that the app can do stuff on your behalf - e.g. LinkedIn spamming everyone in your network.

OAuth is an internet-scale solution because it’s per application. You often have the ability to log in to a dashboard to see what applications you’ve given access to and to revoke consent.

OAuth Actors

The actors in OAuth flows are as follows:

Resource Owner: owns the data in the resource server. For example, I’m the Resource Owner of my Facebook profile.
Resource Server: The API which stores data the application wants to access
Client: the application that wants to access your data
Authorization Server: The main engine of OAuth

The resource owner is a role that can change with different credentials. It can be an end user, but it can also be a company.

Clients can be public and confidential. There is a significant distinction between the two in OAuth nomenclature. Confidential clients can be trusted to store a secret. They’re not running on a desktop or distributed through an app store. People can’t reverse engineer them and get the secret key. They’re running in a protected area where end users can’t access them.

Public clients are browsers, mobile apps, and IoT devices.

Client registration is also a key component of OAuth. It’s like the DMV of OAuth. You need to get a license plate for your application. This is how your app’s logo shows up in an authorization dialog.

OAuth Tokens

Access tokens are the token the client uses to access the Resource Server (API). They’re meant to be short-lived. Think of them in hours and minutes, not days and month. You don’t need a confidential client to get an access token. You can get access tokens with public clients. They’re designed to optimize for internet scale problems. Because these tokens can be short lived and scale out, they can’t be revoked, you just have to wait for them to time out.

The other token is the refresh token. This is much longer-lived; days, months, years. This can be used to get new tokens. To get a refresh token, applications typically require confidential clients with authentication.

Refresh tokens can be revoked. When revoking an application’s access in a dashboard, you’re killing its refresh token. This gives you the ability to force the clients to rotate secrets. What you’re doing is you’re using your refresh token to get new access tokens and the access tokens are going over the wire to hit all the API resources. Each time you refresh your access token you get a new cryptographically signed token. Key rotation is built into the system.

The OAuth spec doesn’t define what a token is. It can be in whatever format you want. Usually though, you want these tokens to be JSON Web Tokens (a standard). In a nutshell, a JWT (pronounced “jot”) is a secure and trustworthy standard for token authentication. JWTs allow you to digitally sign information (referred to as claims) with a signature and can be verified at a later time with a secret signing key. To learn more about JWTs, see A Beginner’s Guide to JWTs in Java.

Tokens are retrieved from endpoints on the authorization server. The two main endpoints are the authorize endpoint and the token endpoint. They’re separated for different use cases. The authorize endpoint is where you go to get consent and authorization from the user. This returns an authorization grant that says the user has consented to it. Then the authorization is passed to the token endpoint. The token endpoint processes the grant and says “great, here’s your refresh token and your access token”.

You can use the access token to get access to APIs. Once it expires, you’ll have to go back to the token endpoint with the refresh token to get a new access token.

The downside is this causes a lot of developer friction. One of the biggest pain points of OAuth for developers is you having to manage the refresh tokens. You push state management onto each client developer. You get the benefits of key rotation, but you’ve just created a lot of pain for developers. That’s why developers love API keys. They can just copy/paste them, slap them in a text file, and be done with them. API keys are very convenient for the developer, but very bad for security.

There’s a pay to play problem here. Getting developers to do OAuth flows increases security, but there’s more friction. There are opportunities for toolkits and platforms to simplify things and help with token management. Luckily, OAuth is pretty mature these days, and chances are your favorite language or framework has tools available to simplify things.

We’ve talked a bit about the client types, the token types, and the endpoints of the authorization server and how we can pass that to a resource server. I mentioned two different flows: getting the authorization and getting the tokens. Those don’t have to happen on the same channel. The front channel is what goes over the browser. The browser redirected the user to the authorization server, the user gave consent. This happens on the user’s browser. Once the user takes that authorization grant and hands that to the application, the client application no longer needs to use the browser to complete the OAuth flow to get the tokens.

The tokens are meant to be consumed by the client application so it can access resources on your behalf. We call that the back channel. The back channel is an HTTP call directly from the client application to the resource server to exchange the authorization grant for tokens. These channels are used for different flows depending on what device capabilities you have.

For example, a Front Channel Flow where you authorize via user agent might look as follows:

Resource Owner starts flow to delegate access to protected resource
Client sends authorization request with desired scopes via browser redirect to the Authorize Endpoint on the Authorization Server
Authorization Server returns a consent dialog saying “do you allow this application to have access to these scopes?” Of course, you’ll need to authenticate to the application, so if you’re not authenticated to your Resource Server, it’ll ask you to login. If you already have a cached session cookie, you’ll just see the consent dialog box. View the consent dialog, and agree.
The authorization grant is passed back to the application via browser redirect. This all happens on the front channel.

There’s also a variance in this flow called the implicit flow. We’ll get to that in a minute.

This is what it looks like on the wire.

Request

GET https://accounts.google.com/o/oauth2/auth?scope=gmail.insert gmail.send
&redirect_uri=https://app.example.com/oauth2/callback
&response_type=code&client_id=812741506391
&state=af0ifjsldkj

This is a GET request with a bunch of query params (not URL-encoded for example purposes). Scopes are from Gmail’s API. The redirect_uri is the URL of the client application that the authorization grant should be returned to. This should match the value from the client registration process (at the DMV). You don’t want the authorization being bounced back to a foreign application. Response type varies the OAuth flows. Client ID is also from the registration process. State is a security flag, similar to XRSF. To learn more about XRSF, see DZone’s “Cross-Site Request Forgery explained”.

Response

HTTP/1.1 302 Found
Location: https://app.example.com/oauth2/callback?
code=MsCeLvIaQm6bTrgtp7&state=af0ifjsldkj

The code returned is the authorization grant and state is to ensure it’s not forged and it’s from the same request.

After the Front Channel is done, a Back Channel Flow happens, exchanging the authorization code for an access token.

The Client application sends an access token request to the token endpoint on the Authorization Server with confidential client credentials and client id. This process exchanges an Authorization Code Grant for an Access Token and (optionally) a Refresh Token. Client accesses a protected resource with Access Token.

Below is how this looks in raw HTTP.

Request

POST /oauth2/v3/token HTTP/1.1
Host: www.googleapis.com
Content-Type: application/x-www-form-urlencoded

code=MsCeLvIaQm6bTrgtp7&client_id=812741506391&client_secret={client_secret}&redirect_uri=https://app.example.com/oauth2/callback&grant_type=authorization_code

The grant_type is the extensibility part of OAuth. It’s an authorization code from a precomputed perspective. It opens up the flexibility to have different ways to describe these grants. This is the most common type of OAuth flow.

Response

{
  "access_token": "2YotnFZFEjr1zCsicMWpAA",
  "token_type": "Bearer",
  "expires_in": 3600,
  "refresh_token": "tGzv3JOkF0XG5Qx2TlKWIA"
}

The response is JSON. You can be reactive or proactive in using tokens. Proactive is to have a timer in your client. Reactive is to catch an error and attempt to get a new token then.

Once you have an access token, you can use the access token in an Authentication header (using the token_type as a prefix) to make protected resource requests.

curl -H "Authorization: Bearer 2YotnFZFEjr1zCsicMWpAA" \
  https://www.googleapis.com/gmail/v1/users/1444587525/messages

So now you have a front channel, a back channel, different endpoints, and different clients. You have to mix and match these for different use cases. This up-levels the complexity of OAuth and it can get confusing.

OAuth Flows

The very first flow is what we call the Implicit Flow. The reason it’s called the implicit flow is because all the communication is happening through the browser. There is no backend server redeeming the authorization grant for an access token. An SPA is a good example of this flow’s use case. This flow is also called 2 Legged OAuth.

Implicit flow is optimized for browser-only public clients. An access token is returned directly from the authorization request (front channel only). It typically does not support refresh tokens. It assumes the Resource Owner and Public Client are on the same device. Since everything happens on the browser, it’s the most vulnerable to security threats.

The gold standard is the Authorization Code Flow, aka 3 Legged, that uses both the front channel and the back channel. This is what we’ve been talking about the most in this article. The front channel flow is used by the client application to obtain an authorization code grant. The back channel is used by the client application to exchange the authorization code grant for an access token (and optionally a refresh token). It assumes the Resource Owner and Client Application are on separate devices. It’s the most secure flow because you can authenticate the client to redeem the authorization grant, and tokens are never passed through a user-agent. There’s not just Implicit and Authorization Code flows, there are additional flows you can do with OAuth. Again, OAuth is more of a framework.

For server-to-server scenarios, you might want to use a Client Credential Flow. In this scenario, the client application is a confidential client that’s acting on its own, not on behalf of the user. It’s more of a service account type of scenario. All you need is the client’s credentials to do the whole flow. It’s a back channel only flow to obtain an access token using the client’s credentials. It supports shared secrets or assertions as client credentials signed with either symmetric or asymmetric keys.

Symmetric-key algorithms are cryptographic algorithms that allow you to decrypt anything, as long as you have the password. This is often found when securing PDFs or .zip files.

Public key cryptography, or asymmetric cryptography, is any cryptographic system that uses pairs of keys: public keys and private keys. Public keys can be read by anyone, private keys are sacred to the owner. This allows data to be secure without the need to share a password.

There’s also a legacy mode called Resource Owner Password Flow. This is very similar to the direct authentication with username and password scenario and is not recommended. It’s a legacy grant type for native username/password apps such as desktop applications. In this flow, you send the client application a username and password and it returns an access token from the Authorization Server. It typically does not support refresh tokens and it assumes the Resource Owner and Public Client are on the same device. For when you have an API that only wants to speak OAuth, but you have old-school clients to deal with.

A more recent addition to OAuth is the Assertion Flow, which is similar to the client credential flow. This was added to open up the idea of federation. This flow allows an Authorization Server to trust authorization grants from third parties such as SAML IdP. The Authorization Server trusts the Identity Provider. The assertion is used to obtain an access token from the token endpoint. This is great for companies that have invested in SAML or SAML-related technologies and allow them to integrate with OAuth. Because SAML assertions are short-lived, there are no refresh tokens in this flow and you have to keep retrieving access tokens every time the assertion expires.

Not in the OAuth spec, is a Device Flow. There’s no web browser, just a controller for something like a TV. A user code is returned from an authorization request that must be redeemed by visiting a URL on a device with a browser to authorize. A back channel flow is used by the client application to poll for authorization approval for an access token and optionally a refresh token. Also popular for CLI clients.

We’ve covered six different flows using the different actors and token types. They’re necessary because of the capabilities of the clients, how we needed to get consent from the client, who is making consent, and that adds a lot of complexity to OAuth.

When people ask if you support OAuth, you have to clarify what they’re asking for. Are they asking if you support all six flows, or just the main ones? There’s a lot of granularity available between all the different flows.

Security and the Enterprise

There’s a large surface area with OAuth. With Implicit Flow, there’s lots of redirects and lots of room for errors. There’s been a lot of people trying to exploit OAuth between applications and it’s easy to do if you don’t follow recommended Web Security 101 guidelines. For example:

Always use CSRF token with the state parameter to ensure flow integrity
Always whitelist redirect URIs to ensure proper URI validations
Bind the same client to authorization grants and token requests with a client ID
For confidential clients, make sure the client secrets aren’t leaked. Don’t put a client secret in your app that’s distributed through an App Store!
The biggest complaint about OAuth in general comes from Security people. It’s regarding the Bearer tokens and that they can be passed just like session cookies. You can pass it around and you’re good to go, it’s not cryptographically bound to the user. Using JWTs helps because they can’t be tampered with. However, in the end, a JWT is just a string of characters so they can easily be copied and used in an Authorization header.

OAuth is not an Authentication Protocol

To summarize some of the misconceptions of OAuth 2.0: it’s not backwards compatible with OAuth 1.0. It replaces signatures with HTTPS for all communication. When people talk about OAuth today, they’re talking about OAuth 2.0.

Because OAuth is an authorization framework and not a protocol, you may have interoperability issues. There are lots of variances in how teams implement OAuth and you might need custom code to integrate with vendors.

OAuth 2.0 is not an authentication protocol. Event its official article says so.

We’ve been talking about delegated authorization this whole time. It’s not about authenticating the user, and this is key. OAuth 2.0 alone says absolutely nothing about the user. You just have a token to get access to a resource.

There’s a huge number of additions that’ve happened to OAuth in the last several years. These add complexity back on top of OAuth to complete a variety of enterprise scenarios. For example, JWTs can be used as interoperable tokens that can be signed and encrypted.

Pseudo-Authentication

Login with OAuth was made famous by Facebook Connect and Twitter. In this flow, a client accesses a /me endpoint with an access token. All it says is that the client has access to the resource with a token. People invented this fake endpoint as a way of getting back a user profile with an access token. It’s a non-standard way to get information about the user. There’s nothing in the standards that say everyone has to implement this endpoint. Access tokens are meant to be opaque. They’re meant for the API, they’re not designed to contain user information.

What you’re really trying to answer with authentication is who the user is, when did the user authenticate, and how did the user authenticate. You can typically answer these questions with SAML assertions, not with access tokens and authorization grants. That’s why we call this pseudo authentication.

Stay tuned <3, Signing off for RAAM

Credits

Image Credits:

About The Author

I am Pankaj Baagwan, a System Design Architect. A Computer Scientist by heart, process enthusiast, and open source author/contributor/writer. Advocates Karma. Love working with cutting edge, fascinating, open source technologies.

To consult Pankaj Bagwan on System Design, Cyber Security and Application Development, SEO and SMO, please reach out at me[at]bagwanpankaj[dot]com
For promotion/advertisement of your services and products on this blog, please reach out at me[at]bagwanpankaj[dot]com

Stay tuned <3. Signing off for RAAM

Design Dig - Oauth 2.0

Categories

Tags