But this is an OAuth, is it not?

This is the speech, that I have presented on PyCon Germany / PyData Berlin conference in April 2022. While I will try and not mention any companies by name, all the examples in this speech are either something that happened to me, or something that I personally witness.

There is a disclaimer needed. I have never before developer an OAuth API. All my experiences come from consuming the dozens of OAuth APIs so far. So while I do have experience about what I am talking about. But, I might have missing pieces, since I did not experience all the aspects of it.

Why OAuth

Let us first quickly go through the problem of, why does OAuth even exist? In the not so distant past, if the services wanted to use the data from another service, the would require the username and password for this service. And let's be clear here, this is sometimes still the only way to get the data from some services.

But there is a big problem with giving somebody your passwords. This person can than do anything to your account, as if they would be you. Not the best situation from the security perspective. Especially for the account that you depend on.

OAuth was one of the solutions for this. It is user friendly, so it does not demand a lot of technical knowledge or settings browsing. It also has the common way to limit the permissions to just the ones, that the app needs.

What is an OAuth

OAuth is a system, where the authentication is outsourced to the service, dedicated to the authentication. This means that the permission settings, login security and so on can be controlled in just one place. In exchange, the API does not need to know, who the person using the API is. They just need to know, if the person accessing the API has permissions to access the data requested.

In a way, it sort of works like the key card in the hotels or like the name tag at this conference. It does not matter, what is written on it, but if you do not have it, you will not be allowed in. I experienced this, since the hole broke, and I could not wear it around my neck. I could exchange the old one for the new one, though - which reminded me of the OAuth refresh tokens.

OAuth components

OAuth have a couple of main components. Here are the four components, that need to be kept in mind, as a developer consuming the OAuth APIs. Below, I am going to describe the most frequent pieces of it. There are cases, where there are some missing or there are some additional ones.

The first is the OAuth application. This is the part, where one usually identifies their application with name and in some cases with logo and additional information. The other important info set here is callback URL. Sometimes there is additional process involved, going from testing the code, to documentation, marketing materials and some more. In exchange we then get client ID and client secret, which can be used.

The second is authorization screen. This is a screen, where people check the permissions required and decide, if they want to grant them. This is usually an URL, which combines the client ID, permissions, callback URL and state. Some of them have some additional elements as well.

The third one is code dealing with tokens. No matter which flow is used, the token for accessing API is returned. The most frequency way would be to send the authorization code to the callback URL, and they would exchange it for access and refresh token. Access token is then used to access API, while the refresh tokens are used to get new access and refresh tokens.

The forth component is the code used to access the API.

These are all the components, one needs to keep in mind, when connecting with the OAuth APIs.

Let's now move to focus to the developer experience

So now that we have the basic understanding of how pieces fit together, let us move to the possible problems, that one can have, when consuming the OAuth APIs. All of these were collected from personal experience.

App creation

Use account under your control

We will start with the possible problems with the application creation.

One of the first things things to keep in mind is, to always use the contact details under your control. Not just now, but also in the future. Let me explain this with a couple of examples.

In the first case, we had the OAuth applications connected to the email of the specific person. That person then decided to leave the company. We were discussing the passing of the responsibilities. Since that person did not mind us getting the access to his work email, we left most of the OAuth applications as is.

Well, first it took a while for redirect to be established. Then in quite some of the cases, the reset password did not work for some reason. I suspect that in at least one case, there was a problem with licence expiring in this time. And since we only had the redirect of email, we could not send email from his address to the support.

In the end, we needed to create the new OAuth application for these ones.

The similar case happened once, when we used the person's account outside of our team. At the time, I could not find the way to create the application without having the license to the product. But some people outside of the development at the company had the license. So problem solved, just use one of their account to create the OAuth application. Except, once they let the license expire a couple of years later, we also lost the OAuth application. When they checked it again, there was no trace of it in the account.

In this time, they had added the option of the development accounts, where you could create the OAuth application for other people without having the license. So while we lost the old one, at least we had the easier time to create a new one.

The third one happened, when our email accounts lost the ability to receive and send emails. About a year ago, we were required by a German company. But while we got the new email accounts, we still kept the old ones as well. We were even aware, that we need to move all the OAuth applications to the new domains eventually. So we kept looking and asking, checking, how much time we still had.

Until the emails could not longer receive emails without any advance warning. So if these services would detect anything as suspicious activity, we would lost access. Which happened. So we had to create new OAuth applications for some of the services.

But there was a case, that was more baffling than this. I was once asked, if it is alright to use one of our customers account to create the OAuth application, that we could use. We did had their username and password, since we used to have a scraping integration implemented. This OAuth application would then be used to connect the other customer credentials.

I needed some time to understand, what I was reading.

I treated this as miscommunication on my side and I explained what I really wanted. We did not go further with this. But never, ever, ever, do something like that.

So don't be like that, and make sure you are using the contact and credentials under you control now and in the future.

Screw the branding

Connected to the upper point, we were also 'asked' to create new versions for all OAuth application after acquisition. The reason being the differentiation between the old old company name and the new company name. They also wanted us to do this, when they changed the name of the product. Well, I do have to admit, that we only did it in the first case, but not the second. So there could be still some remains of the product name change on some OAuth permissions screens.

Approval process

The next point is about the approval process. In some cases, the services will require some sort of approval process, which can range from completely reasonable and quick, to opaque and frustrating. Let me tell some of the stories, as I experienced them.

From the developer perspective, the best approval process is probably no process. But if this is not possible, then the second best one would be the one, where they allow you to check the code. In these cases, they give you access to the testing environment. With this, they test the correctness of the code based on the HTTP calls that they receive. And they only approve you, when this reaches some level. For example, when all the permissions are used, and the percentage of failing HTTP calls is below some level.

In some cases, the approval is connected to the documentation we provide about the integration. I remember one approval process, when they gave us back pages of comments of what they wanted to see in our documentation. There were so many points, we ended up building integration for the different credentials types. I recently wanted to find this report, since based on my memory, we could now cover them more easily. But I could not find it.

There was also another case, where they wanted to talk to us first. They brought up a lot of valid points about data quality, that I have agreed with. I also noticed the same ones, that we talked about it. But then during the research at the company I realized, that at the time, we do not communicate this problems to the customers at all.

This made me finish another documentation project first, and now I can slowly move back to this integration.

This same can be said for the marketing materials. We once had the approval process, where one of their complains was the use of the old logo in the webpage. So we changed it to the new one. But since, as every responsible citizen, we implement catching, the person on the other side could not see the changed.

On the end it was easier to break catching on out end for one picture, then teaching this person about the forced refresh.

But all of these are understandable processed, that I could understand. I do remember one approval process, that took over half a year, and I still shudder when I remember it. It was a process with really bad feedback loop, requiring the privacy policy on specific place in the website - where they could not find it for a while, the domain verification, the video with permissions and data collected explanation and probably some more, that I have forgotten.

There were also some, where we gave up on the approval process in the beginning stages. Some of the processes are just... developer unfriendly, if I need to be polite.

Connection phase

Permissions

The next thing are permissions, that we are requesting from the user.

From strictly developer perspective, if the documentation clearly delineates, what permissions are needed to what call, then all is well. If there are no conditional permissions - so the call works, but only shows part of the data, if other conditions/permissions are met, it is really helpful, if these things are described as well. And if the permissions stay stable - so old permissions should work for years.

If you think, that should be the basics, that just means, that basics are not always met. For each of the point above, I know the example of OAuth API that breaks it.

Since the APIs are not always clear about this in their documentation, the universal way to check for permissions is to call the endpoint, that you want to use. You can not depend on getting back the expected HTTP code, but you will only get back expected data, if the permissions check out.

The more problematic part is, when you need to explain to other people, why some permissions are needed. For some reason, people have their own ideas, of what data should be accessible with each permissions. And they do not generally care about the technical details, that makes their idea not accurate. Not that we have any possible effect on it. The people writing the OAuth APIs are the ones, that decide this. Even them probably have somebody above them, that influences these decisions.

Callback URLs

Let us now spend a bit of time on the callback URLs part of the OAuth. The callback URL is the URL, when we get back the authorization code. This is part of the OAuth flow, which lowers the ability for the third part to impersonate the connection and get credentials.

For some reasons, either to separate the testing and production environment, or because the product is targeting enterprise, and need to be present in different regions, sometimes you would want to have multiple callback URLs. But only some OAuth applications allow for the multiple callback URLs to be present on the same OAuth application.

In which case, this means keeping track of different OAuth applications with their respective callback URL on the both sides. Which leads me to the next case.

Saving client ids and client secrets

So, we had the case, that because of the way we approached out architecture at the time, there was a period of time, when we had to keep track of different OAuth applications, because we required the different callback URLs.

Because of the same architectural reason, there were also other pieces of information, that were duplicated and saved in multiple separate places. In order to save us work in the future, they have decided, that one of them is the truth, and the rest of them are copies, that can be rewritten at any time. Including some of the data, that was not the exact copy, like our different client IDs and client secrets for the same OAuth APIs with different callback URLs.

I saw their decision, feedback phase and implementation of that once I returned from my vacation. That was a unwanted surprise, since this meant we had to spend time during the next couple of weeks fixing this. That was not a fun bug to fix, especially since some OAuth APIs require the client secret for refresh flow, but they do not allow you to copy it from the OAuth application later.

Helpfully, at the time, we were still using dumps from the production database in in the local environment, so everything could be eventually restored.

We now changed the architecture a bit, so it will allow us to use just one callback URL per OAuth API, so hopefully this one will not be repeated again.

Coding

OAuth has different flows

Now we are finally getting to the part, when we are writing an actual code for each OAuth integration. Which is what most people imagine, when they think about the OAuth.

This was also the point, that convinced me to start researching for this speech and where the speech got its name.

Up until now, we have assumed, that the OAuth is actually the OAuth with the authorization code grant type. But the OAuth 2.0 does support also other grant types, from implicit, to client IDs and client secrets to also username and passwords.

The OAuth 2.1, which I think is still in development (?), removes some of them and adds some of them. So password and implicit flow are no longer included. We did add some new ones, like the device grant. But there are still multiple grands, no matter which version of OAuth are we looking at.

So this also means, that APIs, using any other grant type, can declare themselves the OAuth API. Even if they are misusing the grant.

Let me give you an example. There was one API, where on the upper part of their public swagger type, they declared themselves an OAuth API. But looking at their documentation and API calls, what they were using is the client credentials grant type - so they were asking the user to use their client ID and client secret to create a time limited access token.

Now client grant type is used to:

obtain an access token outside of the context of a user. This is typically used by clients to access resources about themselves rather than to access a user's resources.

Well, while it is true, that they wanted each user to create their own app with their own client ID and client secret, so at least it is the data about themselves. Sort of. I don't know. It also has a different flow then authorization code grant. The simple click on authorize button is replaced with creating the app in settings and then using the credentials obtained there.

But the reason, why I remember this example so well, if the comment on the pull request, that I have gotten. The person asked me, why there is no refresh flow in the code, if the API is advertising as the OAuth. Explaining that was interesting. Afterwards I have started to refer to this as OAuth marketing.

These same people later informed us, that they API we are using is the old one, and we should use the new one. But we could not find the documentation for the new one, so... you can imagine, which one we are still using.

Refresh tokens

Now let us forget all the rest of the grants, and let us go back to the authorization code grant only. Now we are going to be talking about the refresh tokens.

Now, here I could use the same example as for the above point, to show, that people generally expect the OAuth APIs to use the refresh grant to update the access tokens. But this is not always the case, and going from my feeling alone, the missing refresh flow with authorization grant is more frequent then other grants - at least among the APIs, that declare using the OAuth.

I do remember a interesting case, where the API documentation claimed, there is no refresh flow. But the refresh flow worked, so we implemented it. Still worked months later. And I know I would never notice this, if not for the expectation of another person involved in this pull request, that there should be a refresh flow.

But this bit is also solvable by a bit of education. In is another problem with refresh tokens, that generally trips people off. It is the non-repeatability of the refresh flow and the time limit of the refresh tokens.

Let me go first with the non-repeatability. What I mean here, that in some cases, running the same code twice, can lead to different results. The main reason being, that some refresh tokens are one use only and can not be reused. In this case, on the first call, the access token and refresh token would be returned, but on the second call, the invalid credentials error would be returned.

This has tripped the people I know multiple times. You can not just rerun the refresh flow locally, without saving credentials. Or at least, if you make this mistake, then don't act surprised multiple times. It is still the same problems. I sometimes do not get the rest of programmers.

This is why one needs to make sure the new refresh token is saved, even if the rest of the code does not follow a happy path.

The one that trips even me is the time limit. If the refresh token is not used in some specific time frame, then it automatically expires. The duration can be from a week to month or even longer, depending on the API. But that also means, that if the code stops executing because of the bug, and it is not fixed in this time frame, all the credentials are lost. This had happened multiple times before, me being personally responsible for some of them.

Still need to think of a way, of how this bugs get prioritized. Hmm.

Changing APIs

And now for the last point, the consequences of API changing. Even if the APIs are treated as more stable than scraping, and they do change less frequently then webpages, they still change all the time. They can range to changes in the endpoint URLs and returning structure to the changes in the permissions, and changes in the entire flow,

The changes in the URLs and return structure are the easiest to adapt to. So these ones generally do not cause a lot of problems, assuming this is documented somewhere and the same data can still be retrieved or changed with the same credentials. This is not always true.

For example. We were once using a certain endpoint to retrieve some aggregate metrics. At one point, the metric retrieved became 0 for some of the users, even though we could see through the rest of the aggregated data, that this should not be 0. It also did not help, that this did not happen to every user, neither did it happen to all the affected users at the same time.

This API did have a way to inform you, if the data was incomplete, but there was no message in this regard. There was also nothing on documentation site about it. No changelog was found. Search engines were unsurprisingly unhelpful. So in the end, we simply stopped aggregating this for everybody. It is not like we knew, what need to be true to get this data.

For the last example, I want to talk about the change in the API, that stopped us from using it at all. We had the integration, that worked well, and suddenly, all the credentials started failing. What we could discover is, that the refresh flow stopped giving us new refresh tokens, and the one we got at the start expired after two weeks.

Looking at the documentation, we tried multiple things, but none of them worked. The API in question also no longer allowed the creation of the new OAuth application, that could access the same API.

For a while, we moved to scraping, which used username and password to authenticate with the OAuth API. But eventually the form also became too complicated, and we removed this integration from the support.

OAuth is still one of the better ways to deal with user data

Now, based on this presentation, some people might have an opinion, that I am against the OAuth. I actually love OAuth, and still think it is the best way to authenticate with the API from user perspective. It is also quite good from the developer perspective. I actually want more APIs to support the OAuth authentication.