Junior developer Interaction with API Documentation (API Days Speech)
I am presenting in the API Days Helsinki & North Online conference tomorrow. Here is the material, that I am going to be presenting tomorrow. If the video is ever released, I will also linked it here.
The idea from the speech came from the conversation, that I had at work. One of my coworkers from the team noted, an interesting thing. Multiple teammates asked him about the difference between GET and POST calls to the API. In the tone that he found the question weird - like should they not already know it.
I am proud, that my team is the team that predominantly employees juniors, so I figured out, that I want to dedicate the speech to some of the things, that I noticed, while my team encountered.
The first point, that I would like to point out is, about the place of the documentation. In most cases, as long as it is public, it should be fine. If it is under the same domain (subdomains also count), that is an extra plus.
Once, I was looking for the API of one service. They were advertising having the API, so I figured it has to be somewhere. I was really happy, when I found the swagger file in the site map. So I looked over it and implemented the solution based on this.
Imagine my surprise, when I got back the message, that this documentation is outdated, and I should use the one accessible after I log in.
Well, except that I could not find this supposed documentation. We end up sharing credentials, in case some of us just missed it. And since it is behind the account wall, the search engines can not help with searching as well. On the end we gave up.
There was also a case, where they were advertising the API. Again, no documentation. But what I could find was, that they did have an API on the API subdomain - that they were not using for the website. I am pretty sure about it, since they were using the backend rendering.
They also had something that looked like an API Key, but it did not help me. Some of the frequent API authentication were tried - like the Authentication headers, both raw, as Bearer header, and as Basic header as raw, as username or as password, as API and key query variables and as x-token header.
I would like to say, that the situations, when people advertise the API, but do not provide the documentation as rare - but they are really now. At one point I started to write emails to each of them, asking for documentation - but I never heard back. Now I just assume, it is a very weird marketing.
If you have an API, please make sure, that the documentation is publicly available.
Don't assume all domain knowledge
This one is usually the problem only with some of the enterprise services. I have seen the documentation, where there was an endpoint to all any table in the database. With absolutely no way to get the list of all tables or with some information, what tables are available.
I would like to say, that the GraphQL API can sometimes hide the information, that almost ends up with the same result.
How do we get to this? Well, the reason is that in some cases the GraphQL documentation does not even provides the information about the objects and their connections, that can be accessible with the API.
I know, that there is a supposed way to get the structure by using the API - but I have yet to see it work anywhere. Maybe the problem is, that I have never see it documented correctly yet?
But honestly, most of the GraphQL calls get created by trial and error. You start with something that works. Either it is an example in the documentation, or if not there, by talking the query from the internet. Then try to change it, until you get something that more or less does what you want.
Examples here are a lot more important, then in the REST-type of APIs.
SDK vs. HTTP calls
One thing, that actually surprised me, is how SDK usually provides a more confusing experience. At least compared to the raw HTTP calls.
This shows in a preference to work on the HTTP codebase, compared to the SDK part of the codebase. I was thinking about it why. Shouldn't the SDKs be easier to deal with than the raw HTTP calls?
I think the reason is, that the HTTP calls are similar to each other. The way how to deal with URLs, with query parameters, with body, with headers is always the same.
But each SDK has a different syntax. Because of this, it is basically impossible to simply copy the solution from one part of the codebase to another part of the codebase. Instead, each integration problem needs to be tackled as it is a new problem.
Some would be surprised, how many junior developers simply start with coping the structure and code and then change it until it works. Not that non-juniors also do not copy code. But in later cases the copied code is generally not immediately seen as the copied code.
So taking into account, that they do not have a code, they can use as a base - the HTTP calls suddenly become a simpler solution.
Before going into a more HTTP like point, I would like to point out the problem with the base URLs. In some cases, there does not seems a better way to figure out which URL to use as the base URL then try all of them.
If there is a way, please just spell it out in a way, that people can understand. In a couple of cases, I have asked the people on which region they where? Or which plans do they have? Both some of the frequent ways to determent the base URL. And they could not tell me, since they did not know.
If you have a place, where people could check -> even if it is the URL they access the frontend of the app at, please just write this down in the documentation.
I talked in the upper part, how the HTTP methods are not usually well understood by juniors, when they join.
We generally only have authentication and read calls (because of the nature of what my team is doing), so the question is normally just between POST and GET verbs. Since these are the only ones, that we use in our codebase.
If we get with the definitions from the MDN, the definitions go like this:
GET: The GET method requests a representation of the specified resource. Requests using GET should only retrieve data.
POST: The POST method submits an entity to the specified resource, often causing a change in state or side effects on the server.
The definitions sometimes included in the documentation as also in this spirit.
Ok, then the follow up question is, why we are using the POST, if we only use the read calls. Sometimes OAuth authentication calls are not really read integrations - since they do sort of change the access (and refresh tokens) on their side.
The not really accurate description that works is generally, that GET do not have a post body, but the POST do.
I think this is a point of the documentation, that is not really needed even for all the juniors out there. It generally help with the confusion, especially since the verbs are not always used in a way, they are described in their definitions.
The next one is about the HTTP codes. Sometimes, they are added as part of the documentation - but they are usually only added, when the usage corresponds to the right HTTP codes. There are almost never added, when the HTTP codes do not correspond to the normal usage.
This means, that the juniors are extra careful for the normal cases, and not in the non-normal cases.
What do I mean with the non-normal cases. There are multiple APIs, that in case of non-authenticated calls they return the HTTP code 200 with a login page. There are also multiple cases, when non-authenticated cases return the HTTP code 400 - though a substantial fraction of these will at least provide a descriptive message in this case.
Some more weird cases for non-authenticated calls are returning HTTP codes 404 or 503 and 502.
One of the weird example, not connected with the authentication was using HTTP code 403 for the rate limit.
Of course, there are also cases, when the API always returns HTTP code 200, and then in the response body describes the problem.
If you are using the HTTP codes in some unusual ways, then please include this information in the documentation. If not, then use your judgment.
What is an OAuth
The next point, where there is potential for confusion. Working on the first OAuth integration is usually one of the steps, that the juniors need to go through. And once they do, they are a couple of confusions, that came connected to this.
The first one is, that what juniors imagine as the OAuth is explicit OAuth grant with non-permanent tokens and refresh flow. But what gets advertised in documentation at the OAuth is also password and credentials grant.
So there is always that conversation, when they see the OAuth in the documentation, but they ask about the refresh flow - this is how the conversation usually starts.
In is interesting to try and explain, why the API is using the credentials grant.
This is the description from the [OAuth site]:
The Client Credentials grant is used when applications request an access token to access their own resources, not on behalf of a user.
So I never have an answer to the questions, of why this authentication is used.
But my favorite example is one, where the documentation claimed, that the tokens are permanent, and there is no refresh flow. But one of my teammate showed with implementation, that the refresh flow work for them.
The second to last point is about authentication of the API. There are a lot of different ways to authenticate with the API. We dealt with the OAuth authentication before, but some of the frequent ones are API key in either the header or as query parameters, or basic authentication with username and password / API key. There is also Soap, which I hope will eventually fail out of use, since it is pain in the ass.
But there are also some more unusually ones, like manipulating the personal certificate. In the case, that I am thinking of right now, there had a non-working sample code, in how this is supposed to work. Without a working code example, there is no way a junior developer can handle this without any help.
As long as the more normal ways of authentication are used, it is usually alright to not describe them in a lot of details. Even junior developers can usually figure out how to add an Authorization header or query parameters.
The last point is about something a bit unusual. Something, that I have noticed in the junior developers is, that they will trust the documentation more than they would trust the results of the code.
Let me explain. There were quite a lot of cases, when the response structure in the documentation and the response structure in the actual calls was not the same. Each time, when I asked them about it, the answer is usually the documentation says this.
It can lead to some interesting cases, when the API provides more data, then the documentation explains. One example was, when the API, that would return real data, suddenly started to miss the property in the results. The documentation did not provide any information, helping us explain this. So it was interesting to see people being confused by this.
In the end, I would like to say, that most of the documentation we deal with does not provide a lot of problems for me. And at least for smaller companies, they are usually very well written.
I am also aware, that writing a good documentation is hard work, and a lot of times it is a skill, that we just expect the people to have. But good documentation is not a simple thing to do. And I am always thankful for any decently written documentation.
But because it is both a hard and important topic, I wanted to bring some potential problems, that I noticed in my work, to the front of other peoples mind.