
Resolving issue of using iNat API through Kong Gateway
TLDR: iNaturalist API gives a “not found” error when a request has a “X-Forwarded-Host” header.
This web app has several integrations with other organisations through their public APIs. In fact, I'd say that's probably the core of the entire functionality of On a Map. And one of those integrations is with iNaturalist, which is a citizen science platform that houses the data from millions of observations of plants and animals.
Up until now, On a Map has fetched data directly from iNaturalist’s own API (which I'll call the destination or “upstream”) - with nothing in between. I wanted to change that; to route all traffic to their API through our API gateway (powered by Kong Gateway). This would provide many benefits, which I'll go into another time.
So there would be essentially 2 sets of requests and responses:
- Stage 1. Between the “client” (user’s browser) ↔ API gateway
- Stage 2. Between the API gateway ↔ destination or “upstream” (iNaturalist)
Each stage has a request and response.
Or you can look at it this way:
Client → API gateway → destination → API gateway → Client
I set this up in the API gateway by creating an iNaturalist “route” (for stage 1) and an iNaturalist “service” (for stage 2).
The problem was that very time I tried a request, I got a 404: Page not found error. And this was despite all my other routes and services working perfectly (for several other integrations).
This implied that either the gateway URL or the iNat URL was wrong.
This didn’t make sense, so I checked the logs. In the gateway, we have in integration plugin for Loggly, which stores log messages for all requests and responses.
For each request that goes via the API gateway, there are 2 log messages (one for each stage).
The upstream_uri was showing the correct path, and the Service details (representing the destination) was also correct.
I suspected that the 404 Error page I was seeing was one generated by Kong Gateway rather than the destination, so I tried several ways of getting the original error page. I tried to log it using the “pre-function” plugin as explained by:
How to Log Request and Response Body With Kong | Red Tomato's Blog
But the log message showed the same body I was l ready seeing.
I tried investigating if Kong Gateway shows its own Error pages when it detects an error from the upstream, but this didn’t turn up anything.
I then thought that requests from my API gateway server might be blocked, so I SSH’ed into the server, and ran a curl request to iNat’s API, and that worked fine. So that proved that it wasn’t being blocked by something like its IP address being blacklisted.
Since the URL being requested in stage 2 was correct, and requests weren’t being blocked, I figured that perhaps something else about the request was causing the issue; perhaps the headers. I needed to see the request headers of the stage 2 request. I didn’t realise I could check these in the log requests.
So I found an online service called Request Catcher that I could send requests to, and it would tell me info about them, including the headers. When I did that, it showed that the API gateway was sending a lot of headers (23 in total).
To test which one/s were the problem, I recreated the request in my API manager app Insomnia (also a product of Kong as it happens).
When I made the request with no headers, it worked fine (got a valid response). When I tried it with all 23 the headers, it got a “Not Found” response. Aha! So one or more of them was definitely the issue. I turned them off one at a time until I discovered the culprit; the “X-Forwarded-Host” header.
I tried various values for that header, but any value I provided made the request fail, and it worked when it wasn’t present.
I figured that Kong Gateway must be adding that header, though I’ve found out since that it actually nginx adding it. Anyway, to remove that header from future requests, I added the Post-Plugin plugin, which allowed me to disable headers, as explained by this docs page:
Post-Function: Disable headers from being passed to upstream service - Plugin | Kong Docs
That was it; after disabling that header from all requests to iNat, I can now send requests through the API gateway successfully. Which also lets me use Cloudflare, which allows me to cache the requests, speeding up loading and reducing the load on iNat’s API.