Organizations that experience a data breach can suffer reputational damage and a public relations nightmare. With new regulations such as GDPR, massive fines are added to the mix of consequences. Protecting against such attacks, therefore, is top of mind for today’s digital businesses—and the growth of APIs has made them a primary focus.
The objective of API security is twofold: You seek to prevent data breaches flowing through the API, as well as disruptions to its own operation. An application’s APIs are not the only logical components targeted in a sophisticated data breach, but for most applications, the API is the channel through which data flows.
This applies to mobile apps and web apps, for example. Consider the many incident reports that mention access tokens being compromised. Attackers leverage these tokens to access data that they are not entitled to by calling an API with them. Even if your API is not productized in any way, it is a hacker’s most attractive attack vector and the likely starting point to an attack plan.
This article reflects on 16 years of experience implementing API security at various enterprises. By sharing lessons learned from notorious hacks and exploring the prospect of emerging security patterns leveraging machine learning, it aims to show you how to shore up your API security and help prevent costly data breaches.
NOTE: REST APIs show up everywhere in your architecture, but this article focuses on measures that need to be applied to your first-line API, the one used by the user-facing app. You are as secure as your weakest component, however, and security best practice needs to be applied at all layers of your application. For example, if you fail to properly secure your MongoDB or Elasticsearch backend running on a public cloud, your associated first-line API security measures will count for little.
Lesson 1 - Secure by design, not through obfuscation
The first step to securing your API is to acknowledge its existence. It may be tempting to consider the API as an implementation detail, of which the specifics need to be known only between the members of a scrum team or two. But by acknowledging the existence of your API outside of your tight-knit development team, you open the door to somebody responsible for security to consider the implications of this API, and you are more likely to have a discussion about it. Somebody in quality assurance may have ideas about how to test it against intrusion attacks.
You also open the door for benefits that are not security related. Your product manager or business development team may be inspired to leverage this API for a requirement that the development team may not have been aware of.
Securing beyond obfuscation involves basic API security design and implementation. A token server is needed, end users need to be authenticated, and access control rules need to be defined and enforced. The API metadata that results from its design strengthens it, as it allows you to implement strict input parameter validation.
Designing for security also means that you build in the ability to monitor and to react to a variety of vulnerabilities and threat scenarios. Your token server not only needs issuing tokens but also needs to revoke them when required, for specific users or apps. This will be leveraged when a breach has been identified and remedied. Even after a bug is fixed, you need to revoke those compromised tokens.
The main reason why hiding your API is not enough is that your client app itself needs to know how to use the API in order to function. And as I explore in the next section, the app can’t keep this information secret.
Lesson 2 - Don’t trust apps to keep secrets
I refer to client-side apps meaning public clients. Private clients (the kind that can keep secrets) also exist, but those are rarely user facing and usually operate behind the scenes.
For over a decade, service providers have searched for ways to ensure that incoming calls to their API originated from their own app, and not some other requestor that happens to know about the existence of the API. Basic attempts to achieve this involve embedding an API key in the app and including it in each API request.
The premise of this security mechanism is that the API key is itself a shared secret between your app and the API. As soon as the app is reverse-engineered, this security mechanism falls apart. The lesson learned here is to resist the fallacy of client-side secrets and that you should greatly limit the information accessed through an API whenever API keys are the only credential provided.
In practice, an app presents a token when accessing an API, not just an API key. This token is issued as part of a handshake that authenticated the user of the app. Each instance of the app should hold a token that refers to its local user and thus ensure that users have access only to data that they are entitled to.
Tokens become compromised when hackers find ways to exploit an app’s bug, by inserting themselves inline of the API traffic somehow (which reveals the token) or by crafting a phishing attack that redirects users to a malicious app posing as a legitimate one and getting a new token on behalf of the user. In either case, the malicious party ends up with a token that allows them to retrieve data meant for that user. Recently, Facebook was the subject of an exploit in which their view-as feature combined with a bug allowed attackers to get tokens for any arbitrary user (ouch).
New developments in token binding standards are promising to reduce the exploitability of compromised tokens. Using such a mechanism, a token would not be useful without an associated key. These are not commonly supported yet, and even after client platforms mature to the point of including this ability, the hacker’s job is simply made harder—and not impossible.
As an API publisher, you often don’t control what client-side developers do. Would you trust the worse client-side app developer against the smartest hackers out there? As Matt McLarty, VP of The API Academy and co-author of O’Reilly’s Securing Microservice APIs, points out, “We've seen numerous examples of supposed secrets being stored directly in client app code, then ending up on Github and other public repos. As an API provider, revocation is a critical need."
Even with the right precaution in place, you will never be 100% protected from access tokens being compromised, either because of a bug in an app or a platform or an elaborate phishing attack in which an end user unintentionally allows a token to be granted to a malicious party. And that leads us right to the next lesson:
Lesson 3 - Recruit end-user input
Phishing attacks require the involvement of end users. They don’t participate voluntarily, of course, but they are in-line of a series of events that will result in a malicious party getting an access token. Because of this, users can be invaluable allies in accelerating the detection of an attack in progress. Service providers should include feedback loops that make it easy to recruit and react to user input. These feedback loops should not be hardcoded to handle a predetermined set of situations; you never know how you’ll want to trigger such events in the future.
Let’s examine a real-world example of such a feedback loop. I sign into Google on a new device, and this results in Google sending me a notification stating that “somebody” signed into Google on a new device using my ID. Google then invites me to take a specific action if I suspect this was illegitimate. The user experience (UX) disruption associated with this notification is minor, and I can just ignore it. Knowing that such a precaution is in place, however, raises my confidence that the service provider is taking my privacy seriously. I am less likely to ignore such a notification if I receive it without having taken prior action.
Another form of helpful “UX disruption” is the step-up authentication that occurs when a user is taking an action that is considered high risk. Users should also be provided an easy way to visualize existing permissions across apps and devices. It should be easy for users to revoke any permission (token) themselves. Users will want to do this if the legitimacy of an application is in question or if a device has been lost or otherwise compromised. The application of this pattern at an API provider is known as consent management.
By tapping into an army of end users, API providers improve their odds in the battle against hackers. The key to effectively implementing feedback loops such as these is to be prepared to act on such feedback. Prompting a user to weigh in on a situation through a feedback loop may seem outside the realm of API security. As we’ll explore in the next section, however, the clues used to detect anomalous or high-risk events that themselves trigger these notifications emerge from the API traffic itself.
Lesson 4 - Classify and detect anomalies
As was discussed in the previous two sections, apps and users can’t be fully trusted. You may be applying all API security principles to their fullest extent, yet there is no foolproof way to ensure that each API request coming into a system is legitimate.
Traditional API security looks at each API call in isolation and applies a series of policies and access control rules. Although best-of-breed API security vendors developed ways to enrich the context of an API request in order to make better-informed decisions, the increasing distribution of systems across multiple microservices, multiple containers and multiple layers introduces new challenges for individual policy enforcement points to get the full picture.
The coordination of all this information across nodes increases complexity. The sea of data that makes up the complete context around an API call cannot be reproduced across each node and analysed thousands of times per second for each API call without slowing down the service. Instead, it needs to be inspected out of band of the direct API traffic, where it can occur without disrupting the service it enables.
As was illustrated by the recent Starwood breach, which started in 2014 and lasted for years, modern attacks do not exist in the scope of single transactions but rather are multi-step and progressive. Although API security needs to continue to operate in the scope of each transaction, advanced API security also needs to be applied out of band.
Imagine an observer who has visibility into all API traffic for an application, who is able to interpret patterns across all API calls, current and past. Such an observer would be able to classify patterns and detect anomalies when attackers poke around, when a progressive brute force attack is in play, or when unusual data is being accessed. Such events would stand out from normal user activity.
The next frontier for API security lies in the application of machine learning to the vast amount of data that is generated by all API nodes of a service provider, such as API gateways and token servers. APIs are already providing increased visibility beyond what most applications provide today, and do so in a platform-independent way. For this reason, applying machine learning to API traffic is a very promising prospect to increase security against attacks in general, and in particular data breaches. This AI layer, decoupled from the policy enforcement points and other nodes making up your runtime API network, can aggregate and analyse all information available, looking for unusual or suspect patterns for closer inspection, to alert the API provider and trigger remedial actions.
Scott Morrison is the CTO of PHEMI, which builds security and governance solutions for big data, and ex-CTO of Layer 7, a best-of-breed API security solution. Scott writes, “Big Data and Machine Learning are key to achieving the next level of security and protecting against data breaches. By applying these technologies to API traffic, you can potentially detect anomalous behavior and stop attacks in progress.”
Boosting API security
API security should be top of mind for any application provider today. Even if you can’t fathom why a malicious party would potentially target your application, keep in mind that although attack reports point the finger at “unauthorized parties” or “malicious attackers,” the source of attacks can be friendly fire too, when a legitimate client is “acting up” in ways that are not intentional and overwhelm an API. Whether accidental or malicious, the worse case scenario is the attack that goes undetected or leaves no trace. Without traceability and training data, applications miss opportunities to become more resilient.