One of the most exciting developments in API security is the growing consensus that API discovery and risk assessment are necessary first steps to improve API security. However, several factors make the API attack surface difficult for an organization to discover and measure without purpose-built tools.

APIs are dark by their very nature. You can’t just fire up a web browser and explore all your APIs running in production. Many organizations experience a loss in visibility into their operations as they have transitioned from web-based to API-based systems. The biggest drawback to embracing APIs is that few people outside software development and engineering will find it easy to understand, support, or audit APIs for security or compliance. This creates a need for discovery tools explicitly built for APIs.

APIs also change frequently and unpredictably, especially from the perspective of non-technical staff. Making changes to a hosted API is typically much easier and faster than making the equivalent change to an application with a web or mobile interface. The benefit to API-centric organizations is being more agile and responsive to changing customer requirements and market conditions. The challenge for API security is that API discovery and risk scoring must be done continuously as the APIs change.

Another headwind for API discovery is that APIs are typically intended to be used in a composable way. The trend for many organizations is to externalize a growing number of highly specific APIs, many of which were only available internally before. This yields tremendous flexibility for API consumers, since they can mix APIs from multiple providers as they see fit to assemble a complete application. The challenge for API security is that the ultimate use of an API is less predictable than with vertically integrated components, so API discovery must include user activities as part of measuring the attack surface.

Creating an API inventory

Having decided to create an API inventory, the next decision will be what to include and how APIs should be counted. There are several reasonable ways to do this:

  • Counting top-level domains.
  • Counting domain, method, and path combinations.
  • Counting subdomains (x.y.z).
  • Counting OpenAPI and GraphQL specs.
  • Counting API service VMs or containers.
  • Counting integration contracts or SLAs.
  • Logging specific user activities (attackers and valid users).

There are trade-offs depending on what you decide to count:

  • Top-level domains change too infrequently, and method/path combinations change too frequently, but counting active subdomains works well across large and small organizations.
  • Discovery based on OpenAPI or GraphQL specifications is very effective when specs are already available from development, or when specs can be generated from runtime monitoring. It also aligns naturally with counting by subdomain.
  • Infrastructure-based discovery helps map the runtime environment, but isn’t informative about what is happening at the API layer. Don’t assume a one-to-one relationship between VMs or containers and APIs. It’s very common for API services to be multi-homed, where each VM or container serves multiple APIs (and even multiple versions of those APIs) simultaneously.

When initial discovery is complete, an organization should be able to state that X APIs are present, where Y of those need immediate attention. Teams that can confidently track X and Y as KPIs are best positioned to improve their security posture over time.

Tracking API inventory changes

Because APIs tend to change quickly, it’s essential to update the API inventory continuously. A manual change-control process can be used, but this is prone to breakdowns between the development and security teams.

The best way to establish a continuous discovery process is to adopt a runtime monitoring system that discovers APIs from real user traffic, or to require the use of an API gateway, or both. These options yield better oversight of the development team than relying on manual notifications to the security team as API changes are made.

Because continuous discovery sees changes as they happen, it’s natural to group APIs based on their life cycle and level of support. Most organizations find these common groups to be a good starting point:

  • “Rogue” or “unmanaged” APIs are actively being used, but have not been reviewed or approved by the security team.
  • “Prohibited” or “banned” APIs have been reviewed by the security team, and are not approved for use inside the organization or from its supply chain.
  • “Monitored” or “supported” APIs are actively maintained by the organization and supervised by the security team.
  • “Deprecated” or “zombie” APIs were supported by the organization in the past, but newer versions exist that API consumers should use instead.

Quantifying API risks

When the organization has an API inventory that is kept reliably in sync with its runtime APIs, the final discovery challenge is how to prioritize APIs relative to each other. Given that every security team has finite resources, risk scoring helps focus time and energy on remediations that will have the greatest benefit.

There is no standard way to calculate risk for API calls, but the best approaches are holistic. Threats can arise from outside or inside the organization, via the supply chain, or by attackers who either sign up as paying customers, or take over valid user accounts to stage an attack. Perimeter security products tend to focus on the API request alone, but inspecting API requests and responses together gives insight into additional risks related to security, quality, conformance, and business operations.

There are so many factors involved when considering API risks that reducing this to a single number is helpful, even if the scoring algorithm is relatively simple. Since the result of risk scoring is usually a number in a given range (like 1 to 8), risk scores can be aggregated for any grouping of API calls:

  • The percentage of total risk is a better indicator of which APIs need immediate care, but the average risk score is a better KPI for tracking attack surface improvements over time.
  • Average risk score and percentage of total risk should be summarized by monitored API, to focus remediation time budgeted towards supported APIs.
  • Average risk score and percentage of total risk should be summarized for API groupings like “Rogue APIs” and “Prohibited APIs,” to quantify the extent of policy violations or drift between runtime APIs and the API inventory.

While there is no single “easy button” for API security, mature tools that are purpose-built for API discovery and risk scoring are now available to help any organization get started. With the proper tools, security teams can quickly discover their API inventory, track how APIs and associated risks change over time, and establish metrics and KPIs the entire organization can understand.

Rob Dickinson is VP of engineering at Graylog, where he is responsible for Graylog API Security.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.