Relational vs. NoSQL Databases for API Traffic

By Vineet Joshi in GET/technical Posted Jun 7, 2016

API Consumption drives the importance of API Traffic. Without insight into how your APIs are being consumed, you’re unable to get the analytics for your customers and API usage.API-traffic-blog-banner.png

API traffic data has a few characteristics; high frequency, payload sizes, data structure, tables, volume and objects. Persisting API traffic is important because most services have some sort of rate limiting and different billing tiers for customers based on their usage. There is usually a threshold for alerting and scaling so if someone is using more API calls than they should you can partition them. You also want to be able to provide analytics to your internal organization for the different APIs / resources you are exposing. If your customers are exposing an API it is a good bet that they are integrating with other services as well, allowing them to slice and dice their data with the data they are consuming from your product.

There are differences in typical API request payloads and response payloads. The request data vs. response data can change what your persistent strategy needs to be. With request payloads you usually have moderate to large payload sizes. 80% of API requests are GET requests and POST requests can also be large. The response payloads are also typically large in size. The data structures are in-deterministic, meaning when you are trying to run analytics on the data the table can be large and slicing and dicing the data responses can be variable.

Establishing that persisting the dataSlack_for_iOS_Upload-3.jpg is the best option, most will look towards some SQL database considering that they are most likely already using one. The question is whether or not a NoSQL database should be considered. SQL, being more than 40 years old, the primary interface for RDMBS and having commercial and open source implementations, it is a strong consideration. NoSQL on the other hand have existed since the 1960s, are the primary storage mechanism before SQL gained popularity, and has gained more traction in last 10 years.  

With the comparison of the two technologies below, the main factors in the decision process are data growth, online versus archived data, search filter flexibility, search performance, and clustering and sharding. When deciding what the best is for you consider the measure of your current inbound API traffic, the data retention policy, estimated data growth, and if your customers need heavy slicing and dicing.

Relational model with data organized in a tabular structure Different Model - document, graph, key value
Pre-defined schema definition Dynamic schema definition
Typically vertically scalable - higher cost VMs

Horizontally scalable - lower cost VMs

Powerful and standardized query interface Query interface varies by provider
Most implementations are ACID compliant Follows CAP (Consistency, Availability, Partioning) 

You can use a SQL datastore for:

  • Manageable data sizes
  • Low time period or size based retention policy
  • Low usage frequency
  • Lightweight analytics
  • Query interface needs to be standardizes

You can use a NoSQL for:

  • Scale and volume is important
  • Deep analytics are required
  • Fast queries are paramount
  • Can live with non-standard query interface

More from Cloud Elements development team: 

Want to learn more about API consumption and API design? Download our ebook, The Definitive Guide to API Integrations. 

Get the definitive guide