Managing Latency by Managing Locality
As a payment orchestrator, Juspay strives to make payment experiences as seamless as possible. This means that the interfaces and the APIs it provides should be fast, with minimal latencies. A core part of the work in Juspay deals with making this possible.
While this branches itself out into many tracks and explores multiple possibilities, there is one unifying idea that underlies many of those efforts: keeping things closer to where you use them. It is an idea that renders its application even in daily life, but is nonetheless overlooked when designing efficient systems.
This article looks at some of the common ways in which we apply them, along with an in-house system built on the principle.
Idea behind locality management
The idea behind the principle is that data and its compute should happen as spatially close as possible. With processor advancements following Moore's law, accessing memory or resources over the network becomes increasingly expensive on a relative basis. This memory / network 'wall' problem is then managed by keeping data closer to where it is processed, thus reducing the time needed to move it. We manage data by managing its locality.
This is also one of the ideas behind caching. While the original development of 'caching' in the 1960s used it as a faster working memory that kept data before it was permanently stowed, it is increasingly used to hold data that 'might' be needed in the immediate term. The futurism requires a measure of omniscience, regarding the data that potentially would be needed. A lot of effective approximations to the Belady Algorithm (a complete, but hypothetical omniscience on future needs) have now been devised.
The memory hardware meanwhile, has moved to using a hierarchy that progresses from the fastest memory hardware (and the most expensive) within the processor (L1 cache) to slower, bigger and farther memory devices in disks. Importantly, it is not the performance of the memory device alone that is scarce here but also its proximity to the processor.
This is one of the aims in Juspay's performance tracks; recognising levels of memory hierarchy and programming to ensure they are used appropriately. While high level functional programming languages like Haskell that extensively uses heap continues to be mainstay at Juspay, more focus is now spent in doing it the 'right' way, appreciating and using the levels of memory hierarchy effectively.
Meanwhile, the locality principle is also exploited at higher scales of geography, in different ways. We look at a few of them, starting with an in house system that slashed the number of database calls per payment transaction.
In-memory config systems
A payment transaction involves dozens of database calls, often to a relational database/ Redis stored in a different server. These include creates and reads for payment tracker tables (relating to the ongoing transaction), longer term objects (e.g. customer info, stored payment methods etc.) and merchant/ gateway specific configuration. The configuration calls are especially interesting as they constitute a significant fraction of all the database accesses and are infrequently updated. Fetching them from the database added considerable latency to payments, and were ripe for some optimizations.
The locality idea lends itself well to use here. Given the configs are relatively static, we can bypass the network calls to databases and the time required to fetch them from the disk by keeping them inside the pods. These are stored as simple hash-maps inside the pod, with table objects stored against a primary key marker for that entry, and additional keys to support querying by secondary indices.
This feature is built-in to the framework code, and tables can be configured to be stored in the pod's memory on the go. Entries are initially read from the relational database, and are then stored in the pod for a configured period of time. Subsequent calls are then fetched from within the pod, reducing latency for those queries and relieving CPU overhead on the database. The computing pod's memory is thus efficiently used to store a pre-defined number of entries.
Eviction is based on an LRU algorithm, where the least recently used entry is replaced by a new incoming entry if storage is full. This algorithm exploits temporal locality found in computer systems and code - a piece of data fetched now is more likely to be used again than longer living data. Incidentally, the LRU algorithm is one of the simplest but most effective approximations to Belady algorithm, in predicting future needs.
Entries in the pod's memory can also be updated by pushing the changes to a common stream of data. This stream of data can be checked periodically by a background process running within the pod that consumes and edits data inside, in a matter of a few seconds. Since the config tables changes are infrequent and minor delays in updates are acceptable, the system drastically reduced database lookup calls.
Leveraging Content Distribution Networks
As for the scale of locality principle, it goes right up to the global level. Requests for data sometimes have to traverse continents before they are fetched from servers in different parts of the world. A simple way around is to have these requests served from a geographically closer server. This is the idea behind Content Distribution Networks, and uses the same fundamentals, based on proximity and hierarchical storage.
Juspay uses AWS' cloud-front systems in multiple payment products to speed up the requests it receives. The Cloud front delivers a request from nearby edge locations that provide the lowest latency time, and if not present, it caches the data from the origin specified. There are levels of storage hierarchy, with edge locations giving way to regional edge caches as in memory architecture, with data at each point residing longer than before.
These cloud front systems are used to store payment page configs, static web contents, payment method information pertaining to a merchant etc. in Juspay, among others. As with in-memory config systems, these are readily served content, infrequently subject to changes.
Interestingly, Amazon uses similar techniques for physical inventory systems as well. The 'Anticipated Package Shipping' system for which it holds a patent 'caches' the high demand items in a separate warehouse for quick retrieval. These are items which have been recently popular in a given area, which are then staged in a warehouse in that area.
A lot of problems in real life find their parallel in computer science!
Use of Redis for databases
Juspay uses Redis more than as a system for caching and session management. We actively use it as a database to manage high traffic spikes and protect the relational database against overheating.
The effectiveness of that idea is partly the simplicity of the general key - value system, as against the relational database with its own complexities. But it is also in large measure due to the use of RAM memory for storing its keys, as against disk memories in conventional relational databases. RAMs are faster, more expensive memories placed closer to the CPU core, reducing data querying and data manipulation times from an order of microseconds to nanoseconds.
They do come with some limitations, being less pliable for querying compared to relational databases and more prone to data loss upon restarts. But, with some workarounds for these we were able to devote a much higher portion of our database queries and inserts/ updates to Redis in a payment transaction. This slashed our querying times, and manifested itself in API latencies as well, reducing them to as much as 30-40% of the previous levels.
A first principles idea
Juspay, as a company, is focussed on solving problems with a first principles approach. While the concept behind locality management is a CS 101 idea, it pervades thinking, system designing and programming at Juspay. And the above are only a few examples that directly or indirectly reference the idea. More to come, as more we build.