As developers and engineers, we are the busy bees of the information age. Always striving to create more performant solutions and improve the performance of existing ones. In this pursuit, we keep adding new tools to our tool belts that will help us to send information faster and more cost-efficient through the wire.
One of these tools is a cache.
It is not only a great tool to improve a solution’s performance. A cache can also help with reducing requests, computing time and power, and costs for databases.
Based on the requirements, implementing a cache is not too complex a task. Emphasis is on “implementing”; using it and developing failure proof caching strategies is a different story.
Before I go ahead and dive into implementing a simple in-memory cache with JavaScript, I want to outline a demo scenario and a bit of the cache’s internals. If you are in a hurry, feel free to jump to the code part and skip the intro.
Scenario
For demo’s sake, we will keep it simple. Just imagine, we have a server that exposes an endpoint that allows users to request some data. Each request to that route triggers a call to a database to gather said data.
Since the database is hosted on the other side of the world and is basically running on a pocket calculator, each request will at least take 5 seconds — what an awful user experience.
Our stakeholder wants us to improve the situation. Specifically, to decrease the response time to provide information for the clients faster.
In this scenario, no one else ever encountered that problem before, or no one participates in open source, so there is no caching module available. It is up to us to implement it ourselves.
Before we get our hands dirty coding, we define the requirements of our cache module:
- key-value: the endpoint is the key, the database’s response will be the value
- set: setting or updating a key and a value
- get: retrieving the value for a key
- delete: deleting a key along with its value
- flush: deleting all keys and their values
Such a module will enable us to store the database response after the initial request returns successfully. Then this cached response is served to all subsequent requests. That will reduce the waiting time for the clients dramatically and take some load off the database.
If the database’s data is updated, we will delete the cache entry for the affected endpoints. As another mechanism to keep the cache up to date, we will introduce a TTL (Time to Live) for cache entries. That way, the cache entry is deleted automatically, in case it outlives its given lifespan. That will help if a data update goes unnoticed.
At this point, we have a solid plan and a good understanding of what we want to achieve. It’s time to code.
Tiny Cache
I will use deno for the implementation to short circuit the hassle of setting up a typescript environment myself.
# Project structure
./src/cache/
├── _delete.ts
├── _get.ts
├── _initialize.ts
├── _set.ts
├── cache.d.ts
├── cache.test.ts
└── index.ts
Initialization
The initialization will be straight forward. Just call the method with the TTL in milliseconds. That’s it.
Internally this will create a Map, which will act as a key-value store for our cached data.
Each value in the store will contain the value itself and the timestamp of its creation.
interface ICacheEntry<T> {
value: T;
_created: number;
}
On initialization, an interval triggers that will check the cache entries periodically. An entry that outlived its TTL gets deleted.
import _initialize from "./_initialize.ts";
import _set from "./_set.ts";
import _get from "./_get.ts";
import _delete from "./_delete.ts";
export default <T>(ttl: number = 2000): ICache<T> => {
const CACHE = _initialize<ICacheEntry<T>>();
let CLEANUP = setInterval(
() =>
CACHE.forEach((entry, key) => {
if (entry._created + ttl < Date.now()) {
CACHE.delete(key);
}
}),
ttl
);
return {
set: (key: string, value: T) =>
_set(CACHE, key, { value, _created: Date.now() }),
get: (key: string) => {
const result = _get(CACHE, key);
if (result && result._created + ttl > Date.now()) {
return result.value;
}
return undefined;
},
delete: (key: string) => _delete(CACHE, key),
flush: () => CACHE.clear(),
stop: () => {
clearInterval(CLEANUP);
CACHE.clear();
},
};
};
The interval will keep the event loop busy and prevent a shutdown of any application. This interval will keep the event loop busy and prevent a shutdown of the application. It is the stop methods job to clear that interval and allow the application to shut down.
Setting and getting data
These are the bread and butter operations of any cache. But that doesn’t mean that they have to be over-complex.
// setting data
export default <T>(cache: Map<string, T>, key: string, value: T) =>
cache.set(key, value);
// getting data
export default <T>(cache: Map<string, T>, key: string) => cache.get(key);
The get function is wrapped with a TTL check during the initialization process to prevent returning an invalid value but not deleted yet.
return {
...
get: (key: string) => {
const result = _get(CACHE, key);
if (result && result._created + ttl > Date.now()) {
return result.value;
}
return undefined;
},
...
}
Deleting data and flushing the cache
Removing an entry from the cache or resetting it is just a matter of calling the delete method or clear method of the underlying Map.
// deleting data
export default <T>(cache: Map<string, T>, key: string) => cache.delete(key);
// flushing the cache
return {
...
flush: () => CACHE.clear(),
...
};
Tiny cache in action
All that’s left now is to add our cache module to the given scenario and see if we satisfied our stakeholder’s requirements.
Therefore, we create a mock for such an endpoint. It will call upon a function that will need at least 5 seconds to return.
import { ICache } from "../cache/cache.d.ts";
const realSlowDatabaseCall = (): Promise<string> =>
new Promise((resolve) => setTimeout(() => resolve("Hello World"), 5000));
export default (cache: ICache<string>) =>
new Promise((resolve) => {
const result = cache.get("realSlowDatabaseCall");
if (result) return resolve(result);
realSlowDatabaseCall().then((result) => {
cache.set("realSlowDatabaseCall", result);
resolve(result);
});
});
Our cache is checked for a valid entry each time before the said function executes. If one is present, the endpoint will send its value as a response. Otherwise, the function will execute, and the returned value will be stored in the cache and then sent as a response.
import cache from "./cache/index.ts";
import fakeRouter from "./fakeRouter/index.ts";
// creating a cache instance with default TTL
const tinyCache = cache<string>();
try {
// Initial request - will not hit cache and therefore be slow
console.time("runExpensiveRequest");
const resultRequest = await fakeRouter(tinyCache);
console.log(resultRequest);
console.timeEnd("runExpensiveRequest");
// Cache hit - result is cached
console.time("runCachedRequest");
const resultCache = await fakeRouter(tinyCache);
console.log(resultCache);
console.timeEnd("runCachedRequest");
// Waiting for some time to pass
await new Promise((resolve) => setTimeout(resolve, 6000));
// Cache miss - entry is over it's TTL
console.time("runTTLRequest");
const resultTTL = await fakeRouter(tinyCache);
console.log(resultTTL);
console.timeEnd("runTTLRequest");
} catch (error) {
throw error;
} finally {
tinyCache.stop();
}
Hello World
runExpensiveRequest: 5004ms
Hello World
runCachedRequest: 0ms
Hello World
runTTLRequest: 5002ms
What a great success! Our module is working, and we were able to reduce the client’s waiting time. We also decreased the number of requests to the database and reduced the heavy lifting on the database side.
You hopefully got a better understanding of how a cache works, why it can be a great tool, and importantly, great tools don’t need to come with a lot of complexity.
Going one step further
You could improve the above example by:
- generating keys with namespaces tied to models
- delete namespaces if an update on the model is triggered
- promisify the methods
- introduce immutability; return copies of the values
- add methods to retrieve all entries, length of entries, etc
You can find the complete code here
Let me end this excursion with a quote for your meditation
There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton