Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Multi-tiered cache for aws #699

Merged
merged 6 commits into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/perfect-coats-tell.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@opennextjs/aws": minor
---

Add a new multi-tiered incremental cache
172 changes: 172 additions & 0 deletions packages/open-next/src/overrides/incrementalCache/multi-tier-ddb-s3.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
import type { CacheValue, IncrementalCache } from "types/overrides";
import { customFetchClient } from "utils/fetch";
import { debug } from "../../adapters/logger";
import S3Cache, { getAwsClient } from "./s3-lite";

// TTL for the local cache in milliseconds
const localCacheTTL = process.env.OPEN_NEXT_LOCAL_CACHE_TTL
? Number.parseInt(process.env.OPEN_NEXT_LOCAL_CACHE_TTL)
conico974 marked this conversation as resolved.
Show resolved Hide resolved
: 0;
// Maximum size of the local cache in nb of entries
const maxCacheSize = process.env.OPEN_NEXT_LOCAL_CACHE_SIZE
? Number.parseInt(process.env.OPEN_NEXT_LOCAL_CACHE_SIZE)
: 1000;

class LRUCache {
conico974 marked this conversation as resolved.
Show resolved Hide resolved
private cache: Map<
string,
{
value: CacheValue<boolean>;
lastModified: number;
}
> = new Map();
private maxSize: number;

constructor(maxSize: number) {
this.maxSize = maxSize;
}
conico974 marked this conversation as resolved.
Show resolved Hide resolved

// isFetch is not used here, only used for typing
get<T extends boolean = false>(key: string, isFetch?: T) {
return this.cache.get(key) as {
conico974 marked this conversation as resolved.
Show resolved Hide resolved
value: CacheValue<T>;
lastModified: number;
};
}

set(key: string, value: any) {
if (this.cache.size >= this.maxSize) {
const firstKey = this.cache.keys().next().value;
if (firstKey) {
this.cache.delete(firstKey);
}
}
this.cache.set(key, value);
}

delete(key: string) {
this.cache.delete(key);
}
}

const localCache = new LRUCache(maxCacheSize);

const awsFetch = (body: RequestInit["body"], type: "get" | "set" = "get") => {
const { CACHE_BUCKET_REGION } = process.env;
const client = getAwsClient();
return customFetchClient(client)(
`https://dynamodb.${CACHE_BUCKET_REGION}.amazonaws.com`,
{
method: "POST",
headers: {
"Content-Type": "application/x-amz-json-1.0",
"X-Amz-Target": `DynamoDB_20120810.${
type === "get" ? "GetItem" : "PutItem"
}`,
},
body,
},
);
};

const buildDynamoKey = (key: string) => {
const { NEXT_BUILD_ID } = process.env;
return `__meta_${NEXT_BUILD_ID}_${key}`;
};

/**
* This cache implementation uses a multi-tier cache with a local cache, a DynamoDB metadata cache and an S3 cache.
* It uses the same DynamoDB table as the default tag cache and the same S3 bucket as the default incremental cache.
* It will first check the local cache.
* If the local cache is expired, it will check the DynamoDB metadata cache to see if the local cache is still valid.
* Lastly it will check the S3 cache.
*/
const multiTierCache: IncrementalCache = {
name: "multi-tier-ddb-s3",
async get(key, isFetch) {
// First we check the local cache
const localCacheEntry = localCache.get(key, isFetch);
if (localCacheEntry) {
if (Date.now() - localCacheEntry.lastModified < localCacheTTL) {
debug("Using local cache without checking ddb");
return localCacheEntry;
}
try {
// Here we'll check ddb metadata to see if the local cache is still valid
const { CACHE_DYNAMO_TABLE } = process.env;
const result = await awsFetch(
JSON.stringify({
TableName: CACHE_DYNAMO_TABLE,
Key: {
path: { S: buildDynamoKey(key) },
tag: { S: buildDynamoKey(key) },
},
}),
);
if (result.status === 200) {
const data = await result.json();
const hasBeenDeleted = data.Item?.deleted?.BOOL;
if (hasBeenDeleted) {
localCache.delete(key);
return { value: undefined, lastModified: 0 };
}
// If the metadata is older than the local cache, we can use the local cache
// If it's not found we assume that no write has been done yet and we can use the local cache
const lastModified = data.Item?.revalidatedAt?.N
? Number.parseInt(data.Item.revalidatedAt.N)
conico974 marked this conversation as resolved.
Show resolved Hide resolved
: 0;
if (lastModified <= localCacheEntry.lastModified) {
debug("Using local cache after checking ddb");
return localCacheEntry;
}
}
} catch (e) {
debug("Failed to get metadata from ddb", e);
}
}
const result = await S3Cache.get(key, isFetch);
if (result.value) {
localCache.set(key, {
value: result.value,
lastModified: result.lastModified ?? Date.now(),
});
}
return result;
},
async set(key, value, isFetch) {
const revalidatedAt = Date.now();
await S3Cache.set(key, value, isFetch);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use Promise.allSetlled() to parallelize?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on purpose actually, given how it works we have 3 choice for handling write error failure:

  • We could do as it is here, set on S3, then set on DDB, which means that if it fails in DDB, instance that don't have local cache will work as expected. ( But those with local cache will still serve outdated data )
  • We could first write in DDB and then S3, which means that in case S3 fail, new instance will fetch outdated data, and existing instance will fetch outdated data.
  • Or we could use allSettled but then the behavior will be unpredictable in case one of the 2 fails

I should have added a comment explaining this. One other thing we could do is to let the user chose the behavior they'd want.

I'll update and merge the PR tomorrow in case we should chose another option

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh you're right 👍
Thanks for explaining very clearly

await awsFetch(
JSON.stringify({
TableName: process.env.CACHE_DYNAMO_TABLE,
Item: {
tag: { S: buildDynamoKey(key) },
path: { S: buildDynamoKey(key) },
revalidatedAt: { N: String(revalidatedAt) },
},
}),
"set",
);
localCache.set(key, {
value,
lastModified: revalidatedAt,
});
},
async delete(key) {
await S3Cache.delete(key);
conico974 marked this conversation as resolved.
Show resolved Hide resolved
await awsFetch(
JSON.stringify({
TableName: process.env.CACHE_DYNAMO_TABLE,
Item: {
tag: { S: buildDynamoKey(key) },
path: { S: buildDynamoKey(key) },
deleted: { BOOL: true },
},
}),
"set",
);
localCache.delete(key);
},
};

export default multiTierCache;
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import { parseNumberFromEnv } from "../../adapters/util";

let awsClient: AwsClient | null = null;

const getAwsClient = () => {
export const getAwsClient = () => {
const { CACHE_BUCKET_REGION } = process.env;
if (awsClient) {
return awsClient;
Expand Down
7 changes: 6 additions & 1 deletion packages/open-next/src/types/open-next.ts
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,12 @@ export interface MiddlewareResult

export type IncludedQueue = "sqs" | "sqs-lite" | "direct" | "dummy";

export type IncludedIncrementalCache = "s3" | "s3-lite" | "fs-dev" | "dummy";
export type IncludedIncrementalCache =
| "s3"
| "s3-lite"
| "multi-tier-ddb-s3"
| "fs-dev"
| "dummy";

export type IncludedTagCache =
| "dynamodb"
Expand Down
Loading