Gifabol - Caching for airgapped solutions #10736

mattkrick · 2025-01-24T01:37:45Z

for all deploys, it'd be nice to have the images hosted on our platform vs. hosted on tenor.
for airgapped solutions, we need to create a solution where folks can search.

so, for prod:

when a query of '' comes in & we fetch the featured gifs, we need to save those gifs to S3 as well as write them to a table
we can write them to a bucket subdir so instead of store or build, we'll have gifabol
In PG we need 4 tables: GifabolGif, GifabolTag GifabolGifTag, GifabolQueryCache. URL has id, description, urlNano, urlTiny, urlOriginal. GifabolTag is id tag TEXT UNIQUE. GifTag is a cross table with a compound PK.
It may also be advantageous to cache the search results without resorting to tags. for example, if someone searches for "food" then we'd have a table with query, startCursor, endCursor, result, cachedAt. The result would be a TEXT[] of the GifabolGif table IDs. This gets tricky with pagination since we don't have cursors, just a next string for the request which will be the endCursor for that batch and the startCursor for the next batch. alternatively, we could denormalize it to query, gifId, rank, endCursor. this would make it easy to read. to write, we'd need to know how to create rank. rank would be the order of results as they come back from tenor, e.g. 1-20 if there was no start cursor. we'd make the endCursor the after value. that way, we adding new values for a particular query, we'd query select * from GifabolQueryCache where endCursor = $after order by rank desc limit 1. where $after is the value that came in via graphql (after is the start cursor, next is the end cursor). we can even index on endCursor where it is not null & only put the cursor on the one with the biggest rank. when a query comes in & overwrites it, it won't push the items down, it'll just overwrite the first page. that way could still query for the first n items in 1 query. when the 2nd page results come in it'll go right after first. if a 3rd never comes in, then the ranks will still hold true. there may be dupes, but who cares if it's in the later pages.
if the query cache is empty, then we'll need to search by tag. first, we'll search by exact match. then, we'll search by prefix. foo% => food.

for all deploys, if we don't like URLs pointing to tenor:

we can't write to S3 faster than we can send a URL to the client, so that means the client is going to get a tenor URL.
when they pic a gif, we can upload that to our own S3 like we do for embedded URLs. embedUserAsset is gonna check out the size, verify that it's a picture, and store it under the User subdir in S3. Ideally, we would store it in the gifabol subdir. By the time they make a selection, it might already be there. if there's a deterministic way to go from the tenor URL to our S3 url, then we can just use that without an extra server call. basically see if the URL starts with the CDN_BASE_URL. if it does, use that. if not, then convert it by using the ID of the gif and the requested size.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gifabol - Caching for airgapped solutions #10736

Gifabol - Caching for airgapped solutions #10736

mattkrick commented Jan 24, 2025

Gifabol - Caching for airgapped solutions #10736

Gifabol - Caching for airgapped solutions #10736

Comments

mattkrick commented Jan 24, 2025