Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching aspects #9

Open
nyurik opened this issue Dec 24, 2023 · 1 comment
Open

Caching aspects #9

nyurik opened this issue Dec 24, 2023 · 1 comment

Comments

@nyurik
Copy link

nyurik commented Dec 24, 2023

When working with pmtiles, especially remote, one may want to cache the directory access. I implemented it for pmtiles crate in stadiamaps/pmtiles-rs#24 - you may want to add this capability too.

Basic idea -- the find_entry_rec recursive directory entry searcher always has a cache instance, and it tries to get a cached directory's entry. That instance could be a noop (no caching, always returns cache miss), in which case it downloads directory and finds an entry in it before passing it off to cache.

This approach allows multiple pmtiles instances to share a common cache. One extra field I added today was an ability to get the size of a cached directory - this way cache could tell how big it is and evict appropriately.

@DerZade
Copy link
Member

DerZade commented Jan 1, 2024

Thanks for the idea, but that wouldn't quite work the way this crate is currently implemented. Here, the directories are always to read in its entirety when opening the archive. The trade-off I'm opting for is that the initial "opening" of the PMTiles archive is more expensive, but retrieving the contents of an individual tile is less complicated as it is already known exactly where each tile is located in the archive. See the following code:

pmtiles-rs/src/pmtiles.rs

Lines 267 to 283 in a5e7f94

let tiles = add_await([read_directories(
&mut input,
header.internal_compression,
(header.root_directory_offset, header.root_directory_length),
header.leaf_directories_offset,
tiles_filter_range,
)])?;
let mut tile_manager = TileManager::new(Some(input));
for (tile_id, info) in tiles {
tile_manager.add_offset_tile(
tile_id,
header.tile_data_offset + info.offset,
info.length,
);
}

There is a way to load only a specific range of tiles when opening an archive (which may allow skipping entire leaf directories when parsing), but this range cannot be extended at a later point. That case is only really helpful, if only a single tile / a couple of tiles are needed.

I'm not saying that is the best way to go, but just that's how it is implemented currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants