Skip to content

Commit

Permalink
mpk: optimize layout of protected stripes, again (bytecodealliance#7622)
Browse files Browse the repository at this point in the history
* mpk: optimize layout of protected stripes, again

This is another attempt at bytecodealliance#7603, attempting reduce the slab layout
sizes of MPK-protected stripes. While experimenting with the limits of
MPK-protected memory pools, @alexcrichton and I discovered that the
current slab layout calculations were too conservative. This meant that
the memory pool could not pack in as many memories as it should have
been able: we were expecting, but not seeing, ~15x more memory slots
over non-MPK memory pools.

This change brings together several fixes:
- it more aggressively divides up the stripes (as in b212152)
- it eliminates an extra stripe (as in 8813a30)
- it replaces some uses of `checked_*` with `saturating_*` (as in
  fb22a20)
- it improves some documentation
- and, crucially, it reports back a larger value for
  `memory_and_guard_size`

The failures observed with bytecodealliance#7603 when run with MPK
(`WASMTIME_TEST_FORCE_MPK=1 cargo test`) were due to `Store::wasm_fault`
not being able to identify which memory an OOB address belonged to.
This is because the `MemoryPool` was underreporting the size of the
region in which OOB accesses would fault. The correct value is provided
by the new `SlabLayout::bytes_to_next_stripe_slot`: any OOB access
within that larger region must fault because (1) the other stripes have
different protection keys and (2) a `Store` can only use one protection
key. We also use (2) to guarantee that `Store::wasm_fault` will be able
to calculate the Wasm address from the raw address.

This change also provides a new `traps` test that will reproduce the
failures from bytecodealliance#7603; if we observe `SIGABRT` from that test, it will be
a regression.

* fix: make test x86-specific
  • Loading branch information
abrown authored Dec 1, 2023
1 parent 3386538 commit 03a14ac
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 56 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
# everyone who runs the test benefits from these saved cases.
cc 696808084287d5d58b85c60c4720227ab4dd83ada7be6841a67162023aaf4914 # shrinks to c = SlabConstraints { max_memory_bytes: 0, num_memory_slots: 1, num_pkeys_available: 0, guard_bytes: 9223372036854775808 }
cc cf9f6c36659f7f56ed8ea646e8c699cbf46708cef6911cdd376418ad69ea1388 # shrinks to c = SlabConstraints { max_memory_bytes: 14161452635954640438, num_memory_slots: 0, num_pkeys_available: 0, guard_bytes: 4285291437754911178 }
cc 58f42405c4fbfb4b464950f372995d4d08a77d6884335e38c2e68d590cacf7d8 # shrinks to c = SlabConstraints { expected_slot_bytes: 0, max_memory_bytes: 8483846582232735745, num_slots: 0, num_pkeys_available: 0, guard_bytes: 0, guard_before_slots: false }
109 changes: 56 additions & 53 deletions crates/runtime/src/instance/allocator/pooling/memory_pool.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,21 +92,21 @@ struct Stripe {

/// Represents a pool of WebAssembly linear memories.
///
/// A linear memory is divided into accessible pages and guard pages.
///
/// A diagram for this struct's fields is:
/// A linear memory is divided into accessible pages and guard pages. A memory
/// pool contains linear memories: each memory occupies a slot in an
/// allocated slab (i.e., `mapping`):
///
/// ```text
/// layout.max_memory_bytes layout.slot_bytes
/// | |
/// <-----+----> <-----------+----------->
/// +-----------+------------+-----------+ +--------+---+-----------+-----------+
/// | PROT_NONE | | PROT_NONE | ... | | PROT_NONE | PROT_NONE |
/// +-----------+------------+-----------+ +--------+---+-----------+-----------+
/// | |<------------------+----------------------------------> <----+---->
/// \ | \ |
/// mapping | `layout.num_slots` memories layout.post_slab_guard_size
/// /
/// ◄─────┴────► ◄───────────┴──────────►
/// ┌───────────┬────────────┬───────────┐ ┌───────────┬───────────┬───────────┐
/// | PROT_NONE | | PROT_NONE | ... | | PROT_NONE | PROT_NONE |
/// └───────────┴────────────┴───────────┘ └───────────┴───────────┴───────────┘
/// | |◄──────────────────┬─────────────────────────────────► ◄────┬────►
/// | | | |
/// mapping | `layout.num_slots` memories layout.post_slab_guard_size
/// |
/// layout.pre_slab_guard_size
/// ```
#[derive(Debug)]
Expand Down Expand Up @@ -275,7 +275,7 @@ impl MemoryPool {
{
match plan.style {
MemoryStyle::Static { bound } => {
if u64::try_from(self.layout.pages_to_next_stripe_slot()).unwrap() < bound {
if self.layout.pages_to_next_stripe_slot() < bound {
bail!(
"memory size allocated per-memory is too small to \
satisfy static bound of {bound:#x} pages"
Expand Down Expand Up @@ -341,9 +341,7 @@ impl MemoryPool {
// but double-check here to be sure.
match memory_plan.style {
MemoryStyle::Static { bound } => {
let pages_to_next_stripe_slot =
u64::try_from(self.layout.pages_to_next_stripe_slot()).unwrap();
assert!(bound <= pages_to_next_stripe_slot);
assert!(bound <= self.layout.pages_to_next_stripe_slot());
}
MemoryStyle::Dynamic { .. } => {}
}
Expand Down Expand Up @@ -375,7 +373,7 @@ impl MemoryPool {
base_ptr,
base_capacity,
slot,
self.layout.slot_bytes,
self.layout.bytes_to_next_stripe_slot(),
unsafe { &mut *request.store.get().unwrap() },
)
})() {
Expand Down Expand Up @@ -595,17 +593,16 @@ struct SlabLayout {
/// guard region after the memory to catch OOB access. On these guard
/// regions, note that:
/// - users can configure how aggressively (or not) to elide bounds checks
/// via `Config::static_memory_guard_size`
/// via `Config::static_memory_guard_size` (see also:
/// `memory_and_guard_size`)
/// - memory protection keys can compress the size of the guard region by
/// placing slots from a different key (i.e., a stripe) in the guard
/// region; this means the slot itself can be smaller and we can allocate
/// more of them.
slot_bytes: usize,
// The maximum size that can become accessible, in bytes, for each linear
// memory. Guaranteed to be a whole number of wasm pages.
/// The maximum size that can become accessible, in bytes, for each linear
/// memory. Guaranteed to be a whole number of Wasm pages.
max_memory_bytes: usize,
/// The total number of bytes to allocate for the memory pool slab.
// total_slab_bytes: usize,
/// If necessary, the number of bytes to reserve as a guard region at the
/// beginning of the slab.
pre_slab_guard_bytes: usize,
Expand All @@ -632,7 +629,7 @@ impl SlabLayout {
.ok_or_else(|| anyhow!("total size of memory reservation exceeds addressable memory"))
}

/// Returns the number of Wasm pages from the beginning of one slot to the
/// Returns the number of Wasm bytes from the beginning of one slot to the
/// next slot in the same stripe--this is the striped equivalent of
/// `static_memory_bound`. Recall that between slots of the same stripe we
/// will see a slot from every other stripe.
Expand All @@ -641,13 +638,20 @@ impl SlabLayout {
/// from the beginning of slot 1 to slot 4, which are of the same stripe:
///
/// ```text
/// ◄────────────────────►
/// ┌────────┬──────┬──────┬────────┬───┐
/// │*slot 1*│slot 2│slot 3│*slot 4*│...|
/// └────────┴──────┴──────┴────────┴───┘
/// ```
fn pages_to_next_stripe_slot(&self) -> usize {
let slot_pages = self.slot_bytes / WASM_PAGE_SIZE as usize;
slot_pages * self.num_stripes
fn bytes_to_next_stripe_slot(&self) -> usize {
self.slot_bytes * self.num_stripes
}

/// Same as `bytes_to_next_stripe_slot` but in Wasm pages.
fn pages_to_next_stripe_slot(&self) -> u64 {
let bytes = self.bytes_to_next_stripe_slot();
let pages = bytes / WASM_PAGE_SIZE as usize;
u64::try_from(pages).unwrap()
}
}

Expand All @@ -671,21 +675,20 @@ fn calculate(constraints: &SlabConstraints) -> Result<SlabLayout> {
// To calculate the slot size, we start with the default configured size and
// attempt to chip away at this via MPK protection. Note here how we begin
// to define a slot as "all of the memory and guard region."
let slot_bytes = expected_slot_bytes
let faulting_region_bytes = expected_slot_bytes
.max(max_memory_bytes)
.checked_add(guard_bytes)
.unwrap_or(usize::MAX);
.saturating_add(guard_bytes);

let (num_stripes, slot_bytes) = if guard_bytes == 0 || max_memory_bytes == 0 || num_slots == 0 {
// In the uncommon case where the memory/guard regions are empty or we don't need any slots , we
// will not need any stripes: we just lay out the slots back-to-back
// using a single stripe.
(1, slot_bytes)
(1, faulting_region_bytes)
} else if num_pkeys_available < 2 {
// If we do not have enough protection keys to stripe the memory, we do
// the same. We can't elide any of the guard bytes because we aren't
// overlapping guard regions with other stripes...
(1, slot_bytes)
(1, faulting_region_bytes)
} else {
// ...but if we can create at least two stripes, we can use another
// stripe (i.e., a different pkey) as this slot's guard region--this
Expand All @@ -702,21 +705,22 @@ fn calculate(constraints: &SlabConstraints) -> Result<SlabLayout> {
// pool is configured with only three slots (`num_memory_slots =
// 3`), we will run into failures if we attempt to set up more than
// three stripes.
let needed_num_stripes =
slot_bytes / max_memory_bytes + usize::from(slot_bytes % max_memory_bytes != 0) + 1;
let needed_num_stripes = faulting_region_bytes / max_memory_bytes
+ usize::from(faulting_region_bytes % max_memory_bytes != 0);
assert!(needed_num_stripes > 0);
let num_stripes = num_pkeys_available.min(needed_num_stripes).min(num_slots);

// Next, we try to reduce the slot size by "overlapping" the
// stripes: we can make slot `n` smaller since we know that slot
// `n+1` and following are in different stripes and will look just
// like `PROT_NONE` memory.
let next_slots_overlapping_bytes = max_memory_bytes
.checked_mul(num_stripes - 1)
.unwrap_or(usize::MAX);
let needed_slot_bytes = slot_bytes
.checked_sub(next_slots_overlapping_bytes)
.unwrap_or(0)
// Next, we try to reduce the slot size by "overlapping" the stripes: we
// can make slot `n` smaller since we know that slot `n+1` and following
// are in different stripes and will look just like `PROT_NONE` memory.
// Recall that codegen expects a guarantee that at least
// `faulting_region_bytes` will catch OOB accesses via segfaults.
let needed_slot_bytes = faulting_region_bytes
.checked_div(num_stripes)
.unwrap_or(faulting_region_bytes)
.max(max_memory_bytes);
assert!(needed_slot_bytes >= max_memory_bytes);

(num_stripes, needed_slot_bytes)
};

Expand All @@ -728,13 +732,11 @@ fn calculate(constraints: &SlabConstraints) -> Result<SlabLayout> {
.ok_or_else(|| anyhow!("slot size is too large"))?;

// We may need another guard region (like `pre_slab_guard_bytes`) at the end
// of our slab. We could be conservative and just create it as large as
// `guard_bytes`, but because we know that the last slot already has
// `guard_bytes` factored in to its guard region, we can reduce the final
// guard region by that much.
let post_slab_guard_bytes = guard_bytes
.checked_sub(slot_bytes - max_memory_bytes)
.unwrap_or(0);
// of our slab to maintain our `faulting_region_bytes` guarantee. We could
// be conservative and just create it as large as `faulting_region_bytes`,
// but because we know that the last slot's `slot_bytes` make up the first
// part of that region, we reduce the final guard region by that much.
let post_slab_guard_bytes = faulting_region_bytes.saturating_sub(slot_bytes);

// Check that we haven't exceeded the slab we can calculate given the limits
// of `usize`.
Expand Down Expand Up @@ -979,13 +981,14 @@ mod tests {

// Check that the memory-striping will not allow OOB access.
// - we may have reduced the slot size from `expected_slot_bytes` to
// `slot_bytes` assuming MPK striping; we check that the expectation
// still holds
// `slot_bytes` assuming MPK striping; we check that our guaranteed
// "faulting region" is respected
// - the last slot won't have MPK striping after it; we check that the
// `post_slab_guard_bytes` accounts for this
assert!(
s.slot_bytes * s.num_stripes >= c.expected_slot_bytes + c.guard_bytes,
"slot may allow OOB access: {c:?} => {s:?}"
s.bytes_to_next_stripe_slot()
>= c.expected_slot_bytes.max(c.max_memory_bytes) + c.guard_bytes,
"faulting region not large enough: {c:?} => {s:?}"
);
assert!(
s.slot_bytes + s.post_slab_guard_bytes >= c.expected_slot_bytes,
Expand Down
45 changes: 42 additions & 3 deletions tests/all/traps.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1349,12 +1349,13 @@ fn wasm_fault_address_reported_by_default() -> Result<()> {

// NB: at this time there's no programmatic access to the fault address
// because it's not always available for load/store traps. Only static
// memories on 32-bit have this information, but bounds-checked memories
// use manual trapping instructions and otherwise don't have a means of
// memories on 32-bit have this information, but bounds-checked memories use
// manual trapping instructions and otherwise don't have a means of
// communicating the faulting address at this time.
//
// It looks like the exact reported fault address may not be deterministic,
// so assert that we have the right error message, but not the exact address.
// so assert that we have the right error message, but not the exact
// address.
let err = format!("{err:?}");
assert!(
err.contains("memory fault at wasm address ")
Expand All @@ -1364,6 +1365,44 @@ fn wasm_fault_address_reported_by_default() -> Result<()> {
Ok(())
}

#[cfg(target_arch = "x86_64")]
#[test]
fn wasm_fault_address_reported_from_mpk_protected_memory() -> Result<()> {
// Trigger the case where an OOB memory access causes a segfault and the
// store attempts to convert it into a `WasmFault`, calculating the Wasm
// address from the raw faulting address. Previously, a store could not do
// this calculation for MPK-protected, causing an abort.
let mut pool = crate::small_pool_config();
pool.total_memories(16);
pool.memory_protection_keys(MpkEnabled::Auto);
let mut config = Config::new();
config.allocation_strategy(InstanceAllocationStrategy::Pooling(pool));
let engine = Engine::new(&config)?;

let mut store = Store::new(&engine, ());
let module = Module::new(
&engine,
r#"
(module
(memory 1)
(func $start
i32.const 0xdeadbeef
i32.load
drop)
(start $start)
)
"#,
)?;
let err = Instance::new(&mut store, &module, &[]).unwrap_err();

// We expect an error here, not an abort; but we also check that the store
// can now calculate the correct Wasm address. If this test is failing with
// an abort, use `--nocapture` to see more details.
let err = format!("{err:?}");
assert!(err.contains("0xdeadbeef"), "bad error: {err}");
Ok(())
}

#[test]
fn trap_with_array_to_wasm_stack_args() -> Result<()> {
let engine = Engine::default();
Expand Down

0 comments on commit 03a14ac

Please sign in to comment.