Data layout in Hylo #1065

kyouko-taiga · 2023-10-05T11:15:12Z

kyouko-taiga
Oct 5, 2023
Maintainer

Disclaimer: Most of what I know about this topic has been picked up from SO posts and other compiler implementations. If someone thinks they know better, they are probably right!

I originally thought we could rely on LLVM to implement data layout for us, but it turns out I am wrong. As it turns out, we probably want more control over the binary layout of our structures so that we can give some guarantees about what MemoryLayout<T> will say about a given T.

By default, LLVM uses a target data layout to determine how the members of a structure are laid out. This layout is required to match what the ultimate code generator expects; it isn't a specification that the frontend (in our case, hc) can dictate.

Swifty-LLVM lets us interact with a target data layout with a handful of methods, notably:

DataLayout.storageSize(of:): the storage size of a types's representation;
DataLayout.preferredAlignment(of:): the preferred alignment for optimizing load/store instructions; and
DataLayout.abiAlignment(of:): the ABI-mandated alignment of a type.

The current implementation uses these methods to generate memory layout data (i.e., the information returned by MemoryLayout). Further, it relies entirely on LLVM to lay out structure elements. Sadly, I realized that there were a few issues with this approach.

Alignment requirements:

The first is that we may want alignment requirements different from (but compatible with) the ones defined by the target data layout. For example, the ABI requirement of i64 is 4 bytes on my Apple M1, while we most likely want i64 to have the same alignment as ptr (i.e., 8 bytes) on a 64-bit machine.

Padding:

The second is that we may want to finely control where padding is inserted to optimize storage and implement efficient union representations. For example, Swift may use the trailing bytes of an inner struct to lay out a member of the outer struct (see details).

Interoperability:

The last issue is that, on top of these constraints, we may need a way to interoperate with C and C++, which will most certainly use different layout algorithms.

I turned to Swift and Rust for answers. For the first issue, it seems like Rust simply defines the layout of built-in types and uses some algorithm to determine the size an alignment of every other type. Only the properties of this algorithm are specified. Swift probably does something similar for built-in types but I couldn't confirm in swiftc's sources. However, its layout algorithm has stronger guarantees.

I think we should do the same for Hylo. One question is to know whether we can define portable alignment requirements. I suspect we should be fine if we say that a built-in type T is aligned at min(MemoryLayout<T>.size(), MemoryLayout<i64>.size()). Removing the platform specificity would make it easier to write portable code.

AFAICT, the only way to finely control type layouts in LLVM is to use packed structs. For example, all Swift structs are lowered packed in LLVM:

// LLVM <{ i8, [7 x i8], <{ i64, i8 }>, i8 }>
struct S2 {
  var x: UInt8
  var s: (Int, UInt8)
  var y: UInt8
}

We should probably do the same for Hylo, which would solve the second issues. One question is to know how we can create pointers inside the contents of a struct if we use this technique without causing misalignment. I believe we should be fine if we guarantee that fields are laid out in a way that it satisfies their alignment requirement.

To address the third issue, Rust proposes a #[repr(C)] attribute. We can probably do something similar in Hylo too.

#[repr(C)]
struct ThreeInts {
  first: i16,
  second: i8,
  third: i32
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Hylo Group

Data layout in Hylo #1065

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

The Hylo Group

Data layout in Hylo #1065

kyouko-taiga Oct 5, 2023 Maintainer

Replies: 0 comments

kyouko-taiga
Oct 5, 2023
Maintainer