Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Image.Extent() interface #38

Merged
merged 9 commits into from
Nov 20, 2024
Merged

Conversation

nirs
Copy link
Member

@nirs nirs commented Oct 26, 2024

Extent() return the next extent in the image, starting at the specified
offset, and limited by the specified length. An extent is a run of
clusters of with the same status.

Using Extent() we can iterate over the image and skip extents that reads
as zeros, instead of reading a zero buffer and detecting that the buffer
is full of zeros.

Benchmarking show 2 orders of magnitude speeded compared with zero
detection.

% go test -bench Benchmark0p/qcow2$/read
Benchmark0p/qcow2/read-12      516     2503752 ns/op      107213.28 MB/s    1050537 B/op    39 allocs/op
    
% go test -bench NextUnallocated
BenchmarkNextUnallocated-12    116    10236186 ns/op    10489666.91 MB/s          0 B/op     0 allocs/op

Ubuntu 24.04 image in qcow2 compressed format:

% ./go-qcow2reader-example convert /var/tmp/images/test.zlib.qcow2 /tmp/tmp.img
6.00 GiB / 6.00 GiB [-------------------------------------] 100.00% 2.86 GiB p/s 

Ubuntu 24.04 image in qcow2 format:

% ./go-qcow2reader-example convert /var/tmp/images/test.qcow2 /tmp/tmp.img
6.00 GiB / 6.00 GiB [------------------------------------] 100.00% 15.73 GiB p/s 

16 TiB empty qcow2 image:

% ./go-qcow2reader-example convert /var/tmp/images/test.0p.qcow2 /tmp/tmp.img
16.00 TiB / 16.00 TiB [---------------------------------] 100.00% 152.46 TiB p/s 

Fixes #32

@nirs nirs force-pushed the block-status branch 2 times, most recently from 6adb42a to e9ada15 Compare October 26, 2024 21:58
image/qcow2/qcow2.go Outdated Show resolved Hide resolved
@nirs nirs force-pushed the block-status branch 3 times, most recently from 55c5240 to 55e97ab Compare October 27, 2024 14:14
image/stub/stub.go Outdated Show resolved Hide resolved
image/raw/raw.go Show resolved Hide resolved
image/qcow2/qcow2.go Outdated Show resolved Hide resolved
@@ -8,10 +8,24 @@ import (
// Type must be a "Backing file format name string" that appears in QCOW2.
type Type string

type Extent struct {
// Offset from start of the image in bytes.
Start int64 `json:"start"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not strictly needed, since when calling Next(start, length) we always return the requested start value.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing Start makes testing easier and eliminate the error of invalid extents list. When consuming the extents we keep current offset externally so we don't really need it. When printing extents having start is helpful. If we don't report it we have to add another type with Start for encoding json.

// Set if this extent is read as zeros.
Zero bool `json:"zero"`
// Set if this extent is compressed.
Compressed bool `json:"compressed"`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allocated and Compressed are not needed for sparse copy, only Zero is needed. They can be useful for other purposes, for example dumping all extents like qemu-img map --output json.

image/qcow2/qcow2.go Outdated Show resolved Hide resolved
image/qcow2/qcow2.go Outdated Show resolved Hide resolved
@nirs nirs force-pushed the block-status branch 4 times, most recently from d005122 to bebc896 Compare October 28, 2024 01:55
image/image.go Outdated Show resolved Hide resolved
image/stub/stub.go Outdated Show resolved Hide resolved
image/image.go Outdated Show resolved Hide resolved
@nirs
Copy link
Member Author

nirs commented Nov 2, 2024

Current version improves the tests but it not ready yet. I did not make the changes we discussed yet.

I extract the changes preparing for this change to #39 so we can merge them early and make this change smaller.

@nirs
Copy link
Member Author

nirs commented Nov 16, 2024

Last push added test for image with backing file. The new tests can be reviewed but I did not handle yet the rest of the needed changes.

I think the last missing test is an image with shorter backing file.

@nirs
Copy link
Member Author

nirs commented Nov 16, 2024

Last push add the last missing test for shorter backing file. The new test can be reviewed but I did not handle yet the rest of the needed changes.

@nirs
Copy link
Member Author

nirs commented Nov 16, 2024

Last push changes:

  • Rename Next() to Extent()
  • Improve documentation
  • Reorder commits to group image and tests changes

Still missing

  • documentation
  • input validation
  • support for progress in convert (ReadeAt and WriteAt cannot be used now)

nirs added 7 commits November 17, 2024 18:46
Extent() return the next extent in the image, starting at the specified
offset, and limited by the specified length. An extent is a run of
clusters of with the same status.

Using Extent() we can iterate over the image and skip extents that reads
as zeros, instead of reading a zero buffer and detecting that the buffer
is full of zeros.

Benchmarking show 2 orders of magnitude speeded compared with zero
detection (10 TiB/s instead of 100 GiB/s)

    % go test -bench Benchmark0p/qcow2$/read
    Benchmark0p/qcow2/read-12      516     2503752 ns/op      107213.28 MB/s    1050537 B/op    39 allocs/op

    % go test -bench NextUnallocated
    BenchmarkNextUnallocated-12    116    10236186 ns/op    10489666.91 MB/s          0 B/op     0 allocs/op

Only qcow2 image has a real implementation. For other image formats we
treat all clusters as allocated, so Extent() always return one allocated
extent. For raw format we can implement Extents() later using SEEK_DATA
and SEEK_HOLE.

Extent() is more strict than ReadAt and fails when offset+length are
after the end of the image. We always know the length of the image so
requesting an extent after the end of the image is likely a bug in the
user code and failing fast may help to debug such issues.

Signed-off-by: Nir Soffer <[email protected]>
Add a helper for creating an image from list of extents. With this we
can create a raw or qcow2 image with allocation described by the
extents.

Add 4 tests, testing different allocation patterns:

- TestExtentsSome: some allocated clusters and some holes
- TestExtentsPartial: writing partial cluster allocates entire cluster
- TestExtentsMerge: consecutive extents of same type are merged
- TestExtentsZero: different extents types that read as zeros

For each test we verify qcow2 and qcow2 compressed (zlib) images.

Signed-off-by: Nir Soffer <[email protected]>
We cannot have useful tests for raw backing file since we have only a
stub implementation reporting that the entire backing file is allocated,
so we test only qcow2 image backing file.

We test backing file in qcow2 and qcow2 compressed. When both images are
uncompressed, allocated extents from top and base are merged. When base
is compressed, allocated extents from top and based cannot be merged,
since all base clusters are compressed.

Compressed image with backing file may be possible to crate with
qemu-io, but is not real use case so we don't test it.

Signed-off-by: Nir Soffer <[email protected]>
When adding a longer qcow2 image on top of an os image, we treat extents
after the end of the backing file as unallocated.

We test also the special case of raw backing file not aligned to cluster
size. When reading a short extent from the end of the backing file we
extend the length to full cluster.

Signed-off-by: Nir Soffer <[email protected]>
This is an example how the extent interface can be used, and useful tool
for debugging, similar to `qemu-img map`.

Example runs:

    % ./go-qcow2reader-example map /var/tmp/images/test.qcow2 | head -5
    {"start":0,"length":65536,"allocated":true,"zero":false,"compressed":false}
    {"start":65536,"length":983040,"allocated":false,"zero":true,"compressed":false}
    {"start":1048576,"length":131072,"allocated":true,"zero":false,"compressed":false}
    {"start":1179648,"length":655360,"allocated":false,"zero":true,"compressed":false}
    {"start":1835008,"length":131072,"allocated":true,"zero":false,"compressed":false}

    % ./go-qcow2reader-example map /var/tmp/images/test.zlib.qcow2 | head -5
    {"start":0,"length":65536,"allocated":true,"zero":false,"compressed":true}
    {"start":65536,"length":983040,"allocated":false,"zero":true,"compressed":false}
    {"start":1048576,"length":131072,"allocated":true,"zero":false,"compressed":true}
    {"start":1179648,"length":655360,"allocated":false,"zero":true,"compressed":false}
    {"start":1835008,"length":131072,"allocated":true,"zero":false,"compressed":true}

Signed-off-by: Nir Soffer <[email protected]>
Using Image.Extent() to skip zero extents instead of reading and
detecting zeros. We still detect zeroes in non-zero extents to convert
areas full of actual zeros to unallocated area in the target image.

This does not change much the performance for the Ubuntu compressed
image since most of the time is spent on reading and decompressing the
actual data.  Converting a large empty image is 2 orders of magnitude
faster, so I'm testing now 1 TiB image instead of 100 GiB image.

Example run with 1 TiB empty image:

    % hyperfine -w3 "qemu-img convert -f qcow2 -O raw -W /tmp/images/test.0p.qcow2 /tmp/tmp.img" \
                    "./go-qcow2reader-example convert /tmp/images/test.0p.qcow2 /tmp/tmp.img"
    Benchmark 1: qemu-img convert -f qcow2 -O raw -W /tmp/images/test.0p.qcow2 /tmp/tmp.img
      Time (mean ± σ):      14.0 ms ±   0.4 ms    [User: 11.8 ms, System: 2.0 ms]
      Range (min … max):    13.5 ms …  17.8 ms    181 runs

    Benchmark 2: ./go-qcow2reader-example convert /tmp/images/test.0p.qcow2 /tmp/tmp.img
      Time (mean ± σ):      20.6 ms ±   0.2 ms    [User: 118.9 ms, System: 2.2 ms]
      Range (min … max):    20.4 ms …  21.7 ms    130 runs

    Summary
      qemu-img convert -f qcow2 -O raw -W /tmp/images/test.0p.qcow2 /tmp/tmp.img ran
        1.48 ± 0.04 times faster than ./go-qcow2reader-example convert /tmp/images/test.0p.qcow2 /tmp/tmp.img

qemu-img is faster but 7 millisecond difference for 1 TiB image is not
very interesting.

Signed-off-by: Nir Soffer <[email protected]>
Previously we always read the entire image so we could report progress
by wrapping the ReaderAt so every read updates the progress indicator.
Now we read only the allocated extents, so we need another way.

To report progress, you need to pass now a type implementing the Updater
interface. We could also use a function but this way seems cleaner.

Signed-off-by: Nir Soffer <[email protected]>
@nirs nirs changed the title WIP: block status interface Add Extent() status interface Nov 17, 2024
Most users will want to integrate a progress bar and this serve as a
good example of how to do this. This also makes it more fun to work on
this project.

Ubuntu 24.04 image in qcow2 compressed format:

    % ./go-qcow2reader-example convert /var/tmp/images/test.zlib.qcow2 /tmp/tmp.img
    6.00 GiB / 6.00 GiB [-------------------------------------] 100.00% 2.86 GiB p/s

Ubuntu 24.04 image in qcow2 format:

    % ./go-qcow2reader-example convert /var/tmp/images/test.qcow2 /tmp/tmp.img
    6.00 GiB / 6.00 GiB [------------------------------------] 100.00% 15.73 GiB p/s

16 TiB empty qcow2 image:

    % ./go-qcow2reader-example convert /var/tmp/images/test.0p.qcow2 /tmp/tmp.img
    16.00 TiB / 16.00 TiB [---------------------------------] 100.00% 152.46 TiB p/s

The pb module brings too many dependencies. Maybe we can find another
library or write a very simple progress instead.

Signed-off-by: Nir Soffer <[email protected]>
// Called from multiple goroutines after a byte range of length was converted.
// If the conversion is successfu, the total number of bytes will be the image
// virtual size.
Update(n int64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can n be negative?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but this is the type the go and the kernel use for file size (off_t). It is easier to use when we can use Extent.Length when updating.

We an add a check for n < 0 in the progress bar example implementation.

@nirs nirs changed the title Add Extent() status interface Add Extent() interface Nov 17, 2024
@nirs nirs changed the title Add Extent() interface Add Image.Extent() interface Nov 17, 2024
@nirs nirs marked this pull request as ready for review November 17, 2024 22:22
@nirs
Copy link
Member Author

nirs commented Nov 17, 2024

Current version addresses all the comments and implements everything.

It is not compatible with lima since convert has additional argument and the reader cannot be used now to report progress. I'll send fix for lima later.

@nirs nirs added the enhancement New feature or request label Nov 19, 2024
@AkihiroSuda AkihiroSuda added this to the v0.5.0 milestone Nov 20, 2024
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit 0473ecd into lima-vm:master Nov 20, 2024
2 checks passed
// file full of zeroes. To get a sparse target image, the image must be a new
// empty file, since Convert does not punch holes for zero ranges even if the
// underlying file system supports hole punching.
func (c *Converter) Convert(wa io.WriterAt, img image.Image, size int64, progress Updater) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

progress Updater could be provided in Options?
Then we would not have to change the function signature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible, but even if we do we accept now an image instead of a ReaderAt and size.

Also progress scope is one convert, while convert instance can convert many images using the same options, so it is better to accept progress per convert operation.

The options are not something you change per image, it may change depending on the machine (e.g number of cores).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to accept options to convert(), and use the zero value of the converter:

var c convert.Converter
c.Convert(target, src, convert.Options{Progress: bar})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There other options that we may want to add in the future that may be per image options, so moving the options to Convert makes sense anyway.

@nirs nirs deleted the block-status branch November 21, 2024 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor performance compared with qemu-img convert
3 participants