Add Image.Extent() interface #38

nirs · 2024-10-26T17:42:23Z

Extent() return the next extent in the image, starting at the specified
offset, and limited by the specified length. An extent is a run of
clusters of with the same status.

Using Extent() we can iterate over the image and skip extents that reads
as zeros, instead of reading a zero buffer and detecting that the buffer
is full of zeros.

Benchmarking show 2 orders of magnitude speeded compared with zero
detection.

% go test -bench Benchmark0p/qcow2$/read
Benchmark0p/qcow2/read-12      516     2503752 ns/op      107213.28 MB/s    1050537 B/op    39 allocs/op
    
% go test -bench NextUnallocated
BenchmarkNextUnallocated-12    116    10236186 ns/op    10489666.91 MB/s          0 B/op     0 allocs/op

Ubuntu 24.04 image in qcow2 compressed format:

% ./go-qcow2reader-example convert /var/tmp/images/test.zlib.qcow2 /tmp/tmp.img
6.00 GiB / 6.00 GiB [-------------------------------------] 100.00% 2.86 GiB p/s

Ubuntu 24.04 image in qcow2 format:

% ./go-qcow2reader-example convert /var/tmp/images/test.qcow2 /tmp/tmp.img
6.00 GiB / 6.00 GiB [------------------------------------] 100.00% 15.73 GiB p/s

16 TiB empty qcow2 image:

% ./go-qcow2reader-example convert /var/tmp/images/test.0p.qcow2 /tmp/tmp.img
16.00 TiB / 16.00 TiB [---------------------------------] 100.00% 152.46 TiB p/s

Fixes #32

image/qcow2/qcow2.go

image/stub/stub.go

image/raw/raw.go

image/qcow2/qcow2.go

nirs · 2024-10-27T22:28:05Z

image/image.go

@@ -8,10 +8,24 @@ import (
 // Type must be a "Backing file format name string" that appears in QCOW2.
 type Type string

+type Extent struct {
+	// Offset from start of the image in bytes.
+	Start int64 `json:"start"`


This is not strictly needed, since when calling Next(start, length) we always return the requested start value.

Removing Start makes testing easier and eliminate the error of invalid extents list. When consuming the extents we keep current offset externally so we don't really need it. When printing extents having start is helpful. If we don't report it we have to add another type with Start for encoding json.

nirs · 2024-10-27T22:30:01Z

image/image.go

+	// Set if this extent is read as zeros.
+	Zero bool `json:"zero"`
+	// Set if this extent is compressed.
+	Compressed bool `json:"compressed"`


Allocated and Compressed are not needed for sparse copy, only Zero is needed. They can be useful for other purposes, for example dumping all extents like qemu-img map --output json.

image/qcow2/qcow2.go

image/image.go

image/stub/stub.go

convert/convert.go

image/image.go

nirs · 2024-11-02T17:49:10Z

Current version improves the tests but it not ready yet. I did not make the changes we discussed yet.

I extract the changes preparing for this change to #39 so we can merge them early and make this change smaller.

qcow2reader_test.go

nirs · 2024-11-16T19:47:21Z

Last push added test for image with backing file. The new tests can be reviewed but I did not handle yet the rest of the needed changes.

I think the last missing test is an image with shorter backing file.

nirs · 2024-11-16T20:09:20Z

Last push add the last missing test for shorter backing file. The new test can be reviewed but I did not handle yet the rest of the needed changes.

Signed-off-by: Nir Soffer <[email protected]>

nirs · 2024-11-16T21:17:55Z

Last push changes:

Rename Next() to Extent()
Improve documentation
Reorder commits to group image and tests changes

Still missing

documentation
input validation
support for progress in convert (ReadeAt and WriteAt cannot be used now)

Extent() return the next extent in the image, starting at the specified offset, and limited by the specified length. An extent is a run of clusters of with the same status. Using Extent() we can iterate over the image and skip extents that reads as zeros, instead of reading a zero buffer and detecting that the buffer is full of zeros. Benchmarking show 2 orders of magnitude speeded compared with zero detection (10 TiB/s instead of 100 GiB/s) % go test -bench Benchmark0p/qcow2$/read Benchmark0p/qcow2/read-12 516 2503752 ns/op 107213.28 MB/s 1050537 B/op 39 allocs/op % go test -bench NextUnallocated BenchmarkNextUnallocated-12 116 10236186 ns/op 10489666.91 MB/s 0 B/op 0 allocs/op Only qcow2 image has a real implementation. For other image formats we treat all clusters as allocated, so Extent() always return one allocated extent. For raw format we can implement Extents() later using SEEK_DATA and SEEK_HOLE. Extent() is more strict than ReadAt and fails when offset+length are after the end of the image. We always know the length of the image so requesting an extent after the end of the image is likely a bug in the user code and failing fast may help to debug such issues. Signed-off-by: Nir Soffer <[email protected]>

Add a helper for creating an image from list of extents. With this we can create a raw or qcow2 image with allocation described by the extents. Add 4 tests, testing different allocation patterns: - TestExtentsSome: some allocated clusters and some holes - TestExtentsPartial: writing partial cluster allocates entire cluster - TestExtentsMerge: consecutive extents of same type are merged - TestExtentsZero: different extents types that read as zeros For each test we verify qcow2 and qcow2 compressed (zlib) images. Signed-off-by: Nir Soffer <[email protected]>

We cannot have useful tests for raw backing file since we have only a stub implementation reporting that the entire backing file is allocated, so we test only qcow2 image backing file. We test backing file in qcow2 and qcow2 compressed. When both images are uncompressed, allocated extents from top and base are merged. When base is compressed, allocated extents from top and based cannot be merged, since all base clusters are compressed. Compressed image with backing file may be possible to crate with qemu-io, but is not real use case so we don't test it. Signed-off-by: Nir Soffer <[email protected]>

When adding a longer qcow2 image on top of an os image, we treat extents after the end of the backing file as unallocated. We test also the special case of raw backing file not aligned to cluster size. When reading a short extent from the end of the backing file we extend the length to full cluster. Signed-off-by: Nir Soffer <[email protected]>

This is an example how the extent interface can be used, and useful tool for debugging, similar to `qemu-img map`. Example runs: % ./go-qcow2reader-example map /var/tmp/images/test.qcow2 | head -5 {"start":0,"length":65536,"allocated":true,"zero":false,"compressed":false} {"start":65536,"length":983040,"allocated":false,"zero":true,"compressed":false} {"start":1048576,"length":131072,"allocated":true,"zero":false,"compressed":false} {"start":1179648,"length":655360,"allocated":false,"zero":true,"compressed":false} {"start":1835008,"length":131072,"allocated":true,"zero":false,"compressed":false} % ./go-qcow2reader-example map /var/tmp/images/test.zlib.qcow2 | head -5 {"start":0,"length":65536,"allocated":true,"zero":false,"compressed":true} {"start":65536,"length":983040,"allocated":false,"zero":true,"compressed":false} {"start":1048576,"length":131072,"allocated":true,"zero":false,"compressed":true} {"start":1179648,"length":655360,"allocated":false,"zero":true,"compressed":false} {"start":1835008,"length":131072,"allocated":true,"zero":false,"compressed":true} Signed-off-by: Nir Soffer <[email protected]>

Using Image.Extent() to skip zero extents instead of reading and detecting zeros. We still detect zeroes in non-zero extents to convert areas full of actual zeros to unallocated area in the target image. This does not change much the performance for the Ubuntu compressed image since most of the time is spent on reading and decompressing the actual data. Converting a large empty image is 2 orders of magnitude faster, so I'm testing now 1 TiB image instead of 100 GiB image. Example run with 1 TiB empty image: % hyperfine -w3 "qemu-img convert -f qcow2 -O raw -W /tmp/images/test.0p.qcow2 /tmp/tmp.img" \ "./go-qcow2reader-example convert /tmp/images/test.0p.qcow2 /tmp/tmp.img" Benchmark 1: qemu-img convert -f qcow2 -O raw -W /tmp/images/test.0p.qcow2 /tmp/tmp.img Time (mean ± σ): 14.0 ms ± 0.4 ms [User: 11.8 ms, System: 2.0 ms] Range (min … max): 13.5 ms … 17.8 ms 181 runs Benchmark 2: ./go-qcow2reader-example convert /tmp/images/test.0p.qcow2 /tmp/tmp.img Time (mean ± σ): 20.6 ms ± 0.2 ms [User: 118.9 ms, System: 2.2 ms] Range (min … max): 20.4 ms … 21.7 ms 130 runs Summary qemu-img convert -f qcow2 -O raw -W /tmp/images/test.0p.qcow2 /tmp/tmp.img ran 1.48 ± 0.04 times faster than ./go-qcow2reader-example convert /tmp/images/test.0p.qcow2 /tmp/tmp.img qemu-img is faster but 7 millisecond difference for 1 TiB image is not very interesting. Signed-off-by: Nir Soffer <[email protected]>

Previously we always read the entire image so we could report progress by wrapping the ReaderAt so every read updates the progress indicator. Now we read only the allocated extents, so we need another way. To report progress, you need to pass now a type implementing the Updater interface. We could also use a function but this way seems cleaner. Signed-off-by: Nir Soffer <[email protected]>

Most users will want to integrate a progress bar and this serve as a good example of how to do this. This also makes it more fun to work on this project. Ubuntu 24.04 image in qcow2 compressed format: % ./go-qcow2reader-example convert /var/tmp/images/test.zlib.qcow2 /tmp/tmp.img 6.00 GiB / 6.00 GiB [-------------------------------------] 100.00% 2.86 GiB p/s Ubuntu 24.04 image in qcow2 format: % ./go-qcow2reader-example convert /var/tmp/images/test.qcow2 /tmp/tmp.img 6.00 GiB / 6.00 GiB [------------------------------------] 100.00% 15.73 GiB p/s 16 TiB empty qcow2 image: % ./go-qcow2reader-example convert /var/tmp/images/test.0p.qcow2 /tmp/tmp.img 16.00 TiB / 16.00 TiB [---------------------------------] 100.00% 152.46 TiB p/s The pb module brings too many dependencies. Maybe we can find another library or write a very simple progress instead. Signed-off-by: Nir Soffer <[email protected]>

AkihiroSuda · 2024-11-17T22:17:03Z

convert/convert.go

+	// Called from multiple goroutines after a byte range of length was converted.
+	// If the conversion is successfu, the total number of bytes will be the image
+	// virtual size.
+	Update(n int64)


Can n be negative?

No, but this is the type the go and the kernel use for file size (off_t). It is easier to use when we can use Extent.Length when updating.

We an add a check for n < 0 in the progress bar example implementation.

nirs · 2024-11-17T22:24:11Z

Current version addresses all the comments and implements everything.

It is not compatible with lima since convert has additional argument and the reader cannot be used now to report progress. I'll send fix for lima later.

AkihiroSuda

Thanks

AkihiroSuda · 2024-11-21T00:35:28Z

convert/convert.go

+// file full of zeroes. To get a sparse target image, the image must be a new
+// empty file, since Convert does not punch holes for zero ranges even if the
+// underlying file system supports hole punching.
+func (c *Converter) Convert(wa io.WriterAt, img image.Image, size int64, progress Updater) error {


progress Updater could be provided in Options?
Then we would not have to change the function signature?

It is possible, but even if we do we accept now an image instead of a ReaderAt and size.

Also progress scope is one convert, while convert instance can convert many images using the same options, so it is better to accept progress per convert operation.

The options are not something you change per image, it may change depending on the machine (e.g number of cores).

Another way to accept options to convert(), and use the zero value of the converter:

var c convert.Converter c.Convert(target, src, convert.Options{Progress: bar})

There other options that we may want to add in the future that may be per image options, so moving the options to Convert makes sense anyway.

nirs force-pushed the block-status branch 2 times, most recently from 6adb42a to e9ada15 Compare October 26, 2024 21:58