DeviceBuffer::drop force always succeed ? #48

zeroexcuses · 2020-05-17T13:59:18Z

Quoting: https://github.com/bheisler/RustaCUDA/blob/master/src/memory/device/device_buffer.rs#L132-L172

    /// Deallocating device memory can return errors from previous asynchronous work. This function
    /// destroys the given buffer and returns the error and the un-destroyed buffer on failure.
...
    pub fn drop(mut dev_buf: DeviceBuffer<T>) -> DropResult<DeviceBuffer<T>> {

The fact that drop can fail is slightly problematic as I can't figure out how to use it with RAII. In particular, when there are no more references to a DeviceBuffer, I want it to free the GPU memory. However, if this drop can fail, I can't guarantee that the GPU Memory is freed.

What is the right way to handle freeing a DeviceBuffer ? (Again, the fact that cuFree can fail is very surprising to me.)

The text was updated successfully, but these errors were encountered:

bheisler · 2020-05-25T22:45:02Z

Yeah, CUDA allows the various deallocator functions to fail, so I had to expose that somehow. I don't much like this compromise, but it's the best I have.

If you want to be really sure about it, you'll have to call the DeviceBuffer::drop function manually in a loop, handling errors and trying again until it succeeds. I can't safely implement that in RustaCUDA because I wouldn't know what the application wants to do with the errors.

You can just let things fall out of scope - those types implement Drop and will attempt to clean up after themselves, but only once. If Drop::drop fails, then they panic. This has been sufficient for my needs (I've never actually seen the deallocators fail in practice), but I can understand that if you're relying on this in production that would be undesirable.

Sorry, I don't think there's much I can do here.

LutzCle · 2020-05-26T13:46:53Z

Drop fails frequently during development. If you do illegal memory accesses inside your GPU kernel, you will get something like this:

thread 'main' panicked at 'Failed to deallocate CUDA page-locked memory.: LaunchFailed', $HOME/.cargo/git/checkouts/rustacuda-84d6f0ef4d2f3ecc/cc20ddc/src/memory/locked.rs:263:17

The proper solution would be to use Rust instead of CUDA on the GPU for memory safety ;-)

On a more serious note, this isn't just really a CUDA-specific problem. munmap can fail, too, and that is a POSIX API. Perhaps there exists a good Rust-y solution for munmap that RustaCUDA could borrow?

thecog19 · 2024-04-11T20:25:59Z

👋 We've been able to consistently reproduce this locally, in fact, our code does this regularly. We'd love to help fix this behavior in a way that makes sense, and not deal with a big honking panic in the middle of our code. Unfortunately our knowledge of cuda/nvidia/gpus is pretty weak. @HiggstonRainbird is going to post an example that we've been using to reproduce, but it may be hardware specific. Any pointers about how we can begin to attempt to modify this behavior in a way that will let us either dealocate the memory or move on without breaking anything?

HiggstonRainbird · 2024-04-11T20:55:49Z

The consistent reproduction of this issue @thecog19 described can be found here: buffer_drop.zip. We have been attempting to use the NvFBC crate to capture the GPU framebuffer, and then use rustacuda to perform operations on that framebuffer without the information leaving the GPU.

However, despite the framebuffer capture itself working (including successful transfers of the framebuffer to RAM and to disk), it seems that trying to pass the pointer to the framebuffer into a rustacuda DeviceBuffer fails somehow, and CUDA is unable to drop the device buffer's memory once we're done with it.

use std::error::Error;

use nvfbc::{cuda::*, BufferFormat};
use rustacuda::{context::{Context, ContextFlags}, device::Device, memory::DeviceBuffer, CudaFlags};
use rustacuda_core::DevicePointer;

fn main() -> Result<(), Box<dyn Error>> {
	rustacuda::init(CudaFlags::empty())?;
	let device = Device::get_device(0)?;
	let _context = Context::create_and_push(
		ContextFlags::MAP_HOST | ContextFlags::SCHED_AUTO, device)?;

	let mut capturer = CudaCapturer::new()?;

	let status = capturer.status()?;
	if !status.can_create_now {
		panic!("Can't create a CUDA capture session.");
	}

	capturer.start(BufferFormat::Rgb, 30)?;
	let frame_info = capturer.next_frame(CaptureMethod::NoWaitIfNewFrame)?;

    let pointer = frame_info.device_buffer as *mut u64;
	let device_buffer = unsafe { DeviceBuffer::from_raw_parts(
		DevicePointer::wrap(pointer),
		frame_info.device_buffer_len as usize,
	) };

	let _result = match DeviceBuffer::drop(device_buffer) {
		Ok(()) => println!("Device_buffer successfully destroyed"),
		Err((e, _buf)) => {
			println!("Failed to destroy device_buffer: {:?}", e);
		},
	};

    Ok(())
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeviceBuffer::drop force always succeed ? #48

DeviceBuffer::drop force always succeed ? #48

zeroexcuses commented May 17, 2020

bheisler commented May 25, 2020

LutzCle commented May 26, 2020

thecog19 commented Apr 11, 2024

HiggstonRainbird commented Apr 11, 2024

DeviceBuffer::drop force always succeed ? #48

DeviceBuffer::drop force always succeed ? #48

Comments

zeroexcuses commented May 17, 2020

bheisler commented May 25, 2020

LutzCle commented May 26, 2020

thecog19 commented Apr 11, 2024

HiggstonRainbird commented Apr 11, 2024