Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close logpipe input file descriptor after dup2() #116

Merged
merged 1 commit into from
Oct 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions android-activity/src/game_activity/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -912,13 +912,19 @@ extern "Rust" {
pub unsafe extern "C" fn _rust_glue_entry(native_app: *mut ffi::android_app) {
abort_on_panic(|| {
// Maybe make this stdout/stderr redirection an optional / opt-in feature?...
let mut logpipe: [RawFd; 2] = Default::default();
libc::pipe(logpipe.as_mut_ptr());
libc::dup2(logpipe[1], libc::STDOUT_FILENO);
libc::dup2(logpipe[1], libc::STDERR_FILENO);

let file = {
let mut logpipe: [RawFd; 2] = Default::default();
libc::pipe2(logpipe.as_mut_ptr(), libc::O_CLOEXEC);
libc::dup2(logpipe[1], libc::STDOUT_FILENO);
libc::dup2(logpipe[1], libc::STDERR_FILENO);
libc::close(logpipe[1]);

File::from_raw_fd(logpipe[0])
};

thread::spawn(move || {
let tag = CStr::from_bytes_with_nul(b"RustStdoutStderr\0").unwrap();
let file = File::from_raw_fd(logpipe[0]);
let mut reader = BufReader::new(file);
let mut buffer = String::new();
loop {
Expand Down
37 changes: 20 additions & 17 deletions android-activity/src/native_activity/glue.rs
Original file line number Diff line number Diff line change
Expand Up @@ -835,28 +835,31 @@ extern "C" fn ANativeActivity_onCreate(
) {
abort_on_panic(|| {
// Maybe make this stdout/stderr redirection an optional / opt-in feature?...
unsafe {
let file = unsafe {
let mut logpipe: [RawFd; 2] = Default::default();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using OwnedFd might be even nicer: it'll close the file descriptors automatically and has a (safe) impl From<OwnedFd> for File.

Even though it is FFI-compatible I'm not sure if we're allowed to rely on the fact that they are initially uninitialized, or perhaps libc::pipe() not initializing them at all or to -1 on error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think we should switch to nix or rustix: no code below is validating the error codes returned by libc, while those APIs do.

On quick inspection I prefer rustix: they return OwnedFd from pipe() rather than nix' returning RawFd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's pretty cool:

        let file = unsafe {
            let logpipe = rustix::pipe::pipe().unwrap();
            rustix::stdio::dup2_stdout(logpipe.1);
            rustix::stdio::dup2_stderr(logpipe.1);

            File::from(logpipe.0)
        };

Unfortunately converting this whole file to rustix is a hassle :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping @rib :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm half wondering whether we should be setting O_CLOSEXEC via pipe2() also but actually I suppose it would maybe make sense that if you fork + exec for some reason then you'd want stdout/err for the thing you exec to also get directed to this same IO thread.

I guess it'd make sense to use O_CLOSEXEC on the just the read end maybe.

Copy link
Member Author

@MarijnS95 MarijnS95 Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunfishcode hence perhaps using rustix in a popular boilerplate Rust+Android crate will give it the testing it deserves?

We even used to run ndk-glue in an emulator, but I don't think that was ported to android-activity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, let's move/continue discussions to #139 if there needs to be any.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's more like you had have a ref counted file object and a separate file descriptor - which is relevant here since the flags are associated with the fd, not the thing that's ref-counted.

Exactly, that is why the O_CLOEXEC no longer has an effect when the write-end is dup2()'d into stdout/stderr.

(nit: note that there's no S in there, ie. O_CLOseonEXECute)

thanks, yeah, I guess I forgot how many letters from 'close' to keep - I'd guess even that the 'O' might be for 'on', who knows :)

Since I think dup2 will create new FDs for stdout/err that don't have O_CLOSEXEC by default then it looks like we do now have the final result we want.

dup2() can't, because you're telling the kernel to make the stdout/stderr FD to "point to" the underlying (refcounted) object described/pointed-to by the "new" FD created by pipe(). If it were to create and assign new FDs things would break and pretty much defeat the purpose of dup2()?

hmm, no, dup2() doesn't really make an existing file descriptor point to the same file object - though it does give the illusion that that's what it does.

dup2() here effectively (atomically) close()s the original stdin/err (deleting the original FDs) and then creates new FDs that don't have flags set (such as O_CLOEXEC). The new FD created by dup2 hold a reference to the same file object that the oldfd points at.

So if you think of an FD as only the integer then it does give that illusion that an existing FD is being made to point at another file - but the FD really has some associated state (including the flags such as O_CLOEXEC) so it's quite an important distinction to note that dup2 is creating a new FD and the new stdin/err file descriptors from dup2() won't have O_CLOEXEC set.

This is effectively why there's now a dup3() to close the same race condition that existed with other system calls that create new file descriptors without O_CLOEXEC already set.

From https://man7.org/linux/man-pages/man2/dup.2.html:

The dup2() system call performs the same task as dup(), but instead of using the lowest-numbered unused file descriptor, it uses the file descriptor number specified in newfd. In other words, the file descriptor newfd is adjusted so that it now refers to the same open file description as oldfd.

The wording is very particular here and the devil is in the details. It's saying it "uses the file descriptor number", not the actual, pre-existing file descriptor. An FD is more than just the "number" that gets passed around in userspace and the FD is something that's also separate from the file object that it references.

Some file state is common to the referenced file object and some other state like the O_CLOEXEC is unique to each FD.

dup2() closes one FD, then creates a new FD that will also use the same number (such as stdout/err) that was previously associated with the FD which has been closed - and then makes that new FD reference the same file object as the other FD passed to dup2().

I'd maybe be slightly nervous about how well tested rustix is on Android in terms of being aware of what system calls are allowed on Android. At least with libc it tends to pretty much have a 1:1 mapping and it's easy to avoid system calls that aren't allowed.

It's not like rustix seems to call "arbitrary" syscalls under the hood at all, or abstract them away. Rather, it's providing better type safety around equally-named UNIX sycalls? At least for the ones that I've been looking at.

I guess at least the rustix binding for dup2 looked pretty explicit that it would map to the dup2 system call. I guess I would expect that not every API is implemented with such an explicit 1:1 mapping to system calls though and its those slightly higher-level APIs I'd maybe worry about.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't want to go to the depths of discussing that a light-weight file descriptor integer in userspace must (obviously) have a whole representation on the kernel side before even pointing to any sort of file object.

However, for a completely distinct case, it is perhaps worth mentioning it, starting with this quote from dup3() docs:

dup()/dup2()
...
The two descriptors do not share file descriptor flags (the close-on-exec flag). The close-on-exec flag (FD_CLOEXEC; see fcntl(2)) for the duplicate descriptor is off.

dup3() is the same as dup2(), except that:

  • The caller can force the close-on-exec flag to be set for the new file descriptor by specifying O_CLOEXEC in flags. See the description of the same flag in open(2) for reasons why this may be useful.

We already deduced above that FD flags affect the FD (or FD number as it were), and won't be "copied" over from oldfd to newfd. However, noting that newfd gets its whole, for the theoretical case that stdin/stdout have O_CLOEXEC set, FD replaced without any sharing is still useful: in this case the resulting FD (which the FD number points to) won't have O_CLOEXEC set anymore.


As such, if there is a case where an Android app may set O_CLOEXEC on stdin/stderr before handing control over to android-activity, we'll undo that with our dup2(). Should we have F_GETFD to check if it is set, and if so use dup3() to add it back?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess at least the rustix binding for dup2 looked pretty explicit that it would map to the dup2 system call. I guess I would expect that not every API is implemented with such an explicit 1:1 mapping to system calls though and its those slightly higher-level APIs I'd maybe worry about.

I'm not worried too much if we're translating existing libc:: calls to their 1:1 equivalents in rustix, unless we cannot find that or stumble upon a more high-level abstraction over it, in which case we will have to check whether it does the same.

However, I doubt android-activity does much if anything special with syscalls.

libc::pipe(logpipe.as_mut_ptr());
libc::pipe2(logpipe.as_mut_ptr(), libc::O_CLOEXEC);
libc::dup2(logpipe[1], libc::STDOUT_FILENO);
libc::dup2(logpipe[1], libc::STDERR_FILENO);
std::thread::spawn(move || {
let tag = CStr::from_bytes_with_nul(b"RustStdoutStderr\0").unwrap();
let file = File::from_raw_fd(logpipe[0]);
let mut reader = BufReader::new(file);
let mut buffer = String::new();
loop {
buffer.clear();
if let Ok(len) = reader.read_line(&mut buffer) {
if len == 0 {
break;
} else if let Ok(msg) = CString::new(buffer.clone()) {
android_log(Level::Info, tag, &msg);
}
libc::close(logpipe[1]);

File::from_raw_fd(logpipe[0])
};

std::thread::spawn(move || {
let tag = CStr::from_bytes_with_nul(b"RustStdoutStderr\0").unwrap();
let mut reader = BufReader::new(file);
let mut buffer = String::new();
loop {
buffer.clear();
if let Ok(len) = reader.read_line(&mut buffer) {
Comment on lines +852 to +854
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably drive-by fix the error handling, because this will loop indefinitely as long as read_line() returns an error. For EOF it returns Ok(0) at least 😌

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that EOF case is handled afterwards with if len == 0 { break; }? aah right, you're saying at least the EOF case is OK but it doesn't handle errors! Yeah that's not great!

At least it looks like ErrorKind::Interrupted is hidden from callers, so I guess probably any io::Error here will be a legitimate error that means we should exit the thread.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's exactly the Ok(0), we handle that, but there's nothing handling Err from read_line(). If that's returned indefinitely, this'll get a CPU core quite busy.

if len == 0 {
break;
} else if let Ok(msg) = CString::new(buffer.clone()) {
android_log(Level::Info, tag, &msg);
}
}
});
}
}
});

log::trace!(
"Creating: {:p}, saved_state = {:p}, save_state_size = {}",
Expand Down
Loading