From 5774b3b9f4a1d196749b89aa2ab2f89feac18df8 Mon Sep 17 00:00:00 2001 From: Marshall Roch Date: Wed, 29 Jun 2022 10:01:40 -0700 Subject: [PATCH] catch EBADF exceptions from Lwt_process.with_process_full Summary: we're seeing exceptions like this: ``` Failed to initialize watchman: Unix.Unix_error(Unix.EBADF, "close", "") Raised by primitive operation at Lwt_unix.self_result in file "src/unix/lwt_unix.cppo.ml", line 237, characters 14-31 Re-raised at Watchman.get_sockname.(fun) in file "flow/src/hack_forked/watchman/watchman.ml", line 381, characters 2-934 Re-raised at FileWatcher.WatchmanFileWatcher.watchman#wait_for_init.go_exn.(fun) in file "flow/src/monitor/fileWatcher.ml", line 534, characters 10-1023 ``` `Watchman.get_sockname` calls `LwtSysUtils.exec` which calls `Lwt_process.with_process_full`. `with_process_full` does an `Lwt.finally` to make sure it calls `process#close`. `close` is throwing because the process's fd is invalid (perhaps already closed). The stack traces aren't much to go on, but I suspect this is happening on exceptions. The file descriptor shouldn't be invalid otherwise. That means there's an underlying exception that's getting hidden by the `finally` also throwing. So, here we ignore `EBADF` exceptions from the `close` and rethrow the original exception. I also filed https://github.com/ocsigen/lwt/issues/956 to discuss doing this upstream. Changelog: [internal] Reviewed By: samwgoldman Differential Revision: D37420401 fbshipit-source-id: de63ab6ed2dd4a43ffe190d31da39bd98633a4fb --- src/common/lwt/lwtSysUtils.ml | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/src/common/lwt/lwtSysUtils.ml b/src/common/lwt/lwtSysUtils.ml index b81a89ad2a6..ff1af260011 100644 --- a/src/common/lwt/lwtSysUtils.ml +++ b/src/common/lwt/lwtSysUtils.ml @@ -50,11 +50,37 @@ let prepare_args cmd args = instead, when we pass "". *) ("", Array.of_list (cmd :: args)) +(** At least as of Lwt 5.5.0, [Lwt_process.with_process_full] tries to close + the process even when [f] fails, and can raise an EBADF that swallows + whatever the original exception was. https://github.com/ocsigen/lwt/issues/956 + + Instead, we will swallow exceptions from [close] and use our [Exception] to + reraise the original exception. We also use ppx_lwt instead of [Lwt.finalize] + to improve backtraces. *) +let with_process_full ?timeout ?env ?cwd cmd f = + let process = Lwt_process.open_process_full ?timeout ?env ?cwd cmd in + let ignore_close process = + try%lwt + let%lwt _ = process#close in + Lwt.return_unit + with + | Unix.Unix_error (Unix.EBADF, _, _) -> Lwt.return_unit + in + let%lwt result = + try%lwt f process with + | e -> + let exn = Exception.wrap e in + let%lwt () = ignore_close process in + Exception.reraise exn + in + let%lwt () = ignore_close process in + Lwt.return result + let exec ?env ?cwd cmd args = - Lwt_process.with_process_full ?env ?cwd (prepare_args cmd args) command_result_of_process + with_process_full ?env ?cwd (prepare_args cmd args) command_result_of_process let exec_with_timeout ~timeout cmd args = - Lwt_process.with_process_full (prepare_args cmd args) (fun process -> + with_process_full (prepare_args cmd args) (fun process -> let timeout_msg = Printf.sprintf "Timed out while running `%s` after %.3f seconds" cmd timeout in