This is a small library that implements epoll on top of kqueue. It has been successfully used to port libinput, libevdev, Wayland and more software to FreeBSD: https://www.freshports.org/devel/libepoll-shim/
It may be useful for porting other software that uses epoll as well.
There are some tests inside test/
. They should also compile under Linux and
can be used to verify proper epoll behavior.
Sadly, this library contains some very ugly hacks and workarounds. For example:
-
When using
timerfd
,signalfd
oreventfd
, the system callsread
,write
andclose
are redefined as macros to internal helper functions. This is needed as there is some internal context that has to be free'd properly. This means that you shouldn't create atimerfd
/signalfd
in one part of a program and close it in a different part wheresys/timerfd.h
isn't included. The context would leak. Luckily, software such as libinput behaves very nicely and puts alltimerfd
related code in a single source file.Alternatively, a target/library
epoll-shim-interpose
is also provided. Instead of redefining those symbols as macros they are provided as "proper" symbols, making use of POSIXdlsym
chaining withRTLD_NEXT
.What approach is more suitable depends on the application: If the use of
epoll
is very localized the macro based approach is less overhead. If the use of those file descriptors is more pervasive, the interposition approach is more robust. It will be a bit less performant because all calls toread
/write
/close
and so on will be routed throughepoll-shim
. -
There is limited support for file descriptors that lack support for kqueue but are supported by
poll(2)
. This includes graphics or sound devices under/dev
. Those descriptors are handled in an outerpoll(2)
loop. Edge triggering usingEPOLLET
will not work. -
Shimmed file descriptors cannot be shared between processes. On
fork()
those fds are closed. When trying to pass a shimmed fd to another process thesendmsg
call will returnEOPNOTSUPP
. In most cases sharingepoll
/timerfd
/signalfd
is a bad idea anyway, but there are some legitimate use cases (for example sharing semaphoreeventfd
s, issue #23). When the OS natively supportseventfd
s (as is the case for FreeBSD >= 13) this library won't provideeventfd
shims or thesys/eventfd.h
header. -
There is no proper notification mechanism for changes to the system
CLOCK_REALTIME
clock on BSD systems. Also,kevent
EVFILT_TIMER
s use the system monotonic clock as reference. Therefore, in order to implement absolute (TFD_TIMER_ABSTIME
)CLOCK_REALTIME
timerfd
s or cancellation support (TFD_TIMER_CANCEL_ON_SET
), a thread is spawned that periodically polls the system boot time for changes to the realtime clock.
The library is tested on the following operating systems:
- FreeBSD 12.2, 13.0
- NetBSD 9.1, -current 2022-03-06
- OpenBSD 7.1
- DragonFlyBSD 6.0.1
Be aware of some subtle kqueue bugs that may affect the emulated
epoll behavior. I've marked tests that hit those behaviors as "skipped".
Have a look at atf_tc_skip()
calls in the tests.
Run the following commands to build libepoll-shim:
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build .
To run the tests:
ctest --output-on-failure
To install (as root):
cmake --build . --target install
- Introduce
epoll-shim-interpose
library. This library provides proper wrapper symbols forread
/write
/close
/poll
/ppoll
/fcntl
. If for some reason the macro based approach of redefining those symbols is not appropriate, using this library instead ofepoll-shim
might be an alternative. - More faithful simulation of file descriptor semantics, including reference counting.
- Faster file descriptor lookup, using an array instead of a tree data structure.
- Define wrapper macros as variadic, except when ANSI C is used.
- Fix compiler warning when using shimmed
fcntl
.
- Allow setting
O_NONBLOCK
flag withfcntl
on created file descriptors. - Implement
TFD_TIMER_CANCEL_ON_SET
fortimerfd
. - Implement correction of absolute (
TFD_TIMER_ABSTIME
)CLOCK_REALTIME
timerfd
s when the system time is stepped.
- Fix compilation on FreeBSD < 12 (#28).
- Add
O_CLOEXEC
handling to created file descriptors (PR #26, thanks arichardson!). Note that the shimmed file descriptors still won't work correctly afterexec(3)
. Therefore, not usingEPOLL_CLOEXEC
/TFD_CLOEXEC
/SFD_CLOEXEC
/EFD_CLOEXEC
is strongly discouraged.
- Fix compilation on FreeBSD 12.1 (#25).
signalfd
now hooks into the signal disposition mechanism, just like on Linux. Note:poll
andppoll
are also shimmed with macros in casesys/signalfd.h
is included to support some use cases seen in the wild. Many moressi_*
fields are now set on the resultingstruct signalfd_siginfo
.- More accurate timeout calculations for
epoll_wait
/poll
/ppoll
. - Fix integer overflow on timerfd timeout field on 32-bit machines.
- Fix re-arming of timerfd timeouts on BSDs where EV_ADD of a EVFILT_TIMER doesn't do it.
- Add support for native
eventfd
s (provided by FreeBSD >= 13). Thesys/eventfd.h
header will not be installed in this case.
- Add support for NetBSD 9.1.
- On FreeBSD, add missing
sys/signal.h
include that resulted insigset_t
errors (#21).
- Lift limit of 32 descriptors in
epoll_wait(2)
. - Implement
EPOLLPRI
usingEVFILT_EXCEPT
, if available. If it is not available, add logic toEVFILT_READ
handling that will work ifSO_OOBINLINE
is set on the socket. - Implement
EPOLLONESHOT
. - Implement edge triggering with
EPOLLET
. - Add support for unlimited numbers of poll-only fds per epoll instance.
- Merge
EVFILT_READ
/EVFILT_WRITE
events together to more closely match epoll semantics. - Add support for NetBSD, OpenBSD and DragonFlyBSD.
- Implement
epoll_pwait(2)
.