This reverts commit e5cb1d3d3d.
Reason: bisecting says it breaks dnstap.
Also revert commit 27134547ff ("kqueue: use EVFILT_USER for async if
available") because otherwise the first commit doesn't revert cleanly,
with enough conflicts in src/unix/async.c that I'm not comfortable
fixing those up manually.
Fixes: https://github.com/libuv/libuv/issues/4584
Route ftruncate() system calls through io_uring instead of the thread
pool when the kernel is new enough to support it (linux >= 6.9).
This commit harmonizes how libuv checks if the kernel is new enough.
Some ops were checking against `uv__kernel_version()` directly while
others stored the result of the version check as a feature flag.
Because the kernel version is cached, and because it is more direct
than a feature flag, I opted for the former approach.
Introduced in Linux 6.6, it tells the kernel to omit the sqarray from
the ring buffer.
Libuv initalizes the array once to an identity mapping and then forgets
about it, so not only does it save a little memory (ca. 1 KiB per ring)
but it also makes things more efficient kernel-side because it removes
a level of indirection.
The SQPOLL io_uring instance wasn't providing consistent behaviour to
users depending on kernel versions, load shape, ... creating issues
difficult to track and fix. Don't use this ring by default but allow
enabling it by calling `uv_loop_configure()` with
`UV_LOOP_ENABLE_IO_URING_SQPOLL`.
Register the eventfd with EPOLLET to enable edge-triggered notification
where we're able to eliminate the overhead of reading the eventfd via
system call on each wakeup event.
When the eventfd counter reaches the maximum value of the unsigned 64-bit,
which may not happen for the entire lifetime of the process, we rewind the
counter and retry.
This optimization saves one system call on each event-loop wakeup,
eliminating the overhead of read(2) as well as the extra latency
for each epoll wakeup.
Perform EPOLL_CTL_DEL immediately instead of going through
io_uring's submit queue, otherwise the file descriptor may
be closed by the time the kernel starts the operation.
Fixes: https://github.com/libuv/libuv/issues/4323
uv_available_parallelism does not handle container cpu limit
set by systems like Docker or Kubernetes. This patch fixes
this limitation by comparing the amount of available cpus
returned by syscall with the quota of cpus available defined
in the cgroup.
Fixes: https://github.com/libuv/libuv/issues/4146
It's been reported that creating many event loops introduces measurable
overhead now that libuv creates an sqpoll-enabled ring.
I don't really see any change in CPU time with or without this change
but deferring ring creation until it's actually used seems like a good
idea, and comes with no downsides that I can think of, so let's do it.
Fixes: https://github.com/libuv/libuv/issues/4308
Since `io_uring` support was added, libuv's signal handler randomly
segfaults on ppc64 when interrupting `epoll_pwait`. Disable it
pending further investigation.
Issue: https://github.com/libuv/libuv/issues/4283
Introduced long ago for old Linux/libc flavors libuv no longer supports.
We include <ifaddrs.h> unconditionally elsewhere so there is no point in
special-casing it here.
Fixes: https://github.com/libuv/libuv/issues/4242
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.
In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.
In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.
Ref: https://github.com/libuv/libuv/pull/3893
Ref: https://github.com/nodejs/node/issues/48490
Work around a poorly understood bug in older kernels where closing a
file descriptor pointing to /foo/bar results in ETXTBSY errors when
trying to execve("/foo/bar") later on.
The bug seems to have been fixed somewhere between 5.15.85 and 5.15.90.
I couldn't pinpoint the responsible commit but good candidates are the
several data race fixes.
Interestingly, it seems to manifest only when running under Docker so
the possibility of a Docker bug can't be completely ruled out either.
This commit moves uv__kernel_version() from fs.c to linux.c because the
latter now uses it more than the former.
Fixes: https://github.com/nodejs/node/issues/48444
Recent versions of gcc have started emitting warnings about the liberal
type casting inside the QUEUE macros. Although the warnings are false
positives, let's use them as the impetus to switch to a type-safer and
arguably cleaner approach.
Fixes: https://github.com/libuv/libuv/issues/4019
In the case of trying to write more than `IOV_MAX` buffers, the
`IORING_OP_WRITEV` operation will return `EINVAL`. As a temporal fix,
fallback to the old ways. In the future we might implement this by
linking multiple `IORING_OP_WRITEV` requests using `IOSQE_IO_LINK`.
When there are more than 128 concurrent cq completions the CQ ring
overflows as signaled via the `UV__IORING_SQ_CQ_OVERFLOW`. If this
happens we have to enter the kernel to get the remaining items.
The worker pool calls all callbacks locally within the queue. So the
value of nevents doesn't properly reflect that case. Increase the number
of events directly from the worker pool's callback to correct this.
In order to properly determine if the events_waiting counter needs to be
incremented, store the timeout value at the time the event provider was
called.
Fix a valgrind warning that only manifested with clang (not gcc!) by
explicitly passing 0L instead of plain 0 as the |sigsz| argument to
io_uring_enter(). That is, pass a long instead of an int.
On x86_64, |sigsz| is passed on the stack (the other arguments are
passed in registers) but where gcc emits a `push $0` that zeroes the
entire stack slot, clang emits a `movl $0,(%rsp)` that leaves the upper
32 bits untouched.
It's academic though since we don't pass IORING_ENTER_EXT_ARG and the
kernel therefore completely ignores the argument.
Refs: https://github.com/libuv/libuv/pull/3952
Add io_uring support for several asynchronous file operations:
- read, write
- fsync, fdatasync
- stat, fstat, lstat
io_uring is used when the kernel is new enough, otherwise libuv simply
falls back to the thread pool.
Performance looks great; an 8x increase in throughput has been observed.
This work was sponsored by ISC, the Internet Systems Consortium.
Fixes: https://github.com/libuv/libuv/issues/1947
The last major distro that supported the oabi calling convention was
Debian 5 (Lenny) and that went out of support in February 2012. It seems
like a fairly safe assumption that nothing speaks oabi anymore in this
day and age.
Fixes: https://github.com/libuv/libuv/issues/3935
Replace the throw-type-safety-to-the-wind CAST() macro with an inline
function that is hopefully harder to misuse. It should make the inotify
code slightly more legible if nothing else.
On Linux, CPUs can come online or go offline while uv_cpu_info() is busy
gathering data. Change uv_cpu_info() in the following ways:
1. Learn online CPUs from /proc/stat
2. Get the model name from /proc/cpuinfo when it has a matching CPU,
or default to "unknown"
3. Get speed from /sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq
when it exists, or default to 0
Before this commit, libuv read the speed from /proc/cpuinfo but that
reports the base frequency, not the actual frequency. My system has
two cores running permanently at 3.6 GHz but libuv thought all 12 ran
at 2.2 GHz.
Fixes: https://github.com/libuv/libuv/issues/2351
Fixes: https://github.com/libuv/libuv/issues/3858
The old Linux baseline was essentially RHEL 6 but that distro has been
out of support for two years now. Move to RHEL 7.
This commit also moves FreeBSD to tier 2 because it isn't actually
part of libuv's CI matrix, only Node's.
Fixes: https://github.com/libuv/libuv/issues/3822