Commit Graph

74 Commits

Author SHA1 Message Date
Jameson Nash
ced77975b8
Revert "Revert "linux: eliminate a read on eventfd per wakeup (#4400)" (#4585)"
This reverts commit 18d48bc13c.
2024-10-18 09:51:27 -04:00
Ben Noordhuis
18d48bc13c
Revert "linux: eliminate a read on eventfd per wakeup (#4400)" (#4585)
This reverts commit e5cb1d3d3d.

Reason: bisecting says it breaks dnstap.

Also revert commit 27134547ff ("kqueue: use EVFILT_USER for async if
available") because otherwise the first commit doesn't revert cleanly,
with enough conflicts in src/unix/async.c that I'm not comfortable
fixing those up manually.

Fixes: https://github.com/libuv/libuv/issues/4584
2024-10-17 20:41:38 +02:00
Ben Noordhuis
f806be87d3
linux: use IORING_OP_FTRUNCATE when available (#4554)
Route ftruncate() system calls through io_uring instead of the thread
pool when the kernel is new enough to support it (linux >= 6.9).

This commit harmonizes how libuv checks if the kernel is new enough.
Some ops were checking against `uv__kernel_version()` directly while
others stored the result of the version check as a feature flag.

Because the kernel version is cached, and because it is more direct
than a feature flag, I opted for the former approach.
2024-09-30 21:55:34 +02:00
Ben Noordhuis
bcc6d1c1fc
linux: use IORING_SETUP_NO_SQARRAY when available (#4553)
Introduced in Linux 6.6, it tells the kernel to omit the sqarray from
the ring buffer.

Libuv initalizes the array once to an identity mapping and then forgets
about it, so not only does it save a little memory (ca. 1 KiB per ring)
but it also makes things more efficient kernel-side because it removes
a level of indirection.
2024-09-30 19:44:27 +02:00
Santiago Gimeno
e78e29c231
linux: disable SQPOLL io_uring by default (#4492)
The SQPOLL io_uring instance wasn't providing consistent behaviour to
users depending on kernel versions, load shape, ... creating issues
difficult to track and fix. Don't use this ring by default but allow
enabling it by calling `uv_loop_configure()` with
`UV_LOOP_ENABLE_IO_URING_SQPOLL`.
2024-08-06 22:10:13 +02:00
Andy Pan
e5cb1d3d3d
linux: eliminate a read on eventfd per wakeup (#4400)
Register the eventfd with EPOLLET to enable edge-triggered notification
where we're able to eliminate the overhead of reading the eventfd via
system call on each wakeup event.

When the eventfd counter reaches the maximum value of the unsigned 64-bit,
which may not happen for the entire lifetime of the process, we rewind the
counter and retry.

This optimization saves one system call on each event-loop wakeup,
eliminating the overhead of read(2) as well as the extra latency
for each epoll wakeup.
2024-07-29 19:59:41 -04:00
Viacheslav Muravyev
7c491bde32
unix,win: remove unused req parameter from macros (#4435)
Remove the unused `req` parameter from the uv__req_register and
uv__req_unregister macros.
2024-07-11 21:29:15 +02:00
Ben Noordhuis
3ecce91410
linux: don't delay EPOLL_CTL_DEL operations (#4328)
Perform EPOLL_CTL_DEL immediately instead of going through
io_uring's submit queue, otherwise the file descriptor may
be closed by the time the kernel starts the operation.

Fixes: https://github.com/libuv/libuv/issues/4323
2024-03-21 09:23:08 +01:00
Farzin Monsef
b0816180e3
linux: fix /proc/self/stat executable name parsing (#4353)
- The filename of the executable may contain both spaces and parentheses
- Use uv__slurp instead of open/read/close
2024-03-14 09:35:25 +01:00
Thomas Walter
6b56200cc8
linux: fix uv_available_parallelism using cgroup (#4278)
uv_available_parallelism does not handle container cpu limit
set by systems like Docker or Kubernetes. This patch fixes
this limitation by comparing the amount of available cpus
returned by syscall with the quota of cpus available defined
in the cgroup.

Fixes: https://github.com/libuv/libuv/issues/4146
2024-02-28 12:23:23 +01:00
Ben Noordhuis
507f3046d1
linux: create io_uring sqpoll ring lazily (#4315)
It's been reported that creating many event loops introduces measurable
overhead now that libuv creates an sqpoll-enabled ring.

I don't really see any change in CPU time with or without this change
but deferring ring creation until it's actually used seems like a good
idea, and comes with no downsides that I can think of, so let's do it.

Fixes: https://github.com/libuv/libuv/issues/4308
2024-02-14 11:20:44 +01:00
Brad King
3b6a1a14ca
linux: disable io_uring on ppc64 and ppc64le (#4285)
Since `io_uring` support was added, libuv's signal handler randomly
segfaults on ppc64 when interrupting `epoll_pwait`.  Disable it
pending further investigation.

Issue: https://github.com/libuv/libuv/issues/4283
2024-01-13 12:04:01 +01:00
Santiago Gimeno
160cd5629e
linux: retry fs op if unsupported by io_uring (#4268)
Fallback to the threadpool if it returns `EOPNOTSUPP`.

Fixes: https://github.com/nodejs/node/issues/50876
2024-01-08 22:25:44 +01:00
Ben Noordhuis
a7d5255122
linux: remove HAVE_IFADDRS_H macro (#4243)
Introduced long ago for old Linux/libc flavors libuv no longer supports.

We include <ifaddrs.h> unconditionally elsewhere so there is no point in
special-casing it here.

Fixes: https://github.com/libuv/libuv/issues/4242
2023-12-12 15:13:31 -05:00
matoro
f144429365
linux: disable io_uring on hppa below kernel 6.1.51 (#4224)
First kernel with support is 6.1, was only fully functional from .51
onwards: https://lore.kernel.org/all/cb912694-b1fe-dbb0-4d8c-d608f3526905@gmx.de/

Co-authored-by: matoro <matoro@users.noreply.github.com>
2023-11-15 23:57:06 +01:00
Ben Noordhuis
a389393ffa
linux: disable io_uring on 32 bits arm systems (#4187)
It's been reported that no released kernels are bug-free enough to use
io_uring without causing regressions.

Fixes: https://github.com/libuv/libuv/issues/4158
2023-10-28 13:18:42 +02:00
Santiago Gimeno
c811169f91
unix: disable io_uring close on selected kernels (#4141)
Specifically on non-longterm kernels between 5.16.0 (non-longterm) and
6.1.0 (longterm). Starting with longterm 6.1.0, the issue is solved.
2023-09-17 22:09:00 +02:00
Ben Noordhuis
0d78f3c758 unix: get mainline kernel version in Debian (#4131)
In Debian, the mainline kernel version is reported via the `uname()`
`version` field.
2023-09-01 11:24:26 +02:00
Santiago Gimeno
e2c8fed7b3 unix: get mainline kernel version in Ubuntu (#4131)
In Ubuntu, the kernel version reported by `uname()` follows the
versioning format that Ubuntu uses for their kernels which does not have
a direct correspondence with the mainline kernel version they're based
on. Get that version from `/proc/version_signature` as documented in:

https://wiki.ubuntu.com/Kernel/FAQ#Kernel.2FFAQ.2FGeneralVersionRunning.How_can_we_determine_the_version_of_the_running_kernel.3F
2023-09-01 11:24:26 +02:00
michalbiesek
65dc822d6c
linux: add missing riscv syscall numbers (#4127)
Signed-off-by: Michal Biesek <michalbiesek@gmail.com>
2023-08-25 21:41:56 +02:00
Trevor Norris
2f82750098
unix: match kqueue and epoll code (#4091)
Match the implementation for linux.c to kqueue.c in the code around the
calls to kevent and epoll.

In linux.c the code was made more DRY by moving the nfds check up
(including a comment of why it's possible) and combining two if checks
into one.

In kqueue.c the assert to check the timeout when nfds == 0 has been
moved to be called directly after the EINTR check. Since it should
always be true regardless.

Ref: https://github.com/libuv/libuv/pull/3893
Ref: https://github.com/nodejs/node/issues/48490
2023-08-04 14:10:53 -06:00
Ben Noordhuis
30c3ef9f6f
linux: handle UNAME26 personality (#4109) 2023-07-31 23:40:59 +02:00
Ben Noordhuis
50b53cbd0d
linux: don't use io_uring on pre-5.10.186 kernels (#4093)
Those kernels have a known resource consumption bug where the sqpoll
thread busy-loops.

Fixes: https://github.com/libuv/libuv/issues/4089
2023-07-12 23:33:49 +02:00
Shuduo Sang
a939d643dd
linux: fix harmless warn_unused_result warning (#4056) 2023-07-12 23:00:59 +02:00
Ben Noordhuis
1752791c9e
linux: work around io_uring IORING_OP_CLOSE bug (#4059)
Work around a poorly understood bug in older kernels where closing a
file descriptor pointing to /foo/bar results in ETXTBSY errors when
trying to execve("/foo/bar") later on.

The bug seems to have been fixed somewhere between 5.15.85 and 5.15.90.
I couldn't pinpoint the responsible commit but good candidates are the
several data race fixes.

Interestingly, it seems to manifest only when running under Docker so
the possibility of a Docker bug can't be completely ruled out either.

This commit moves uv__kernel_version() from fs.c to linux.c because the
latter now uses it more than the former.

Fixes: https://github.com/nodejs/node/issues/48444
2023-06-20 13:01:12 +02:00
liuxiang88
7ada448d18
unix: add loongarch support (#4054)
Signed-off-by: liuxiang <liuxiang@loongson.cn>
2023-06-16 10:25:25 +02:00
Santiago Gimeno
e7b9633170
linux: fs_read to use io_uring if iovcnt > IOV_MAX (#4023)
Just cap it to `IOV_MAX` as it's already done when performing reads
using the threadpool.
2023-05-25 12:09:51 +02:00
Ben Noordhuis
1b01b786c0
unix,win: replace QUEUE with struct uv__queue (#4022)
Recent versions of gcc have started emitting warnings about the liberal
type casting inside the QUEUE macros. Although the warnings are false
positives, let's use them as the impetus to switch to a type-safer and
arguably cleaner approach.

Fixes: https://github.com/libuv/libuv/issues/4019
2023-05-25 00:04:30 +02:00
Santiago Gimeno
962b8e626c
linux: add some more iouring backed fs ops (#4012)
Specifically: `link`, `mkdir`, `rename`, `symlink` and `unlink`.
2023-05-23 10:42:20 +02:00
Ben Noordhuis
281e6185cc
android: disable io_uring support (#4016)
Android's zealous seccomp filter blocks the io_uring_setup system call.

Fixes: https://github.com/libuv/libuv/issues/4010
2023-05-23 00:25:09 +02:00
Santiago Gimeno
ef6a9a624d
linux: fix WRITEV with lots of bufs using io_uring (#4004)
In the case of trying to write more than `IOV_MAX` buffers, the
`IORING_OP_WRITEV` operation will return `EINVAL`. As a temporal fix,
fallback to the old ways. In the future we might implement this by
linking multiple `IORING_OP_WRITEV` requests using `IOSQE_IO_LINK`.
2023-05-19 11:03:17 +02:00
Ben Noordhuis
d23a20f62c
linux: work around EOWNERDEAD io_uring kernel bug (#4002)
io_uring sometimes erroneously returns EOWNERDEAD when the intention was
to return 0. It's harmless and fixed in linux 5.14 so just ignore the
error.

Fixes: https://github.com/libuv/libuv/issues/4001
Refs: https://github.com/torvalds/linux/commit/21f965221e
2023-05-17 16:54:36 +02:00
Santiago Gimeno
30fc896cc1
unix: handle CQ overflow in iou ring (#3991)
When there are more than 128 concurrent cq completions the CQ ring
overflows as signaled via the `UV__IORING_SQ_CQ_OVERFLOW`. If this
happens we have to enter the kernel to get the remaining items.
2023-05-15 10:42:14 +02:00
Tim Besard
6ad347fae4
unix: constrained_memory should return UINT64_MAX (#3753)
Document that we return UINT64_MAX if the cgroup limit is set to the
max. For cgroupv2, that happens if we encounter `max`, while cgroupv1
returns 9223372036854771712 when no limit is set (which according to
[this StackExchange discussion] is derived from LONG_MAX and
PAGE_SIZE). So make sure we also detect this case for cgroupv1.

[this StackExchange discussion]: https://unix.stackexchange.com/questions/420906/what-is-the-value-for-the-cgroups-limit-in-bytes-if-the-memory-is-not-restricte

Addresses: https://github.com/libuv/libuv/pull/3744/files#r974062912
2023-05-12 14:34:20 -04:00
Ben Noordhuis
6e073ef5da
linux: use io_uring to batch epoll_ctl calls (#3979)
This work was sponsored by ISC, the Internet Systems Consortium.
2023-05-01 09:00:08 +02:00
Ben Noordhuis
f272082240
linux: fix logic bug in sqe ring space check (#3980)
Handle wraparound properly, otherwise we may end up overwriting elements
that have not been consumed by the kernel yet.
2023-05-01 06:17:26 +02:00
Trevor Norris
e02642cf3b src: fix events/events_waiting metrics counter (#3957)
The worker pool calls all callbacks locally within the queue. So the
value of nevents doesn't properly reflect that case. Increase the number
of events directly from the worker pool's callback to correct this.

In order to properly determine if the events_waiting counter needs to be
incremented, store the timeout value at the time the event provider was
called.
2023-04-24 15:29:14 -06:00
Ben Noordhuis
1c935a3445
linux: remove bug workaround for obsolete kernels (#3965)
Libuv no longer supports such kernels so the workaround can be removed.
2023-04-20 12:15:32 +02:00
Ben Noordhuis
dfae365f84
linux: add IORING_OP_CLOSE support (#3964) 2023-04-20 10:44:16 +02:00
Ben Noordhuis
5ca5e475bb
linux: add IORING_OP_OPENAT support (#3963) 2023-04-20 10:17:06 +02:00
Ben Noordhuis
a7ff759ca1
linux: fix academic valgrind warning (#3960)
Fix a valgrind warning that only manifested with clang (not gcc!) by
explicitly passing 0L instead of plain 0 as the |sigsz| argument to
io_uring_enter(). That is, pass a long instead of an int.

On x86_64, |sigsz| is passed on the stack (the other arguments are
passed in registers) but where gcc emits a `push $0` that zeroes the
entire stack slot, clang emits a `movl $0,(%rsp)` that leaves the upper
32 bits untouched.

It's academic though since we don't pass IORING_ENTER_EXT_ARG and the
kernel therefore completely ignores the argument.

Refs: https://github.com/libuv/libuv/pull/3952
2023-04-19 07:39:10 +02:00
Ben Noordhuis
d2c31f429b
linux: introduce io_uring support (#3952)
Add io_uring support for several asynchronous file operations:

- read, write
- fsync, fdatasync
- stat, fstat, lstat

io_uring is used when the kernel is new enough, otherwise libuv simply
falls back to the thread pool.

Performance looks great; an 8x increase in throughput has been observed.

This work was sponsored by ISC, the Internet Systems Consortium.

Fixes: https://github.com/libuv/libuv/issues/1947
2023-04-18 12:32:08 +02:00
Trevor Norris
2f33980a91
src: switch to use C11 atomics where available (#3950)
Switch all code in unix/ to use C11 atomics directly.

Change uv_library_shutdown() to use an exchange instead of load/store.

Unfortunately MSVC only started supporting C11 atomics in VS2022 version
17.5 Preview 2 as experimental. So resort to using the Interlocked API.

Ref: https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio-2022-version-17-5-preview-2/
Fixes: https://github.com/libuv/libuv/issues/3948
2023-04-12 13:54:22 -06:00
Ben Noordhuis
244df24bf4
linux: remove arm oabi support (#3942)
The last major distro that supported the oabi calling convention was
Debian 5 (Lenny) and that went out of support in February 2012. It seems
like a fairly safe assumption that nothing speaks oabi anymore in this
day and age.

Fixes: https://github.com/libuv/libuv/issues/3935
2023-04-02 00:20:26 +02:00
Ben Noordhuis
28b9f1e68b
linux: replace unsafe macro with inline function (#3933)
Replace the throw-type-safety-to-the-wind CAST() macro with an inline
function that is hopefully harder to misuse. It should make the inotify
code slightly more legible if nothing else.
2023-03-31 10:09:48 +02:00
Ben Noordhuis
0c8eccc3fc
linux: remove epoll_pwait() emulation code path (#3936)
This was removed before in 2018 but then reinstated again in 2019 to
fix building with old Android SDKs. Well, time marches on; this time
it's gone for good.

Refs: https://github.com/libuv/libuv/pull/1372
Refs: https://github.com/libuv/libuv/pull/2358
2023-03-28 11:58:56 +02:00
Ben Noordhuis
434eb4b0ac
linux: handle cpu hotplugging in uv_cpu_info() (#3861)
On Linux, CPUs can come online or go offline while uv_cpu_info() is busy
gathering data. Change uv_cpu_info() in the following ways:

1. Learn online CPUs from /proc/stat

2. Get the model name from /proc/cpuinfo when it has a matching CPU,
   or default to "unknown"

3. Get speed from /sys/devices/system/cpu/cpu%u/cpufreq/scaling_cur_freq
   when it exists, or default to 0

Before this commit, libuv read the speed from /proc/cpuinfo but that
reports the base frequency, not the actual frequency. My system has
two cores running permanently at 3.6 GHz but libuv thought all 12 ran
at 2.2 GHz.

Fixes: https://github.com/libuv/libuv/issues/2351
Fixes: https://github.com/libuv/libuv/issues/3858
2023-01-14 05:08:15 +01:00
Ben Noordhuis
a3de1384c3
linux: simplify uv_uptime() (#3859)
Drop support for old kernels. Assume support for CLOCK_BOOTTIME.
2022-12-15 12:57:14 +01:00
Ben Noordhuis
8ddffeeea3
doc: bump min supported linux and freebsd versions (#3830)
The old Linux baseline was essentially RHEL 6 but that distro has been
out of support for two years now. Move to RHEL 7.

This commit also moves FreeBSD to tier 2 because it isn't actually
part of libuv's CI matrix, only Node's.

Fixes: https://github.com/libuv/libuv/issues/3822
2022-11-28 12:00:27 +01:00
Tim Besard
988d225cf0
unix,win: add uv_get_available_memory() (#3754) 2022-11-24 22:09:32 +01:00