Rework the event based handling of transfers and connections to
be "localized" into a single source file with clearer dependencies.
- add multi_ev.c and multi_ev.h
- add docs/internal/MULTI-EV.md to explain the overall workings
- only do event handling book keeping when the socket callback
is set
- add handling for "connection only" event tracking, when internal
easy handles are used that are not really tied to a connection.
Used in connection pool.
- remove transfer member "last_poll" and connections "shutdown_poll"
and keep all that internal to multi_ev.c
- add CURL_TRC_M() for tracing of "multi" related things, including
event handling and connection pool operations. Add new trace
feature "multi" for trace config.
multi traces will show exactly what is going on in regard to
event handling.
- multi: trace transfers "mstate" in every CURL_TRC_M() call
- make internal trace buffer 2048 bytes and end the silliness
with +n here -m there. Adjust test 1652 expectations of resulting
length and input edge cases.
- add trace feature "lib-ids" to perfix libcurl traces with transfer
and connection ids. Useful for debugging libcurl applications.
Closes#16308
- Make curl_multi_waitfds consistent with the documentation.
Issue Addressed:
- The documentation of curl_multi_waitfds indicates that users should
be able to call curl_multi_waitfds with a NULL ufds. However, before
this change, the function would return CURLM_BAD_FUNCTION_ARGUMENT.
- Additionally, the documentation suggests that users can use this
function to determine the number of file descriptors (fds) needed.
However, the function would stop counting fds if the supplied fds
were exhausted.
Changes Made:
- NULL ufds Handling: curl_multi_waitfds can now accept a NULL ufds if
size is also zero.
- Counting File Descriptors: If curl_multi_waitfds is passed a NULL
ufds, or the size of ufds is insufficient, the output parameter
fd_count will return the number of fds needed. This value may be
higher than actually needed but never lower.
Testing:
- Test 2405 has been updated to cover the usage scenarios described
above.
Fixes https://github.com/curl/curl/issues/15146
Closes https://github.com/curl/curl/pull/15155
This is a better match for what they do and the general "cpool"
var/function prefix works well.
The pool now handles very long hostnames correctly.
The following changes have been made:
* 'struct connectdata', e.g. connections, keep new members
named `destination` and ' destination_len' that fully specifies
interface+port+hostname of where the connection is going to.
This is used in the pool for "bundling" of connections with
the same destination. There is no limit on the length any more.
* Locking: all locks are done inside conncache.c when calling
into the pool and released on return. This eliminates hazards
of the callers keeping track.
* 'struct connectbundle' is now internal to the pool. It is no
longer referenced by a connection.
* 'bundle->multiuse' no longer exists. HTTP/2 and 3 and TLS filters
no longer need to set it. Instead, the multi checks on leaving
MSTATE_CONNECT or MSTATE_CONNECTING if the connection is now
multiplexed and new, e.g. not conn->bits.reuse. In that case
the processing of pending handles is triggered.
* The pool's init is provided with a callback to invoke on all
connections being discarded. This allows the cleanups in
`Curl_disconnect` to run, wherever it is decided to retire
a connection.
* Several pool operations can now be fully done with one call.
Pruning dead connections, upkeep and checks on pool limits
can now directly discard connections and need no longer return
those to the caller for doing that (as we have now the callback
described above).
* Finding a connection for reuse is now done via `Curl_cpool_find()`
and the caller provides callbacks to evaluate the connection
candidates.
* The 'Curl_cpool_check_limits()' now directly uses the max values
that may be set in the transfer's multi. No need to pass them
around. Curl_multi_max_host_connections() and
Curl_multi_max_total_connections() are gone.
* Add method 'Curl_node_llist()' to get the llist a node is in.
Used in cpool to verify connection are indeed in the list (or
not in any list) as they need to.
I left the conncache.[ch] as is for now and also did not touch the
documentation. If we update that outside the feature window, we can
do this in a separate PR.
Multi-thread safety is not achieved by this PR, but since more details
on how pools operate are now "internal" it is a better starting
point to go for this in the future.
Closes#14662
Based on the standards and guidelines we use for our documentation.
- expand contractions (they're => they are etc)
- host name = > hostname
- file name => filename
- user name = username
- man page => manpage
- run-time => runtime
- set-up => setup
- back-end => backend
- a HTTP => an HTTP
- Two spaces after a period => one space after period
Closes#14073
When libcurl discards a connection there are two phases this may go
through: "shutdown" and "closing". If a connection is aborted, the
shutdown phase is skipped and it is closed right away.
The connection filters attached to the connection implement the phases
in their `do_shutdown()` and `do_close()` callbacks. Filters carry now a
`shutdown` flags next to `connected` to keep track of the shutdown
operation.
Filters are shut down from top to bottom. If a filter is not connected,
its shutdown is skipped. Notable filters that *do* something during
shutdown are HTTP/2 and TLS. HTTP/2 sends the GOAWAY frame. TLS sends
its close notify and expects to receive a close notify from the server.
As sends and receives may EAGAIN on the network, a shutdown is often not
successful right away and needs to poll the connection's socket(s). To
facilitate this, such connections are placed on a new shutdown list
inside the connection cache.
Since managing this list requires the cooperation of a multi handle,
only the connection cache belonging to a multi handle is used. If a
connection was in another cache when being discarded, it is removed
there and added to the multi's cache. If no multi handle is available at
that time, the connection is shutdown and closed in a one-time,
best-effort attempt.
When a multi handle is destroyed, all connection still on the shutdown
list are discarded with a final shutdown attempt and close. In curl
debug builds, the environment variable `CURL_GRACEFUL_SHUTDOWN` can be
set to make this graceful with a timeout in milliseconds given by the
variable.
The shutdown list is limited to the max number of connections configured
for a multi cache. Set via CURLMOPT_MAX_TOTAL_CONNECTIONS. When the
limit is reached, the oldest connection on the shutdown list is
discarded.
- In multi_wait() and multi_waitfds(), collect all connection caches
involved (each transfer might carry its own) into a temporary list.
Let each connection cache on the list contribute sockets and
POLLIN/OUT events it's connections are waiting for.
- in multi_perform() collect the connection caches the same way and let
them peform their maintenance. This will make another non-blocking
attempt to shutdown all connections on its shutdown list.
- for event based multis (multi->socket_cb set), add the sockets and
their poll events via the callback. When `multi_socket()` is invoked
for a socket not known by an active transfer, forward this to the
multi's cache for processing. On closing a connection, remove its
socket(s) via the callback.
TLS connection filters MUST NOT send close nofity messages in their
`do_close()` implementation. The reason is that a TLS close notify
signals a success. When a connection is aborted and skips its shutdown
phase, the server needs to see a missing close notify to detect
something has gone wrong.
A graceful shutdown of FTP's data connection is performed implicitly
before regarding the upload/download as complete and continuing on the
control connection. For FTP without TLS, there is just the socket close
happening. But with TLS, the sent/received close notify signals that the
transfer is complete and healthy. Servers like `vsftpd` verify that and
reject uploads without a TLS close notify.
- added test_19_* for shutdown related tests
- test_19_01 and test_19_02 test for TCP RST packets
which happen without a graceful shutdown and should
no longer appear otherwise.
- add test_19_03 for handling shutdowns by the server
- add test_19_04 for handling shutdowns by curl
- add test_19_05 for event based shutdowny by server
- add test_30_06/07 and test_31_06/07 for shutdown checks
on FTP up- and downloads.
Closes#13976
`CURLDEBUG` is meant to enable memory tracking, but in a bunch of cases,
it was protecting debug features that were supposed to be guarded with
`DEBUGBUILD`.
Replace these uses with `DEBUGBUILD`.
This leaves `CURLDEBUG` uses solely for its intended purpose: to enable
the memory tracking debug feature.
Also:
- autotools: rely on `DEBUGBUILD` to enable `checksrc`.
Instead of `CURLDEBUG`, which worked in most cases because debug
builds enable `CURLDEBUG` by default, but it's not accurate.
- include `lib/easyif.h` instead of keeping a copy of a declaration.
- add CI test jobs for the build issues discovered.
Ref: https://github.com/curl/curl/pull/13694#issuecomment-2120311894Closes#13718
- add an `id` long to Curl_easy, -1 on init
- once added to a multi (or its own multi), it gets
a non-negative number assigned by the connection cache
- `id` is unique among all transfers using the same
cache until reaching LONG_MAX where it will wrap
around. So, not unique eternally.
- CURLINFO_CONN_ID returns the connection id attached to
data or, if none present, data->state.lastconnect_id
- variables and type declared in tool for write out
Closes#11185
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING
checksrc is updated to allow non-year using copyright statements
Closes#10205
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.
This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.
Closes#8869
To simplify, and also since the returned name is not the full actual
name used for the check. The port number and zone id is also involved,
so just showing the name is misleading.
Closes#8750
... in most cases instead of 'struct connectdata *' but in some cases in
addition to.
- We mostly operate on transfers and not connections.
- We need the transfer handle to log, store data and more. Everything in
libcurl is driven by a transfer (the CURL * in the public API).
- This work clarifies and separates the transfers from the connections
better.
- We should avoid "conn->data". Since individual connections can be used
by many transfers when multiplexing, making sure that conn->data
points to the current and correct transfer at all times is difficult
and has been notoriously error-prone over the years. The goal is to
ultimately remove the conn->data pointer for this reason.
Closes#6425
More connection cache accesses are protected by locks.
CONNCACHE_* is a beter prefix for the connection cache lock macros.
Curl_attach_connnection: now called as soon as there's a connection
struct available and before the connection is added to the connection
cache.
Curl_disconnect: now assumes that the connection is already removed from
the connection cache.
Ref: #4915Closes#5009
It could accidentally let the connection get used by more than one
thread, leading to double-free and more.
Reported-by: Christopher Reid
Fixes#4544Closes#4557
Only HTTP proxy use where multiple host names can be used over the same
connection should use the proxy host name for bundles.
Reported-by: Tom van der Woerdt
Fixes#3951Closes#3955
Do not assume/store assocation between a given easy handle and the
connection if it can be avoided.
Long-term, the 'conn->data' pointer should probably be removed as it is a
little too error-prone. Still used very widely though.
Reported-by: masbug on github
Fixes#3391Closes#3400
If the lock is released before the dealings with the bundle is over, it may
have changed by another thread in the mean time.
Fixes#2132Fixes#2151Closes#2139
... to make all libcurl internals able to use the same data types for
the struct members. The timeval struct differs subtly on several
platforms so it makes it cumbersome to use everywhere.
Ref: #1652Closes#1693
to allow code to act differently on the situation.
Also added some more info message for the connection re-use function to
make it clearer when connections are not re-used.
All the existing Curl_bundle* functions were only ever used from within
the conncache.c file, so I moved them over and made them static (and
removed the Curl_ prefix).
When checking for a connection to re-use, a proxy-using request must
check for and use a proxy connection and not one based on the host
name!
Added test 1421 to verify
Bug: http://curl.haxx.se/bug/view.cgi?id=1492
Bringing back the old functionality that was mistakenly removed when the
connection cache was remade. When creating a new connection, all the
existing ones are checked and those that are known to be dead get
disconnected for real and removed from the connection cache. It helps
the cache from holding on to very many stale connections and aids in
keeping down the number of system sockets in wait states.
Help-by: Jonatan Vela <jonatan.vela@ergon.ch>
Bug: http://curl.haxx.se/mail/lib-2014-06/0189.html
warning C4267: '=' : conversion from 'size_t' to 'long', possible loss
of data
The member connection_id of struct connectdata is a long (always a
32-bit signed integer on Visual C++) and the member next_connection_id
of struct conncache is a size_t, so one of them should be changed to
match the other.
This patch the size_t in struct conncache to long (the less invasive
change as that variable is only ever used in a single code line).
Bug: http://curl.haxx.se/bug/view.cgi?id=1399
The static connection counter caused a race condition. Moving the
connection id counter into conncache solves it, as well as simplifying
the related logic.
By introducing an internal alternative to curl_multi_init() that accepts
parameters to set the hash sizes, easy handles will now use tiny socket
and connection hash tables since it will only ever add a single easy
handle to that multi handle.
This decreased the number mallocs in test 40 (which is a rather simple
and typical easy interface use case) from 1142 to 138. The maximum
amount of memory allocated used went down from 118969 to 78805.
Remove internal separated behavior of the easy vs multi intercace.
curl_easy_perform() is now using the multi interface itself.
Several minor multi interface quirks and bugs have been fixed in the
process.
Much help with debugging this has been provided by: Yang Tse
This reverts renaming and usage of lib/*.h header files done
28-12-2012, reverting 2 commits:
f871de0... build: make use of 76 lib/*.h renamed files
ffd8e12... build: rename 76 lib/*.h files
This also reverts removal of redundant include guard (redundant thanks
to changes in above commits) done 2-12-2013, reverting 1 commit:
c087374... curl_setup.h: remove redundant include guard
This also reverts renaming and usage of lib/*.c source files done
3-12-2013, reverting 3 commits:
13606bb... build: make use of 93 lib/*.c renamed files
5b6e792... build: rename 93 lib/*.c files
7d83dff... build: commit 13606bbfde follow-up 1
Start of related discussion thread:
http://curl.haxx.se/mail/lib-2013-01/0012.html
Asking for confirmation on pushing this revertion commit:
http://curl.haxx.se/mail/lib-2013-01/0048.html
Confirmation summary:
http://curl.haxx.se/mail/lib-2013-01/0079.html
NOTICE: The list of 2 files that have been modified by other
intermixed commits, while renamed, and also by at least one
of the 6 commits this one reverts follows below. These 2 files
will exhibit a hole in history unless git's '--follow' option
is used when viewing logs.
lib/curl_imap.h
lib/curl_smtp.h
A bundle is a list of all persistent connections to the same host.
The connection cache consists of a hash of bundles, with the
hostname as the key.
The benefits may not be obvious, but they are two:
1) Faster search for connections to reuse, since the hash
lookup only finds connections to the host in question.
2) It lays out the groundworks for an upcoming patch,
which will introduce multiple HTTP pipelines.
This patch also removes the awkward list of "closure handles",
which were needed to send QUIT commands to the FTP server
when closing a connection.
Now we allocate a separate closure handle and use that
one to close all connections.
This has been tested in a live system for a few weeks, and of
course passes the test suite.