luo980/curl - curl - Kebab: Code with Heart

Author	SHA1	Message	Date
Viktor Szakats	1bc69df7b4	tidy-up: use more example domains Also make use of the example TLD: https://en.wikipedia.org/wiki/.example Reviewed-by: Daniel Stenberg Closes #11992	2023-09-29 18:25:56 +00:00
Jay Satiro	7a2421dbb7	escape: replace Curl_isunreserved with ISUNRESERVED - Use the ALLCAPS version of the macro so that it is clear a macro is being called that evaluates the variable multiple times. - Also capitalize macro isurlpuntcs => ISURLPUNTCS since it evaluates a variable multiple times. This is a follow-up to `291d225a` which changed Curl_isunreserved into an alias macro for ISUNRESERVED. The problem is the former is not easily identified as a macro by the caller, which could lead to a bug. For example, ISUNRESERVED(foo++) is easily identifiable as wrong but Curl_isunreserved(foo++) is not even though they both are the same. Closes https://github.com/curl/curl/pull/11846	2023-09-14 03:07:45 -04:00
Daniel Stenberg	887b998e6e	urlapi: setting a blank URL ("") is not an ok URL Test it in 1560 Fixes #11714 Reported-by: ad0p on github Closes #11715	2023-08-23 23:24:16 +02:00
Daniel Stenberg	a281057091	urlapi: return CURLUE_BAD_HOSTNAME if puny2idn encoding fails And document it. Only return out of memory when it actually is a memory problem. Pointed-out-by: Jacob Mealey Closes #11674	2023-08-17 08:21:08 +02:00
Daniel Stenberg	c350069f64	urlapi: CURLU_PUNY2IDN - convert from punycode to IDN name Asssisted-by: Jay Satiro Closes #11655	2023-08-13 15:34:38 +02:00
Daniel Stenberg	49e2443186	urlapi: make sure zoneid is also duplicated in curl_url_dup Add several curl_url_dup() tests to the general lib1560 test. Reported-by: Rutger Broekhoff Bug: https://curl.se/mail/lib-2023-07/0047.html Closes #11549	2023-08-01 08:00:28 +02:00
Sergey	a21f318992	urlapi: fix heap buffer overflow `u->path = Curl_memdup(path, pathlen + 1);` accesses bytes after the null-terminator. ``` ==2676==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x04d48c75 at pc 0x0112708a bp 0x006fb7e0 sp 0x006fb3c4 READ of size 78 at 0x04d48c75 thread T0 #0 0x1127089 in __asan_wrap_memcpy D:\a\_work\1\s\src\vctools\asan\llvm\compiler-rt\lib\sanitizer_common\sanitizer_common_interceptors.inc:840 #1 0x1891a0e in Curl_memdup C:\actions-runner\_work\client\client\third_party\curl\lib\strdup.c:97 #2 0x18db4b0 in parseurl C:\actions-runner\_work\client\client\third_party\curl\lib\urlapi.c:1297 #3 0x18db819 in parseurl_and_replace C:\actions-runner\_work\client\client\third_party\curl\lib\urlapi.c:1342 #4 0x18d6e39 in curl_url_set C:\actions-runner\_work\client\client\third_party\curl\lib\urlapi.c:1790 #5 0x1877d3e in parseurlandfillconn C:\actions-runner\_work\client\client\third_party\curl\lib\url.c:1768 #6 0x1871acf in create_conn C:\actions-runner\_work\client\client\third_party\curl\lib\url.c:3403 #7 0x186d8dc in Curl_connect C:\actions-runner\_work\client\client\third_party\curl\lib\url.c:3888 #8 0x1856b78 in multi_runsingle C:\actions-runner\_work\client\client\third_party\curl\lib\multi.c:1982 #9 0x18531e3 in curl_multi_perform C:\actions-runner\_work\client\client\third_party\curl\lib\multi.c:2756 ``` Closes #11560	2023-08-01 07:59:07 +02:00
Daniel Stenberg	dacd25888f	curl_url_set: enforce the max string length check for all parts Update the docs and test 1559 accordingly Closes #11273	2023-06-08 23:40:08 +02:00
Daniel Stenberg	3c9256c8a0	urlapi: have *set(PATH) prepend a slash if one is missing Previously the code would just do that for the path when extracting the full URL, which made a subsequent curl_url_get() of the path to (unexpectedly) still return it without the leading path. Amend lib1560 to verify this. Clarify the curl_url_set() docs about it. Bug: https://curl.se/mail/lib-2023-06/0015.html Closes #11272 Reported-by: Pedro Henrique	2023-06-08 16:08:45 +02:00
Daniel Stenberg	ba669d072d	urlapi: scheme starts with alpha Add multiple tests to lib1560 to verify Fixes #11249 Reported-by: ad0p on github Closes #11250	2023-06-05 16:28:27 +02:00
Daniel Stenberg	6375a65433	urlapi: remove superfluous host name check ... as it is checked later more proper. Closes #11195	2023-05-25 08:30:20 +02:00
Daniel Stenberg	127eb0d83a	misc: fix spelling mistakes Reported-by: musvaage on github Fixes #11171 Closes #11172	2023-05-23 10:42:09 +02:00
Emanuele Torre	eef076baa6	Revert "urlapi: respect CURLU_ALLOW_SPACE and CURLU_NO_AUTHORITY for redirects" This reverts commit `df6c2f7b54`. (It only keep the test case that checks redirection to an absolute URL without hostname and CURLU_NO_AUTHORITY). I originally wanted to make CURLU_ALLOW_SPACE accept spaces in the hostname only because I thought curl_url_set(CURLUPART_URL, CURLU_ALLOW_SPACE) was already accepting them, and they were only not being accepted in the hostname when curl_url_set(CURLUPART_URL) was used for a redirection. That is not actually the case, urlapi never accepted hostnames with spaces, and a hostname with a space in it never makes sense. I probably misread the output of my original test when I they were normally accepted when using CURLU_ALLOW_SPACE, and not redirecting. Some other URL parsers seems to allow space in the host part of the URL, e.g. both python3's urllib.parse module, and Chromium's javascript URL object allow spaces (chromium percent escapes the spaces with %20), (they also both ignore TABs, and other whitespace characters), but those URLs with spaces in the hostname are useless, neither python3's requests module nor Chromium's window.location can actually use them. There is no reason to add support for URLs with spaces in the host, since it was not a inconsistency bug; let's revert that patch before it makes it into release. Sorry about that. I also reverted the extra check for CURLU_NO_AUTHORITY since that does not seem to be necessary, CURLU_NO_AUTHORITY already worked for redirects. Closes #11169	2023-05-21 13:59:04 +02:00
Daniel Stenberg	92772e6d39	urlapi: allow numerical parts in the host name It can only be an IPv4 address if all parts are all digits and no more than four parts, otherwise it is a host name. Even slightly wrong IPv4 will now be passed through as a host name. Regression from `17a15d8846` shipped in 8.1.0 Extended test 1560 accordingly. Reported-by: Pavel Kalyugin Fixes #11129 Closes #11131	2023-05-19 16:01:26 +02:00
Emanuele Torre	df6c2f7b54	urlapi: respect CURLU_ALLOW_SPACE and CURLU_NO_AUTHORITY for redirects curl_url_set(uh, CURLUPART_URL, redirurl, flags) was not respecing CURLU_ALLOW_SPACE and CURLU_NO_AUTHORITY in the host part of redirurl when redirecting to an absolute URL. Closes #11136	2023-05-18 20:52:59 +02:00
Emanuele Torre	f198d33e8d	checksrc: disallow spaces before labels Out of 415 labels throughout the code base, 86 of those labels were not at the start of the line. Which means labels always at the start of the line is the favoured style overall with 329 instances. Out of the 86 labels not at the start of the line: * 75 were indented with the same indentation level of the following line * 8 were indented with exactly one space * 2 were indented with one fewer indentation level then the following line * 1 was indented with the indentation level of the following line minus three space (probably unintentional) Co-Authored-By: Viktor Szakats Closes #11134	2023-05-18 20:45:04 +02:00
Emanuele Torre	7f712399d5	checksrc: check for spaces before the colon of switch labels Closes #11047	2023-04-27 23:26:50 +02:00
Daniel Stenberg	d567cca1de	checksrc: fix SPACEBEFOREPAREN for conditions starting with "*" The open paren check wants to warn for spaces before open parenthesis for if/while/for but also for any function call. In order to avoid catching function pointer declarations, the logic allows a space if the first character after the open parenthesis is an asterisk. I also spotted what we did not include "switch" in the check but we should. This check is a little lame, but we reduce this problem by not allowing that space for if/while/for/switch. Reported-by: Emanuele Torre Closes #11044	2023-04-27 17:24:47 +02:00
Daniel Stenberg	b7b1846275	urlapi: make internal function start with Curl_ Curl_url_set_authority() it is. Follow-up to `acd82c8bfd` Closes #11035	2023-04-27 08:36:51 +02:00
Stefan Eissing	acd82c8bfd	tests/http: more tests with specific clients - Makefile support for building test specific clients in tests/http/clients - auto-make of clients when invoking pytest - added test_09_02 for server PUSH_PROMISEs using clients/h2-serverpush - added test_02_21 for lib based downloads and pausing/unpausing transfers curl url parser: - added internal method `curl_url_set_authority()` for setting the authority part of a url (used for PUSH_PROMISE) http2: - made logging of PUSH_PROMISE handling nicer Placing python test requirements in requirements.txt files - separate files to base test suite and http tests since use and module lists differ - using the files in the gh workflows websocket test cases, fixes for we and bufq - bufq: account for spare chunks in space calculation - bufq: reset chunks that are skipped empty - ws: correctly encode frames with 126 bytes payload - ws: update frame meta information on first call of collect callback that fills user buffer - test client ws-data: some test/reporting improvements Closes #11006	2023-04-26 23:24:46 +02:00
Daniel Stenberg	3f1d89ed24	urlapi: skip a pointless assign It stores a null byte after already having confirmed there is a null byte there. Detected by PVS. Ref: #10929 Closes #10943	2023-04-13 14:36:28 +02:00
Daniel Stenberg	4cfa5bcc9a	urlapi: cleanups - move host checks together - simplify the scheme parser loop and the end of host name parser - avoid itermediate buffer storing in multiple places - reduce scope for several variables - skip the Curl_dyn_tail() call for speed - detect IPv6 earlier and skip extra checks for such hosts - normalize directly in dynbuf instead of itermediate buffer - split out the IPv6 parser into its own funciton - call the IPv6 parser directly for ipv6 addresses - remove (unused) special treatment of % in host names - junkscan() once in the beginning instead of scattered - make junkscan return error code - remove unused query management from dedotdotify() - make Curl_parse_login_details use memchr - more use of memchr() instead of strchr() and less strlen() calls - make junkscan check and return the URL length An optimized build runs one of my benchmark URL parsing programs ~41% faster using this branch. (compared against the shipped 7.88.1 library in Debian) Closes #10935	2023-04-13 08:41:40 +02:00
Daniel Stenberg	826e8011d5	urlapi: prevent setting invalid schemes with *url_set() A typical mistake would be to try to set "https://" - including the separator - this is now rejected as that would then lead to url_get(... URL...) would get an invalid URL extracted. Extended test 1560 to verify. Closes #10911	2023-04-09 23:23:54 +02:00
Daniel Stenberg	17a15d8846	urlapi: detect and error on illegal IPv4 addresses Using bad numbers in an IPv4 numerical address now returns CURLUE_BAD_HOSTNAME. I noticed while working on trurl and it was originally reported here: https://github.com/curl/trurl/issues/78 Updated test 1560 accordingly. Closes #10894	2023-04-06 09:02:00 +02:00
Daniel Stenberg	f042e1e75d	urlapi: URL encoding for the URL missed the fragment Meaning that it would wrongly still store the fragment using spaces instead of %20 if allowing space while also asking for URL encoding. Discovered when playing with trurl. Added test to lib1560 to verify the fix. Closes #10887	2023-04-05 08:30:12 +02:00
rcombs	b1d735956f	urlapi: take const args in _dup and _get functions Closes #10708	2023-03-08 15:38:26 +01:00
rcombs	95cb7d3166	urlapi: avoid mutating internals in getter routine This was not intended. Closes #10708	2023-03-08 15:38:18 +01:00
Daniel Stenberg	0a0c9b6dfa	urlapi: '%' is illegal in host names Update test 1560 to verify Ref: #10708 Closes #10711	2023-03-08 15:33:43 +01:00
Brad Spencer	ad4997e5b2	urlapi: parse IPv6 literals without ENABLE_IPV6 This makes the URL parser API stable and working the same way independently of libcurl supporting IPv6 transfers or not. Closes #10660	2023-03-03 10:05:08 +01:00
Daniel Stenberg	8b27799f8c	urlapi: do the port number extraction without using sscanf() - sscanf() is rather complex and slow, strchr() much simpler - the port number function does not need to fully verify the IPv6 address anyway as it is done later in the hostname_check() function and doing it twice is unnecessary. Closes #10541	2023-02-17 16:21:26 +01:00
Pronyushkin Petr	2b46ce0313	urlapi: fix part of conditional expression is always true: qlen Closes #10408	2023-02-06 08:53:07 +01:00
Daniel Stenberg	37554d7c07	urlapi: remove pathlen assignment "Value stored to 'pathlen' is never read" Follow-up to `804d5293f8` Reported-by: Kvarec Lezki Closes #10405	2023-02-03 08:20:21 +01:00
Daniel Stenberg	63c53ea627	urlapi: skip the extra dedotdot alloc if no dot in path Saves an allocation for many/most URLs. Updates test 1395 accordingly Closes #10403	2023-02-02 22:34:32 +01:00
Daniel Stenberg	7305ca63e2	urlapi: avoid Curl_dyn_addf() for hex outputs Inspired by the recent fixes to escape.c, we should avoid calling Curl_dyn_addf() in loops, perhaps in particular when adding something so simple as %HH codes - for performance reasons. This change makes the same thing for the URL parser's two URL-encoding loops. Closes #10384	2023-02-01 23:05:51 +01:00
Daniel Stenberg	804d5293f8	urlapi: skip path checks if path is just "/" As a miniscule optimization, treat a path of the length 1 as the same as non-existing, as it can only be a single leading slash, and that's what we do for no paths as well. Closes #10385	2023-02-01 23:04:45 +01:00
Daniel Stenberg	2bc1d775f5	copyright: update all copyright lines and remove year ranges - they are mostly pointless in all major jurisdictions - many big corporations and projects already don't use them - saves us from pointless churn - git keeps history for us - the year range is kept in COPYING checksrc is updated to allow non-year using copyright statements Closes #10205	2023-01-03 09:19:21 +01:00
Daniel Stenberg	901392cbb7	urlapi: add CURLU_PUNYCODE Allows curl_url_get() get the punycode version of host names for the host name and URL parts. Extend test 1560 to verify. Closes #10109	2022-12-26 23:29:23 +01:00
Daniel Stenberg	c20b35ddae	urlapi: reject more bad letters from the host name: &+() Follow-up from `eb0167ff7d` Extend test 1560 to verify Closes #10096	2022-12-15 08:23:48 +01:00
Daniel Stenberg	b15ca64bb0	urlapi: remove two variable assigns To please scan-build: urlapi.c:1163:9: warning: Value stored to 'qlen' is never read qlen = Curl_dyn_len(&enc); ^ ~~~~~~~~~~~~~~~~~~ urlapi.c:1164:9: warning: Value stored to 'query' is never read query = u->query = Curl_dyn_ptr(&enc); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Follow-up to `7d6cf06f57` Closes #9777	2022-10-21 11:00:18 +02:00
Daniel Stenberg	7d6cf06f57	urlapi: fix parsing URL without slash with CURLU_URLENCODE When CURLU_URLENCODE is set, the parser would mistreat the path component if the URL was specified without a slash like in http://local.test:80?-123 Extended test 1560 to reproduce and verify the fix. Reported-by: Trail of Bits Closes #9763	2022-10-20 08:56:53 +02:00
12932	ddeec8feba	misc: nitpick grammar in comments/docs because the 'u' in URL is actually a consonant sound it is only correct to write "a URL" sorry this is a bit nitpicky :P https://english.stackexchange.com/questions/152/when-should-i-use-a-vs-an https://www.techtarget.com/whatis/feature/Which-is-correct-a-URL-or-an-URL Closes #9699	2022-10-12 11:32:43 +02:00
John Bampton	e80c4ff3d0	misc: fix spelling in docs and comments also: remove outdated sentence Closes #9644	2022-10-05 16:12:10 +02:00
Daniel Stenberg	eb0167ff7d	urlapi: reject more bad characters from the host name field Extended test 1560 to verify Report from the ongoing source code audit by Trail of Bits. Closes #9608	2022-09-28 08:22:42 +02:00
Patrick Monnerat	9d51329047	setopt: use the handler table for protocol name to number conversions This also returns error CURLE_UNSUPPORTED_PROTOCOL rather than CURLE_BAD_FUNCTION_ARGUMENT when a listed protocol name is not found. A new schemelen parameter is added to Curl_builtin_scheme() to support this extended use. Note that disabled protocols are not recognized anymore. Tests adapted accordingly. Closes #9472	2022-09-16 23:29:01 +02:00
Daniel Stenberg	846678541b	urlapi: detect scheme better when not guessing When the parser is not allowed to guess scheme, it should consider the word ending at the first colon to be the scheme, independently of number of slashes. The parser now checks that the scheme is known before it counts slashes, to improve the error messge for URLs with unknown schemes and maybe no slashes. When following redirects, no scheme guessing is allowed and therefore this change effectively prevents redirects to unknown schemes such as "data". Fixes #9503	2022-09-15 09:31:40 +02:00
Daniel Stenberg	f703cf971c	urlapi: leaner with fewer allocs Slightly faster with more robust code. Uses fewer and smaller mallocs. - remove two fields from the URL handle struct - reduce copies and allocs - use dynbuf buffers more instead of custom malloc + copies - uses dynbuf to build the host name in reduces serial alloc+free within the same function. - move dedotdotify into urlapi.c and make it static, not strdup the input and optimize it by checking for . and / before using strncmp - remove a few strlen() calls - add Curl_dyn_setlen() that can "trim" an existing dynbuf Closes #9408	2022-09-07 10:21:45 +02:00
Daniel Stenberg	8dd95da35b	ctype: remove all use of <ctype.h>, use our own versions Except in the test servers. Closes #9433	2022-09-06 08:32:36 +02:00
Viktor Szakats	c9061f242b	misc: spelling fixes Found using codespell 2.2.1. Also delete the redundant protocol designator from an archive.org URL. Reviewed-by: Daniel Stenberg Closes #9403	2022-08-31 14:31:01 +00:00
Pierrick Charron	4bf2c231d7	urlapi: make curl_url_set(url, CURLUPART_URL, NULL, 0) clear all parts As per the documentation : > Setting a part to a NULL pointer will effectively remove that > part's contents from the CURLU handle. But currently clearing CURLUPART_URL does nothing and returns CURLUE_OK. This change will clear all parts of the URL at once. Closes #9028	2022-06-20 08:15:51 +02:00
max.mehl	ad9bc5976d	copyright: make repository REUSE compliant Add licensing and copyright information for all files in this repository. This either happens in the file itself as a comment header or in the file `.reuse/dep5`. This commit also adds a Github workflow to check pull requests and adapts copyright.pl to the changes. Closes #8869	2022-06-13 09:13:00 +02:00

1 2 3

120 Commits