Out of 415 labels throughout the code base, 86 of those labels were
not at the start of the line. Which means labels always at the start of
the line is the favoured style overall with 329 instances.
Out of the 86 labels not at the start of the line:
* 75 were indented with the same indentation level of the following line
* 8 were indented with exactly one space
* 2 were indented with one fewer indentation level then the following
line
* 1 was indented with the indentation level of the following line minus
three space (probably unintentional)
Co-Authored-By: Viktor Szakats
Closes#11134
The open paren check wants to warn for spaces before open parenthesis
for if/while/for but also for any function call. In order to avoid
catching function pointer declarations, the logic allows a space if the
first character after the open parenthesis is an asterisk.
I also spotted what we did not include "switch" in the check but we should.
This check is a little lame, but we reduce this problem by not allowing
that space for if/while/for/switch.
Reported-by: Emanuele Torre
Closes#11044
- Makefile support for building test specific clients in tests/http/clients
- auto-make of clients when invoking pytest
- added test_09_02 for server PUSH_PROMISEs using clients/h2-serverpush
- added test_02_21 for lib based downloads and pausing/unpausing transfers
curl url parser:
- added internal method `curl_url_set_authority()` for setting the
authority part of a url (used for PUSH_PROMISE)
http2:
- made logging of PUSH_PROMISE handling nicer
Placing python test requirements in requirements.txt files
- separate files to base test suite and http tests since use
and module lists differ
- using the files in the gh workflows
websocket test cases, fixes for we and bufq
- bufq: account for spare chunks in space calculation
- bufq: reset chunks that are skipped empty
- ws: correctly encode frames with 126 bytes payload
- ws: update frame meta information on first call of collect
callback that fills user buffer
- test client ws-data: some test/reporting improvements
Closes#11006
- move host checks together
- simplify the scheme parser loop and the end of host name parser
- avoid itermediate buffer storing in multiple places
- reduce scope for several variables
- skip the Curl_dyn_tail() call for speed
- detect IPv6 earlier and skip extra checks for such hosts
- normalize directly in dynbuf instead of itermediate buffer
- split out the IPv6 parser into its own funciton
- call the IPv6 parser directly for ipv6 addresses
- remove (unused) special treatment of % in host names
- junkscan() once in the beginning instead of scattered
- make junkscan return error code
- remove unused query management from dedotdotify()
- make Curl_parse_login_details use memchr
- more use of memchr() instead of strchr() and less strlen() calls
- make junkscan check and return the URL length
An optimized build runs one of my benchmark URL parsing programs ~41%
faster using this branch. (compared against the shipped 7.88.1 library
in Debian)
Closes#10935
A typical mistake would be to try to set "https://" - including the
separator - this is now rejected as that would then lead to
url_get(... URL...) would get an invalid URL extracted.
Extended test 1560 to verify.
Closes#10911
Using bad numbers in an IPv4 numerical address now returns
CURLUE_BAD_HOSTNAME.
I noticed while working on trurl and it was originally reported here:
https://github.com/curl/trurl/issues/78
Updated test 1560 accordingly.
Closes#10894
Meaning that it would wrongly still store the fragment using spaces
instead of %20 if allowing space while also asking for URL encoding.
Discovered when playing with trurl.
Added test to lib1560 to verify the fix.
Closes#10887
- sscanf() is rather complex and slow, strchr() much simpler
- the port number function does not need to fully verify the IPv6 address
anyway as it is done later in the hostname_check() function and doing
it twice is unnecessary.
Closes#10541
Inspired by the recent fixes to escape.c, we should avoid calling
Curl_dyn_addf() in loops, perhaps in particular when adding something so
simple as %HH codes - for performance reasons. This change makes the
same thing for the URL parser's two URL-encoding loops.
Closes#10384
As a miniscule optimization, treat a path of the length 1 as the same as
non-existing, as it can only be a single leading slash, and that's what
we do for no paths as well.
Closes#10385
- they are mostly pointless in all major jurisdictions
- many big corporations and projects already don't use them
- saves us from pointless churn
- git keeps history for us
- the year range is kept in COPYING
checksrc is updated to allow non-year using copyright statements
Closes#10205
To please scan-build:
urlapi.c:1163:9: warning: Value stored to 'qlen' is never read
qlen = Curl_dyn_len(&enc);
^ ~~~~~~~~~~~~~~~~~~
urlapi.c:1164:9: warning: Value stored to 'query' is never read
query = u->query = Curl_dyn_ptr(&enc);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Follow-up to 7d6cf06f57Closes#9777
When CURLU_URLENCODE is set, the parser would mistreat the path
component if the URL was specified without a slash like in
http://local.test:80?-123
Extended test 1560 to reproduce and verify the fix.
Reported-by: Trail of Bits
Closes#9763
This also returns error CURLE_UNSUPPORTED_PROTOCOL rather than
CURLE_BAD_FUNCTION_ARGUMENT when a listed protocol name is not found.
A new schemelen parameter is added to Curl_builtin_scheme() to support
this extended use.
Note that disabled protocols are not recognized anymore.
Tests adapted accordingly.
Closes#9472
When the parser is not allowed to guess scheme, it should consider the
word ending at the first colon to be the scheme, independently of number
of slashes.
The parser now checks that the scheme is known before it counts slashes,
to improve the error messge for URLs with unknown schemes and maybe no
slashes.
When following redirects, no scheme guessing is allowed and therefore
this change effectively prevents redirects to unknown schemes such as
"data".
Fixes#9503
Slightly faster with more robust code. Uses fewer and smaller mallocs.
- remove two fields from the URL handle struct
- reduce copies and allocs
- use dynbuf buffers more instead of custom malloc + copies
- uses dynbuf to build the host name in reduces serial alloc+free within
the same function.
- move dedotdotify into urlapi.c and make it static, not strdup the input
and optimize it by checking for . and / before using strncmp
- remove a few strlen() calls
- add Curl_dyn_setlen() that can "trim" an existing dynbuf
Closes#9408
As per the documentation :
> Setting a part to a NULL pointer will effectively remove that
> part's contents from the CURLU handle.
But currently clearing CURLUPART_URL does nothing and returns
CURLUE_OK. This change will clear all parts of the URL at once.
Closes#9028
Add licensing and copyright information for all files in this repository. This
either happens in the file itself as a comment header or in the file
`.reuse/dep5`.
This commit also adds a Github workflow to check pull requests and adapts
copyright.pl to the changes.
Closes#8869
Previously, the return code CURLUE_MALFORMED_INPUT was used for almost
30 different URL format violations. This made it hard for users to
understand why a particular URL was not acceptable. Since the API cannot
point out a specific position within the URL for the problem, this now
instead introduces a number of additional and more fine-grained error
codes to allow the API to return more exactly in what "part" or section
of the URL a problem was detected.
Also bug-fixes curl_url_get() with CURLUPART_ZONEID, which previously
returned CURLUE_OK even if no zoneid existed.
Test cases in 1560 have been adjusted and extended. Tests 1538 and 1559
have been updated.
Updated libcurl-errors.3 and curl_url_strerror() accordingly.
Closes#8049
Instad of having all callers pass in the maximum length, always use
it. The passed in length is instead used only as the length of the
target buffer for to storing the scheme name in, if used.
Added the scheme max length restriction to the curl_url_set.3 man page.
Follow-up to 45bcb2eaa7Closes#8047
file URLs that are 6 bytes or shorter are not complete. Return
CURLUE_MALFORMED_INPUT for those. Extended test 1560 to verify.
Triggered by #8041Closes#8042
... to let curl_easy_escape() itself do the strlen. This avoids a (false
positive) Coverity warning and it avoids us having to store the strlen()
return value in an int variable.
Reviewed-by: Daniel Gustafsson
Closes#7862
The host name is stored decoded and can be encoded when used to extract
the full URL. By default when extracting the URL, the host name will not
be URL encoded to work as similar as possible as before. When not URL
encoding the host name, the '%' character will however still be encoded.
Getting the URL with the CURLU_URLENCODE flag set will percent encode
the host name part.
As a bonus, setting the host name part with curl_url_set() no longer
accepts a name that contains space, CR or LF.
Test 1560 has been extended to verify percent encodings.
Reported-by: Noam Moshe
Reported-by: Sharon Brizinov
Reported-by: Raul Onitza-Klugman
Reported-by: Kirill Efimov
Fixes#7830Closes#7834