GHA/windows: improve build perf with cmake unity batches

Default curl unity builds make a single unit for each target. It means
all target sources are batched together and built in a single compiler
invocation. With multi-core CPUs this doesn't always result in the best
possible performance. This patch enables smaller batches for jobs where
this resulted in shorter build times. These jobs are Cygwin, MSYS2,
MinGW, running on the Windows runners.

Use batch of 30 (meaning 30 sources batched into units), and 32 for
Cygwin/MSYS2 to avoid a unity fallout that's subject to a different PR.

(CMake allows to set the number of sources per unit, not the number
of units, though the latter may be more practical to max out CPU cores.)

Also override to not batch the `curlu` target because batching lost
a little bit of time there, due to the already existing parallelism when
building the `testdeps` targets.

For jobs on the macOS and Linux runners jobs were already mostly single
digit or below teen seconds, and batching didn't improve on them
noticeably. On VM jobs, the virtual CPUs are limited, so I didn't
make a try. In AppVeyor and GHA vcpkg jobs (using msbuild), batching
didn't result in conclusive or any gains.

Build times in seconds (curl + testdeps):
Job                  |          Before | After w curlu=0 | Gain
:--------------------| :-------------- | :-------------- | :---
cygwin, CM           |   19 + 32 =  51 |  12 +  32 =  44 |    7
msys2, CM            |    7 + 15 =  22 |   5 +  14 =  19 |    3
mingw gcc U, CM      |   19 + 30 =  49 |  13 +  32 =  45 |    4
mingw ucrt, CM       |   32 + 42 =  74 |  15 +  43 =  58 |   16
mingw clang, CM      |   15 + 21 =  36 |   8 +  21 =  29 |    7
mingw uwp, CM        |   30 + 40 =  70 |  14 +  40 =  54 |   16
mingw gcc, CM        |   20 + 31 =  51 |  12 +  31 =  43 |    8
mingw x86, CM        |   35 + 40 =  75 |  15 +  38 =  53 |   22
dl-mingw, CM 9.5.0   |   88 + 99 = 187 |  42 + 101 = 143 |   44
dl-mingw, CM 7.3.0 U |   24 + 32 =  56 |  17 +  35 =  52 |    4
Total                |                 |                 |  131

Total gain per GHA/windows workflow runs: 2m11s

Runs:
Before: https://github.com/curl/curl/actions/runs/13220256084/job/36904342259
After: https://github.com/curl/curl/actions/runs/13220383702/job/36904602981
       https://github.com/curl/curl/actions/runs/13220613141/job/36905170104
       https://github.com/curl/curl/actions/runs/13222019443/job/36908358550
With curlu tweak: https://github.com/curl/curl/actions/runs/13222239255/job/36908782462

Ref: 116950a250 #16265

Closes #16272
This commit is contained in:
Viktor Szakats 2025-02-09 00:50:07 +01:00
parent 48df83a30e
commit 29e4eda631
No known key found for this signature in database
GPG Key ID: B5ABD165E2AEF201
2 changed files with 6 additions and 3 deletions

View File

@ -86,7 +86,7 @@ jobs:
if [ '${{ matrix.build }}' = 'cmake' ]; then
PATH="/usr/bin:$(cygpath "${SYSTEMROOT}")/System32"
cmake -B bld -G Ninja ${options} \
-DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \
-DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=32 -DCURL_TEST_BUNDLES=ON \
-DCURL_WERROR=ON \
${{ matrix.config }}
else
@ -252,7 +252,7 @@ jobs:
cmake -B bld -G Ninja ${options} \
-DCMAKE_C_FLAGS="${{ matrix.cflags }} ${CFLAGS_CMAKE} ${CPPFLAGS}" \
-DCMAKE_BUILD_TYPE='${{ matrix.type }}' \
-DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \
-DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=32 -DCURL_TEST_BUNDLES=ON \
-DCURL_WERROR=ON \
${{ matrix.config }}
else
@ -438,7 +438,7 @@ jobs:
cmake -B bld -G 'MSYS Makefiles' ${options} \
-DCMAKE_C_COMPILER=gcc \
-DCMAKE_BUILD_TYPE='${{ matrix.type }}' \
-DCMAKE_UNITY_BUILD=ON -DCURL_TEST_BUNDLES=ON \
-DCMAKE_UNITY_BUILD=ON -DCMAKE_UNITY_BUILD_BATCH_SIZE=30 -DCURL_TEST_BUNDLES=ON \
-DCURL_WERROR=ON \
-DCURL_USE_LIBPSL=OFF \
${{ matrix.config }}

View File

@ -55,6 +55,9 @@ if(CURL_BUILD_TESTING)
)
target_compile_definitions(curlu PUBLIC "UNITTESTS" "CURL_STATICLIB")
target_link_libraries(curlu PRIVATE ${CURL_LIBS})
# There is plenty of parallelism when building the testdeps target.
# Override the curlu batch size with the maximum to optimize performance.
set_target_properties(curlu PROPERTIES UNITY_BUILD_BATCH_SIZE 0)
endif()
if(ENABLE_CURLDEBUG)