llvm-project 16 was just released. I added some lld/ELF notes to https://github.com/llvm/llvm-project/blob/release/16.x/lld/docs/ReleaseNotes.rst. Here I will elaborate on some changes.
- Link speed improved greatly compared with lld 15.0. Notably input section initialization and relocation scanning are now parallel. (D130810) (D133003)
ELFCOMPRESS_ZSTDcompressed input sections are now supported. (D129406)--compress-debug-sections=zstdis now available to compress debug sections with zstd (ELFCOMPRESS_ZSTD). (D133548)--no-warnings/-wis now available to suppress warnings. (D136569)DT_RISCV_VARIANT_CCis now produced if at least oneR_RISCV_JUMP_SLOTrelocation references a symbol with theSTO_RISCV_VARIANT_CCbit. (D107951)DT_STATIC_TLSis now set for AArch64/PPC32/PPC64 initial-exec TLS models when producing a shared object.--no-undefined-versionis now the default; symbols named in version scripts that have no matching symbol in the output will be reported. Use--undefined-versionto revert to the old behavior. (D135402)-Vis now an alias for-vto supportgcc -fuse-ld=lld -von many targets.-rno longer defines__global_pointer$or_TLS_MODULE_BASE_.- A corner case of mixed GCC and Clang object files
(
STB_WEAKandSTB_GNU_UNIQUEin different COMDATs) is now supported. (D136381) - The output
SHT_RISCV_ATTRIBUTESsection now merges all input components instead of picking the first input component. (D138550) - For x86-32,
-fno-pltGD/LD TLS modelscall *[email protected](%reg)are now supported. Previous output might have runtime crash. - Armv4(T) thunks are now supported. (D139888) (D141272)
Speed
Link speed has greatly improved compared to lld 15.0.0.
In this release cycle, I made input section initialization and relocation scanning parallel. (D130810 D133003)
1 | % hyperfine --warmup 1 --min-runs 20 "numactl -C 20-27 "{/tmp/out/custom15,/tmp/out/custom16}"/bin/ld.lld @response.txt --threads=8" |
(--threads=4 => 1.38x (0.7009s => 0.5096s))
Linking a -DCMAKE_BUILD_TYPE=Debug build of clang:
1 | % hyperfine --warmup 2 --min-runs 25 "numactl -C 20-27 /tmp/out/custom"{15,16}"/bin/ld.lld @response.txt --threads=8" |
(--threads=1 => 1.06x (7.620s =>
7.202s), --threads=4 => 1.11x (4.138s => 3.727s))
Linking a default build of chrome:
1 | % hyperfine --warmup 2 --min-runs 25 "numactl -C 20-27 /tmp/out/custom"{15,16}"/bin/ld.lld @response.txt --threads=8" |
(--threads=4 => 1.11x (4.387s => 3.940s))