Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lld provide - provides more symbols than needed #74771

Closed
shankarke opened this issue Dec 7, 2023 · 5 comments
Closed

lld provide - provides more symbols than needed #74771

shankarke opened this issue Dec 7, 2023 · 5 comments
Labels

Comments

@shankarke
Copy link
Contributor

shankarke commented Dec 7, 2023

cat > 1.c << \!
int foo = 10;
!

cat > script.t << \!
SECTIONS {
  PROVIDE(f4 = 0x1000);
  PROVIDE(f3 = f4);
  PROVIDE(f2 = f3);
  PROVIDE(f2 = f3);
  PROVIDE(f1 = f2);
  PROVIDE(foo = f1);
}
!

clang -target riscv32 -c 1.c
ld.lld  1.o -T script.t
$ ld.lld --version
LLD 18.0.0 (compatible with GNU linkers)

$llvm-readelf -s a.out
Symbol table '.symtab' contains 10 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS 1.c
     2: 00000000     0 NOTYPE  LOCAL  DEFAULT    2 $d.0
     3: 00000000     0 NOTYPE  LOCAL  DEFAULT    3 $d.1
     4: 00000000     0 NOTYPE  LOCAL  DEFAULT  ABS $d.2
     5: 00000000     4 OBJECT  GLOBAL DEFAULT    2 foo
     6: 00001000     0 NOTYPE  GLOBAL DEFAULT  ABS f4
     7: 00001000     0 NOTYPE  GLOBAL DEFAULT  ABS f3
     8: 00001000     0 NOTYPE  GLOBAL DEFAULT  ABS f2
     9: 00001000     0 NOTYPE  GLOBAL DEFAULT  ABS f1

Is this is a known issue why GNU linker script PROVIDE directive works this way with lld ?

@github-actions github-actions bot added the lld label Dec 7, 2023
@MaskRay
Copy link
Member

MaskRay commented Dec 9, 2023

If a symbol is only referenced by the right hand side of a PROVIDE assignment, it is not considered referenced and PROVIDE will not define it.
For simplicity, ld.lld does not implement this special rule.

as /dev/null -o a.o
cat > a.t <<e
SECTIONS {
  PROVIDE(f3 = 0x1000);
  PROVIDE(f2 = f3);
  PROVIDE(f1 = f2);
  PROVIDE(foo = f1);
}
e
ld.bfd a.o -T a.t -o a.bfd
ld.lld a.o -T a.t -o a.lld
% readelf -s a.bfd

Symbol table '.symtab' contains 1 entry:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
% readelf -s a.lld

Symbol table '.symtab' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000001000     0 NOTYPE  GLOBAL DEFAULT  ABS f3
     2: 0000000000001000     0 NOTYPE  GLOBAL DEFAULT  ABS f2
     3: 0000000000001000     0 NOTYPE  GLOBAL DEFAULT  ABS f1

@shankarke
Copy link
Contributor Author

Thanks @MaskRay, I don't see that this behavior will break anything, any plans on fixing it ?

@MaskRay
Copy link
Member

MaskRay commented Dec 13, 2023

I think implementing this behavior is non-trivial. so I'd ask why it's important to implement this behavior.

The RHS of a symbol assignment does serve the purpose of Undefined and we implement it with referencedSymbols. referencedSymbols is needed to extract archive members and inform LTO about the uses. One nice aspect using referencedSymbols is that we reduce position dependence.

For BFD like PROVIDE behavior, it seems that we cannot unconditionally add Undefined for referencedSymbols. We need to add back the entries when a PROVIDE gets used. I believe the changes would be very intrusive.


GNU ld performs many iterations, but some ports (e.g. x86, but not aarch64/riscv) do not allow too many iterations.

% cat chain.t
PROVIDE(f7 = 0x1000);
PROVIDE(f6 = f7);
PROVIDE(f5 = f6);
PROVIDE(f4 = f5);
PROVIDE(f3 = f4);
PROVIDE(f2 = f3);
PROVIDE(f1 = f2);
PROVIDE(newsym = f1);
% ld.bfd a.o -T chain.t
ld.bfd:chain.t:2: undefined symbol `f7' referenced in expression

FWIW I have improved PROVIDE tests in 215c565

@shankarke
Copy link
Contributor Author

@MaskRay MaskRay added lld:ELF and removed lld labels Dec 13, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 13, 2023

@llvm/issue-subscribers-lld-elf

Author: None (shankarke)

``` cat > 1.c << \! int foo = 10; !

cat > script.t << !
SECTIONS {
PROVIDE(f4 = 0x1000);
PROVIDE(f3 = f4);
PROVIDE(f2 = f3);
PROVIDE(f2 = f3);
PROVIDE(f1 = f2);
PROVIDE(foo = f1);
}
!

clang -target riscv32 -c 1.c
ld.lld 1.o -T script.t

$ ld.lld --version
LLD 18.0.0 (compatible with GNU linkers)

$llvm-readelf -s a.out
Symbol table '.symtab' contains 10 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS 1.c
2: 00000000 0 NOTYPE LOCAL DEFAULT 2 $d.0
3: 00000000 0 NOTYPE LOCAL DEFAULT 3 $d.1
4: 00000000 0 NOTYPE LOCAL DEFAULT ABS $d.2
5: 00000000 4 OBJECT GLOBAL DEFAULT 2 foo
6: 00001000 0 NOTYPE GLOBAL DEFAULT ABS f4
7: 00001000 0 NOTYPE GLOBAL DEFAULT ABS f3
8: 00001000 0 NOTYPE GLOBAL DEFAULT ABS f2
9: 00001000 0 NOTYPE GLOBAL DEFAULT ABS f1


Is this is a known issue why GNU linker script PROVIDE directive works this way with lld ?

</details>

partaror added a commit to partaror/llvm-project that referenced this issue Mar 8, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

Closes llvm#74771
partaror added a commit to partaror/llvm-project that referenced this issue Mar 8, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

Closes llvm#74771
partaror added a commit to partaror/llvm-project that referenced this issue Mar 8, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

Closes llvm#74771
partaror added a commit to partaror/llvm-project that referenced this issue Mar 11, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes llvm#74771
Closes llvm#84730
partaror added a commit to partaror/llvm-project that referenced this issue Mar 18, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes llvm#74771
Closes llvm#84730
partaror added a commit to partaror/llvm-project that referenced this issue Mar 25, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes llvm#74771
Closes llvm#84730
partaror added a commit to partaror/llvm-project that referenced this issue Mar 25, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes llvm#74771
Closes llvm#84730
partaror added a commit to partaror/llvm-project that referenced this issue Mar 25, 2024
Previously, linker was unnecessarily including a PROVIDE symbol which
was referenced by another unused PROVIDE symbol. For example, if a
linker script contained the below code and 'not_used_sym' provide symbol
is not included, then linker was still unnecessarily including 'foo' PROVIDE
symbol because it was referenced by 'not_used_sym'. This commit fixes
this behavior.

PROVIDE(not_used_sym = foo)
PROVIDE(foo = 0x1000)

This commit fixes this behavior by using dfs-like algorithm to find
all the symbols referenced in provide expressions of included provide
symbols.

This commit also fixes the issue of unused section not being garbage-collected
if a symbol of the section is referenced by an unused PROVIDE symbol.

Closes llvm#74771
Closes llvm#84730

Co-authored-by: Fangrui Song <[email protected]>
MaskRay added a commit that referenced this issue Jul 27, 2024
Extend commit ebb326a for (#74771) to
support quoted names, e.g. `PROVIDE("f1" = f2 + f3);`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants