Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inline attr to CString::into_inner so it can optimize out NonNull checks #85719

Merged
merged 1 commit into from
May 27, 2021

Conversation

elichai
Copy link
Contributor

@elichai elichai commented May 26, 2021

It seems that currently if you convert any of the standard library's container to a pointer and then to a NonNull pointer, all will optimize out the NULL check except CString(https://godbolt.org/z/YPKW9G5xn),
because for some reason CString::into_inner isn't inlined even though it's a private function that should compile into a simple mov instruction.

Adding a simple #[inline] attribute solves this, code example:

use std::ffi::CString;
use std::ptr::NonNull;

pub fn cstring_nonull(mut n: CString) -> NonNull<i8> {
    NonNull::new(CString::into_raw(n)).unwrap()
}

assembly before:

__ZN3wat14cstring_nonull17h371c755bcad76294E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	callq	__ZN3std3ffi5c_str7CString10into_inner17h28ece07b276e2878E
	testq	%rax, %rax
	je	LBB0_2
	popq	%rbp
	retq
LBB0_2:
	leaq	l___unnamed_1(%rip), %rdi
	leaq	l___unnamed_2(%rip), %rdx
	movl	$43, %esi
	callq	__ZN4core9panicking5panic17h92a83fa9085a8f73E
	.cfi_endproc

	.section	__TEXT,__const
l___unnamed_1:
	.ascii	"called `Option::unwrap()` on a `None` value"

l___unnamed_3:
	.ascii	"wat.rs"

	.section	__DATA,__const
	.p2align	3
l___unnamed_2:
	.quad	l___unnamed_3
	.asciz	"\006\000\000\000\000\000\000\000\006\000\000\000(\000\000"

Assembly after:

__ZN3wat14cstring_nonull17h9645eb9341fb25d7E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movq	%rdi, %rax
	popq	%rbp
	retq
	.cfi_endproc

(Related discussion on zulip: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/NonNull.20From.3CBox.3CT.3E.3E)

@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 26, 2021
@m-ou-se
Copy link
Member

m-ou-se commented May 26, 2021

Thanks!

@bors r+

@bors
Copy link
Contributor

bors commented May 26, 2021

📌 Commit 45099e6 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 26, 2021
@m-ou-se m-ou-se assigned m-ou-se and unassigned kennytm May 26, 2021
@m-ou-se m-ou-se added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 26, 2021
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 26, 2021
…r=m-ou-se

Add inline attr to CString::into_inner so it can optimize out NonNull checks

It seems that currently if you convert any of the standard library's container to a pointer and then to a NonNull pointer, all will optimize out the NULL check except `CString`(https://godbolt.org/z/YPKW9G5xn),
because for some reason `CString::into_inner` isn't inlined even though it's a private function that should compile into a simple `mov` instruction.

Adding a simple `#[inline]` attribute solves this, code example:
```rust
use std::ffi::CString;
use std::ptr::NonNull;

pub fn cstring_nonull(mut n: CString) -> NonNull<i8> {
    NonNull::new(CString::into_raw(n)).unwrap()
}
```

assembly before:
```asm
__ZN3wat14cstring_nonull17h371c755bcad76294E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	callq	__ZN3std3ffi5c_str7CString10into_inner17h28ece07b276e2878E
	testq	%rax, %rax
	je	LBB0_2
	popq	%rbp
	retq
LBB0_2:
	leaq	l___unnamed_1(%rip), %rdi
	leaq	l___unnamed_2(%rip), %rdx
	movl	$43, %esi
	callq	__ZN4core9panicking5panic17h92a83fa9085a8f73E
	.cfi_endproc

	.section	__TEXT,__const
l___unnamed_1:
	.ascii	"called `Option::unwrap()` on a `None` value"

l___unnamed_3:
	.ascii	"wat.rs"

	.section	__DATA,__const
	.p2align	3
l___unnamed_2:
	.quad	l___unnamed_3
	.asciz	"\006\000\000\000\000\000\000\000\006\000\000\000(\000\000"
```

Assembly after:
```asm
__ZN3wat14cstring_nonull17h9645eb9341fb25d7E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movq	%rdi, %rax
	popq	%rbp
	retq
	.cfi_endproc
```

(Related discussion on zulip: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/NonNull.20From.3CBox.3CT.3E.3E)
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this pull request May 27, 2021
…r=m-ou-se

Add inline attr to CString::into_inner so it can optimize out NonNull checks

It seems that currently if you convert any of the standard library's container to a pointer and then to a NonNull pointer, all will optimize out the NULL check except `CString`(https://godbolt.org/z/YPKW9G5xn),
because for some reason `CString::into_inner` isn't inlined even though it's a private function that should compile into a simple `mov` instruction.

Adding a simple `#[inline]` attribute solves this, code example:
```rust
use std::ffi::CString;
use std::ptr::NonNull;

pub fn cstring_nonull(mut n: CString) -> NonNull<i8> {
    NonNull::new(CString::into_raw(n)).unwrap()
}
```

assembly before:
```asm
__ZN3wat14cstring_nonull17h371c755bcad76294E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	callq	__ZN3std3ffi5c_str7CString10into_inner17h28ece07b276e2878E
	testq	%rax, %rax
	je	LBB0_2
	popq	%rbp
	retq
LBB0_2:
	leaq	l___unnamed_1(%rip), %rdi
	leaq	l___unnamed_2(%rip), %rdx
	movl	$43, %esi
	callq	__ZN4core9panicking5panic17h92a83fa9085a8f73E
	.cfi_endproc

	.section	__TEXT,__const
l___unnamed_1:
	.ascii	"called `Option::unwrap()` on a `None` value"

l___unnamed_3:
	.ascii	"wat.rs"

	.section	__DATA,__const
	.p2align	3
l___unnamed_2:
	.quad	l___unnamed_3
	.asciz	"\006\000\000\000\000\000\000\000\006\000\000\000(\000\000"
```

Assembly after:
```asm
__ZN3wat14cstring_nonull17h9645eb9341fb25d7E:
	.cfi_startproc
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movq	%rdi, %rax
	popq	%rbp
	retq
	.cfi_endproc
```

(Related discussion on zulip: https://rust-lang.zulipchat.com/#narrow/stream/219381-t-libs/topic/NonNull.20From.3CBox.3CT.3E.3E)
This was referenced May 27, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request May 27, 2021
Rollup of 8 pull requests

Successful merges:

 - rust-lang#84221 (E0599 suggestions and elision of generic argument if no canditate is found)
 - rust-lang#84701 (stabilize member constraints)
 - rust-lang#85564 ( readd capture disjoint fields gate)
 - rust-lang#85583 (Get rid of PreviousDepGraph.)
 - rust-lang#85649 (Update cc)
 - rust-lang#85689 (Remove Iterator #[rustc_on_unimplemented]s that no longer apply.)
 - rust-lang#85719 (Add inline attr to CString::into_inner so it can optimize out NonNull checks)
 - rust-lang#85725 (Remove unneeded workaround)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 955e0f4 into rust-lang:master May 27, 2021
@rustbot rustbot added this to the 1.54.0 milestone May 27, 2021
@elichai elichai deleted the cstring-into_inner-inline branch May 27, 2021 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants