Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use probe-stack=inline-asm in LLVM 11+ #77885

Merged
merged 1 commit into from Jan 16, 2021
Merged

Conversation

@erikdesjardins
Copy link
Contributor

@erikdesjardins erikdesjardins commented Oct 13, 2020

Fixes (?) #74405, related to #43241

r? @cuviper

@cuviper
Copy link
Member

@cuviper cuviper commented Oct 14, 2020

Thanks! This may have perf implications, so I'm excluding it from rollups.

@bors r+ rollup=never

@bors
Copy link
Contributor

@bors bors commented Oct 14, 2020

📌 Commit 95269c2 has been approved by cuviper

@bors
Copy link
Contributor

@bors bors commented Oct 15, 2020

Testing commit 95269c2 with merge 1272a1a...

bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 15, 2020
Use probe-stack=inline-asm in LLVM 11+

Fixes (?) rust-lang#74405, related to rust-lang#43241

r? `@cuviper`
@bors
Copy link
Contributor

@bors bors commented Oct 15, 2020

💔 Test failed - checks-actions

@cuviper
Copy link
Member

@cuviper cuviper commented Oct 15, 2020

The specific failures start here. I'm not sure why this change would cause new stack overflows, but it needs investigation.

failures:

---- [ui] ui/unsized-locals/by-value-trait-object-safety-rpass.rs stdout ----

error: test run failed!
status: signal: 6
command: "/checkout/obj/build/i686-unknown-linux-gnu/test/ui/unsized-locals/by-value-trait-object-safety-rpass/a"
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

------------------------------------------


---- [ui] ui/unsized-locals/by-value-trait-object-safety-withdefault.rs stdout ----

error: test run failed!
status: signal: 6
command: "/checkout/obj/build/i686-unknown-linux-gnu/test/ui/unsized-locals/by-value-trait-object-safety-withdefault/a"
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

------------------------------------------


---- [ui] ui/unsized-locals/autoderef.rs stdout ----

error: test run failed!
status: signal: 6
command: "/checkout/obj/build/i686-unknown-linux-gnu/test/ui/unsized-locals/autoderef/a"
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------

thread 'main' has overflowed its stack
fatal runtime error: stack overflow

------------------------------------------



failures:
    [ui] ui/unsized-locals/autoderef.rs
    [ui] ui/unsized-locals/by-value-trait-object-safety-rpass.rs
    [ui] ui/unsized-locals/by-value-trait-object-safety-withdefault.rs

test result: FAILED. 10827 passed; 3 failed; 109 ignored; 0 measured; 0 filtered out
@erikdesjardins
Copy link
Contributor Author

@erikdesjardins erikdesjardins commented Oct 17, 2020

Looks like a comparison is inverted. This is a code snippet from by-value-trait-object-safety-rpass. Immediately before this, the size of the alloca is loaded into ebx, which in our case is 0 because the unsized trait object has size 0.

	add	ebx, 15
	and	ebx, -16
	mov	eax, esp
	sub	eax, ebx
	mov	dword ptr [ebp - 368], edx
	mov	dword ptr [ebp - 372], esi
	mov	dword ptr [ebp - 376], edi
	mov	dword ptr [ebp - 380], eax
.LBB87_43:
	mov	eax, dword ptr [ebp - 380]
	cmp	eax, esp
	jl	.LBB87_45
	mov	dword ptr [esp], 0
	sub	esp, 4096
	jmp	.LBB87_43
.LBB87_45:
	mov	eax, dword ptr [ebp - 380]
	mov	esp, eax

Due to:

cmp	eax, esp
jl	.LBB87_45

...if ebx is nonzero, the new stack pointer in eax will be below esp, and it will exit the loop without probing.
...if ebx is zero, the new stack pointer in eax will be equal to, and then greater than esp, and it will probe infinitely.
(and that's what happens running it locally under gdb)

The same thing shows up in Clang 11 with -fstack-clash-protection (https://godbolt.org/z/vxbaxd) with just:

int size;

void foo(void*);

int main() {
    foo(alloca(size));
}

I suppose nobody noticed this in C because it's rare to call alloca(0), and alloca(<nonzero>) will still run, it just won't actually probe anything.

Assuming I haven't misinterpreted something, can you report this to LLVM? I don't have an account yet.

@pietroalbini
Copy link
Member

@pietroalbini pietroalbini commented Oct 19, 2020

@bors r-

The PR failed but bors forgot about that during synchronize.

@cuviper
Copy link
Member

@cuviper cuviper commented Oct 19, 2020

Hmm, I thought we fixed that cmp direction in D82867, but it's possible we missed a case...
(edit: actually, I think that change may have mis-corrected the cmp for dynamic alloca... still investigating...)

@erikdesjardins
Copy link
Contributor Author

@erikdesjardins erikdesjardins commented Oct 19, 2020

Yeah, the changes to stack-clash-dynamic-alloca.ll in that revision look wrong; it was correct before (although maybe replace jl .LBB0_3 with jle to avoid probing for zero size allocas).
stack-clash-large.ll looks right if it's guaranteed that we always probe an exact multiple of the page size for static allocas.

@serge-sans-paille
Copy link

@serge-sans-paille serge-sans-paille commented Oct 27, 2020

@erikdesjardins / @cuviper https://reviews.llvm.org/D90216 should do the trick. Thanks for spotting this!

@cuviper
Copy link
Member

@cuviper cuviper commented Oct 27, 2020

Ah, so there are two parts to this...

I suppose nobody noticed this in C because it's rare to call alloca(0),

This actually was already reported in bug 47657 and fixed in D88548.

and alloca(<nonzero>) will still run, it just won't actually probe anything.

This problem remains, but I just confirmed locally that it is fixed by D90216. I stepped through the assembly in gdb just to be sure. 🙂

I think both fixes would be good to have in LLVM 11.0.1. For Rust's part, we can backport the fixes to our bundled LLVM fork, but we should be careful about how this is enabled for external LLVM. I don't think we usually check the patch version for functionality, but maybe this one justifies it -- assuming 11.0.1 does get fixed.

serge-sans-paille pushed a commit to llvm/llvm-project that referenced this pull request Oct 30, 2020
- Perform the probing in the correct direction.
  Related to rust-lang/rust#77885 (comment)

- The first touch on a dynamic alloca cannot use a mov because it clobbers
  existing space. Use a xor 0 instead

Differential Revision: https://reviews.llvm.org/D90216
serge-sans-paille pushed a commit to serge-sans-paille/llvm-project that referenced this pull request Nov 5, 2020
- Perform the probing in the correct direction.
  Related to rust-lang/rust#77885 (comment)

- The first touch on a dynamic alloca cannot use a mov because it clobbers
  existing space. Use a xor 0 instead

Differential Revision: https://reviews.llvm.org/D90216
@crlf0710
Copy link
Contributor

@crlf0710 crlf0710 commented Nov 13, 2020

Triage: So it seems this is blocked on a llvm bug.

tstellar added a commit to tstellar/llvm-project that referenced this pull request Nov 21, 2020
- Perform the probing in the correct direction.
  Related to rust-lang/rust#77885 (comment)

- The first touch on a dynamic alloca cannot use a mov because it clobbers
  existing space. Use a xor 0 instead

Differential Revision: https://reviews.llvm.org/D90216

(cherry picked from commit 0f60bcc)
tstellar added a commit to tstellar/llvm-project that referenced this pull request Nov 25, 2020
- Perform the probing in the correct direction.
  Related to rust-lang/rust#77885 (comment)

- The first touch on a dynamic alloca cannot use a mov because it clobbers
  existing space. Use a xor 0 instead

Differential Revision: https://reviews.llvm.org/D90216

(cherry picked from commit 0f60bcc)
serge-sans-paille pushed a commit to serge-sans-paille/llvm-project that referenced this pull request Dec 3, 2020
- Perform the probing in the correct direction.
  Related to rust-lang/rust#77885 (comment)

- The first touch on a dynamic alloca cannot use a mov because it clobbers
  existing space. Use a xor 0 instead

Differential Revision: https://reviews.llvm.org/D90216

(cherry picked from commit 0f60bcc)
@rust-timer
Copy link
Collaborator

@rust-timer rust-timer commented Jan 8, 2021

Awaiting bors try build completion.

@bors
Copy link
Contributor

@bors bors commented Jan 8, 2021

Trying commit b28b231 with merge 088a806...

bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 8, 2021
Use probe-stack=inline-asm in LLVM 11+

Fixes (?) rust-lang#74405, related to rust-lang#43241

r? `@cuviper`
@bors
Copy link
Contributor

@bors bors commented Jan 8, 2021

☀️ Try build successful - checks-actions
Build commit: 088a806 (088a806ae338b7577aa6f01e5700a94f9feb139c)

@rust-timer
Copy link
Collaborator

@rust-timer rust-timer commented Jan 8, 2021

Queued 088a806 with parent 937f629, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rust-timer
Copy link
Collaborator

@rust-timer rust-timer commented Jan 8, 2021

Finished benchmarking try commit (088a806): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@cuviper
Copy link
Member

@cuviper cuviper commented Jan 13, 2021

Now that #80796 has merged, can you rebase again to separate this PR from that change?

@erikdesjardins erikdesjardins force-pushed the erikdesjardins:probeasm branch from b28b231 to cd25807 Jan 15, 2021
@erikdesjardins
Copy link
Contributor Author

@erikdesjardins erikdesjardins commented Jan 15, 2021

Rebased.

@cuviper
Copy link
Member

@cuviper cuviper commented Jan 15, 2021

@bors r+

@bors
Copy link
Contributor

@bors bors commented Jan 15, 2021

📌 Commit cd25807 has been approved by cuviper

@bors
Copy link
Contributor

@bors bors commented Jan 16, 2021

Testing commit cd25807 with merge 635ccfe...

@bors
Copy link
Contributor

@bors bors commented Jan 16, 2021

☀️ Test successful - checks-actions
Approved by: cuviper
Pushing 635ccfe to master...

@bors bors added the merged-by-bors label Jan 16, 2021
@bors bors merged commit 635ccfe into rust-lang:master Jan 16, 2021
11 checks passed
11 checks passed
PR (mingw-check, ubuntu-latest-xl)
Details
PR (x86_64-gnu-llvm-9, ubuntu-latest-xl)
Details
PR (x86_64-gnu-tools, 1, ubuntu-latest-xl)
Details
auto
Details
try
Details
master
Details
bors build finished
Details
bors build finished
Details
bors build finished
Details
bors build finished
Details
homu Test successful
Details
@rustbot rustbot added this to the 1.51.0 milestone Jan 16, 2021
@erikdesjardins erikdesjardins deleted the erikdesjardins:probeasm branch Jan 16, 2021
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 23, 2021
…iper

Target stack-probe support configurable finely

This adds capability to configure the target's stack probe support in a
more precise manner than just on/off. In particular now we allow
choosing between always inline-asm, always call or either one of those
depending on the LLVM version.

Note that this removes the ability to turn off the generation of the
stack-probe attribute. This is valid to replace it with inline-asm for all targets because
`probe-stack="inline-asm"` will not generate any machine code on targets
that do not currently support stack probes. This makes support for stack
probes on targets that don't have any right now automatic with LLVM
upgrades in the future.

(This is valid to do based on the fact that clang unconditionally sets
this attribute when `-fstack-clash-protection` is used, AFAICT)

cc rust-lang#77885
r? `@cuviper`
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 24, 2021
…iper

Target stack-probe support configurable finely

This adds capability to configure the target's stack probe support in a
more precise manner than just on/off. In particular now we allow
choosing between always inline-asm, always call or either one of those
depending on the LLVM version.

Note that this removes the ability to turn off the generation of the
stack-probe attribute. This is valid to replace it with inline-asm for all targets because
`probe-stack="inline-asm"` will not generate any machine code on targets
that do not currently support stack probes. This makes support for stack
probes on targets that don't have any right now automatic with LLVM
upgrades in the future.

(This is valid to do based on the fact that clang unconditionally sets
this attribute when `-fstack-clash-protection` is used, AFAICT)

cc rust-lang#77885
r? `@cuviper`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

10 participants