Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Acero] Use ASAN to replace current temp stack memory poisoning #41460

Closed
zanmato1984 opened this issue Apr 30, 2024 · 1 comment
Closed

Comments

@zanmato1984
Copy link
Collaborator

zanmato1984 commented Apr 30, 2024

Describe the enhancement requested

From the discussion at #41335 (comment), the arrow::util::TempVectorStack memory initialization (by filling 0xFFs [1]) seems not very useful in the sense that:

  1. If it is supposed to be work with the guards [2] at the edge of each stack frame, the guards are only checked in debug mode (note it is using ARROW_DCHECK) [3], making filling 0xFFs unnecessary in non-debug mode. And the debug-only guarding itself can fully leverage more decent memory poisoning strategy such as ASAN [4].
  2. If it is for non-debug mode too, then it might be unnecessary because the stack users are not assuming the stack memory to be initially filled with any particular content.

We may want to refine this by removing unnecessary memory initialization and/or using ASAN's manual memory poisoning.

[1]

std::memset(buffer->mutable_data(), 0xFF, size);

[2]
static constexpr uint64_t kGuard1 = 0x3141592653589793ULL;
static constexpr uint64_t kGuard2 = 0x0577215664901532ULL;

[3]
ARROW_DCHECK(reinterpret_cast<const uint64_t*>(buffer_->mutable_data() + top_)[-1] ==
kGuard2);
ARROW_DCHECK(top_ >= size);
top_ -= size;
ARROW_DCHECK(reinterpret_cast<const uint64_t*>(buffer_->mutable_data() + top_)[0] ==
kGuard1);

[4] https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning

Component(s)

C++

felipecrv pushed a commit that referenced this issue May 18, 2024
### Rationale for this change

See #41460. And reduce the overhead of current manual poisoning (filling the entire stack space with `0xFF`s) that happens even in release mode.

### What changes are included in this PR?

Use ASAN API to replace the current manual poisoning of the temp vector stack memory.

### Are these changes tested?

Wanted to add cases to assert that ASAN poison/unpoison is functioning. However I found it too tricky to catch an ASAN error because ASAN directly uses signals that are hard to interoperate in C++/gtest. So I just manually checked poisoning is working in my local, e.g. by intentionally not unpoisoning the allocated buffer and seeing ASAN unhappy.

Just leveraging existing cases that use temp stack such as acero tests, which should cover this change well.

### Are there any user-facing changes?

None.

* GitHub Issue: #41460

Authored-by: Ruoxi Sun <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
@felipecrv felipecrv added this to the 17.0.0 milestone May 18, 2024
@felipecrv
Copy link
Contributor

Issue resolved by pull request 41695
#41695

vibhatha pushed a commit to vibhatha/arrow that referenced this issue May 25, 2024
…ache#41695)

### Rationale for this change

See apache#41460. And reduce the overhead of current manual poisoning (filling the entire stack space with `0xFF`s) that happens even in release mode.

### What changes are included in this PR?

Use ASAN API to replace the current manual poisoning of the temp vector stack memory.

### Are these changes tested?

Wanted to add cases to assert that ASAN poison/unpoison is functioning. However I found it too tricky to catch an ASAN error because ASAN directly uses signals that are hard to interoperate in C++/gtest. So I just manually checked poisoning is working in my local, e.g. by intentionally not unpoisoning the allocated buffer and seeing ASAN unhappy.

Just leveraging existing cases that use temp stack such as acero tests, which should cover this change well.

### Are there any user-facing changes?

None.

* GitHub Issue: apache#41460

Authored-by: Ruoxi Sun <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants