Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation Profiler: Types for all allocations #50333

Closed
wants to merge 4 commits into from

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Jun 28, 2023

Before this PR, we were missing the types for allocations in two cases:

  1. allocations from codegen
  2. allocations in gc_managed_realloc_

The second one is easy: those are always used for buffers, right?

For the first one: this PR adds a new exported julia function, which codegen will call after every allocation, to record the allocation and set its type.

Based on the approach from #33467.


An example of the generated code:

  %ptls_field6 = getelementptr inbounds {}**, {}*** %4, i64 2
  %13 = bitcast {}*** %ptls_field6 to i8**
  %ptls_load78 = load i8*, i8** %13, align 8
  %box = call noalias nonnull dereferenceable(32) {}* @ijl_gc_pool_alloc(i8* %ptls_load78, i32 1184, i32 32) #7
  %14 = bitcast {}* %box to i64*
  %15 = getelementptr inbounds i64, i64* %14, i64 -1
  store atomic i64 4335994192, i64* %15 unordered, align 8
  call void @ijl_maybe_record_alloc_to_profile({}* nonnull %box, i32 16, i64 4335994192)

(Note that now we are reporting the true size of the allocated object, rather than its pool osize, when recording an allocation. Maybe that's not actually desirable? We could move this to the pool vs big alloc case to get that more precise.)

Before this PR, we were missing the types for allocations in two cases:
1. allocations from codegen
2. allocations in gc_managed_realloc_

The second one is easy: those are always used for `buffer`s, right?

For the first one: this PR adds a new exported julia function, which codegen will call after every allocation, to record the allocation and set its type.
@NHDaly NHDaly marked this pull request as draft June 28, 2023 19:58
@NHDaly
Copy link
Member Author

NHDaly commented Jun 28, 2023

Current error:

julia> using Profile
Precompiling Profile
  ✗ Profile
  0 dependencies successfully precompiled in 2 seconds
[ Info: Precompiling Profile [9abbd945-dff8-562f-b5e8-e1ebf5ef1b79]
Call parameter type does not match function signature!
  %box = call noalias nonnull dereferenceable(48) {} addrspace(10)* @ijl_gc_pool_alloc(i8* %34, i32 1232, i32 48) #22
 i64  call void @ijl_maybe_record_alloc_to_profile({} addrspace(10)* %box, i32 40, {}* %30)
in function jfptr_parse_flat_910
LLVM ERROR: Broken function found, compilation aborted!

@NHDaly
Copy link
Member Author

NHDaly commented Jun 28, 2023

Another issue: this would be an extra function call. (Thanks for pointing that out, @vilterp.) It seems like it would be better to keep things the way they were before (where the maybe_record was inlined), and just pass the type through to the jl_gc_pool_alloc call.

Do we know why setting the type tag is done in the generated code, rather than in the C alloc functions?
It seems unnecessarily complicated. Why can't the alloc function just be something like:

JL_DLLEXPORT jl_value_t *jl_gc_pool_alloc_typed(jl_ptls_t ptls, int pool_offset,
                                          int osize, jl_value_t* type)
{
    jl_value_t *val = jl_gc_pool_alloc_inner(ptls, pool_offset, osize);
    jl_set_typeof(val, type);
    maybe_record_alloc_to_profile(val, osize, (jl_datatype_t*)type);
    return val;
}

Is there a reason that things are architected the way they currently are, or is it just historical, and could be changed?

@vchuravy
Copy link
Member

So I don't like adding this into the generated code. It is another unconditional call (unlike in the runtime were we are eliding them), I would prefer if we would pass the type to the allocation function, that would certainly be less overhead.

It's a bit weird then that we do the typetag setting not in the allocation function, but in the generated code. I don't know the actual reason for that behavior.

It might be nice to hide that write since it's technically in "unobservable memory".

@NHDaly
Copy link
Member Author

NHDaly commented Jun 28, 2023

Here's the other option: moving the type setting into the allocation function:
#50337

@NHDaly
Copy link
Member Author

NHDaly commented Jun 28, 2023

K, fixed the above issue here too, by adding builder.CreatePtrToInt(tag, T_size).

I still think I prefer the approach in #50337, assuming that it isn't missing any issues, since it's cheaper and I think simpler.

@NHDaly NHDaly requested a review from vchuravy June 28, 2023 22:01
@NHDaly
Copy link
Member Author

NHDaly commented Jun 29, 2023

So I don't like adding this into the generated code. It is another unconditional call (unlike in the runtime were we are eliding them), I would prefer if we would pass the type to the allocation function, that would certainly be less overhead.

It's a bit weird then that we do the typetag setting not in the allocation function, but in the generated code. I don't know the actual reason for that behavior.

It might be nice to hide that write since it's technically in "unobservable memory".

Oh, thanks Valentin. I missed your reply! Must have cross posted.

Yes, I agree with you! It seems so much more straightforward to pass the type through! So that's what I did in the other PR.

.. I also don't see why we wouldn't just do the actual tag setting there too. I'm curious to understand more about what you mean about "hide the write." What does that mean?

Also, I wish the odd choices like this were documented to explain WHY they are the way they are. :(

@NHDaly
Copy link
Member Author

NHDaly commented Jun 30, 2023

Closing this PR in favor of #50337.

@NHDaly NHDaly closed this Jun 30, 2023
@NHDaly NHDaly deleted the nhd-alloc-profiler-types branch June 30, 2023 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants