Skip to content

Commit

Permalink
hot patch for win64 alignment error segfault
Browse files Browse the repository at this point in the history
this makes the taggedvalue struct a multiple of 16-bytes (on 64-bit
platforms) so that the 16-byte gc-alignment is maintained, until we
can fix the gc to hand out better aligned pointers
  • Loading branch information
vtjnash committed Mar 21, 2015
1 parent 7d433e8 commit 0d8cec3
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 21 deletions.
30 changes: 10 additions & 20 deletions src/gc.c
Original file line number Diff line number Diff line change
Expand Up @@ -2318,7 +2318,7 @@ void *reallocb(void *b, size_t sz)

DLLEXPORT jl_value_t *allocobj(size_t sz)
{
sz += sizeof(void*);
sz += sizeof(jl_taggedvalue_t);
#ifdef MEMDEBUG
return jl_valueof(alloc_big(sz));
#endif
Expand All @@ -2330,39 +2330,29 @@ DLLEXPORT jl_value_t *allocobj(size_t sz)

DLLEXPORT jl_value_t *alloc_1w(void)
{
const int sz = sizeof(jl_taggedvalue_t) + sizeof(void*);
#ifdef MEMDEBUG
return jl_valueof(alloc_big(2*sizeof(void*)));
#endif
#ifdef _P64
return jl_valueof(_pool_alloc(&pools[2], 2*sizeof(void*)));
#else
return jl_valueof(_pool_alloc(&pools[0], 2*sizeof(void*)));
return jl_valueof(alloc_big(sz));
#endif
return jl_valueof(_pool_alloc(&pools[szclass(sz)], sz));
}

DLLEXPORT jl_value_t *alloc_2w(void)
{
const int sz = sizeof(jl_taggedvalue_t) + sizeof(void*) * 2;
#ifdef MEMDEBUG
return jl_valueof(alloc_big(3*sizeof(void*)));
#endif
#ifdef _P64
return jl_valueof(_pool_alloc(&pools[4], 3*sizeof(void*)));
#else
return jl_valueof(_pool_alloc(&pools[1], 3*sizeof(void*)));
return jl_valueof(alloc_big(sz));
#endif

return jl_valueof(_pool_alloc(&pools[szclass(sz)], sz));
}

DLLEXPORT jl_value_t *alloc_3w(void)
{
const int sz = sizeof(jl_taggedvalue_t) + sizeof(void*) * 3;
#ifdef MEMDEBUG
return jl_valueof(alloc_big(4*sizeof(void*)));
#endif
#ifdef _P64
return jl_valueof(_pool_alloc(&pools[6], 4*sizeof(void*)));
#else
return jl_valueof(pool_alloc(&pools[2]));
return jl_valueof(alloc_big(sz));
#endif
return jl_valueof(_pool_alloc(&pools[szclass(sz)], sz));
}

#ifdef GC_FINAL_STATS
Expand Down
13 changes: 12 additions & 1 deletion src/julia.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,14 +80,25 @@ typedef struct {
#endif
};
};
#ifdef _P64
uintptr_t realign16;
#endif
jl_value_t value[0];
} jl_taggedvalue_t;

#define jl_astaggedvalue__MACRO(v) container_of((v),jl_taggedvalue_t,value)
#define jl_typeof__MACRO(v) ((jl_value_t*)(jl_astaggedvalue__MACRO(v)->type_bits&~(size_t)3))
#define jl_astaggedvalue jl_astaggedvalue__MACRO
#define jl_typeof jl_typeof__MACRO
#define jl_set_typeof(v,t) (jl_astaggedvalue(v)->type = (jl_value_t*)(t))
//#define jl_set_typeof(v,t) (jl_astaggedvalue(v)->type = (jl_value_t*)(t))
static inline void jl_set_typeof(void *v, void *t)
{
jl_taggedvalue_t *tag = jl_astaggedvalue(v);
#ifdef _P64
tag->realign16 = 0xA1164A1164A11640ull;
#endif
tag->type = (jl_value_t*)t;
}
#define jl_typeis(v,t) (jl_typeof(v)==(jl_value_t*)(t))

typedef struct _jl_sym_t {
Expand Down

7 comments on commit 0d8cec3

@carnaval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vtjnash
Heard about this yesterday. What are the precise alignment requirements ?
Is the 16-byte thing only for BLAS due to SSE constraints ?
If I'm not mistaken the GC will hand out 16-aligned addresses as long as the pool's object size is a multiple of 16 (since pool pages are 16k aligned). The GC itself only needs 4-byte alignment to use the 2 low bits of the type tag.

@vtjnash
Copy link
Member Author

@vtjnash vtjnash commented on 0d8cec3 Apr 2, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gc was recently changed to offset the values it hands out by the size of a jl_taggedtype_t. In some cases, the compiler will assume that a struct is aligned to some larger boundary (up to some platform-defined limit -- on most modern platforms that limit being 16-bytes) and emit vmovsd or similar instructions accordingly

@carnaval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When compiling the C runtime or julia code ?
I'm not sure I understand why it only happens on windows.
Anyway, if we can't explain this 8 mod 16 alignment to the compiler we could still offset it in the gc. It would only waste a page (the first one) of committed memory but for some reason it feels wrong to me.

@vtjnash
Copy link
Member Author

@vtjnash vtjnash commented on 0d8cec3 Apr 3, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the jmp_buf for windows needs to save a lot more registers, including MMX. in linux, it only needs long int alignment to save a few registers. on darwin, it only demands int alignment.

this doesn't seem like it should waste a full page, just 16 - (sizeof(jl_taggedvalue_t) % 16) bytes at the beginning

@carnaval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this also waste a full page because of data at the end ? For e.g. a lot of 8 x 2k objects (gc) pages you'd either have to restrict every one of those to 7 objects or waste (most of) the last (vm) page right ?
Could we just treat the jmp_buf as a special case ?

@vtjnash
Copy link
Member Author

@vtjnash vtjnash commented on 0d8cec3 Apr 3, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to shrink the objects by (2048-sizeof(jl_taggedvalue_t)) to accomodate

Could we just treat the jmp_buf as a special case ?

not really, since we want to be able to use avx/altivec instructions on any arbitrary type, such as 4xFloat32 tuples

@JeffBezanson
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not using vector type ABIs yet, so other types aren't a concern. So is it possible just to align jmp_bufs?

Please sign in to comment.