You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At scale 1000, all of these queries have workers getting restarted after running out of memory. We should investigate the cause and see if we're missing optimizations, have chosen a poor join order, or whether there are any other issues with these queries.
Query 18 most likely dies because our source dataset is weird. We have files that have 50mbs in memory and files that have 380mbs in memory. The latter is relatively big for our small machines (8GB of ram). This gets worse through our strategy of combining multiple partitions when we drop columns, we end up combining a few large ones which makes them even bigger.
I don't know how we want to proceed exactly, but the varying partitions are probably not very good for what we want to do here.
Edit: This is not compression related, the difference scales down to compressed file sizes
At scale 1000, all of these queries have workers getting restarted after running out of memory. We should investigate the cause and see if we're missing optimizations, have chosen a poor join order, or whether there are any other issues with these queries.
query_9
query_11
query_13
query_17
query_18
query_19
Note that
query_21
is excluded from this list due to [TPC-H] Query 21 times out at scale 100 #1362.The text was updated successfully, but these errors were encountered: