Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Fatal Exception: std::out_of_range vector, Regression in 4.0 beta #11538

Closed
parrots opened this issue Mar 27, 2018 · 17 comments
Closed

Fatal Exception: std::out_of_range vector, Regression in 4.0 beta #11538

parrots opened this issue Mar 27, 2018 · 17 comments
Labels
Android Mapbox Maps SDK for Android crash iOS Mapbox Maps SDK for iOS

Comments

@parrots
Copy link

parrots commented Mar 27, 2018

Platform: iOS
Mapbox SDK version: 4.0.0-beta2

This may be related to #10844, which was included and through to be fixed in the 4.0.0-beta 2 release, however running 4.0.0-beta2 shows it isn't fixed.

Because we can not dynamically hide/show annotations without adding and removing them, I'm left rapidly adding and removing annotations. At the same time I'm also changing the coordinate on some annotations. This all happens as the user interacts with a scrollview to browse their day and the paths they took on a ski hill.

Basically I'm causing a lot of redrawing, and sometimes that leads to crashes, more-so than ever with the 4.0.0 beta 2. This has skyrocketed to be my top crasher since deploying 4.0.0 beta (needed heat maps).

screen shot 2018-03-27 at 12 52 31 pm

Steps to trigger behavior

  1. POD Install
  2. Run the attached project on hardware, preferably an older phone (6s, for EX)
  3. It may take a minute, or a relaunch or two, to cause the crash. This seems like a race condition bug, hard to reproduce, but I have now reproduced it multiple times letting this sample project run for a minute or two on an iPhone 6s. If it doesn't trigger within 5 minutes I'd relaunch.
  4. Boom (eventually)

Expected behavior

No boom

Actual behavior

libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: vector.

Full crash log from production: CrashLog.txt

MapboxCrasher.zip

@friedbunny friedbunny added iOS Mapbox Maps SDK for iOS crash labels Mar 27, 2018
@friedbunny
Copy link
Contributor

friedbunny commented Mar 27, 2018

Thanks for the thorough report and reproduction, @parrots — here’s what the crash looks like with a fully-symbolicated debug build of our framework at 7f18b0b:

* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.2
    frame #0: 0x00000001162b4b86 libc++abi.dylib`__cxa_throw
    frame #1: 0x00000001151f3059 libc++.1.dylib`std::__1::__vector_base_common<true>::__throw_out_of_range() const + 71
    frame #2: 0x000000010dd7cf87 Mapbox`std::__1::vector<std::__1::shared_ptr<mbgl::AnnotationTileFeatureData const>, std::__1::allocator<std::__1::shared_ptr<mbgl::AnnotationTileFeatureData const> > >::at(this=0x000060c00069b420 size=1, __n=14) at vector:1513
    frame #3: 0x000000010dd7ca10 Mapbox`mbgl::AnnotationTileLayer::getFeature(this=0x000060400023fd60, i=14) const at annotation_tile.cpp:83
    frame #4: 0x000000010dda0672 Mapbox`mbgl::FeatureIndex::addFeature(this=0x0000608000347900, result=size=1, indexedFeature=0x00007fd6f421b1b0, queryGeometry=0x00007ffee2710508, options=0x00007ffee2713640, geometryTileData=0x0000608000247800, tileID=0x00007ffee271022c, layers=size=1, bearing=0, pixelsToTileUnits=16) const at feature_index.cpp:119
    frame #5: 0x000000010dd9fd0e Mapbox`mbgl::FeatureIndex::query(this=0x0000608000347900, result=size=1, queryGeometry=0x00007ffee2710508, bearing=0, tileSize=512, scale=1, queryOptions=0x00007ffee2713640, geometryTileData=0x0000608000247800, tileID=0x00007ffee2710228, sourceID="com.mapbox.annotations", layers=size=1, collisionIndex=0x00007fd6f043a9b0, additionalQueryRadius=0) const at feature_index.cpp:81
    frame #6: 0x000000010e532e72 Mapbox`mbgl::GeometryTile::queryRenderedFeatures(this=0x00007fd6f05215f0, result=size=1, queryGeometry=0x00007ffee2710508, transformState=0x00007fd6f2a077d0, layers=size=1, options=0x00007ffee2713640, collisionIndex=0x00007fd6f043a9b0) at geometry_tile.cpp:244
    frame #7: 0x000000010e13d6b4 Mapbox`mbgl::TilePyramid::queryRenderedFeatures(this=0x000060c000133948, geometry=0x00007ffee2712ff0, transformState=0x00007fd6f2a077d0, layers=size=1, options=0x00007ffee2713640, collisionIndex=0x00007fd6f043a9b0) const at tile_pyramid.cpp:284
    frame #8: 0x000000010dd8f91b Mapbox`mbgl::RenderAnnotationSource::queryRenderedFeatures(this=0x000060c000133920, geometry=0x00007ffee2712ff0, transformState=0x00007fd6f2a077d0, layers=size=1, options=0x00007ffee2713640, collisionIndex=0x00007fd6f043a9b0) const at render_annotation_source.cpp:69
    frame #9: 0x000000010e0bc956 Mapbox`mbgl::Renderer::Impl::queryRenderedFeatures(this=0x00007fd6f2a07750, geometry=0x00007ffee2712ff0, options=0x00007ffee2713640, layers=size=1) const at renderer_impl.cpp:686
    frame #10: 0x000000010e0ba04f Mapbox`mbgl::Renderer::Impl::queryRenderedFeatures(this=0x00007fd6f2a07750, geometry=0x00007ffee2712ff0, options=0x00007ffee2713640) const at renderer_impl.cpp:674
    frame #11: 0x000000010e09cce4 Mapbox`mbgl::Renderer::queryRenderedFeatures(this=0x000060000001a9b0, box=0x00007ffee27137c8, options=0x00007ffee2713640) const at renderer.cpp:45
    frame #12: 0x000000010e09d7a0 Mapbox`mbgl::Renderer::queryPointAnnotations(this=0x000060000001a9b0, box=0x00007ffee27137c8) const at renderer.cpp:60
    frame #13: 0x000000010dba63fb Mapbox`::-[MGLMapView annotationTagsInRect:](self=0x00007fd6f403e400, _cmd="annotationTagsInRect:", rect=(origin = (x = -750, y = -1624), size = (width = 1875, height = 4060))) at MGLMapView.mm:4226
    frame #14: 0x000000010db9b528 Mapbox`::-[MGLMapView visibleAnnotationsInRect:](self=0x00007fd6f403e400, _cmd="visibleAnnotationsInRect:", rect=(origin = (x = -750, y = -1624), size = (width = 1875, height = 4060))) at MGLMapView.mm:3576
    frame #15: 0x000000010dbb876d Mapbox`::-[MGLMapView updateAnnotationViews](self=0x00007fd6f403e400, _cmd="updateAnnotationViews") at MGLMapView.mm:5676
    frame #16: 0x000000010dbb7b8c Mapbox`::-[MGLMapView mapViewDidFinishRenderingFrameFullyRendered:](self=0x00007fd6f403e400, _cmd="mapViewDidFinishRenderingFrameFullyRendered:", fullyRendered=NO) at MGLMapView.mm:5607
    frame #17: 0x000000010dbc14cf Mapbox`MBGLView::onDidFinishRenderingFrame(this=0x00006000000591a0, mode=Partial) at MGLMapView.mm:6183
    frame #18: 0x000000010dbc17ef Mapbox`non-virtual thunk to MBGLView::onDidFinishRenderingFrame(this=0x00006000000591a0, mode=Partial) at MGLMapView.mm:0
    frame #19: 0x000000010de2737c Mapbox`mbgl::Map::Impl::onDidFinishRenderingFrame(this=0x00007fd6f0605210, renderMode=Partial, needsRepaint=true) at map.cpp:201
    frame #20: 0x000000010de2784e Mapbox`non-virtual thunk to mbgl::Map::Impl::onDidFinishRenderingFrame(this=0x00007fd6f0605210, renderMode=Partial, needsRepaint=true) at map.cpp:0
    frame #21: 0x000000010e0b4f54 Mapbox`mbgl::Renderer::Impl::render(this=0x00007fd6f2a07750, updateParameters=0x00007fd6f04194e8) at renderer_impl.cpp:644
    frame #22: 0x000000010e09ca78 Mapbox`mbgl::Renderer::render(this=0x000060000001a9b0, updateParameters=0x00007fd6f04194e8) at renderer.cpp:33
    frame #23: 0x000000010db75ef4 Mapbox`MGLRenderFrontend::render(this=0x0000604000462940) at MGLRendererFrontend.h:52
    frame #24: 0x000000010db75dce Mapbox`::-[MGLMapView glkView:drawInRect:](self=0x00007fd6f403e400, _cmd="glkView:drawInRect:", view=0x00007fd6f062d170, rect=(origin = (x = 0, y = 0), size = (width = 375, height = 812))) at MGLMapView.mm:971
    frame #25: 0x00000001118aaf73 GLKit`-[GLKView _display:] + 310
    frame #26: 0x000000010db773c0 Mapbox`::-[MGLMapView updateFromDisplayLink](self=0x00007fd6f403e400, _cmd="updateFromDisplayLink") at MGLMapView.mm:1104
    frame #27: 0x00000001156a32b8 QuartzCore`CA::Display::DisplayLink::dispatch_items(unsigned long long, unsigned long long, unsigned long long) + 684
    frame #28: 0x00000001157e3894 QuartzCore`display_timer_callback(__CFMachPort*, void*, long, void*) + 248
    frame #29: 0x0000000110ae7899 CoreFoundation`__CFMachPortPerform + 169
    frame #30: 0x0000000110ae77d9 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 41
    frame #31: 0x0000000110ae7741 CoreFoundation`__CFRunLoopDoSource1 + 465
    frame #32: 0x0000000110adf524 CoreFoundation`__CFRunLoopRun + 2596
    frame #33: 0x0000000110ade889 CoreFoundation`CFRunLoopRunSpecific + 409
    frame #34: 0x000000011793b9c6 GraphicsServices`GSEventRunModal + 62
    frame #35: 0x00000001119cc5d6 UIKit`UIApplicationMain + 159
    frame #36: 0x000000010d515b47 MapboxCrasher`main at AppDelegate.swift:12
    frame #37: 0x000000011642ad81 libdyld.dylib`start + 1

/cc @jfirebaugh @ChrisLoer

@ChrisLoer
Copy link
Contributor

Yeah, thanks for the great report @parrots!

The crash is happening because a layer in the tile's AnnotationTileData has fewer features than the index of a feature for that layer in the accompanying FeatureIndex. A simple workaround to prevent the crash would be to just check the bounds in AnnotationTileLayer::getFeature and skip the feature if the index isn't valid -- however, that would just hide the underlying data problem.

GeometryTile::commitFeatureIndex is responsible for keeping the FeatureIndex for a GeometryTile in sync with the GeometryTileData that was used to create the index (in this case, the GeometryTileData is an AnnotationTileData):

void GeometryTile::commitFeatureIndex() {
if (pendingFeatureIndex) {
featureIndex = std::move(pendingFeatureIndex);
}
if (pendingData) {
data = std::move(pendingData);
}
}
)

It seems like somehow we're ending up in a situation where the FeatureIndex doesn't match the stored data. I'm looking at annotation_tile.cpp and I see at least one concerning thing -- the clone() method just uses the default copy constructor:

std::unique_ptr<GeometryTileData> AnnotationTileData::clone() const {
return std::make_unique<AnnotationTileData>(*this);
}

And the default copy constructor will end up copying shared pointers:

std::unordered_map<std::string, std::shared_ptr<AnnotationTileLayerData>> layers;

So we get in a weird situation where AnnotationTileLayerData ownership is shared between the background GeometryTileWorker and the foreground GeometryTile, which doesn't really seem like the intent of:

parent.invoke(&GeometryTile::onLayout, GeometryTile::LayoutResult {
std::move(buckets),
std::move(featureIndex),
*data ? (*data)->clone() : nullptr,
}, correlationID);

That said, I haven't actually found a bug in the logic yet: it doesn't look like we modify any of the AnnotationTileLayerData objects after we create them.

I'll keep investigating tomorrow.

ChrisLoer added a commit that referenced this issue Mar 28, 2018
First half of fix for issue #11538.

Testing `if (pendingData)` didn't work if there _was_ a data update, but the update was "the data for this tile is now null". In that case, the tile's FeatureIndex would be updated, but the tile's data member would remain unchanged.

In the case of a tile's data being deleted, the matching FeatureIndex would have an empty set of bucketLayerIDs, but the _old_ data would still be in place for querying, and the symbolBuckets might not be updated yet (pending onPlacement). In this case `bucketLayerIDs.at(indexedFeature.bucketName)` could throw an out-of-range exception.
ChrisLoer added a commit that referenced this issue Mar 28, 2018
Second half of fix for issue #11538.

If a global placement took place between the time a tile's non-symbol layout updated and the time new symbol buckets arrived, the tile's new FeatureIndex would be committed, but the global CollisionIndex would be generated against the old symbolBuckets. The mismatch could cause the CollisionIndex to contain indices that didn't exist in the new FeatureIndex, and attempting to access them would generate an out-of-bounds exception.
@ChrisLoer
Copy link
Contributor

I believe I've found the cause and have two proposed fixes in #11551. In the end, the broken logic I found was entirely in the shared GeometryTile logic. The only thing that made annotations special was that adding and removing annotations rapidly stress tests symbol querying against tiles whose feature set is frequently changing. This bug might also surface in an app that did symbol querying against a rapidly changing GeoJSON source (or even against plain vector tiles, but it would take very bad luck for a symbol query against a tile to happen at almost the exact moment it loaded a new version from the server).

ChrisLoer added a commit that referenced this issue Mar 29, 2018
First half of fix for issue #11538.

Testing `if (pendingData)` didn't work if there _was_ a data update, but the update was "the data for this tile is now null". In that case, the tile's FeatureIndex would be updated, but the tile's data member would remain unchanged.

In the case of a tile's data being deleted, the matching FeatureIndex would have an empty set of bucketLayerIDs, but the _old_ data would still be in place for querying, and the symbolBuckets might not be updated yet (pending onPlacement). In this case `bucketLayerIDs.at(indexedFeature.bucketName)` could throw an out-of-range exception.
ChrisLoer added a commit that referenced this issue Mar 29, 2018
Second half of fix for issue #11538.

If a global placement took place between the time a tile's non-symbol layout updated and the time new symbol buckets arrived, the tile's new FeatureIndex would be committed, but the global CollisionIndex would be generated against the old symbolBuckets. The mismatch could cause the CollisionIndex to contain indices that didn't exist in the new FeatureIndex, and attempting to access them would generate an out-of-bounds exception.
@ChrisLoer
Copy link
Contributor

This should be fixed in release-boba with #11551. However since I had trouble reproducing, it's definitely possible that there's another crash that happens to happen around the same time and with a similar signature. @parrots, any chance you can make a custom build of the framework off the current head of release-boba and test it with your reproduction case to make sure it resolves the problem?

@ChrisLoer ChrisLoer added the Android Mapbox Maps SDK for Android label Mar 29, 2018
@friedbunny friedbunny added this to the ios-v4.0.0 milestone Mar 29, 2018
@friedbunny
Copy link
Contributor

@ChrisLoer I’m still able to exactly reproduce #11538 (comment) after #11551. I’ll work on bringing the reproduction into our project and push a branch, so it’s less work to reproduce/iterate.

@friedbunny friedbunny reopened this Mar 29, 2018
@ChrisLoer
Copy link
Contributor

Darn. 😞 OK back to the drawing board.

@friedbunny
Copy link
Contributor

Interesting wrinkle: this crash appears to require the presence of annotation views — commenting out -[MGLMapViewDelegate mapView:viewForAnnotation:] (effectively converting the MGLAnnotationView to MGLAnnotationImage) makes this crash go away.

@julianrex
Copy link
Contributor

julianrex commented Mar 29, 2018

Side note & wrinkle to your wrinkle: If I let @parrots' test run and run (after commenting out mapView:viewForAnnotation:) I eventually see a crash in GeometryTile::GeometryTile in ThreadLocal<T>::get() - with an invalid impl.

Edit: This is with excessive printf logging. Without the logging, I don't see a crash.

@ChrisLoer
Copy link
Contributor

So many wrinkles! @julianrex Can you post a symbolicated call stack from the ThreadLocal<T>::get() crash?

friedbunny added a commit that referenced this issue Mar 30, 2018
@friedbunny
Copy link
Contributor

@ChrisLoer check out fb-11538-reproduction for a version of iosapp that integrates the original reproduction. ReproductionViewController.swift loads automatically, but you can also access it from the folder toolbar button in the main screen of iosapp.

@ChrisLoer
Copy link
Contributor

Awesome, I can reproduce pretty easily running this version in the debugger.

ChrisLoer added a commit that referenced this issue Mar 30, 2018
Follow up fix for issue #11538.
This closes a hole in the logic where:
- Layout result 1 arrived
- Symbols arrives for result 1, it was marked ready for commit
- Layout result 2 arrived, clearing "ready for commit" state
- Global placement ran on symbols for result 1, but layout result 1 data wasn't committed because it had been replaced with layout result 2.
ChrisLoer added a commit that referenced this issue Mar 30, 2018
Modest simplification refactoring (issue #10457).
Also, fixes issue #11538, which was caused in part by a hole in the vestigial two-phase loading.
ChrisLoer added a commit that referenced this issue Apr 2, 2018
Modest simplification refactoring (issue #10457).
Also, fixes issue #11538, which was caused in part by a hole in the vestigial two-phase loading.
@ChrisLoer
Copy link
Contributor

Fixed again with #11575. @parrots we really owe you for the reproduction case you put together -- it turns out to be a great stress-tester for a part of our code with complex timing-dependent logic.

The additional problem was a hole in the logic where:

  • Layout result 1 arrived
  • Symbols arrives for result 1, it was marked ready for commit
  • Layout result 2 arrived, clearing "ready for commit" state
  • Global placement ran on symbols for result 1, but layout result 1 data wasn't committed because it had been replaced with layout result 2.

I originally prepared #11570 as a patch for the race condition, but we decided instead to pursue a larger fix that hopefully simplifies the logic in a way that will make it easier to prevent this kind of problem in the future.

@ghost
Copy link

ghost commented Apr 23, 2018

Not fixed.

Platform: iOS
Mapbox SDK version: 4.0.0

Still getting error:

libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: vector

While rapidly removing and adding multiple annotation views.
Same code running on 3.7.0 has no problems.

@friedbunny
Copy link
Contributor

@l2comdev That’s a bummer — please post any stack traces and reproduction code that you can provide.

@parrots
Copy link
Author

parrots commented Apr 23, 2018

screen shot 2018-04-23 at 5 50 08 pm

Can confirm I'm still seeing them too, but it is at a _much_ lower rate than pre-4.0. (note the small blips in jan / feb -- I've always gotten 1 or 2 of these a month before one release made it really bad)

@ghost
Copy link

ghost commented May 17, 2018

Just updated to 4.0.1, its rare but I'd say after 5 minutes of map panning/zooming I can generate the crash.

libc++abi.dylib: terminating with uncaught exception of type std::out_of_range: vector

UIKitUIApplicationMain:
0x18e0116a0 <+0>: stp x24, x23, [sp, #-0x40]!
0x18e0116a4 <+4>: stp x22, x21, [sp, #0x10]
0x18e0116a8 <+8>: stp x20, x19, [sp, #0x20]
0x18e0116ac <+12>: stp x29, x30, [sp, #0x30]
0x18e0116b0 <+16>: add x29, sp, #0x30 ; =0x30
0x18e0116b4 <+20>: mov x20, x3
0x18e0116b8 <+24>: mov x21, x1
0x18e0116bc <+28>: mov x22, x0
0x18e0116c0 <+32>: mov x0, x2
0x18e0116c4 <+36>: bl 0x18acc032c
0x18e0116c8 <+40>: mov x19, x0
0x18e0116cc <+44>: mov x0, x20
0x18e0116d0 <+48>: bl 0x18acc032c
0x18e0116d4 <+52>: mov x20, x0
0x18e0116d8 <+56>: mov w0, #0x168
0x18e0116dc <+60>: movk w0, #0x2b87, lsl #16
0x18e0116e0 <+64>: mov w1, #0x32
0x18e0116e4 <+68>: mov x2, #0x0
0x18e0116e8 <+72>: mov x3, #0x0
0x18e0116ec <+76>: mov x4, #0x0
0x18e0116f0 <+80>: bl 0x18acc0d4c
0x18e0116f4 <+84>: orr w0, wzr, #0x3
0x18e0116f8 <+88>: orr w1, wzr, #0x3
0x18e0116fc <+92>: mov x2, #-0x1
0x18e011700 <+96>: orr x4, xzr, #0x8000000000000000
0x18e011704 <+100>: mov w3, #0x0
0x18e011708 <+104>: bl 0x18acc1db8
0x18e01170c <+108>: adrp x23, 165009
0x18e011710 <+112>: add x23, x23, #0x1c8 ; =0x1c8
0x18e011714 <+116>: ldr w8, [x23]
0x18e011718 <+120>: cbnz w8, 0x18e011730 ; <+144>
0x18e01171c <+124>: adrp x8, 165006
0x18e011720 <+128>: ldr x8, [x8, #0x848]
0x18e011724 <+132>: cmn x8, #0x1 ; =0x1
0x18e011728 <+136>: b.ne 0x18e0117b4 ; <+276>
0x18e01172c <+140>: ldr w8, [x23]
0x18e011730 <+144>: lsr w8, w8, #8
0x18e011734 <+148>: cmp w8, #0x201 ; =0x201
0x18e011738 <+152>: b.lo 0x18e011764 ; <+196>
0x18e01173c <+156>: bl 0x18acc0354
0x18e011740 <+160>: mov x23, x0
0x18e011744 <+164>: mov x0, x22
0x18e011748 <+168>: mov x1, x21
0x18e01174c <+172>: mov x2, x19
0x18e011750 <+176>: mov x3, x20
0x18e011754 <+180>: bl 0x18e11c39c ; _UIApplicationMainPreparations
0x18e011758 <+184>: mov x0, x23
0x18e01175c <+188>: bl 0x18ea7f8a4 ; symbol stub for: -[UIWKSelectionView touchChanged:forHandle:]
0x18e011760 <+192>: b 0x18e011778 ; <+216>
0x18e011764 <+196>: mov x0, x22
0x18e011768 <+200>: mov x1, x21
0x18e01176c <+204>: mov x2, x19
0x18e011770 <+208>: mov x3, x20
0x18e011774 <+212>: bl 0x18e11c39c ; _UIApplicationMainPreparations
0x18e011778 <+216>: adrp x8, 165009
0x18e01177c <+220>: ldr x0, [x8, #0x1c0]
0x18e011780 <+224>: adrp x8, 159652
0x18e011784 <+228>: ldr x1, [x8, #0xb20]
0x18e011788 <+232>: bl 0x18acc02fc
0x18e01178c <+236>: mov x0, x20
0x18e011790 <+240>: bl 0x18acc0328
0x18e011794 <+244>: mov x0, x19
0x18e011798 <+248>: bl 0x18acc0328
0x18e01179c <+252>: mov w0, #0x0
0x18e0117a0 <+256>: ldp x29, x30, [sp, #0x30]
0x18e0117a4 <+260>: ldp x20, x19, [sp, #0x20]
0x18e0117a8 <+264>: ldp x22, x21, [sp, #0x10]
0x18e0117ac <+268>: ldp x24, x23, [sp], #0x40
0x18e0117b0 <+272>: ret
0x18e0117b4 <+276>: adrp x0, 165006
0x18e0117b8 <+280>: add x0, x0, #0x848 ; =0x848
0x18e0117bc <+284>: adrp x1, 137848
0x18e0117c0 <+288>: add x1, x1, #0x648 ; =0x648
0x18e0117c4 <+292>: bl 0x18ea7f31c ; symbol stub for: -[UILayoutGuide _stashedLayoutVariableObservations]
0x18e0117c8 <+296>: b 0x18e01172c ; <+140>`

0x18e01178c <+236>: mov x0, x20 Thread 1: signal SIGABRT

@ghost
Copy link

ghost commented May 22, 2018

Updated to ios-v4.1.0-alpha.1 - No crashes since!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Android Mapbox Maps SDK for Android crash iOS Mapbox Maps SDK for iOS
Projects
None yet
Development

No branches or pull requests

4 participants