-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batching API for sensor readings #171
Comments
I would like to propose the following approach:
example usage:
Also I would drop 'onchnage' event for motion sensors as it does not bring much benefits there and on the other hand it is called too often draining CPU and battery. |
Thanks for getting us started on this key conversation.
We want to make sure the API design doesn't force the implementation to preemptively store all intermediary samples in case I think that's what you're suggesting here, but that's unclear from the API name alone. It's something we need to express clearly through the API names. Worth checking other BYOB APIs on the Web for ideas here. What the use cases point towards so far is an API where the consumer can:
That requires removing them from An other option would be to clearly distinguish between |
I thought that if the actual amount of readings exceeds the given buffer capacity, the older records will be dropped and newer will be still added, so that buffer contains always the latest recorded readings |
That's not what I meant. Let me try and clarify. The API name suggested above gives the impression you'll be able to fill up the buffer with existing readings pre-stored somewhere and read from it in the same event turn.
That said, that wouldn't meet the Navisens use case for which dropping samples is a deal breaker. |
Agree, this use cased won't be met: long enough array could solve this problem for samples reported while an animation frame is being painted, but anyway some samples could lost in between of To meet this use case we need an uninterrupted recording all the time, note that samples can be lost even during notification that the given buffer is full and while refreshing the buffer. |
Would you write to the buffer directly as you get the samples from the HAL (or equivalent) or would you write to it with each animation frame? If its the former, we indeed need a buffer swapping mechanism of some sort. If it's the latter the buffer can be swapped synchronously by the API consumer during rAF (or right before? I'm not sure about the particulars of this) if they're advised the buffer is about to be filled, no? let gyro = new Gyroscope();
let bufferA = Float64Array(gyro.getReadingSize() * n);
let bufferB = Float64Array(gyro.getReadingSize() * n);
gyro.onbufferfull = _ => {
if (gyro.buffer === bufferA) {
gyro.buffer = bufferB;
} else {
gyro.buffer = bufferA;
}
}; |
I was planning to have an internal data structure that is being fulfilled periodically (using a timer) with N latest readings starting from the moment when In Chromium sensor data is fetched from the underlying system in one process and then it is put to a shared memory buffer. JS runs within a different process. Considering the fact that JS engine can arbitrary pause or defer scripts execution I doubt that we can completely avoid sample loosing whatever API we invent. |
That seems perfectly reasonable. Would the buffer get overwritten each turn? Or just used once? Would you handle the use case of collecting large amounts of data (e.g. 500 samples at 120 Hz) using the same API? If so would we fire specific events when the buffer is full? What would happen to extra samples? Would they stay in your data structure until a new buffer came along?
I guess in practice that depends on the size of the underlying shared buffer, no? I think if we can limit this to being really rare and/or fire |
Only once per
The given array will contain only the readings recorded while a single frame painting, meaning that for 120 Hz polling frequency (and assuming 60 fps) space for 2 readings should suffice.
Inside the internal data structure the oldest readings will be replaced by the new ones, so if the given buffer is shorter than needed it will contain the latest readings and the older ones will be lost. Any new event will be delivered in a different call chain (at an undefined future moment) which means that some samples will be lost in meanwhile.. Instead, maybe we could require array size equal to [READING_SIZE * COUNT+1] and the last element will be used as a flag indicating overflow, so that the user can allocate a bigger buffer for the next time.
JS is not aware if it got suspended or deferred so there is always a likelyhood of lost samples |
What happens when we fall below 60 Hz, though? I think we need to account for that in how we size buffers (or the underlying data structure used to populate them.
I imagine that if we can write this info to the consumer's buffer, we can equally write it somewhere else and use that info to fire a "bufferoverflow" event, no? Seems more web-y. Events are synchronous, so the consumer would be informed within the same AF.
So that doesn't fit our Navisens use case. Would be interesting to thing about it separately, maybe even in terms of a ServiceWorker-based API, that would emit events a much longer intervals, once a large buffer would be full. |
The previous proposal looks a bit clumsy and not sufficient for some use cases (#171 (comment)) , so another approach:
For implementing this we could start a new thread collecting readings in background until the required amount of data is collected. @tobie, PTAL |
Also, as for
|
As mentioned before, I think we need to do a bit more research in use cases, requirements and how similar cases are handled across the platform before we write down APIs. |
Did anyone look at whether the new SharedArrayBuffer might make sense? https://github.com/tc39/ecmascript_sharedmem/blob/master/TUTORIAL.md |
Vaguely. I don't think it matches our use case at all, but it's probably easy to prove me wrong. |
Recently I studied the possible ways how to implement reliable delivery of sensor readings received at the rate higher than the current frame rate From what I learned, the only feasible way is to collect readings and make batches on platform side (in the same context where these readings were obtained) and then send the fulfilled batches to JS side consequently and making sure nothing is missed. This approach implies a significant latency. On the other hand the existing Sensor API is designed to fetch the most recent data with minimal latency, so it looks quite problematic to integrate both approaches in the same API. This means that batching API should be split from the existing Sensor API ( |
Just to clarify, it seems, from the use cases gathered so far, that we're fine getting those batches delivered on rAF. How would that increase latency of the most freshest of these samples compared to the current situation? Basically, what we want to support is polling at a greater frequency than the frame rate, not reporting this data at a greater frequency than the frame rate. (Or at least that's the plan given the technical constraints and the use cases gathered so far.) |
Right, we propagate data in sync with rAF in any case, but currently we propagate the freshest data immediately obtained from platform side (via shared buffer) - very low latency. |
I see. Care to elaborate on the distinctions between the two IPC mechanism and their respective limitations? Basically, why can't we just get a larger shared buffer? To collect maybe 4 samples instead of 1? |
We can. Than we'll always provide 4 latest readings, but if frame rate goes down for some reason and for example 6 readings are missed (i.e. not propagated) than only latest four will be delivered and previous two lost. Is it acceptable? |
In best case scenario the extra latency is 2ms (on Nexus), but it can increase if the context running JS slows down for some reason and suspends handling of the received batches. |
Well, that's an interesting tradeoff I frankly don't have a good answer to until we have clearer use cases and requirements. How large can such a buffer be? i.e. can you get some extra space to account for say the 95% percentile of frame delays? Going back to one of your previous comments, I agree we might also want another API that trades off latency over integrity, perhaps for bigger sample sets. The requirements I'm roughly seeing so far are (take those with a grain of salt for now):
I'm wondering whether (1) and (2) could be combined in a single API, though (2) requires you somehow integrate timestamps in the output which (1) can (possibly) handle differently. |
Yikes! We certainly don't want to ruin latency for VR by doing this. Thanks for these data points, that's super useful. |
yes, I do not see any issues from the implementation side, but we should figure out the required amount of latest readings to keep and explain it somehow (i.e. why 4 and not 40 :-) ) |
From an implementor's perspective, would it make sense and be useful to have this be author-settable (and capped to a max value, obviously)? |
it might be hard to find a good estimation, could try to base it on 60Hz assuming it as frame rate, but the actual frame rate varies. |
another thing is that we have multiple sensor instances using the same buffer infra and new instances can arbitrary appear in runtime, so the buffer size should not be something settable by the user. |
Sorry, edited my comment in the meantime, which is generally a bad idea. I'm making the (perhaps incorrect) assumption that authors (god—I hate that name) would be in the right position to know the number of readings required to run whatever computation they want to in each new frame. |
I was imagining that an approach that would allow BYOB would be much more sound. It also seems we were converging on TypedArrays for other reasons. As expressed in #153 (comment) (which I re-opened), I'm still fairly unconvinced by the testing done so far that would prevent us from triggering GC pauses in RL scenarios. The GC sweep times mentioned in the related blink-dev thread is also a concern. |
In another blink-dev thread that I started to figure out the implementation approach exactly for sensors batch API I was guided to PointerEvent.getcoalescedevents API returning sequence of objects. We could do smth similar with I understand the performance concerns, but on the other hand, considering that the user cannot predict the required size of the buffer BYOB concept might be too complex here. Maybe we could start with a user-friendly API and optimize it if needed? It's usually harder to make an "optimized" API more user-friendly than doing the opposite :-) |
btw I'm not proposing to reconsider the existing sensor getters, |
Oh! That's interesting. |
On the call we had today, @pozdnyakov suggested we completely drop the API that collects iirc we had use cases and requirements for this API, but I wasn't sure what they were precisely. @kenchris, @rwaldron, do you have any in mind or is it OK to limit the number of readings exposed in realtime to a maximum of 1 reading per rAF? |
For IoT devices we would need some kind of batching in order to send high frequency events from the sensors. Simply polling say 60 hz on the Arduino101 takes too many instructions to send each individual sample over Bluetoot, limiting us to max 20-30hz sampling. Doing some batching we would probably reach 60 or more (I think we can poll at 200hz even with JS on the board) |
Batching and sending over bluetooth seems to be a candidate for option (3) above with lower latency requirements. What I'm really interested in is browser-based rAF-tied use cases with low-latency requirements and bigger than 1 reading per frame requirements. Afaik, some of the newer work on VR has such requirements (i.e. they're using all readings between frames, not just the freshest one). As this has API design impact and buffer structure impact, it's important that we look at it now. |
No such use case comes to mind. For VR use cases rAF should run at higher frequency as well. Of course the higher frequency and the better you can smooth out noise and the like, but if that really makes a difference for most use cases, I am not sure |
Are there plans for rAF to go beyond 60 Hz?
With setup (1), higher frequency would only reduce latency, not offer you more samples than one per rAF. So let's say you poll at 200 Hz, you're still only going to get one sample per rAF, even if rAF is a steady 60 Hz. |
The newly proposed getCoalescedEvents PointerEvent extension is very similar to our option (2) here. Worth digging deeper. |
I would assume that browsers targeting to run on VR headsets might have rAF beyond 60Hz. Would be nice to hear people from Samsung, Oculus etc. @torgo |
Yeah, getCoalescedEvents-like approach is most likely a solution for batching for us (and this is what I concluded from my investigation, pls. see #171 (comment)). My point is that batching API should be an extension to the existing sensor properties, i.e. we should not try to substitute the the existing |
I'd say it's rather closer to (3), the whole sequence is kept, nothing is lost |
@pozdnyakov from our previous conversations, the latency issue only appeared if we added the requirement not to loose any intermediary reading to our API. (2) above is clearly distinct in that regard and it seems that there are potential use cases for this fast with more than 1 reading per rAF scenario. Additionally, creating this API at a later stage does not absolve us from the necessity of designing our API today so that it is compatible with that future design and stays developer friendly. For example, if such an API is buffer-based, we'll need a way to express timestamps within buffers. Maybe this is something worth considering upfront. |
You're right. So I don't think it's a good model for our use cases. The use case I imagined for (3) where those which involved over-the-network processing, tolerated much higher latency and were expecting much larger batches of sensor readings. Scenario (2) really looks at > 60 readings per second real time use cases (where you're not increasing polling speed in the sole goal of reducing latency). |
(3) should fit for #98, right? we could combine (2) and (3) if needed, but then again there should be two separate pieces of API for each, i.e. coalesced events for (3) and smth like
Absolutely agree! I did not mean we should postpone working on it. |
Yes, (3) is for the #98 use case. Given those are essentially going to be sent over the network for further analysis, it might be worth aiming for something as fetch/stream-compatible as possible (Which afaik typedarrays are pretty good for too). We'll probably want reporting frequency to be very different from polling frequency in such cases, so maybe a completely different API for this? For (2), on the other, we want something closer to (1) in API terms, so you can really easily move from dealing with a single data point to multiple ones. |
Great to know we're on the same page here. That was one of my biggest concerns. |
this sounds so much like USB :-)
|
Note in the near future, we'll have: (4) collect samples in the background, Where (4) is probably very close to (3) except much larger data sets. |
I like having names for these :-) Makes it a bit easier to talk about, than referring to a number :) |
A bit of info about the USB stuff here in my presentation: https://youtu.be/9mXTaIr8OHw?t=17m47s |
Given that we do not bind |
Batched events are supported by the FitBit implementation: |
Batching API shall be used for collecting sensor readings that are lost between
onchange
calls in case sensor polling frequency is higher than the current frame rate.This API is already a subtopic at #98, but being itself a rather big item it deserves a dedicated issue. So please use this issue to discuss the use cases and proposals for sensor readings batch API.
The text was updated successfully, but these errors were encountered: