-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Spike] render_nunjucks error spike #9150
Comments
Spent the afternoon looking into this and collected my observations on the notion document: https://www.notion.so/pixiebrix/Nunjucks-Error-Spike-0283abe59c6a48ad96f5f84715516ec3?showMoveTo=true&saveParent=true I have not been able to identify root cause, but I propose creating a PR tomorrow that
More involved solution/future work would be to come up with a different solution for rendering nunjucks templates, an approach I have not researched and have little context on atm |
@mnholtz that sounds good I appreciate you looking into it. Re logging – is it possible to have visibility into the success rate of the mod run when the error is thrown? |
Discussed over zoom - the answer is that how much of the mod is able to be run/will run on a simple page refresh/button click/etc. is determined on a per-mod basis based on the mod definition. Generally speaking this error is very disruptive to mod execution and would usually result in the user modifying their workflow in some way to overcome the issue |
…o 5s (#9168) * increase pRetry timeout in postMessage and injectIframe from 3s -> 5s * introduce SandboxInjectionError * throw SandboxInjectionError from getSandbox helper * add constant for MAX_RETRIES * refaxtor SandboxInjectionError * refactor rename SandboxInjectionError -> IframeInjectionError
Tentatively putting this back into the backlog until we can get more logging visibility/information from #9168 |
It looks like increasing the timeout to 5s did not make any impact on the errors: https://app.datadoghq.com/logs?query=issue.id%3A6f1c3b50-4bb1-11ef-a033-da7ad0900002%20%2A%3A%22%22%20&agg_m=count&agg_m_source=base&agg_t=count&cols=host%2Cservice%2C%40error.causes%5B1%5D&fromUser=true&messageDisplay=inline&refresh_mode=sliding&storage=hot&stream_sort=service%2Casc&viz=stream&from_ts=1726583812591&to_ts=1727188612591&live=true And the cause is still mixed; doesn't seem to be an issue with the iframe getting injected onto the page (not a single one of these errors was caused by the new IframeInjectionError), but loading. |
After further investigation with @grahamlangford, we've identified the following additional hypotheses (both of which are documented on the original notion document):
Following this, we have identified the following next steps for a follow-up PR:
|
* move postMessage to sandbox directory * add try/catch to postMessage * introduce SandboxTimeoutError * refactor SandboxTimeoutMessage * fix constructor * bust sandbox reference cache on failed ping * add sandbox timeout error to cache bust logic * remove sandbox from dom if bad sandbox ping * add comment * remove iframe if load timeout occurs in loadSandbox * fix unit tests * add unit tests for SandboxTimeoutError * refactor last test * add jest.runAllTimers() before promise assertion * fix postMessage test
Adding re-injection in #9208 did not seem to have an effect. Additional information added to error telemetry did not come through to datadog due to the way datadog selects error properties to report (oversight on my part). Will be fixing this in a follow-up commit. |
* introduce widgets/ directory * introduce demo/ directory * refactor rename FormBuilder -> FormBuilderDemo * fix sass component reference
One avenue for us to pursue is to move the sandbox to the offscreen page to avoid issues related to injecting the iframe during brick execution runtime. POC: |
Another avenue that requires less of an architecture shift is to pre-emptively inject the sandbox onto the page upon content script initialization rather than waiting until the point where a brick requiring it is executing |
I merged in some pre-work in the above PR which will make it easier to extend the offscreen page to inject sandbox execution. Pausing on this for now until we see how the current release does with render_nunjucks errors |
Before you moved to the offscreen document, have you seen any initialization errors in the sidebar as well? During the |
I haven't seen anything in our logging, and we haven't had any issues reported in a long time |
Spike details: https://www.notion.so/pixiebrix/Nunjucks-Error-Spike-0283abe59c6a48ad96f5f84715516ec3?showMoveTo=true&saveParent=true
Datadog dashboards:
test edit
The text was updated successfully, but these errors were encountered: