-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long validations times, explore using fastjsonschema #190
Comments
While I'm not opposed to the idea, validating a 33MB notebook begs the question of why is the notebook that large? Other mechanisms start failing for notebooks larger than 10MB (browser crash, transport mechanisms times out, etc). |
@MSeal it was an example to really expose the problem.
That is a different issue.
Also a different issue.
Great, I have an opened PR, need to generalize it for the use with other available libraries, and then it would be good for review. |
Understood. I'll try to take a look -- protip if you reference this issue in the PR it will generate a link between them and post here that it is linked. |
Yes, I forgot 🙃 Thanks for the feedback @MSeal. |
Hello :-)
On some cases using JLab, when outputs are collected from several nodes (using dask for example) when errors are found, many tracebacks can populate the output of a given cell. In these cases where the output is a large list of tracebacks, the validation step can be significant.
This is a synthetic notebook, but it illustrates the problem.
50000-errors.ipynb.zip
A script to test this.
Read is already doing validation, so that extra call to validation was for testing purposes.
Yields in seconds:
Could the use of another validation library like https://github.com/horejsek/python-fastjsonschema be considered to improve the validation performance for cases like the one described?
Thanks!
Pinging @mlucool, @echarles
The text was updated successfully, but these errors were encountered: