-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with converting the OpenGL glDrawElements function to WebGL in emscripten #4214
Comments
I checked glDrawElements in library_gl.js and it looks good. Can you put here a small example which you use for rendering? |
An example would also be useful as a testcase for the test suite, assuming this turns out to be a bug. |
Sorry I haven't had a chance to reply till now. Hopefully this works for the example you requested: Below is a snipet of opengl code that should work to test against. It simply produces (or is supposed to produce) a red triangle in the middle of the screen with a black background. If you use the variables vVertices and ndxArray for the data for the vertex array and the index array, then it works fine in opengl, but fails (as in, the triangle doesn't appear as it is supposed to) when using the current emscripten to generate the js/webGL version (though it worked for me using emscripten when I tested this with the fix I posted above). If you use the variables vVertices2 and ndxArray2 for the data, then it works fine for both emscripten and for the traditional opengl. Testing wise, in this example the size of the vertex array that is made in the GL.preDrawHandleClientVertexAttribBindings function should contain 7 generic vertex attribute values (so 7 values for vertex attribs * 3 components--x,y,z--for each attribute = 21 values in the vVertices array * 4 bytes for floats = a total of 84 bytes) for the vVertices/ndxArray set of data and 3 generic vertex attribute values (for a total of 36 bytes) for the vVertices2/ndxArray2 set of data. So, in the GL.preDrawHandleClientVertexAttribBindings function, the variable 'size' should equal 84 bytes when the vVertices/ndxArray data is used and 34 bytes when the vVertices2/ndxArray2 is used. (though, of course, you could test it in a few other ways too)
One other thing: A college of mine just told me about opengl's primitive restart functionality, which I've never used before. It's apparently used to sort-of restart when you're drawing a triangle strip, by inserting a specific number (such as the max int allowed) into the index array, which lets opengl know to start a new triangle strip. This would potentially mess up the example fix that I posted here. I am very fuzzy on how it works or what versions of openGL actually support this feature. From basic googling, it looks to me like it's possibly in/going to be in WebGL2, ES 3, and OpenGL 4.3. I've no idea how emscripten is handling / will handle that, or precisely what to do with this new information in regards to the bug I'm seeing in the glDrawElements function, as I'm not totally clear on how to see what the current primitive restart index would be or how emscripten handles this now. Any thoughts or ideas? Thank you both for all your help on this, and please let me know if you need anything else from me. |
Thanks for the testcase. This does look like a problem. Let's see what @juj thinks (might be delayed, he's on vacation). |
It does and in such a important function. glDrawRangeElements can be a workaround without losing performance but this function is available only in OpenGL ES 3.0 |
The problem can be solved by using vbo because glDrawElements doesn't have to calculate buffer for loading. This example works fine for me.
However glDrawElements has to change in case if data is on client side. Here is important to optimize the function because of performance and detect case when max element should be calculated . draft
|
commit a4fcb6b |
Awesome. Thanks for all your help! Do you know when/approximately when this fix will make it into a release? |
I'm confused, how does that commit show up as part of this repo? I don't see a PR for it, and it doesn't look merged into any branch? I also don't see it in my local git clone. |
Yeah, me too. This is just a commit to my fork, i don't why it looks like this. |
Weird... might be a github bug? Anyhow, if that code is ready, please open a PR with it. And thanks for looking into this :) |
Agreed that there is a problem that the In GLES2, the function The proposed PR #4238 looks like it has extremely high performance footprint, since it does O(#of rendered indices) amount of work to figure out the max index, and allocates a new JS object at each index. I'd like this to be solved in a different way that does not impact performance and is optional to only be applied in codebases that know they need this fix so they don't receive the performance impact needlessly:
@eukarpov: How does that sound? |
How I mentioned before glDrawRangeElements is available only in OpenGL ES 3.0
I was thinking about other solution, but not sure if it's possible to implement. |
When people use FULL_ES2, they already know there is a performance impact involved. |
For the record, the fix proposed in PR #4238 also seems to solve my problem, but I agree that there could be a significant performance impact. When can we expect a proper fix for this to be included in Master branch? |
I'm not a GL expert, but based on @juj's last comment here, I think we can add this if it's a new option (so it doesn't add overhead). Or if we show benchmark numbers that prove it has no new overhead, we might not need a new option. |
@kripken I'm sure it's fine to add as a new option, but to be clear, this is definitely a bug. The function doesn't work as advertised, in its current incarnation. Seems strange to have to use a special option in order to have the function "work." I implemented the fix locally, but as pointed out above, there are likely better / more efficient ways of doing it. |
Yeah, agreed it's a bug, but if it's one that is rarely encountered and has severe perf impact to properly fix, then we need to weigh those factors too, I think that's what was mentioned before. Alternatively maybe the fixed/slow state should be the default, and an option added to speed things up unsafely with the bug. I guess it depends on the speed difference here, we should measure that. |
I think this problem is critical for people who just started working with openGL and emscripten. |
Makes sense. Sounds like most people here would prefer this be fixed by default + add an option to speed things up unsafely? (and add a warning somewhere, perhaps) |
The bug here affects a specific set type of usage, specifically those codebases that refer to indices >= the size of the index buffer. Everyone agrees that this is a bug, but the appropriate method to fix was not yet agreed above. If this was fixed with a method that causes large performance impact for existing developers, then those developers would likely complain a performance regression bug against Emscripten. That is why I proposed a second flag, and a development time assertions check, so that developers can investigate whether they need the more involved emulation or not. I know a number of developers are currently shipping production code with It would be best to use a C side unrolled loop to compute the max index element so that it does not generate garbage. Using a technique similar to used in the PR #4898 it's possible to mix C and JS code to implement this emulation. Happy to review if someone wants to write a PR to do that, or I can also put this on my queue if that would be preferable? |
@juj It sounds like you have the best handle on this, and it would make sense for you to implement an efficient fix, but of course that's not my call. Thanks again to all of you for re-engaging on this. |
@juj How to write performance and garbage test for that code? Just i wanted to be sure that it is so bad and C implementation is needed there |
Any solution here? I got bit by this again, because I had updated Emscripten and forgot to apply my patch. Could we not just implement this with another flag? |
@dwhipps it looks like @juj doesn't have time to resolve this. I can try to help out here. Reading the above discussion, I'm not sure what you mean by "my patch" - is it the PR linked, or the patch pasted in a comment here, or something else? (I'm trying to read the code with the best solution for the problem, which is what I assume is in that patch?) |
Hmm, it doesn't look that slow to me - those loops should be fast? I don't see concrete measurements in the discussion above, that might be worth doing. Anyhow, I see no risk in adding this behind a flag, that way at least you don't need to maintain a patch. But hopefully we can land it enabled by default if it's fast enough. |
As a starting point, if you want to open a PR with that as an option, that would be good I think. |
Ok. Might just be my code that is slow... I haven't done any performance testing on that specific method, just my use of it. Not sure I'm comfortable adding a new flag, and the code is literally there to copy paste. Would prefer someone with a bit more experience contributing to handle it, but happy to help out if you guys don't have the bandwidth. |
As I said before i guess the performance is fine here and complicity is just O(n). |
@eukarpov thanks! that sounds good. For tests, maybe a good relevant one that uses FULL_ES2 is My feeling is that this behavior should be correct by default, so no need for a new flag. But we can optionally add a flag to disable it, if people want. Could be left to a followup later. |
Actually, it might be nice to add that flag with that PR, and add a test for what happens without that flag set, that is, doing the assumption that leads to incorrect results. Those two tests together would give better testing overall, I think. |
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant. |
Hi, faced this issue, even this issue is reported long time ago it was hard to find for me. In my case game rendering works in 99% of cases, only some objects are missing. I read documentation many times but didn't pay attention to issue#4214 note 🥇 . Anyway after digging and debugging I also realized that size in @juj What do you think if 1st. Add in emscripten.h: void emscripten_glVertexAttribPointer (
GLuint index,
GLint size, GLenum type,
GLboolean normalized,
GLsizei stride,
const void *pointer,
GLuint exact_size // NEW
); 2st. Change library_webgl.js like this: glVertexAttribPointer: (index, size, type, normalized, stride, ptr, exact_size) => {
#if FULL_ES2
var cb = GL.currentContext.clientBuffers[index];
#if GL_ASSERTIONS
assert(cb, index);
#endif
if (!GLctx.currentArrayBufferBinding) {
cb.exact_size = exact_size; // NEW
// ...
preDrawHandleClientVertexAttribBindings: (count) => {
GL.resetBufferBinding = false;
// TODO: initial pass to detect ranges we need to upload, might not need
// an upload per attrib
for (var i = 0; i < GL.currentContext.maxVertexAttribs; ++i) {
var cb = GL.currentContext.clientBuffers[i];
if (!cb.clientside || !cb.enabled) continue;
GL.resetBufferBinding = true;
var size = cb.exact_size ?? GL.calcBufLength(cb.size, cb.type, cb.stride, count); // NEW I think this change is quite simple and effective. This also solve my case, in current proejct I always know the size of buffer at binding time. |
Perhaps it can be fixed like this, however it is not OpenGL ES 2.0 API. I still think that to calculate buffer size how it was proposed above is a correct direction with a warning message that this function might need an optimization or to switch to OpenGL ES 3.0 which has this support. Leaving this bug unsolved does not seem right way. |
This is exactly what I want, so here what I can do in my project: #ifdef EMSCRIPTEN
#include <webgl/webgl2.h>
#endif
//...
#ifdef EMSCRIPTEN
emscripten_glDrawRangeElements(GL_TRIANGLES, 0, m_NbVertex, count, GL_UNSIGNED_SHORT, indices);
#else
glDrawElements(GL_TRIANGLES, count, GL_UNSIGNED_SHORT, indices); But, unfortunately glDrawRangeElements does nothing, it just call glDrawElements: glDrawRangeElements: (mode, start, end, count, type, indices) => {
// TODO: This should be a trivial pass-though function registered at the bottom of this page as
// glFuncs[6][1] += ' drawRangeElements';
// but due to https://bugzilla.mozilla.org/show_bug.cgi?id=1202427,
// we work around by ignoring the range.
_glDrawElements(mode, count, type, indices);
}, There is some implemenation in LEGACY_GL_EMULATION mode, but I think it something different. Also I don't want to enable FULL_ES3, cause produces some rendering artifacts if compare to FULL_ES2 |
Could you please try this solution a4fcb6b and provide performance feedback? |
I tested it. Visually it works just fine: no missed geometries, no artifacts. But I can't tell about perfomance, my game is pretty simple and always run on 60FPS. I will try to test it on more devices. |
I suggest to proceed with this solution which supports default API and add a warning message about a potential performance issue. If it is an acceptable resolution for your project I will reopen this PR next week. |
For the current project, yes, this is the best option. However I'm also a bit scared about performance, maybe we can add another definition that is not enabled by default, like -sGL_PRECIES_DRAWELEMENTS or something like that? |
My point of view is that full, proper emulation should be supported out of the box and not by hiding bugs under feature flags. |
@eukarpov I am not a GL expert but we have similar code to yours in other parts of FULL_ES2, so I think your proposed fix is reasonable. In general FULL_ES2 does add some overhead in return for proper emulation out of the box, as you said. So I support moving forward with your change. @caiiiycuk I also agree we should be careful about performance, but do you think this is more risky than other parts of FULL_ES2 code that also loop on |
@kripken I tested average Unity Game with lot of draw calls that works on 40FPS on iPhone8 / Android, applying this changes does not affect performance at all. So I also agreed to complete this PR. |
Sounds good, thanks @eukarpov @caiiiycuk ! |
I don't think Unity renders from client side memory at all, so the slow path in the proposed PR would not be incurred by testing in Unity. |
Idk, but when I comment body of that function the game renders nothing |
Unity definitely uses |
Anyway, I use the following script to patch the resulting js. Maybe it will be useful for someone.
Source: #!/bin/python
import sys
file = sys.argv[1]
pattern = "_glDrawElements=(mode,count,type,indices)=>{"
with open(file, "r") as f:
text = f.read()
start = text.find(pattern) + len(pattern) - 1
end = start + 1
balance = 0
while 0 <= balance:
if text[end] == "{":
balance = balance + 1
if text[end] == "}":
balance = balance - 1
end = end + 1
text = text[0:start + 1] + """
var buf;
var vertexes = 0;
if (!GLctx.currentElementArrayBufferBinding) {
var size = GL.calcBufLength(1, type, 0, count);
buf = GL.getTempIndexBuffer(size);
GLctx.bindBuffer(34963, buf);
GLctx.bufferSubData(34963, 0, HEAPU8.subarray(indices, indices + size));
// Calculating vertex count if shader's attribute data is on client side
if (count > 0) {
for (var i = 0; i < GL.currentContext.maxVertexAttribs; ++i) {
var cb = GL.currentContext.clientBuffers[i];
if (cb.clientside && cb.enabled) {
var array_classes = {
5121 /* GL_UNSIGNED_BYTE */: Uint8Array,
5123 /* GL_UNSIGNED_SHORT */: Uint16Array,
5125 /* GL_UNSIGNED_INT */: Uint32Array};
if (array_classes[type])
vertexes = Math.max.apply(null, new array_classes[type](HEAPU8.buffer, indices, count)) + 1;
break;
}
}
}
indices = 0
}
GL.preDrawHandleClientVertexAttribBindings(vertexes);
GLctx.drawElements(mode, count, type, indices);
GL.postDrawHandleClientVertexAttribBindings();
if (!GLctx.currentElementArrayBufferBinding) {
GLctx.bindBuffer(34963, null)
}
""" + text[end - 1:]
with open(file, "w") as f:
f.write(text) |
I'm running into an issue with how emscripten converts the OpenGL glDrawElements function to the WebGL equivalent (Glctx.drawElements). I had been getting a number of weird triangles popping around in my graphics, and I believe I’ve tracked it down to this:
In emscripten’s library_gl.js file resides the glDrawElements function that replaces the OpenGL call with the same name. Part of what this function does is set up a buffer for the indices (called GLctx.ELEMENT_ARRAY_BUFFER) and (in the GL.preDrawHandleClientVertexAttribBindings function call) it sets up a buffer for the vertices (called GLctx.ARRAY_BUFFER).
The problem is that in the GL.preDrawHandleClientVertexAttribBindings function, it is setting how large the GLctx.ARRAY_BUFFER data array will be based on the variable ‘count’, which is the size of the GLctx.ELEMENT_ARRAY_BUFFER (i.e. the array of indices). This is a problem because what the Glctx.drawElements function will actually do is use the values in the index array to determine where it needs to get its data from in the vertex array (i.e. GLctx.ARRAY_BUFFER). So if the values in the index array point to an index larger than the size of the index array, the Glctx.drawElements function will try to get data from outside the bounds of the GLctx.ARRAY_BUFFER (which just gives junk data).
An example of where the current code would have this happen (i.e. failing to produce the correct graphics) would be if you defined a triangle with an index array (or GLctx.ELEMENT_ARRAY_BUFFER) of [0, 5, 6] and a vertex array (or GLctx.ARRAY_BUFFER) containing only the position in (x,y,z) with value of
[-1,-1,-5,
1, 2, 3,
2, 3, 4,
3, 4, 5,
4, 5, 6,
1, -1, -5,
1, 1, -5].
After running through emscripten, the program would calculate the size of the index array to be 3 (since only 3 indices were given) and each position has 3 values (x, y, and z), so the GLctx.ARRAY_BUFFER would be made to hold only the first 9 values:
[-1,-1,-5,
1, 2, 3,
2, 3, 4]
However, when the WebGL function GLctx.drawElements is called, it would be looking for values in the GLctx.ARRAY_BUFFER array at 0, 5 and 6 (which should give it -1, -1, -5 and 1, -1, -5 and 1, 1, -5 respectively). But since the size of the GLctx.ARRAY_BUFFER is based on the fact that there are 3 indices (and therefore doesn’t include the values after the first 9), it will actually just be getting junk values for indices 5 and 6.
Solution:
I believe the best way to fix this (and at least the only way I can see to fix it) would be to base the size of the GLctx.ARRAY_BUFFER data array on the highest/maximum index value in the index array instead of basing it on the size of the index array. This would ensure that you would not accidentally make the GLctx.ARRAY_BUFFER data array shorter than it should be. To this end, I came up with a fix for this problem (I put a copy of this diff below). I know that this works for my code, but I believe it should also work in general. But if there’s a better or more concise way to fix this, great.
Basically what I’ve done is find the maximum index value in the index array and use that as the parameter for the GL.preDrawHandleClientVertexAttribBindings function instead of using the size of the index array (which makes the size of the GLctx.ARRAY_BUFFER data array be based on the maximum index value instead of the size of the index array).
The text was updated successfully, but these errors were encountered: