Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uniforms named with brackets fail to get location, causing invisible players/models on OSX (modelMatrix[] / colorMul[]) #25

Closed
iceiix opened this issue Nov 18, 2018 · 7 comments

Comments

@iceiix
Copy link
Owner

iceiix commented Nov 18, 2018

Note: The original issue https://github.com/iceiix/steven/issues/5 was somehow deleted, recreating what I remember from memory... UPDATE: partly restored in #5

Players and other models (villagers, sun, moon) do not render on certain systems. Summary of serious graphical bugs:

GPU System OS Invisible models? NUM_SAMPLES=1 graphics corruption? GUI scale/offset bugs?
AMD Radeon Pro 580 iMac 2017 macOS 10.14 Yes Yes No
VMware iMac 2017, VMware Fusion Ubuntu Linux 18.04.1 No No No
AMD Radeon R9 M295X iMac late 2014 macOS 10.14.1 Yes Yes Yes
AMD Radeon R9 M295X iMac late 2014 Windows 11 ? Yes No
Intel HD Graphics 4000 MacBook Air mid-2012 macOS 10.13.6 Yes No No
Intel G41 desktop PC, Core2 Duo FreeBSD 11 No No No
iceiix referenced this issue in iceiix/steven Nov 19, 2018
…#26)

Part of fixing graphical rendering bugs: https://github.com/iceiix/steven/issues/25
Corrupted graphics on AMD video cards: Thinkofname#74

The previous workaround set NUM_SAMPLES to 2, but this added an extra texel fetch. Setting to 1 reproduces the 16px blue striped unexpected behavior. Setting to 0 samples avoids the extra fetch and the striping.

This is allowed per http://docs.gl/gl3/glTexImage2DMultisample:
> samples specifies the number of samples in the image and must be in the range zero to GL_MAX_SAMPLES - 1.
@iceiix iceiix changed the title Invisible players (2) and other graphical issues Invisible players/entities (2) and other graphical issues Nov 20, 2018
@iceiix iceiix changed the title Invisible players/entities (2) and other graphical issues Invisible players/models (2) and other graphical issues Nov 20, 2018
@iceiix
Copy link
Owner Author

iceiix commented Nov 20, 2018

Fixed some other rendering issues but this one (invisible players, or actually "models" including sun and moon, but not clouds) remains. An easy way to demonstrate the problem is to stare at the sun, or where the sun should be, or another player. Tracked down to this code in src/render/mod.rs:

        // Model rendering
        self.model.draw(&self.frustum, &self.perspective_matrix, &self.camera_matrix, self.light_level, self.sky_offset);
        if world.copy_cloud_heightmap(&mut self.clouds.heightmap_data) {
            self.clouds.dirty = true;
        }
        self.clouds.draw(&self.camera.pos, &self.perspective_matrix, &self.camera_matrix, self.light_level, self.sky_offset, delta);

self.clouds.draw() renders fine, but not self.model.draw(), which is defined in src/render/model.rs. Ultimately it calls glDrawElements in a loop:

                gl::draw_elements(gl::TRIANGLES, model.count, self.index_type, 0);

triangles, count is 6 for the sun, index type is UNSIGNED_SHORT. Verified this function is called, on both systems, but just doesn't show up. check_gl_error() confirms no glGetError() code either.

Does not appear to be a texturing problem, adding to src/render/shaders/sun_frag.glsl fragColor=vec4(1.0, 1.0, 1.0, 1.0); forcing white sun, works on Intel G41 BSD system, still nothing shows up on M295X.

Has anyone had similar problems? gamedev.net:
glDrawElements isn't drawing anything
, was receiving glErrors, but I'm not. Also already using GL_UNSIGNED_SHORT. /r/opengl Stumped by glDrawElements not actually rendering anything. had a different problem related to back face culling:

Edit: So the problem turned out to be that my surface had its faces winding the wrong direction, causing back face culling to remove all the faces. Additionally, RenderDoc doesn't seem to want to deal with vertices at all right now.

Let this be a lesson to not enable rear face culling while you're just starting to work in a 3D scene. Additionally, it is difficult to debug your program when you really need to be debugging the debugger...

src/render/mod.rs does enable depth tests and back face culling, but if I comment it out:

        //gl::enable(gl::DEPTH_TEST);
        //gl::enable(gl::CULL_FACE_FLAG);
        //gl::cull_face(gl::BACK);
        //gl::front_face(gl::CLOCK_WISE);
        //gl::front_face(gl::COUNTER_CLOCK_WISE);

no difference. Besides, my problem seems to be GPU-specific or OS-specific, these two solutions appear to be universal, programming errors. Not to rule out this bug isn't either, but...

Another lead: StackOverflow: Call to glDrawElements breaks display on some GPUs?, posted 2018/06/20 but no answers, although some possibly relevant hints:

The peculiar part is that this only happens on my colleagues' computer that uses an Nvidia GeForce GTX 1060 as its graphic cards (2 of my friends have the same exact computer).

EDIT : putting a glClear(GL_COLOR_BUFFER_BIT) call right before glfwSwapBuffers(window) still displays a white screen with the culprit line uncommented even though the clear color has been set to light blue. Commenting the culprit line indeed displays a light blue screen, so this makes me think it's a framebuffer issue, but I can't say for sure.

Steven isn't using glfw so the glfwSwapBuffers equivalent for SDL may be either sdl_video.gl_set_swap_interval and/or window.gl_swap_window();. glClear(GL_COLOR_BUFFER_BIT), Why do we have to clear depth buffer in OpenGL during rendering?, Steven clears the color and depth buffers in src/render/mod.rs tick():

        gl::clear(gl::ClearFlags::Color | gl::ClearFlags::Depth);

@iceiix
Copy link
Owner Author

iceiix commented Nov 27, 2018

@iceiix
Copy link
Owner Author

iceiix commented Nov 27, 2018

While reviewing https://github.com/iceiix/steven/issues/34 other rendering libraries, found this useful tool reference: https://bkaradzic.github.io/bgfx/overview.html - there isn't much available to debug GL on OSX, but there are at least these two tools: https://apitrace.github.io and
https://software.intel.com/en-us/vcsource/tools/intel-gpa. Intel GPA installs several utilities in /Applications/Intel, first launch GpaMonitor.app, then select the reduced test case ("invisible", posted at https://github.com/iceiix/invisible), start. The app starts, with some HUD indicators:

screen shot 2018-11-27 at 1 57 57 pm

There are a bunch of keyboard shortcuts available:

screen shot 2018-11-27 at 1 59 13 pm

including disabling various features to help debug. Toggled each of these, but nothing obviously was the problem. Next up is FrameAnalyzer.app, it can also launch the app (click Add), when running click Capture then it captures a frame, then in the browser click Open, this lets you see various bits of useful information. I can see the glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, 0) call corresponding to src/render/model.rs draw:

            for model in collection.models.values() {
                model.array.bind();
                collection.shader.lighting.map(|v| v.set_float2(0.0, 0.0));
                collection.shader.model_matrix.map(|v| v.set_matrix4_multi(&model.matrix));
//println!("about to draw model {:?} {:?}", model.count, self.index_type);
                gl::draw_elements(gl::TRIANGLES, model.count, self.index_type, 0); // here

The input geometry can be viewed, and to make it easier to distinguish from the trans buffer, made the sun slightly trapezoidal, and sure enough the geometry comes through, it is sent to the GPU successfully:

screen shot 2018-11-27 at 2 07 37 pm

this matches what is seen on Ubuntu, where the sun renders successfully:

screen shot 2018-11-27 at 2 08 04 pm

so the problem lies later. The output pane RT: 2, T: 3 shows the solid blue sky, no sun, so it isn't making it to the output. The RT: 2, T: 7 output is the depth buffer (DEPTH_COMPONENT32F), which is a solid 1.000 (the sky is solid R: 122, G: 165, B: 247, A: 255). The API log only shows three calls: glClear, glDrawElements, glClearBufferfv, then glDrawArrays, not too much; the output from glDrawElements doesn't match what is expected, maybe dig deeper there.

@iceiix
Copy link
Owner Author

iceiix commented Nov 27, 2018

Simplified the shader, this doesn't look good, the modelMatrix is all zeroes!

screen shot 2018-11-27 at 2 25 21 pm

even though if I print it before its sent to OpenGL:

            for model in collection.models.values() {
                model.array.bind();
                println!("model.matrix = {:?}", &model.matrix);
                collection.shader.model_matrix.map(|v| v.set_matrix4_multi(&model.matrix));

it is the translated identity matrix, we expect:

about to draw model 6 5123
model.matrix = [Matrix4 [[1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [-300.5, -13.2, 0.5, 1.0]]]

How is it sent to GL? v.set_matrix4_multi in the GL wrapper src/gl/mod.rs, this call to glMatrixUniform4fv looks suspicious:

    pub fn set_matrix4_multi(&self, m: &[::cgmath::Matrix4<f32>]) {
        unsafe {
            gl::UniformMatrix4fv(self.0, m.len() as i32, false as u8, m.as_ptr() as *const _); // TODO: Most likely isn't safe
        }
    }

"Most likely isn't safe" indeed...

@iceiix
Copy link
Owner Author

iceiix commented Nov 27, 2018

Sure enough, that was it! set_matrix4_multi is busted, changing to set_matrix4 and the other code to only set and expect a singular matrix iceiix/invisible@2bdfbdd, properly renders the sun (or moon):

screen shot 2018-11-27 at 2 32 03 pm

...at last. This fixes the test case, but as Steven requires multiple model matrices, a real fix would be different, but at least now the problem is identified.

@iceiix iceiix changed the title Invisible players/models (2) and other graphical issues Unsafe set_matrix4_multi glUniformMatrix4fv zeroing model matrix, causing invisible players/models on OSX Nov 27, 2018
@iceiix
Copy link
Owner Author

iceiix commented Nov 28, 2018

set_matrix4 takes a cgmath::Matrix4, which it calls .as_ptr() on to pass to glUniformMatrix4fv. This works because cgmath::Matrix4 is marked as #[repr(C)], and it is a struct of Vector4's:

pub struct Matrix4<S> {
    pub x: Vector4<S>,
    pub y: Vector4<S>,
    pub z: Vector4<S>,
    pub w: Vector4<S>,
}

Rustonomicon: Alternative representations documents repr(C) and others:

This is the most important repr. It has fairly simple intent: do what C does. The order, size, and alignment of fields is exactly what you would expect from C or C++. Any type you expect to pass through an FFI boundary should have repr(C), as C is the lingua-franca of the programming world. This is also necessary to soundly do more elaborate tricks with data layout such as reinterpreting values as a different type.

set_matrix4_multi takes a reference to an array slice: m: &[::cgmath::Matrix4<f32>], which doesn't have repr(C) so m.as_ptr() as *const _ isn't guaranteed to have a C-compatible structure. StackOverflow: Is it possible to pass arrays from Rust to C? suggests:

The C ABI does not have a concept of "returning an array". You should allocate a Vec<const c_char> and return it, defining the function on the C side as extern char* string_from_rust().

Alternatively, since you are returning two pointers, use a #[repr(C)] struct with two members instead of array.

m.to_vec().to_ptr(), no difference (and still haven't added repr(C)). But how is a vector better than a slice? Rust by example: Arrays and Slices says arrays are contiguous, which is what we want:

An array is a collection of objects of the same type T, stored in contiguous memory. Arrays are created using brackets [], and their size, which is known at compile time, is part of their type signature [T; size].

Slices are similar to arrays, but their size is not known at compile time. Instead, a slice is a two-word object, the first word is a pointer to the data, and the second word is the length of the slice. The word size is the same as usize, determined by the processor architecture eg 64 bits on an x86-64. Slices can be used to borrow a section of an array, and have the type signature &[T].

There is discussion in rust-lang about how array slices are not FFI-safe because they are fat pointers: rust-lang/rust#30382 Document how Rust array/slice types are translated for extern "C" functions.

but as_ptr() is used to convert the array slice to a raw pointer, *const T: https://doc.rust-lang.org/std/primitive.slice.html#method.as_ptr

Returns a raw pointer to the slice's buffer.

The caller must ensure that the slice outlives the pointer this function returns, or else it will end up pointing to garbage.

Modifying the container referenced by this slice may cause its buffer to be reallocated, which would also make any pointers to it invalid.

what is as *const _? as_ptr() on a slice returns *const T, so for cgmath::Matrix4<f32> it will be a Matrix4, but gl::UniformMatrix4fv expects an f32:

   Compiling invisible v0.1.0 (/Users/admin/games/rust/invisible)
error[E0308]: mismatched types
   --> src/gl/mod.rs:621:71
    |
621 |             gl::UniformMatrix4fv(self.0, m.len() as i32, false as u8, m.as_ptr()); // TODO: Most likely isn't safe
    |                                                                       ^^^^^^^^^^ expected f32, found struct `cgmath::matrix::Matrix4`
    |
    = note: expected type `*const f32`
               found type `*const cgmath::matrix::Matrix4<f32>`

set_matrix4 also uses as_ptr(), but on a Matrix4 itself, which is why it imports this trait: use cgmath::Matrix;, for https://docs.rs/cgmath/0.16.1/cgmath/trait.Matrix.html#method.as_ptr

    pub fn set_matrix4(&self, m: &::cgmath::Matrix4<f32>) {
        use cgmath::Matrix;
        unsafe {
            gl::UniformMatrix4fv(self.0, 1, false as u8, m.as_ptr());
        }
    }

fn as_ptr(&self) -> *const Self::Scalar

Get the pointer to the first element of the array.

How about doing the same on the first element of the slice?

    pub fn set_matrix4_multi(&self, m: &[::cgmath::Matrix4<f32>]) {
        use cgmath::Matrix;
        unsafe {
            gl::UniformMatrix4fv(self.0, m.len() as i32, false as u8, m[0].as_ptr()); // TODO: Most likely isn't safe
        }
    }

but modelMatrix is still matrix zero. The shader hardcodes modelMatrix[10], fixed length, why not pass a fixed-length vector?

                let a = [model.matrix[0]; 10];
                collection.shader.model_matrix.map(|v| v.set_matrix4_multi(&a));

or?

    pub fn set_matrix4_multi(&self, m: &[::cgmath::Matrix4<f32>]) {
        use cgmath::Matrix;
        let a = [m[0]; 10];

        unsafe {
            gl::UniformMatrix4fv(self.0, 10, false as u8, a[0].as_ptr()); // TODO: Most likely isn't safe
        }
    }

no difference. Interestingly, making only this change, the uniform from an array to a scalar, causes the sun to render, without any other code change:

--- a/src/render/model.rs
+++ b/src/render/model.rs
@@ -203,7 +203,7 @@ init_shader! {
         uniform = {
             optional perspective_matrix => "perspectiveMatrix",
             optional camera_matrix => "cameraMatrix",
-            optional model_matrix => "modelMatrix[]",
+            optional model_matrix => "modelMatrix",
         },
     }
 }

confirmed in IntelGPA Frame Analyzer:

screen shot 2018-11-27 at 4 30 09 pm

even though it is modelMatrix[10] in the shader declaration. Digging deeper, src/gl/mod.rs uniform_location and attribute_location return None if glGetUniformLocation or glGetAttributeLocation fail. Added logging:

invisible $ cargo run|grep fail
    Finished dev [unoptimized + debuginfo] target(s) in 0.04s
     Running `target/debug/invisible`
glGetAttributeLocation failed for aPosition
glGetAttributeLocation failed for id
glGetUniformLocation failed for perspectiveMatrix
glGetUniformLocation failed for cameraMatrix
glGetUniformLocation failed for modelMatrix[]
glGetAttributeLocation failed for id
glGetUniformLocation failed for modelMatrix[]

change optional model_matrix => "modelMatrix[]", to optional model_matrix => "modelMatrix",, doesn't fail in glGetAttributeLocation:

invisible $ cargo run|grep fail
   Compiling invisible v0.1.0 (/Users/admin/games/rust/invisible)
    Finished dev [unoptimized + debuginfo] target(s) in 0.92s
     Running `target/debug/invisible`
glGetAttributeLocation failed for aPosition
glGetAttributeLocation failed for id
glGetUniformLocation failed for perspectiveMatrix
glGetUniformLocation failed for cameraMatrix
glGetUniformLocation failed for modelMatrix
glGetAttributeLocation failed for id

and the sun renders ok. Is it correct to have brackets on arrays in attribute/uniform names?

@iceiix iceiix changed the title Unsafe set_matrix4_multi glUniformMatrix4fv zeroing model matrix, causing invisible players/models on OSX Uniforms named with brackets fail to get location, causing invisible players/models on OSX (modelMatrix[] / colorMul[]) Nov 28, 2018
@iceiix
Copy link
Owner Author

iceiix commented Nov 28, 2018

If I log all glGetUniform/AttribLocation failures:

glGetAttribLocation failed for aColor
glGetUniformLocation failed for lightLevel
glGetUniformLocation failed for skyOffset
glGetUniformLocation failed for lighting

but some are optional, so leaving this logging off for now. Just removing the brackets fixes the rendering, now finally see the sun (and the player):

screen shot 2018-11-27 at 4 58 56 pm

@iceiix iceiix transferred this issue from iceiix/steven Jan 12, 2019
@iceiix iceiix mentioned this issue Jan 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant