Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvenc v7 support #1260

Closed
totaam opened this issue Jul 21, 2016 · 32 comments
Closed

nvenc v7 support #1260

totaam opened this issue Jul 21, 2016 · 32 comments

Comments

@totaam
Copy link
Collaborator

totaam commented Jul 21, 2016

Issue migrated from trac ticket # 1260

component: encodings | priority: major | resolution: fixed

2016-07-21 09:53:28: antoine created the issue


Download link: [https://developer.nvidia.com/nvidia-video-codec-sdk].
Anand: Maxwell Display Matters: New Display Controller, HDR, & HEVC

Key features that are relevant to us:

  • HEVC 8K (8192 pixels x 8192 pixels) encoding
  • HEVC 4:4:4 encoding
  • HEVC 10-bit encoding
  • HEVC lossless encoding
  • Rate control and quality improvements

Claims are that the performance is doubled over Maxwell.

Related tickets:

1.1 should drop support for all older nvenc codecs

@totaam
Copy link
Collaborator Author

totaam commented Jul 21, 2016

2016-07-21 10:13:52: antoine uploaded file nvenc7.pc (0.3 KiB)

pkg-config file for building against nvenc7

@totaam
Copy link
Collaborator Author

totaam commented Jul 21, 2016

2016-07-21 10:41:51: antoine changed status from new to assigned

@totaam
Copy link
Collaborator Author

totaam commented Jul 21, 2016

2016-07-21 10:41:51: antoine commented


Stub added in r13060, minor api updates in r13061, build switch in r13087. Missing symbol fix in r13088.

Still TODO:

@totaam
Copy link
Collaborator Author

totaam commented Aug 16, 2016

2016-08-16 07:04:37: antoine commented


  • r13335 dropped support for all the older nvenc versions, AFAICT the newer SDK is backwards compatible.
  • r13363: lots of improvements (see commit message), includes support for HEVC (aka h265)
  • r13364: probe max-encoder-size at runtime (support 8K with HEVC)

@totaam
Copy link
Collaborator Author

totaam commented Sep 16, 2016

2016-09-16 07:53:41: antoine commented


10-bit HEVC support moved to #1308.

@totaam
Copy link
Collaborator Author

totaam commented Sep 24, 2016

2016-09-24 16:25:08: antoine changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Sep 24, 2016

2016-09-24 16:25:08: antoine changed owner from antoine to smo

@totaam
Copy link
Collaborator Author

totaam commented Sep 24, 2016

2016-09-24 16:25:08: antoine commented


Well, well. I've spent a small fortune on a GTX1070 to test this ticket, in particular the performance of HEVC. Problem is that I get hard system lockups running the tests.
I thought there was something wrong with the code but then I went back to the 0.17.x branch and all nvenc codec versions from that branch also lockup the system...

So it could be one of two things:

  • hardware problem, this card could be a dud: it does have problems with 4k (not working on one monitor) and some glittering pixels at boot
  • the code just isn't compatible with Pascal cards (10xx) - in which case we'll probably need to feed the YUV data directly rather than using the CUDA buffer (sigh)

@smo: you have a card, can you run the tests:

mkdir tmp && cd tmp && cp -apr ../tests ./
PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 

@totaam
Copy link
Collaborator Author

totaam commented Oct 27, 2016

2016-10-27 16:44:14: antoine changed status from new to assigned

@totaam
Copy link
Collaborator Author

totaam commented Oct 27, 2016

2016-10-27 16:44:14: antoine changed owner from smo to antoine

@totaam
Copy link
Collaborator Author

totaam commented Oct 27, 2016

2016-10-27 16:44:14: antoine commented


Found the bug: using "sliceMode" and "sliceModeData" causes nvenc to lockup completely (I have rebooted my desktop system ~20 times today as a result of that bug).
Fix in r14303, (backport to v0.17.x in 14304)

It is quite a bit faster than the previous card I had been testing with, though some of the gains could also be due to the faster CPU / memory. I'm getting:

nvenc(BGRX/NV12/H264 - low-latency - 3840x2160) finished encoding 100    BGRX frames at  3840x2160: \
     521 MPixels/s,   15ms/frame,        9KB/frame (NV12)

That would allow for up to 60 fps at 4K.
Or almost 1500fps at 480p! (or 60 clients at 30 fps)

Still TODO:

  • fix some test failures
  • remove driver version warning
  • finish or re-schedule ARGB mode

@totaam
Copy link
Collaborator Author

totaam commented Oct 28, 2016

2016-10-28 10:13:23: antoine changed status from assigned to new

@totaam
Copy link
Collaborator Author

totaam commented Oct 28, 2016

2016-10-28 10:13:23: antoine changed owner from antoine to smo

@totaam
Copy link
Collaborator Author

totaam commented Oct 28, 2016

2016-10-28 10:13:23: antoine commented


@smo: test if you have any cards that support NVENC (not just Pascal generation), or just close.

@totaam
Copy link
Collaborator Author

totaam commented Oct 30, 2016

2016-10-30 15:46:28: antoine commented


Recently, I've started seeing a lot of these (improved error handling in r14337):

Error: cannot initialize CUDA
 cuInit failed: unknown error

Not much we can do about it! ("unknown error" - sigh)

@totaam
Copy link
Collaborator Author

totaam commented Nov 10, 2016

2016-11-10 21:07:58: smo commented


Trying to run the tests but maybe i'm missing something? This is weird it should work like you said.

[cosmo@explosivo src] $ pwd
/home/cosmo/work/Xpra/trunk/src
[cosmo@explosivo src] $ mkdir tmp && cd tmp && cp -apr ../tests ./
[cosmo@explosivo tmp] $ PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 
Traceback (most recent call last):
  File "./tests/xpra/codecs/test_nvenc7.py", line 7, in <module>
    from tests.xpra.codecs import test_nvenc
ImportError: No module named tests.xpra.codecs
[cosmo@explosivo tmp] $ pwd
/home/cosmo/work/Xpra/trunk/src/tmp
[cosmo@explosivo tmp] $ ls
tests

Not entirely sure why it is complaining or ignoring PYTHONPATH

@totaam
Copy link
Collaborator Author

totaam commented Nov 11, 2016

2016-11-11 05:04:20: antoine commented


you need r14403 which is just this:

touch tests/__init__.py

@totaam
Copy link
Collaborator Author

totaam commented Nov 17, 2016

2016-11-17 03:13:52: smo commented


After installing python2-pycuda from your repo this is what is happening while trying to run that test.

PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 

2016-11-16 19:07:05,771 CUDA initialization (this may take a few seconds)
2016-11-16 19:07:06,027 CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
2016-11-16 19:07:06,027   + Graphics Device @ 0000:05:00.0 (memory: 89% free, compute: 6.1)
2016-11-16 19:07:06,156 NVidia driver version 370.28
2016-11-16 19:07:08,143 NVENC successfully initialized
creating sample data for size 4096
Traceback (most recent call last):
  File "./tests/xpra/codecs/test_nvenc7.py", line 23, in <module>
    main()
  File "./tests/xpra/codecs/test_nvenc7.py", line 12, in main
    test_nvenc.test_encode_one()
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_nvenc.py", line 30, in test_encode_one
    test_encoder(encoder_module)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 94, in test_encoder
    do_test_encoder(e, src_format, actual_w, actual_h, images, log=log, after_encode_cb=after_encode_cb)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 120, in do_test_encoder
    c = encoder.compress_image(image)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2035, in xpra.codecs.nvenc7.encoder.Encoder.compress_image (xpra/codecs/nvenc7/encoder.c:24541)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2213, in xpra.codecs.nvenc7.encoder.Encoder.do_compress_image (xpra/codecs/nvenc7/encoder.c:27908)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1317, in xpra.codecs.nvenc7.encoder.raiseNVENC (xpra/codecs/nvenc7/encoder.c:9269)
xpra.codecs.nvenc7.encoder.NVENCException: locking output buffer - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.

After pressing ctrl+c the whole system locks up I have no choice but to power cycle it.

@totaam
Copy link
Collaborator Author

totaam commented Nov 17, 2016

2016-11-17 03:15:04: smo changed owner from smo to antoine

@totaam
Copy link
Collaborator Author

totaam commented Nov 17, 2016

2016-11-17 07:15:28: antoine changed owner from antoine to smo

@totaam
Copy link
Collaborator Author

totaam commented Nov 17, 2016

2016-11-17 07:15:28: antoine commented


That's the error I was seeing before, which was meant to be fixed in r14303.
Please include more details about the setup: full revision, GPU details, ie:

python ./xpra/codecs/nv_util.py
python ./xpra/codecs/cuda_common/cuda_context.py

etc.

@totaam
Copy link
Collaborator Author

totaam commented Nov 17, 2016

2016-11-17 20:50:56: smo commented


python ./xpra/codecs/nv_util.py
2016-11-17 12:48:46,868 NVidia driver version 370.28
2016-11-17 12:48:46,868 NVENC license keys:
2016-11-17 12:48:46,887 * version common: 0 key(s)
2016-11-17 12:48:46,887 * version 7: 0 key(s)
2016-11-17 12:48:46,892 
2016-11-17 12:48:46,892 1 card:
2016-11-17 12:48:46,901 * 0
2016-11-17 12:48:46,902   - clock-info-graphics           : 1354
2016-11-17 12:48:46,902   - clock-info-graphics-max       : 1974
2016-11-17 12:48:46,902   - clock-info-mem                : 3504
2016-11-17 12:48:46,902   - clock-info-mem-max            : 3504
2016-11-17 12:48:46,902   - clock-info-sm                 : 1354
2016-11-17 12:48:46,902   - clock-info-sm-max             : 1974
2016-11-17 12:48:46,902   - fan-speed                     : 30
2016-11-17 12:48:46,902   - memory
2016-11-17 12:48:46,903     - free                        : 3715563520
2016-11-17 12:48:46,903     - total                       : 4234018816
2016-11-17 12:48:46,903     - used                        : 518455296
2016-11-17 12:48:46,903   - name                          : Graphics Device
2016-11-17 12:48:46,903   - pci
2016-11-17 12:48:46,903     - bus                         : 5
2016-11-17 12:48:46,903     - busId                       : 0000:05:00.0
2016-11-17 12:48:46,903     - device                      : 0
2016-11-17 12:48:46,904     - domain                      : 0
2016-11-17 12:48:46,904     - pciDeviceId                 : 478286046
2016-11-17 12:48:46,904     - pciSubSystemId              : 1649621058
2016-11-17 12:48:46,904   - pcie-link-generation          : 2
2016-11-17 12:48:46,904   - pcie-link-generation-max      : 2
2016-11-17 12:48:46,904   - pcie-link-width               : 16
2016-11-17 12:48:46,904   - pcie-link-width-max           : 16
2016-11-17 12:48:46,904   - power-state                   : 0
2016-11-17 12:48:46,904   - temperature                   : 33
2016-11-17 12:48:46,905   - uuid                          : GPU-f6898fc2-4cc7-0e8c-5b3e-02000396306e
2016-11-17 12:48:46,905   - vbios-version                 : 86.07.22.00.50
python ./xpra/codecs/cuda_common/cuda_context.py
2016-11-17 12:49:28,438 pycuda_info
2016-11-17 12:49:28,439 CUDA initialization (this may take a few seconds)
2016-11-17 12:49:28,674 CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
2016-11-17 12:49:28,674   + Graphics Device @ 0000:05:00.0 (memory: 86% free, compute: 6.1)
2016-11-17 12:49:28,789 * version                         : 2016.1.2
2016-11-17 12:49:28,789   - text                          : 2016.1.2
2016-11-17 12:49:28,790 cuda_info
2016-11-17 12:49:28,790 * driver
2016-11-17 12:49:28,790   - driver_version                : 8000
2016-11-17 12:49:28,790   - version                       : 7.5.0
2016-11-17 12:49:28,790 preferences:
rpm -qa xpra
xpra-1.0-0.20161115r14430.fc24.x86_64

I'm using the rpm packages from the beta repo. Do you think building this myself would make a difference?

@totaam
Copy link
Collaborator Author

totaam commented Nov 19, 2016

2016-11-19 04:24:43: antoine commented


"Graphics Device"... sigh.

Can you please try with the latest drivers to see if that improves things: 375.20 is out.
Please also post the gl_check output, we may be able to get more GPU information that way. (though loading opengl could also be a problem in itself..)

I'll try to downgrade mine.

For the record, here's what I get with my overpriced GTX 1070 (trick of the day: XPRA_LOG_FORMAT):

XPRA_LOG_FORMAT="" python ./xpra/codecs/nv_util.py 
NVidia driver version 375.10
NVENC license keys:
* version common: 0 key(s)
* version 7: 0 key(s)

1 card:
* 0
  - clock-info-graphics           : 961
  - clock-info-graphics-max       : 1987
  - clock-info-mem                : 4006
  - clock-info-mem-max            : 4004
  - clock-info-sm                 : 961
  - clock-info-sm-max             : 1987
  - fan-speed                     : 0
  - memory
    - free                        : 7338196992
    - total                       : 8507162624
    - used                        : 1168965632
  - name                          : GeForce GTX 1070
  - pci
    - bus                         : 1
    - busId                       : 0000:01:00.0
    - device                      : 0
    - domain                      : 0
    - pciDeviceId                 : 461443294
    - pciSubSystemId              : 0
  - pcie-link-generation          : 2
  - pcie-link-generation-max      : 2
  - pcie-link-width               : 16
  - pcie-link-width-max           : 16
  - power-state                   : 0
  - temperature                   : 57
  - uuid                          : GPU-5ae4275b-349c-124a-b4ac-072e50f886f2
  - vbios-version                 : 86.04.26.00.3E
XPRA_LOG_FORMAT="" ./xpra/codecs/cuda_common/cuda_context.py 
pycuda_info
CUDA initialization (this may take a few seconds)
CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
  + GeForce GTX 1070 @ 0000:01:00.0 (memory: 85% free, compute: 6.1)
* version                         : 2016.1.2
  - text                          : 2016.1.2
cuda_info
* driver
  - driver_version                : 8000
  - version                       : 7.5.0
preferences:

@totaam
Copy link
Collaborator Author

totaam commented Nov 22, 2016

2016-11-22 19:57:16: smo commented


Thanks for the trick of the day :) Video card name is showing up properly with the new driver.

XPRA_LOG_FORMAT="" python ./xpra/codecs/nv_util.py 
NVidia driver version 375.20
NVENC license keys:
* version common: 0 key(s)
* version 7: 0 key(s)

1 card:
* 0
  - clock-info-graphics           : 759
  - clock-info-graphics-max       : 1974
  - clock-info-mem                : 810
  - clock-info-mem-max            : 3504
  - clock-info-sm                 : 759
  - clock-info-sm-max             : 1974
  - fan-speed                     : 30
  - memory
    - free                        : 3770810368
    - total                       : 4267573248
    - used                        : 496762880
  - name                          : GeForce GTX 1050 Ti
  - pci
    - bus                         : 5
    - busId                       : 0000:05:00.0
    - device                      : 0
    - domain                      : 0
    - pciDeviceId                 : 478286046
    - pciSubSystemId              : 1649621058
  - pcie-link-generation          : 2
  - pcie-link-generation-max      : 2
  - pcie-link-width               : 16
  - pcie-link-width-max           : 16
  - power-state                   : 5
  - temperature                   : 32
  - uuid                          : GPU-f6898fc2-4cc7-0e8c-5b3e-02000396306e
  - vbios-version                 : 86.07.22.00.50
XPRA_LOG_FORMAT="" ./xpra/codecs/cuda_common/cuda_context.py
pycuda_info
CUDA initialization (this may take a few seconds)
CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
  + GeForce GTX 1050 Ti @ 0000:05:00.0 (memory: 87% free, compute: 6.1)
* version                         : 2016.1.2
  - text                          : 2016.1.2
cuda_info
* driver
  - driver_version                : 8000
  - version                       : 7.5.0
XPRA_LOG_FORMAT="" ./xpra/client/gl/gl_check.py 
OpenGL_accelerate module loaded


OpenGL properties:
* GLU.extensions                  : GLU_EXT_nurbs_tessellator GLU_EXT_object_space_tess 
* GLU.version                     : 1.3
* accelerate                      : 3.1.1a1
* display_mode                    : ALPHA, SINGLE
* extensions                      : GL_AMD_multi_draw_indirect, GL_AMD_seamless_cubemap_per_texture, GL_AMD_vertex_shader_viewport_index, GL_AMD_vertex_shader_layer, GL_ARB_arrays_of_arrays, GL_ARB_base_instance, GL_ARB_bindless_texture, GL_ARB_blend_func_extended, GL_ARB_buffer_storage, GL_ARB_clear_buffer_object, GL_ARB_clear_texture, GL_ARB_clip_control, GL_ARB_color_buffer_float, GL_ARB_compatibility, GL_ARB_compressed_texture_pixel_storage, GL_ARB_conservative_depth, GL_ARB_compute_shader, GL_ARB_compute_variable_group_size, GL_ARB_conditional_render_inverted, GL_ARB_copy_buffer, GL_ARB_copy_image, GL_ARB_cull_distance, GL_ARB_debug_output, GL_ARB_depth_buffer_float, GL_ARB_depth_clamp, GL_ARB_depth_texture, GL_ARB_derivative_control, GL_ARB_direct_state_access, GL_ARB_draw_buffers, GL_ARB_draw_buffers_blend, GL_ARB_draw_indirect, GL_ARB_draw_elements_base_vertex, GL_ARB_draw_instanced, GL_ARB_enhanced_layouts, GL_ARB_ES2_compatibility, GL_ARB_ES3_compatibility, GL_ARB_ES3_1_compatibility, GL_ARB_ES3_2_compatibility, GL_ARB_explicit_attrib_location, GL_ARB_explicit_uniform_location, GL_ARB_fragment_coord_conventions, GL_ARB_fragment_layer_viewport, GL_ARB_fragment_program, GL_ARB_fragment_program_shadow, GL_ARB_fragment_shader, GL_ARB_fragment_shader_interlock, GL_ARB_framebuffer_no_attachments, GL_ARB_framebuffer_object, GL_ARB_framebuffer_sRGB, GL_ARB_geometry_shader4, GL_ARB_get_program_binary, GL_ARB_get_texture_sub_image, GL_ARB_gl_spirv, GL_ARB_gpu_shader5, GL_ARB_gpu_shader_fp64, GL_ARB_gpu_shader_int64, GL_ARB_half_float_pixel, GL_ARB_half_float_vertex, GL_ARB_imaging, GL_ARB_indirect_parameters, GL_ARB_instanced_arrays, GL_ARB_internalformat_query, GL_ARB_internalformat_query2, GL_ARB_invalidate_subdata, GL_ARB_map_buffer_alignment, GL_ARB_map_buffer_range, GL_ARB_multi_bind, GL_ARB_multi_draw_indirect, GL_ARB_multisample, GL_ARB_multitexture, GL_ARB_occlusion_query, GL_ARB_occlusion_query2, GL_ARB_parallel_shader_compile, GL_ARB_pipeline_statistics_query, GL_ARB_pixel_buffer_object, GL_ARB_point_parameters, GL_ARB_point_sprite, GL_ARB_post_depth_coverage, GL_ARB_program_interface_query, GL_ARB_provoking_vertex, GL_ARB_query_buffer_object, GL_ARB_robust_buffer_access_behavior, GL_ARB_robustness, GL_ARB_sample_locations, GL_ARB_sample_shading, GL_ARB_sampler_objects, GL_ARB_seamless_cube_map, GL_ARB_seamless_cubemap_per_texture, GL_ARB_separate_shader_objects, GL_ARB_shader_atomic_counter_ops, GL_ARB_shader_atomic_counters, GL_ARB_shader_ballot, GL_ARB_shader_bit_encoding, GL_ARB_shader_clock, GL_ARB_shader_draw_parameters, GL_ARB_shader_group_vote, GL_ARB_shader_image_load_store, GL_ARB_shader_image_size, GL_ARB_shader_objects, GL_ARB_shader_precision, GL_ARB_shader_storage_buffer_object, GL_ARB_shader_subroutine, GL_ARB_shader_texture_image_samples, GL_ARB_shader_texture_lod, GL_ARB_shading_language_100, GL_ARB_shader_viewport_layer_array, GL_ARB_shading_language_420pack, GL_ARB_shading_language_include, GL_ARB_shading_language_packing, GL_ARB_shadow, GL_ARB_sparse_buffer, GL_ARB_sparse_texture, GL_ARB_sparse_texture2, GL_ARB_sparse_texture_clamp, GL_ARB_stencil_texturing, GL_ARB_sync, GL_ARB_tessellation_shader, GL_ARB_texture_barrier, GL_ARB_texture_border_clamp, GL_ARB_texture_buffer_object, GL_ARB_texture_buffer_object_rgb32, GL_ARB_texture_buffer_range, GL_ARB_texture_compression, GL_ARB_texture_compression_bptc, GL_ARB_texture_compression_rgtc, GL_ARB_texture_cube_map, GL_ARB_texture_cube_map_array, GL_ARB_texture_env_add, GL_ARB_texture_env_combine, GL_ARB_texture_env_crossbar, GL_ARB_texture_env_dot3, GL_ARB_texture_filter_minmax, GL_ARB_texture_float, GL_ARB_texture_gather, GL_ARB_texture_mirror_clamp_to_edge, GL_ARB_texture_mirrored_repeat, GL_ARB_texture_multisample, GL_ARB_texture_non_power_of_two, GL_ARB_texture_query_levels, GL_ARB_texture_query_lod, GL_ARB_texture_rectangle, GL_ARB_texture_rg, GL_ARB_texture_rgb10_a2ui, GL_ARB_texture_stencil8, GL_ARB_texture_storage, GL_ARB_texture_storage_multisample, GL_ARB_texture_swizzle, GL_ARB_texture_view, GL_ARB_timer_query, GL_ARB_transform_feedback2, GL_ARB_transform_feedback3, GL_ARB_transform_feedback_instanced, GL_ARB_transform_feedback_overflow_query, GL_ARB_transpose_matrix, GL_ARB_uniform_buffer_object, GL_ARB_vertex_array_bgra, GL_ARB_vertex_array_object, GL_ARB_vertex_attrib_64bit, GL_ARB_vertex_attrib_binding, GL_ARB_vertex_buffer_object, GL_ARB_vertex_program, GL_ARB_vertex_shader, GL_ARB_vertex_type_10f_11f_11f_rev, GL_ARB_vertex_type_2_10_10_10_rev, GL_ARB_viewport_array, GL_ARB_window_pos, GL_ATI_draw_buffers, GL_ATI_texture_float, GL_ATI_texture_mirror_once, GL_S3_s3tc, GL_EXT_texture_env_add, GL_EXT_abgr, GL_EXT_bgra, GL_EXT_bindable_uniform, GL_EXT_blend_color, GL_EXT_blend_equation_separate, GL_EXT_blend_func_separate, GL_EXT_blend_minmax, GL_EXT_blend_subtract, GL_EXT_compiled_vertex_array, GL_EXT_Cg_shader, GL_EXT_depth_bounds_test, GL_EXT_direct_state_access, GL_EXT_draw_buffers2, GL_EXT_draw_instanced, GL_EXT_draw_range_elements, GL_EXT_fog_coord, GL_EXT_framebuffer_blit, GL_EXT_framebuffer_multisample, GL_EXTX_framebuffer_mixed_formats, GL_EXT_framebuffer_multisample_blit_scaled, GL_EXT_framebuffer_object, GL_EXT_framebuffer_sRGB, GL_EXT_geometry_shader4, GL_EXT_gpu_program_parameters, GL_EXT_gpu_shader4, GL_EXT_multi_draw_arrays, GL_EXT_packed_depth_stencil, GL_EXT_packed_float, GL_EXT_packed_pixels, GL_EXT_pixel_buffer_object, GL_EXT_point_parameters, GL_EXT_polygon_offset_clamp, GL_EXT_post_depth_coverage, GL_EXT_provoking_vertex, GL_EXT_raster_multisample, GL_EXT_rescale_normal, GL_EXT_secondary_color, GL_EXT_separate_shader_objects, GL_EXT_separate_specular_color, GL_EXT_shader_image_load_formatted, GL_EXT_shader_image_load_store, GL_EXT_shader_integer_mix, GL_EXT_shadow_funcs, GL_EXT_sparse_texture2, GL_EXT_stencil_two_side, GL_EXT_stencil_wrap, GL_EXT_texture3D, GL_EXT_texture_array, GL_EXT_texture_buffer_object, GL_EXT_texture_compression_dxt1, GL_EXT_texture_compression_latc, GL_EXT_texture_compression_rgtc, GL_EXT_texture_compression_s3tc, GL_EXT_texture_cube_map, GL_EXT_texture_edge_clamp, GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3, GL_EXT_texture_filter_anisotropic, GL_EXT_texture_filter_minmax, GL_EXT_texture_integer, GL_EXT_texture_lod, GL_EXT_texture_lod_bias, GL_EXT_texture_mirror_clamp, GL_EXT_texture_object, GL_EXT_texture_shared_exponent, GL_EXT_texture_sRGB, GL_EXT_texture_sRGB_decode, GL_EXT_texture_storage, GL_EXT_texture_swizzle, GL_EXT_timer_query, GL_EXT_transform_feedback2, GL_EXT_vertex_array, GL_EXT_vertex_array_bgra, GL_EXT_vertex_attrib_64bit, GL_EXT_x11_sync_object, GL_EXT_import_sync_object, GL_NV_robustness_video_memory_purge, GL_IBM_rasterpos_clip, GL_IBM_texture_mirrored_repeat, GL_KHR_context_flush_control, GL_KHR_debug, GL_KHR_no_error, GL_KHR_robust_buffer_access_behavior, GL_KHR_robustness, GL_KTX_buffer_region, GL_NV_alpha_to_coverage_dither_control, GL_NV_bindless_multi_draw_indirect, GL_NV_bindless_multi_draw_indirect_count, GL_NV_bindless_texture, GL_NV_blend_equation_advanced, GL_NV_blend_equation_advanced_coherent, GL_NVX_blend_equation_advanced_multi_draw_buffers, GL_NV_blend_square, GL_NV_clip_space_w_scaling, GL_NV_command_list, GL_NV_compute_program5, GL_NV_conditional_render, GL_NV_conservative_raster, GL_NV_conservative_raster_dilate, GL_NV_conservative_raster_pre_snap_triangles, GL_NV_copy_depth_to_color, GL_NV_copy_image, GL_NV_depth_buffer_float, GL_NV_depth_clamp, GL_NV_draw_texture, GL_NV_draw_vulkan_image, GL_NV_ES1_1_compatibility, GL_NV_ES3_1_compatibility, GL_NV_explicit_multisample, GL_NV_fence, GL_NV_fill_rectangle, GL_NV_float_buffer, GL_NV_fog_distance, GL_NV_fragment_coverage_to_color, GL_NV_fragment_program, GL_NV_fragment_program_option, GL_NV_fragment_program2, GL_NV_fragment_shader_interlock, GL_NV_framebuffer_mixed_samples, GL_NV_framebuffer_multisample_coverage, GL_NV_geometry_shader4, GL_NV_geometry_shader_passthrough, GL_NV_gpu_program4, GL_NV_internalformat_sample_query, GL_NV_gpu_program4_1, GL_NV_gpu_program5, GL_NV_gpu_program5_mem_extended, GL_NV_gpu_program_fp64, GL_NV_gpu_shader5, GL_NV_half_float, GL_NV_light_max_exponent, GL_NV_multisample_coverage, GL_NV_multisample_filter_hint, GL_NV_occlusion_query, GL_NV_packed_depth_stencil, GL_NV_parameter_buffer_object, GL_NV_parameter_buffer_object2, GL_NV_path_rendering, GL_NV_path_rendering_shared_edge, GL_NV_pixel_data_range, GL_NV_point_sprite, GL_NV_primitive_restart, GL_NV_register_combiners, GL_NV_register_combiners2, GL_NV_sample_locations, GL_NV_sample_mask_override_coverage, GL_NV_shader_atomic_counters, GL_NV_shader_atomic_float, GL_NV_shader_atomic_float64, GL_NV_shader_atomic_fp16_vector, GL_NV_shader_atomic_int64, GL_NV_shader_buffer_load, GL_NV_shader_storage_buffer_object, GL_NV_stereo_view_rendering, GL_NV_texgen_reflection, GL_NV_texture_barrier, GL_NV_texture_compression_vtc, GL_NV_texture_env_combine4, GL_NV_texture_multisample, GL_NV_texture_rectangle, GL_NV_texture_shader, GL_NV_texture_shader2, GL_NV_texture_shader3, GL_NV_transform_feedback, GL_NV_transform_feedback2, GL_NV_uniform_buffer_unified_memory, GL_NV_vdpau_interop, GL_NV_vertex_array_range, GL_NV_vertex_array_range2, GL_NV_vertex_attrib_integer_64bit, GL_NV_vertex_buffer_unified_memory, GL_NV_vertex_program, GL_NV_vertex_program1_1, GL_NV_vertex_program2, GL_NV_vertex_program2_option, GL_NV_vertex_program3, GL_NV_viewport_array2, GL_NV_viewport_swizzle, GL_NVX_conditional_render, GL_NVX_gpu_memory_info, GL_NVX_nvenc_interop, GL_NV_shader_thread_group, GL_NV_shader_thread_shuffle, GL_KHR_blend_equation_advanced, GL_KHR_blend_equation_advanced_coherent, GL_SGIS_generate_mipmap, GL_SGIS_texture_lod, GL_SGIX_depth_texture, GL_SGIX_shadow, GL_SUN_slice_accum, 
* gdkgl
  - version                       : 1.4
* gdkglext
  - version                       : 1.2.0
* glconfig                        : <gtk.gdkgl.Config object at 0x7f9095059c30 (GdkGLConfigImplX11 at 0x55ab518c1920)>
* gtkglext
  - version                       : 1.2.0
* has_alpha                       : True
* max-viewport-dims               : (32768, 32768)
* opengl                          : 4, 5
* pygdkglext
  - version                       : 1.1.0
* pyopengl                        : 3.1.1a1
* renderer                        : GeForce GTX 1050 Ti/PCIe/SSE2
* rgba                            : True
* safe                            : True
* shading-language-version        : 4.50 NVIDIA
* texture-size-limit              : 32768
* transparency                    : True
* vendor                          : NVIDIA Corporation
* zerocopy                        : True

@totaam
Copy link
Collaborator Author

totaam commented Nov 22, 2016

2016-11-22 20:02:38: smo commented


Pasting this before I ctrl+c my client because I think it will lock up my machine.


PYTHONPATH=. ./tests/xpra/codecs/test_nvenc7.py 
2016-11-22 12:00:28,143 CUDA initialization (this may take a few seconds)
2016-11-22 12:00:28,388 CUDA 7.5.0 / PyCUDA 2016.1.2, found 1 device:
2016-11-22 12:00:28,388   + GeForce GTX 1050 Ti @ 0000:05:00.0 (memory: 86% free, compute: 6.1)
2016-11-22 12:00:28,498 NVidia driver version 375.20
2016-11-22 12:00:30,496 NVENC successfully initialized
creating sample data for size 4096

Traceback (most recent call last):
  File "./tests/xpra/codecs/test_nvenc7.py", line 23, in <module>
    main()
  File "./tests/xpra/codecs/test_nvenc7.py", line 12, in main
    test_nvenc.test_encode_one()
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_nvenc.py", line 30, in test_encode_one
    test_encoder(encoder_module)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 94, in test_encoder
    do_test_encoder(e, src_format, actual_w, actual_h, images, log=log, after_encode_cb=after_encode_cb)
  File "/home/cosmo/work/Xpra/trunk/src/tmp/tests/xpra/codecs/test_encoder.py", line 120, in do_test_encoder
    c = encoder.compress_image(image)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2035, in xpra.codecs.nvenc7.encoder.Encoder.compress_image (xpra/codecs/nvenc7/encoder.c:24541)
  File "xpra/codecs/nvenc7/encoder.pyx", line 2213, in xpra.codecs.nvenc7.encoder.Encoder.do_compress_image (xpra/codecs/nvenc7/encoder.c:27908)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1317, in xpra.codecs.nvenc7.encoder.raiseNVENC (xpra/codecs/nvenc7/encoder.c:9269)
xpra.codecs.nvenc7.encoder.NVENCException: locking output buffer - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.

@totaam
Copy link
Collaborator Author

totaam commented Nov 22, 2016

2016-11-22 20:04:43: smo commented


Yes this did lock up my workstation.

@totaam
Copy link
Collaborator Author

totaam commented Nov 23, 2016

2016-11-23 09:48:44: antoine commented


r14473 + r14472 require newer drivers (375.x or later) so we can be sure that we'll detect the newer cards and then we blacklist the 10xx ones.

At some later point, we can relax this check when we figure out what works and what doesn't...

@totaam
Copy link
Collaborator Author

totaam commented Feb 7, 2017

2017-02-07 00:21:59: smo commented


Tested today with 1.0.2

Warning: device 'GeForce GTX 1050 Ti @ 0000:05:00.0' is blacklisted and will not be used
NVidia driver version 375.26

@totaam
Copy link
Collaborator Author

totaam commented Feb 7, 2017

2017-02-07 00:22:05: smo changed status from new to closed

@totaam
Copy link
Collaborator Author

totaam commented Feb 7, 2017

2017-02-07 00:22:05: smo set resolution to fixed

@totaam totaam closed this as completed Feb 7, 2017
@totaam
Copy link
Collaborator Author

totaam commented Jun 18, 2017

2017-06-18 14:51:17: antoine commented


Follow up in #1550: other cards get the API error now, fortunately without the lockups.
NVENC v8 support in #1552.

@totaam
Copy link
Collaborator Author

totaam commented Apr 28, 2018

2018-04-28 13:29:08: antoine commented


For nvenc v8 support see #1823

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant