Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVENC7 active but not used? #1550

Closed
totaam opened this issue Jun 15, 2017 · 17 comments
Closed

NVENC7 active but not used? #1550

totaam opened this issue Jun 15, 2017 · 17 comments

Comments

@totaam
Copy link
Collaborator

totaam commented Jun 15, 2017

Issue migrated from trac ticket # 1550

component: encodings | priority: major | resolution: fixed | keywords: nvenc

2017-06-15 17:01:51: DocMAX created the issue


I did everything like described on the website to make NVENC7 work. (NVENC8 seems to be not supported yet)
Everything looks like NVENC is up and running, but it seems hardware acceleration is not done. (CPU is at 100% on server, framerate on client is at about 10fps).

Maybe i need license keys? But would i get a notice on xpra start about this?

Versions Client + Server:

xpra v2.1 (svn)
Linux game 4.11.4-1-ARCH #1 SMP PREEMPT Fri Jun 9 07:46:48 CEST 2017 x86_64 GNU/Linux
Geforce GTX 970

Cmdline Server:

xpra start :100 --auth=none --daemon=no --video-encoders=nvenc

Cmdline Client:

xpra attach ssh:game:100

Diags Server:

video.encoding.video-encoder.ffmpeg=disabled
video.encoding.video-encoder.nvenc=active
video.encoding.video-encoder.vpx=disabled
video.encoding.video-encoder.x264=disabled
video.encoding.video-encoder.x265=disabled
2017-06-15 17:50:23,872 NVidia driver version 381.22
2017-06-15 17:50:23,872 NVENC license keys:
2017-06-15 17:50:23,880 * version common: 0 key(s)
2017-06-15 17:50:23,880 * version 7: 0 key(s)
Jun 15 17:54:51 game xpra[14208]: X.Org X Server 1.19.3
Jun 15 17:54:51 game xpra[14208]: Release Date: 2017-03-15
Jun 15 17:54:51 game xpra[14208]: X Protocol Version 11, Revision 0
Jun 15 17:54:51 game xpra[14208]: Build Operating System: Linux 4.9.11-1-ARCH x86_64
Jun 15 17:54:51 game xpra[14208]: Current Operating System: Linux game 4.11.4-1-ARCH #1 SMP PREEMPT Fri Jun 9 07:46:48 CEST 2017 x86_64
Jun 15 17:54:51 game xpra[14208]: Kernel command line: initrd=\kernel\arch\initramfs-linux.img root=LABEL=arch rw intel_iommu=on loglevel=3 modprobe.blacklist=nouveau
Jun 15 17:54:51 game xpra[14208]: Build Date: 07 April 2017  05:42:48PM
Jun 15 17:54:51 game xpra[14208]:  
Jun 15 17:54:51 game xpra[14208]: Current version of pixman: 0.34.0
Jun 15 17:54:51 game xpra[14208]:         Before reporting problems, check http://wiki.x.org
Jun 15 17:54:51 game xpra[14208]:         to make sure that you have the latest version.
Jun 15 17:54:51 game xpra[14208]: Markers: (--) probed, (**) from config file, (==) default setting,
Jun 15 17:54:51 game xpra[14208]:         (++) from command line, (!!) notice, (II) informational,
Jun 15 17:54:51 game xpra[14208]:         (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
Jun 15 17:54:51 game xpra[14208]: (++) Log file: "/var/run/user/1000/xpra/Xorg.:100.log", Time: Thu Jun 15 17:54:51 2017
Jun 15 17:54:51 game xpra[14208]: (++) Using config file: "/etc/xpra/xorg.conf"
Jun 15 17:54:51 game xpra[14208]: (==) Using system config directory "/usr/share/X11/xorg.conf.d"
Jun 15 17:54:51 game xpra[14208]: Warning: some of the sockets are in an unknown state:
Jun 15 17:54:51 game xpra[14208]:  /run/user/1000/xpra/game-100
Jun 15 17:54:51 game xpra[14208]:  /var/run/xpra/game-100
Jun 15 17:54:51 game xpra[14208]:  please wait as we allow the socket probing to timeout
Jun 15 17:54:57 game xpra[14208]: created unix domain socket: /run/user/1000/xpra/game-100
Jun 15 17:54:57 game xpra[14208]: created unix domain socket: /var/run/xpra/game-100
Jun 15 17:54:58 game xpra[14208]: Warning: webcam forwarding is disabled
Jun 15 17:54:58 game xpra[14208]:  the virtual video directory '/sys/devices/virtual/video4linux' was not found
Jun 15 17:54:58 game xpra[14208]:  make sure that the 'v4l2loopback' kernel module is installed and loaded
Jun 15 17:54:58 game xpra[14208]: found 0 virtual video devices for webcam forwarding
Jun 15 17:54:58 game xpra[14208]: pulseaudio server started with pid 14284
Jun 15 17:54:58 game xpra[14208]: GStreamer version 1.12.0 for Python 2.7.13 64-bit
Jun 15 17:54:58 game xpra[14208]: D-Bus notification forwarding is available
Jun 15 17:54:58 game xpra[14208]: xpra X11 version 2.1 64-bit
Jun 15 17:54:58 game xpra[14208]:  uid=1000 (docmax), gid=100 (users)
Jun 15 17:54:58 game xpra[14208]:  running with pid 14208 on Linux
Jun 15 17:54:58 game xpra[14208]:  connected to X11 display :100 with 24 bit colors
Jun 15 17:54:58 game xpra[14208]: xpra is ready.
Jun 15 17:54:59 game xpra[14208]: printer forwarding enabled using postscript and pdf
Jun 15 17:54:59 game xpra[14208]: 11.8GB of system memory
@totaam
Copy link
Collaborator Author

totaam commented Jun 15, 2017

2017-06-15 18:04:46: DocMAX commented


looking in -d loader looks like the module pycuda is missing, just because in arch it doesnt work with cuda-7.5. will try further..

@totaam
Copy link
Collaborator Author

totaam commented Jun 15, 2017

2017-06-15 22:06:30: DocMAX commented


NVEC is now initialized successfully.
But still my client is way too slow!
Whats the problem here?

@totaam
Copy link
Collaborator Author

totaam commented Jun 16, 2017

2017-06-16 00:21:59: DocMAX commented


Checked the log again... do i really need a license key? Or is it a bug? How do i know?


2017-06-16 01:21:32,987 pycuda_info
2017-06-16 01:21:32,987 CUDA initialization (this may take a few seconds)
2017-06-16 01:21:33,093 CUDA 8.0.0 / PyCUDA 2017.1, found 1 device:
2017-06-16 01:21:33,093   + GeForce GTX 970 @ 0000:01:00.0 (memory: 93% free, compute: 5.2)
2017-06-16 01:21:33,122 * version                         : 2017.1
2017-06-16 01:21:33,122   - text                          : 2017.1
2017-06-16 01:21:33,122 cuda_info
2017-06-16 01:21:33,122 * driver
2017-06-16 01:21:33,122   - driver_version                : 8000
2017-06-16 01:21:33,122   - version                       : 8.0.0
2017-06-16 01:21:33,122 preferences:
2017-06-16 01:21:33,122 * blacklist                       : GTX 10

2017-06-16 01:17:36,174 init_cuda failed
Traceback (most recent call last):
  File "xpra/codecs/nvenc7/encoder.pyx", line 1492, in xpra.codecs.nvenc7.encoder.Encoder.init_context (xpra/codecs/nvenc7/encoder.c:12593)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1657, in xpra.codecs.nvenc7.encoder.Encoder.init_nvenc (xpra/codecs/nvenc7/encoder.c:17379)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1672, in xpra.codecs.nvenc7.encoder.Encoder.init_encoder (xpra/codecs/nvenc7/encoder.c:17676)
  File "xpra/codecs/nvenc7/encoder.pyx", line 1335, in xpra.codecs.nvenc7.encoder.raiseNVENC (xpra/codecs/nvenc7/encoder.c:9505)
NVENCException: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.
2017-06-16 01:17:36,174 encoder nvenc(BGRA/BGRX/H264 - low-latency-hq - 1920x1080) failed: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.
2017-06-16 01:17:36,174 error during NVENC encoder test: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.
2017-06-16 01:17:36,174  a license key may be required

@totaam
Copy link
Collaborator Author

totaam commented Jun 16, 2017

Looks to me like this bug: #1260 comment 16, we blacklisted GTX 10x0 cards - looks like this now affects other cards. Using an older / newer driver version may help.

(or maybe you do need a license key - unlikely, I think you should always get 2 contexts on consumer cards)

@totaam
Copy link
Collaborator Author

totaam commented Jun 16, 2017

Oh, and btw, xpra v2.1 (svn) is not a full version number, always include the full version with the exact svn revision.

@totaam
Copy link
Collaborator Author

totaam commented Jun 16, 2017

2017-06-16 09:00:16: DocMAX commented


well, its the result of xpra --version. i dont know where else i can see the build number.

@totaam
Copy link
Collaborator Author

totaam commented Jun 16, 2017

well, its the result of xpra --version. i dont know where else i can see the build number.

When building from an svn checkout, the svnrevision should be included automatically. (captured from the output of svnversion where the code is built)
When using packages, the svn version should be included already and in any case it is included in the package filename.

@totaam
Copy link
Collaborator Author

totaam commented Jun 18, 2017

NVENC SDK8 supported added in #1552 - this makes no difference as this new version adds almost nothing.

Will test using my GTX 970 when I get back.

@totaam
Copy link
Collaborator Author

totaam commented Jun 22, 2017

2017-06-22 04:09:51: DocMAX commented


any updates?
i'm stuck with

error during NVENC encoder test: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.

@totaam
Copy link
Collaborator Author

totaam commented Jun 22, 2017

2017-06-22 04:14:41: DocMAX commented


oh, and i'm at r16116

@totaam
Copy link
Collaborator Author

totaam commented Jul 16, 2017

2017-07-16 12:56:20: kumofly commented


xpra 16362: the problem still exist, tested with Nvidia Quadro M2000 (driver version is 381.22) and nvenc v8. H265 codec is successfully initialize, but h264 initialization is fail.


^[[36m2017-07-16 14:44:45,201 get_preset(H264) speed=100, quality=50, lossless=False, pixel_format=BGRX, options={160: [('hq', '34DBA71D-A77B-4B8F-9C3E-B6
^[[36m2017-07-16 14:44:45,201 using preset 'low-latency-hq' for speed=100, quality=50, lossless=0, pixel_format=BGRX^[[0m
^[[36m2017-07-16 14:44:45,201 init_params(H264) using preset=low-latency-hq^[[0m
^[[36m2017-07-16 14:44:45,201 9 input format types:^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x1^[[0m
^[[36m2017-07-16 14:44:45,201  + 0x1 : NV12_PL^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x10^[[0m
^[[36m2017-07-16 14:44:45,201  + 0x10 : YV12_PL^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x100^[[0m
^[[36m2017-07-16 14:44:45,201  + 0x100 : IYUV_PL^[[0m
^[[36m2017-07-16 14:44:45,201 * 0x1000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x1000 : YUV444_PL^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x1000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x1000000 : ARGB^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x10000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x10000000 : ABGR^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x4000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x4000000 : AYUV^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x2000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x2000000 : ARGB10^[[0m
^[[36m2017-07-16 14:44:45,202 * 0x20000000^[[0m
^[[36m2017-07-16 14:44:45,202  + 0x20000000 : ABGR10^[[0m
^[[36m2017-07-16 14:44:45,208 init_cuda failed
Traceback (most recent call last):
  File "xpra/codecs/nvenc/encoder.pyx", line 1510, in xpra.codecs.nvenc.encoder.Encoder.init_context (xpra/codecs/nvenc/encoder.c:12595)
  File "xpra/codecs/nvenc/encoder.pyx", line 1675, in xpra.codecs.nvenc.encoder.Encoder.init_nvenc (xpra/codecs/nvenc/encoder.c:17376)
  File "xpra/codecs/nvenc/encoder.pyx", line 1690, in xpra.codecs.nvenc.encoder.Encoder.init_encoder (xpra/codecs/nvenc/encoder.c:17673)
  File "xpra/codecs/nvenc/encoder.pyx", line 1353, in xpra.codecs.nvenc.encoder.raiseNVENC (xpra/codecs/nvenc/encoder.c:9507)
NVENCException: initializing encoder - returned 8: This indicates that one or more of the parameter passed to the API call is invalid.^[[0m
^[[36m2017-07-16 14:44:45,209 encoder nvenc(BGRA/BGRX/H264 - low-latency-hq - 1920x1080) failed: initializing encoder - returned 8: This indicates that on
^[[33m2017-07-16 14:44:45,209 error during NVENC encoder test: initializing encoder - returned 8: This indicates that one or more of the parameter passed
^[[36m2017-07-16 14:44:45,209  a license key may be required^[[0m

@totaam
Copy link
Collaborator Author

totaam commented Jul 17, 2017

2017-07-17 14:24:44: antoine commented


I still get complete system lockups with my GTX 1070.
It's been broken for months, time to get on it.
Other NVENC tickets we should close for 2.1: #1519, #1347, #1317.


Looking at the cuda example that comes with nvenc 8, this is how they do it now in pseudo-code:

  • InitCuda
  • cuDeviceGet
  • cuCtxCreate
  • cuModuleLoadDataEx
  • cuModuleGetFunction "InterleaveUV"
  • cuCtxPopCurrent
  • AllocateIOBuffers:
  • cuMemAlloc * 2
  • cuMemAllocHost * 3 (input buffers)
  • then for each encode buffer:
  • cuMemAllocPitch
  • NvEncRegisterResource
  • NvEncCreateBitstreamBuffer
  • ReleaseIOBuffers:
  • cuMemFree
  • cuMemFreeHost
  • for each encode buffer:
  • NvEncUnregisterResource
  • NvEncDestroyBitstreamBuffer
  • FlushEncoder:
  • NvEncFlushEncoderQueue
  • NvEncUnmapInputResource
  • wait for any pending buffers using ````ProcessOutput`
  • Deinitialize
  • NvEncDestroyEncoder
  • ConvertYUVToNV12:
  • cuMemcpyHtoD
  • cuLaunchKernel
  • EncodeMain:
  • InitCuda
  • GetPresetGUID
  • AllocateIOBuffers
  • for each frame:
  • load the frame's YUV pixel data from file
  • ConvertYUVToNV12
  • NvEncMapInputResource
  • NvEncEncodeFrame
  • FlushEncoder
  • Deinitialize
  • ProcessOutput:
  • nvEncLockBitstream
  • nvEncUnlockBitstream

The only major differences that I can see:

  • they use multiple input buffers and if there isn't one available, they wait for one to become free via ProcessOutput + NvEncUnmapInputResource
  • maybe somehow we're not using a low-latency preset and so we block waiting for a frame that never comes since we feed them one at a time

The low latency example sets these options:

    encodeConfig.endFrameIdx = INT_MAX;
    encodeConfig.bitrate = 5000000;
    encodeConfig.rcMode = NV_ENC_PARAMS_RC_2_PASS_QUALITY;
    encodeConfig.gopLength = NVENC_INFINITE_GOPLENGTH;
    encodeConfig.deviceType = 0;
    encodeConfig.codec = NV_ENC_H264;
    encodeConfig.fps = 30;
    encodeConfig.qp = 28;
    encodeConfig.i_quant_factor = DEFAULT_I_QFACTOR;
    encodeConfig.b_quant_factor = DEFAULT_B_QFACTOR;  
    encodeConfig.i_quant_offset = DEFAULT_I_QOFFSET;
    encodeConfig.b_quant_offset = DEFAULT_B_QOFFSET; 
    encodeConfig.presetGUID = NV_ENC_PRESET_LOW_LATENCY_HQ_GUID;
    encodeConfig.pictureStruct = NV_ENC_PIC_STRUCT_FRAME;
    encodeConfig.numB = 0;
    m_stCreateEncodeParams.encodeGUID = inputCodecGUID;
    m_stCreateEncodeParams.presetGUID = pEncCfg->presetGUID;
    m_stCreateEncodeParams.encodeWidth = pEncCfg->width;
    m_stCreateEncodeParams.encodeHeight = pEncCfg->height;

    m_stCreateEncodeParams.darWidth = pEncCfg->width;
    m_stCreateEncodeParams.darHeight = pEncCfg->height;
    m_stCreateEncodeParams.frameRateNum = pEncCfg->fps;
    m_stCreateEncodeParams.frameRateDen = 1;
    m_stCreateEncodeParams.enableEncodeAsync = 0;

    m_stCreateEncodeParams.enablePTD = 1;
    m_stCreateEncodeParams.reportSliceOffsets = 0;
    m_stCreateEncodeParams.enableSubFrameWrite = 0;
    m_stCreateEncodeParams.encodeConfig = &m_stEncodeConfig;
    m_stCreateEncodeParams.maxEncodeWidth = m_uMaxWidth;
    m_stCreateEncodeParams.maxEncodeHeight = m_uMaxHeight;
    m_stEncodeConfig.gopLength = pEncCfg->gopLength;
    m_stEncodeConfig.frameIntervalP = pEncCfg->numB + 1;
        m_stEncodeConfig.frameFieldMode = NV_ENC_PARAMS_FRAME_FIELD_MODE_FRAME;

For YUV444:

            m_stEncodeConfig.encodeCodecConfig.hevcConfig.chromaFormatIDC = 3;
#OR:
            m_stEncodeConfig.encodeCodecConfig.h264Config.chromaFormatIDC = 3;

For 10 bit input:

            m_stEncodeConfig.encodeCodecConfig.h264Config.chromaFormatIDC = 3;

etc..

Modifiying the cuda example to print all method calls I see:

$ ./NvEncoderCudaInterop -i /opt/Shared/Xpra-Build-Libs/nvenc_4.0.0_sdk/Samples/YUV/1080p/PixelBlur-1920x1080.yuv  -o test -size 1920 1080 -numB 0
Encoding input           : "/opt/Shared/Xpra-Build-Libs/nvenc_4.0.0_sdk/Samples/YUV/1080p/PixelBlur-1920x1080.yuv"
         output          : "test"
         codec           : "H264"
         size            : 1920x1080
         bitrate         : 5000000 bits/sec
         vbvMaxBitrate   : 0 bits/sec
         vbvSize         : 0 bits
         fps             : 30 frames/sec
         rcMode          : CONSTQP
         goplength       : INFINITE GOP 
         B frames        : 0 
         QP              : 28 
         preset          : DEFAULT

BufferCount : 1 
AsyncMode   : 0 
AllocateIOBuffers
loadframe
ConvertYUVToNV12
NvEncMapInputResource
NvEncEncodeFrame
loadframe
no encode buffer, calling ProcessOutput
ProcessOutput: nvEncLockBitstream
ProcessOutput: nvEncUnlockBitstream
NvEncUnmapInputResource
ConvertYUVToNV12
NvEncMapInputResource
NvEncEncodeFrame
loadframe
no encode buffer, calling ProcessOutput
ProcessOutput: nvEncLockBitstream
ProcessOutput: nvEncUnlockBitstream
NvEncUnmapInputResource
ConvertYUVToNV12
NvEncMapInputResource
NvEncEncodeFrame
loadframe

etc...

loadframe
ProcessOutput: nvEncLockBitstream
ProcessOutput: nvEncUnlockBitstream
Encoded 116 frames in 293.04ms
Avergage Encode Time :   2.53ms

@totaam
Copy link
Collaborator Author

totaam commented Jul 17, 2017

2017-07-17 18:39:52: antoine commented


Well, this has taken many hours and something like ~20 to 30 full system lockups followed by reboots and lots of swearing.
In the end, the bug is clearly an underflow in the nvidia API, just like we saw when HEVC support was added (which also cost me hours of wasted time back then): #1046#comment:6.

Fixes in:

  • r16394: minor, load license key file matching API version
  • r16395: default to 30 fps
  • r16396: raise minimum codec size to 128x128, other minor updates
  • r16397: remove pascal cards from the blacklist

Most of this should be backported.
We still get this error for some codec / settings combinations:

xpra.codecs.nvenc.encoder.NVENCException: initializing encoder - returned 8:
This indicates that one or more of the parameter passed to the API call is invalid.

But at least now I stand a chance of being to fix it.

@totaam
Copy link
Collaborator Author

totaam commented Jul 18, 2017

Now for the updates and proper fixes:

@DocMAX: please close if this works for you.

@totaam
Copy link
Collaborator Author

totaam commented Jul 19, 2017

More fixes (backporting this mess is not going to be easy - might just go for the easy option: just disable most of nvenc and recommend the newer version):

  • r16405: selftest would fail! (doh)
  • r16406: disabling YUV420P would cause errors
  • r16409: changes in quality could cause visual corruption (wrongly used the old CUDA kernel with the new pixel format..)

@totaam
Copy link
Collaborator Author

totaam commented Jul 19, 2017

Fixed and tested, see #1519

@totaam totaam closed this as completed Jul 19, 2017
@totaam
Copy link
Collaborator Author

totaam commented Jul 19, 2017

Tedious backporting to older branches done and tested in 16416.
We don't support HEVC in older branches (easier), use 2.1 or later if you need this.

This was referenced Jan 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant