Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't use NVENC in hybrid mode, can use just in resetted mode #147

Open
sam-cavalheiro opened this issue Dec 6, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@sam-cavalheiro
Copy link

sam-cavalheiro commented Dec 6, 2023

Describe the bug
I just can use NVENC (I think it's Nvidia codec) only with resetted envycontrol mode, not hybrid (I usually use that mode).

To Reproduce
Steps to reproduce the behavior:

  1. 'sudo envycontrol -s hybrid' then reboot
  2. 'ffmpeg -i input.mp4 -vcodec h264_nvenc output.mp4' to test if NVENC works
  3. See error: 'dl_fn->cuda_dl->cuInit(0) failed -> CUDA_ERROR_UNKNOWN: unknown error'
  4. 'sudo envycontrol --reset' then reboot
  5. Test if NVENC works (step 2 again)
  6. Will convert the video as expected

Expected behavior
Test NVENC (with 'ffmpeg -i input.mp4 -vcodec h264_nvenc output.mp4') should just don't return error and convert the example file -- in hybrid mode, not necessarily resetted mode.

Screenshots
Hybrid mode:
image

Resetted mode:
image

System Information:

  • Model: Lenovo IdeaPad Gaming 3 15IHU6
  • Distro: Fedora Linux 39 (Workstation Edition)
  • Kernel: Linux 6.6.3-200.fc39.x86_64
  • DE/WM and Display Manager (if applicable): Gnome 45.2 with GDM
  • EnvyControl version: 3.3.1
  • Nvidia driver version: 545.29.06
  • lspci output:
00:00.0 Host bridge: Intel Corporation 11th Gen Core Processor Host Bridge/DRAM Registers (rev 01)
00:02.0 VGA compatible controller: Intel Corporation TigerLake-LP GT2 [Iris Xe Graphics] (rev 01)
00:04.0 Signal processing controller: Intel Corporation TigerLake-LP Dynamic Tuning Processor Participant (rev 01)
00:08.0 System peripheral: Intel Corporation GNA Scoring Accelerator module (rev 01)
00:0a.0 Signal processing controller: Intel Corporation Tigerlake Telemetry Aggregator Driver (rev 01)
00:14.0 USB controller: Intel Corporation Tiger Lake-LP USB 3.2 Gen 2x1 xHCI Host Controller (rev 20)
00:14.2 RAM memory: Intel Corporation Tiger Lake-LP Shared SRAM (rev 20)
00:14.3 Network controller: Intel Corporation Wi-Fi 6 AX201 (rev 20)
00:15.0 Serial bus controller: Intel Corporation Tiger Lake-LP Serial IO I2C Controller #0 (rev 20)
00:16.0 Communication controller: Intel Corporation Tiger Lake-LP Management Engine Interface (rev 20)
00:17.0 SATA controller: Intel Corporation Tiger Lake-LP SATA Controller (rev 20)
00:1c.0 PCI bridge: Intel Corporation Tiger Lake-LP PCI Express Root Port #5 (rev 20)
00:1d.0 PCI bridge: Intel Corporation Tiger Lake-LP PCI Express Root Port #9 (rev 20)
00:1d.3 PCI bridge: Intel Corporation Tiger Lake-LP PCI Express Root Port #12 (rev 20)
00:1f.0 ISA bridge: Intel Corporation Tiger Lake-LP LPC Controller (rev 20)
00:1f.3 Multimedia audio controller: Intel Corporation Tiger Lake-LP Smart Sound Technology Audio Controller (rev 20)
00:1f.4 SMBus: Intel Corporation Tiger Lake-LP SMBus Controller (rev 20)
00:1f.5 Serial bus controller: Intel Corporation Tiger Lake-LP SPI Controller (rev 20)
01:00.0 3D controller: NVIDIA Corporation TU117M [GeForce GTX 1650 Mobile / Max-Q] (rev a1)
02:00.0 Non-Volatile memory controller: Intel Corporation SSD 670p Series [Keystone Harbor] (rev 03)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)

@sam-cavalheiro sam-cavalheiro added the bug Something isn't working label Dec 6, 2023
@sam-cavalheiro sam-cavalheiro changed the title [BUG] [BUG] Can't use NVENC in hybrid mode, can use just in resetted mode Dec 6, 2023
@klmcwhirter
Copy link
Contributor

@sam-cavalheiro , I personally do not see this as a bug. Let me explain.

The Optimus hybrid mode has a number of features- one of which is the support of the HDMI port (and audio through it).

This carries with it the effect of greatly reducing battery time.

So, envycontrol has the --rtd3 additional option for hybrid mode that helps with that. But it comes with the tradeoff of reduced dGPU functionality. Please see the link to the official documentation in the Hybrid section of the README.

If you need the full-blown Optimus hybrid mode feature set, please do use the --reset option to restore it.

@sam-cavalheiro
Copy link
Author

@sam-cavalheiro , I personally do not see this as a bug. Let me explain.

The Optimus hybrid mode has a number of features- one of which is the support of the HDMI port (and audio through it).

This carries with it the effect of greatly reducing battery time.

So, envycontrol has the --rtd3 additional option for hybrid mode that helps with that. But it comes with the tradeoff of reduced dGPU functionality. Please see the link to the official documentation in the Hybrid section of the README.

If you need the full-blown Optimus hybrid mode feature set, please do use the --reset option to restore it.

I use a custom-nvidia.conf in /etc/modprobe.d/ with
options nvidia "NVreg_DynamicPowerManagement=0x02" in resseted mode. This means that RTD3 is not working here? (I don't know a good way to test this).

@klmcwhirter
Copy link
Contributor

klmcwhirter commented May 26, 2024

@sam-cavalheiro ,

I use a custom-nvidia.conf in /etc/modprobe.d/ with options nvidia "NVreg_DynamicPowerManagement=0x02" in resseted mode. This means that RTD3 is not working here? (I don't know a good way to test this).

Without digging back into the details of RTD3 to refresh my memory I cannot quickly tell if that is all that is needed.

I do know that @bayasdev has spent quite a bit of time working with several people over the years making sure his code works with different hardware (Intel and AMD iGPUs), distros, WM's, etc.

I just know there are some tradeoffs in there. Hence, the cli options to switch to nvidia mode or simply --reset are also available. I, too, found it necessary to --reset when working with pytorch with the nvidia cuda drivers, for example.

When I am done with such a project I use envycontrol again to put my laptop in the mode I need.

My usage of envycontrol is not one and done - but rather situational.

Make sense?

To gain a better understanding of what happens upon switch to hybrid mode, please take a look at the few lines of code used to switch into hybrid mode to see exactly which files it is creating with what contents. The CONSTANTS used are defined near the top of the file.

BTW, nvidia-smi should show both GPU utilization and power consumption metrics. That is how I monitor GPU overhead / usage.
nvidia-smi-sample

@klmcwhirter
Copy link
Contributor

klmcwhirter commented May 27, 2024

For future readers ...

I followed these steps to install the nvidia cuda drivers and the nvidia-smi utility.

Note: these instructions are for Fedora 39. As I write this drivers are not available for Fedora 40 yet.

Installing NVIDIA CUDA Drivers on Fedora 39

IMPORTANT!!! DO NOT install the proprietary NVIDIA drivers from the NVIDIA website using the run script method. This will surely break your system when kernel updates appear.

See this video segment ... Proprietary NVIDIA Drivers

Before installing, ensure your system is updated to prevent potential conflicts between graphic card drivers and kernels. To update your Fedora system, use the following command:
sudo dnf upgrade --refresh
Import Nvidia CUDA Repository for Fedora 39:
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora39/x86_64/cuda-fedora39.repo
Proceed to install the necessary dependencies for NVIDIA Drivers:
sudo dnf install kernel-headers kernel-devel tar bzip2 make automake gcc gcc-c++ pciutils elfutils-libelf-devel libglvnd-opengl libglvnd-glx libglvnd-devel acpid pkgconfig dkms
To view the NVIDIA RPM modules, execute:
sudo dnf module list nvidia-driver
$ sudo dnf module list nvidia-driver
Last metadata expiration check: 2:42:46 ago on Wed 17 Apr 2024 07:11:24 AM PDT.
cuda-fedora39-x86_64
Name                   Stream                      Profiles                           Summary                                       
nvidia-driver          latest                      default [d], fm, ks, src           Nvidia driver for latest branch               
nvidia-driver          latest-dkms [d][e]          default [d] [i], fm, ks            Nvidia driver for latest-dkms branch          
nvidia-driver          open-dkms                   default [d], fm, ks, src           Nvidia driver for open-dkms branch            
nvidia-driver          550                         default [d], fm, ks, src           Nvidia driver for 550 branch                  
nvidia-driver          550-dkms                    default [d], fm, ks                Nvidia driver for 550-dkms branch             
nvidia-driver          550-open                    default [d], fm, ks, src           Nvidia driver for 550-open branch             

Hint: [d]efault, [e]nabled, [x]disabled, [i]nstalled

Select the appropriate one corresponding to your Fedora version to integrate the CUDA repository into your Fedora system.

To install the latest NVIDIA drivers using the DKMS method, execute:
sudo dnf module install nvidia-driver:latest-dkms

@sam-cavalheiro
Copy link
Author

@klmcwhirter
Sorry, but now I don't have time to test :(
I'd not read the Nvidia documentation about RTD3 because english is not my primary language, so it's hard to read long text. I need to dedicate a bit of time to do that.
So, for now, the only thing I can do is trust you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants