Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACS may cause P2P bandwidth problem #10

Closed
lukeyeager opened this issue Jul 7, 2015 · 2 comments
Closed

ACS may cause P2P bandwidth problem #10

lukeyeager opened this issue Jul 7, 2015 · 2 comments

Comments

@lukeyeager
Copy link
Member

If lower than expected performance is observed when executing a training and DIGITS has been configured to use multiple GPUs, verify that PCI Express Access Control Services (ACS) are disabled.

NVIDIA recommends that the system BIOS (SBIOS) disables ACS to ensure maximum P2P bandwidth between GPUs. The SBIOS should leave the ACS capability exposed but disabled on switch downstream ports and root ports so that ACS-aware OS and Hypervisors can choose to enable ACS when required.

Please verify with your motherboard manufacturer that the SBIOS correctly disables ACS, and if this is not the case whether an updated SBIOS is available.

If an SBIOS that correctly disables ACS is not yet available from your motherboard manufacturer, you can attempt to disable ACS programmatically by running the following script that uses the linux lspci utility. Note that this script must be run after every system boot or system reset.

#!/usr/bin/env bash
for i in $(lspci -d "10b5:" | awk '{print $1}') ; do
       o=$(lspci -vvv -s $i | grep ACSCtl)
       if [ $? -eq 0 ] ; then
               echo $o | grep "+"
               if [ $? -eq 0 ] ; then
                       setpci -s $i f2a.w=0000
               fi
       fi
done
@juliebernauer
Copy link

Or one can disable the ACS directly in the BIOS of the server.

pooyadavoodi pushed a commit to pooyadavoodi/caffe that referenced this issue Jul 6, 2016
Exclude pause instruction from __aarch64__
@xzheng4
Copy link

xzheng4 commented Dec 15, 2017

how much performance downgrade do you observed? we have similar issue. @lukeyeager

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants