nvidia ncp-ain practice test

Exam Title: AI Networking

Last update: Nov 27 ,2025
Question 1

[InfiniBand Configuration]
Why is the InfiniBand LRH called a local header?

  • A. It is used for routing traffic between nodes in the local subnet.
  • B. It provides the LIDs from the local subnet manager.
  • C. It allows traffic on a local link only.
  • D. It provides the parameters for each local HCA.
Answer:

A


Explanation:
The Local Route Header (LRH) in InfiniBand is termed "local" because it is used exclusively for routing
packets within a single subnet. The LRH contains the destination and source Local Identifiers (LIDs),
which are unique within a subnet, facilitating efficient routing without the need for global
addressing. This design optimizes performance and simplifies routing within localized network
segments.
InfiniBand is a high-performance, low-latency interconnect technology widely used in AI and HPC
data centers, supported by NVIDIA’s Quantum InfiniBand switches and adapters. The Local Routing
Header (LRH) is a critical component of the InfiniBand packet structure, used to facilitate routing
within an InfiniBand fabric. The question asks why the LRH is called a “local header,” which relates to
its role in the InfiniBand network architecture.
According to NVIDIA’s official InfiniBand documentation, the LRH is termed “‘local’ because it
contains the addressing information necessary for routing packets between nodes within the same
InfiniBand subnet.” The LRH includes fields such as the Source Local Identifier (SLID) and Destination
Local Identifier (DLID), which are assigned by the subnet manager to identify the source and
destination endpoints within the local subnet. These identifiers enable switches to forward packets
efficiently within the subnet without requiring global routing information, distinguishing the LRH
from the Global Routing Header (GRH), which is used for inter-subnet routing.
Exact Extract from NVIDIA Documentation:
“The Local Routing Header (LRH) is used for routing InfiniBand packets within a single subnet. It
contains the Source LID (SLID) and Destination LID (DLID), which are assigned by the subnet manager
to identify the source and destination nodes in the local subnet. The LRH is called a ‘local header’
because it facilitates intra-subnet routing, enabling switches to forward packets based on LID-based
forwarding tables.”
— NVIDIA InfiniBand Architecture Guide
This extract confirms that option A is the correct answer, as the LRH’s primary function is to route
traffic between nodes within the local subnet, leveraging LID-based addressing. The term “local”
reflects its scope, which is limited to a single InfiniBand subnet managed by a subnet manager.
Reference: LRH and GRH InfiniBand Headers - NVIDIA Enterprise Support Portal

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 2

[AI Network Architecture]
You are designing a new AI data center for a research institution that requires high-performance
computing for large-scale deep learning models. The institution wants to leverage NVIDIA's reference
architectures for optimal performance.
Which NVIDIA reference architecture would be most suitable for this high-performance AI research
environment?

  • A. NVIDIA Base Command Platform
  • B. NVIDIA DGX Cloud
  • C. NVIDIA LaunchPad
  • D. NVIDIA DGX SuperPOD
Answer:

D


Explanation:
The NVIDIA DGX SuperPOD is a turnkey AI supercomputing infrastructure designed for large-scale
deep learning and high-performance computing workloads. It integrates multiple DGX systems with
high-speed networking and storage solutions, providing a scalable and efficient platform for AI
research institutions. The architecture supports rapid deployment and is optimized for training
complex models, making it the ideal choice for environments demanding top-tier AI performance.
Reference: DGX SuperPOD Architecture - NVIDIA Docs

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 3

[InfiniBand Security]
In a multi-tenant InfiniBand environment managed by UFM, you need to configure access controls to
prevent unauthorized users from altering the fabric configuration. Which method is used within UFM
to manage user access and ensure authorized modifications only?

  • A. Digital Certification Management (DCM)
  • B. Network Access Control (NAC)
  • C. Virtual Network Segmentation (VNS)
  • D. Role-Based Access Control (RBAC)
Answer:

D


Explanation:
Role-Based Access Control (RBAC) is implemented within NVIDIA's Unified Fabric Manager (UFM) to
manage user permissions effectively. RBAC allows administrators to assign roles to users, each with
specific permissions, ensuring that only authorized individuals can make changes to the fabric
configuration. This structured approach to access control enhances security by limiting the potential
for unauthorized modifications and streamlines the management of user privileges across the
network.
Reference: Role-Based Access Control (RBAC) - One Identity

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 4

[InfiniBand Troubleshooting]
As the network administrator for a large-scale AI research cluster, you are responsible for ensuring
seamless data flow across an InfiniBand east-west fabric that interconnects hundreds of compute
nodes.
Which tool would you use to trace and discover the network paths between nodes on this InfiniBand
east-west fabric?

  • A. NetQ
  • B. ibpathverify
  • C. ibnetdiscover
  • D. tracert
Answer:

C


Explanation:
The ibnetdiscover utility is used to perform InfiniBand subnet discovery and outputs a human-
readable topology file. GUIDs, node types, and port numbers are displayed, as well as port LIDs and
node descriptions. All nodes and links are displayed, providing a full topology. This utility can also be
used to list the current connected nodes. The output is printed to the standard output unless a
topology file is specified.
InfiniBand is a high-performance, low-latency interconnect technology used in AI and HPC data
centers, particularly for east-west traffic between compute nodes in large-scale fabrics. Ensuring
seamless data flow requires tools to troubleshoot and monitor the network, including the ability to
trace and discover network paths between nodes. The question asks for the specific tool used to
trace and discover paths in an InfiniBand fabric, which is a key task in InfiniBand troubleshooting.
According to NVIDIA’s official InfiniBand documentation, the ibnetdiscover tool is designed to
discover and map the topology of an InfiniBand fabric, including the paths between nodes. It scans
the fabric, queries the subnet manager, and generates a topology map that details the connections
between switches, Host Channel Adapters (HCAs), and other devices. This tool is essential for
verifying connectivity, identifying routing paths, and troubleshooting issues like misconfigured routes
or link failures in large-scale InfiniBand fabrics.
Exact Extract from NVIDIA Documentation:
“The ibnetdiscover tool is used to discover the InfiniBand fabric topology and generate a map of the
network. It queries the subnet manager to retrieve information about all nodes, switches, and links
in the fabric, providing a detailed view of the paths between nodes. This tool is critical for
troubleshooting connectivity issues and ensuring proper routing in InfiniBand networks.”
— NVIDIA InfiniBand Networking Guide
This extract confirms that ibnetdiscover is the correct tool for discovering network paths in an
InfiniBand east-west fabric. It provides a comprehensive view of the fabric’s topology, enabling
administrators to trace paths between compute nodes and ensure seamless data flow.
Reference: InfiniBand Fabric Utilities - NVIDIA Docs

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 5

[Spectrum-X Configuration]
When upgrading Cumulus Linux to a new version, which configuration files should be migrated from
the old installation?
Pick the 2 correct responses below.

  • A. All files in /etc/cumulus/acl
  • B. All files in /etc/network
  • C. All files in /etc
  • D. All files in /etc/mix
Answer:

A, B


Explanation:
Before upgrading Cumulus Linux, it's essential to back up configuration files to a different server. The
/etc directory is the primary location for all configuration data in Cumulus Linux. Specifically, the
following files and directories should be backed up:
/etc/frr/ - Routing application (responsible for BGP and OSPF)
/etc/hostname - Configuration file for the hostname of the switch
/etc/network/ - Network configuration files, most notably /etc/network/interfaces and
/etc/network/interfaces.d/
/etc/cumulus/acl - Access control list configurations
Cumulus Linux is a network operating system used on NVIDIA Spectrum switches, including those in
the Spectrum-X platform, to provide a Linux-based environment for Ethernet networking in AI and
HPC data centers. When upgrading Cumulus Linux to a new version, it’s critical to migrate specific
configuration files to preserve network settings and ensure continuity. The question asks for the two
configuration file locations that should be migrated from the old installation during an upgrade.
According to NVIDIA’s official Cumulus Linux documentation, the key directories containing
configuration files that should be migrated during an upgrade are /etc/cumulus/acl (for access
control list configurations) and /etc/network (for network interface configurations). These directories
store critical network settings that define the switch’s behavior, such as ACL rules and interface
settings, which must be preserved to maintain network functionality after the upgrade.
Exact Extract from NVIDIA Documentation:
“When upgrading Cumulus Linux, you must back up and migrate specific configuration files to ensure
continuity of network settings. The following directories should be included in the backup:
/etc/cumulus/acl: Contains access control list (ACL) configuration files that define packet filtering and
security policies.
/etc/network: Contains network interface configuration files, such as interfaces and ifupdown2
settings, which define the network interfaces and their properties.
Back up these directories before upgrading and restore them after the new version is installed to
maintain consistent network behavior.”
— NVIDIA Cumulus Linux Upgrade Guide
This extract confirms that options A and B are the correct answers, as /etc/cumulus/acl and
/etc/network contain essential configuration files that must be migrated during a Cumulus Linux
upgrade. These files ensure that ACL policies and network interface settings are preserved, which are
critical for Spectrum-X configurations in AI networking environments.
Reference: Upgrading Cumulus Linux - NVIDIA Docs

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 6

[Spectrum-X Configuration]
What is the total throughput of the SN5600 Spectrum-X switch?

  • A. 12.8 petabits per second
  • B. 25.6 terabits per second
  • C. 102.4 gigabits per second
  • D. 51.2 terabits per second
Answer:

D


Explanation:
The SN5600 smart-leaf/spine/super-spine switch offers 64 ports of 800GbE in a dense 2U form factor.
The SN5600 offers diverse connectivity in combinations of 1 to 800GbE and boasts an industry-
leading total throughput of 51.2Tb/s.
Reference: NVIDIA Spectrum SN5600 Ethernet Switch - Bluum

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 7

[InfiniBand Troubleshooting]
You are tasked with troubleshooting a link flapping issue in an InfiniBand AI fabric. You would like to
start troubleshooting from the physical layer.
What is the right NVIDIA tool to be used for this task?

  • A. nvidia-smi utility
  • B. mlxlink utility
  • C. tcpdump tool
Answer:

B


Explanation:
The mlxlink tool is used to check and debug link status and issues related to them. The tool can be
used on different links and cables (passive, active, transceiver, and backplane). It is intended for
advanced users with appropriate technical background.
Reference: mlxlink Utility - NVIDIA Docs

vote your answer:
A
B
C
A 0 B 0 C 0
Comments
Question 8

[AI Network Architecture]
Which of the following statements are true about AI workloads and adaptive routing?
Pick the 2 correct responses below.

  • A. AI workloads are made of a small number of volumetric flows called elephant flows.
  • B. AI workloads have very high entropy that helps spread traffic evenly without congestion.
  • C. Flow-based load balancing mechanisms increase congestion risk.
  • D. ECMP-based load balancing works best for AI workloads.
Answer:

A, C


Explanation:
AI workloads, particularly in large-scale training scenarios, are characterized by a small number of
high-bandwidth, long-lived flows known as "elephant flows." These flows can dominate network
traffic and are prone to causing congestion if not managed effectively.
Traditional flow-based load balancing mechanisms, such as Equal-Cost Multipath (ECMP), distribute
traffic based on flow hashes. However, in AI workloads with low entropy (i.e., limited variability in
flow characteristics), ECMP can lead to uneven traffic distribution and congestion on certain paths.
Adaptive routing techniques, which dynamically adjust paths based on real-time network conditions,
are more effective in managing AI traffic patterns and mitigating congestion risks.
Reference: Powering Next-Generation AI Networking with NVIDIA SuperNICs

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 9

[InfiniBand Troubleshooting]
A user has requested confirmation that the InfiniBand network is performing optimally and is not
limiting the speed of a training run. To verify this, you would like to measure the RDMA throughput
rate between two endpoints.
Which tool should be used?

  • A. ibdiagnet
  • B. ib_write_bw
  • C. ping
  • D. iperf
Answer:

B


Explanation:
The ib_write_bw tool is part of the Perftest package and is specifically designed to measure the
bandwidth of RDMA write operations between two InfiniBand endpoints. It provides accurate
assessments of RDMA throughput, which is crucial for verifying the performance of InfiniBand
networks in high-performance computing and AI training environments.
Reference: ib_write_bw - NVIDIA Enterprise Support Portal

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 10

[Spectrum-X Optimization]
You have recently implemented NVIDIA Spectrum-X in your data center to optimize AI workloads.
You need to verify the performance improvements and create a baseline for future comparisons.
Which tool would be most appropriate for creating performance baseline results in this Spectrum-X
environment?

  • A. NetQ
  • B. CloudAI Benchmark
  • C. MLNX-OS
  • D. Ansible
Answer:

B


Explanation:
The CloudAI Benchmark is designed to evaluate and establish performance baselines in AI-optimized
networking environments like NVIDIA Spectrum-X. It assesses various performance metrics,
including throughput and latency, ensuring that the network meets the demands of AI workloads.
This benchmarking is essential for validating the benefits of Spectrum-X and for ongoing
performance monitoring.
Reference: NVIDIA Spectrum-X Validated Solution Stack

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Page 1 out of 6
Viewing questions 1-10 out of 70
Go To
page 2