The foundation of the NVIDIA software stack is the DGX OS. Which of the following Linux
distributions is DGX OS built upon?
A
Explanation:
DGX OS, the operating system powering NVIDIA DGX systems, is built on Ubuntu Linux, specifically
the Long-Term Support (LTS) version. It integrates Ubuntu’s robust base with NVIDIA-specific
enhancements, including GPU drivers, tools, and optimizations tailored for AI and high-performance
computing workloads. Neither Red Hat nor CentOS serves as the foundation for DGX OS, making
Ubuntu the correct choice.
(Reference: NVIDIA DGX OS Documentation, System Requirements Section)
What is the name of NVIDIA’s SDK that accelerates machine learning?
C
Explanation:
The CUDA Deep Neural Network library (cuDNN) is NVIDIA’s SDK specifically designed to accelerate
machine learning, particularly deep learning tasks. It provides highly optimized implementations of
neural network primitives—such as convolutions, pooling, normalization, and activation functions—
leveraging GPU parallelism. Clara focuses on healthcare applications, and RAPIDS accelerates data
science workflows, but cuDNN is the core SDK for machine learning acceleration.
(Reference: NVIDIA cuDNN Documentation, Introduction)
Which aspect of computing uses large amounts of data to train complex neural networks?
B
Explanation:
Deep learning, a subset of machine learning, relies on large datasets to train multi-layered neural
networks, enabling them to learn hierarchical feature representations and complex patterns
autonomously. While machine learning encompasses broader techniques (some requiring less data),
deep learning’s dependence on vast data volumes distinguishes it. Inferencing, the application of
trained models, typically uses smaller, real-time inputs rather than extensive training data.
(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on Deep Learning
Fundamentals)
Which of the following statements correctly differentiates between AI, Machine Learning, and Deep
Learning?
D
Explanation:
Artificial Intelligence (AI) is the overarching field encompassing techniques to mimic human
intelligence. Machine Learning (ML), a subset of AI, involves algorithms that learn from data. Deep
Learning (DL), a specialized subset of ML, uses neural networks with many layers to tackle complex
tasks. This hierarchical relationship—DL within ML, ML within AI—is the correct differentiation,
unlike the reversed or conflated options.
(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on AI, ML, and DL
Definitions)
How is the architecture different in a GPU versus a CPU?
B
Explanation:
A GPU’s architecture is designed for massive parallelism, featuring thousands of lightweight cores
that execute simple instructions across vast data elements simultaneously—ideal for tasks like AI
training. In contrast, a CPU has fewer, complex cores optimized for sequential execution and
branching logic. GPUs don’t function as PCIe controllers (a hardware role), nor are they single-core
designs, making the parallel execution focus the key differentiator.
(Reference: NVIDIA GPU Architecture Whitepaper, Section on GPU Design Principles)
What factors have led to significant breakthroughs in Deep Learning?
C
Explanation:
Deep learning breakthroughs stem from three pillars: advances in hardware (e.g., GPUs and TPUs)
providing the compute power for large-scale neural networks; the availability of large datasets
offering the data volume needed for training; and improvements in training algorithms (e.g.,
optimizers like Adam, novel architectures like Transformers) enhancing model efficiency and
accuracy. While internet speed, sensors, or smartphones play roles in broader tech, they’re less
directly tied to deep learning’s core advancements.
(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on Deep Learning
Advancements)
Which type of GPU core was specifically designed to realistically simulate the lighting of a scene?
C
Explanation:
Ray Tracing Cores, introduced in NVIDIA’s RTX architecture, are specialized hardware units built to
accelerate ray-tracing computations—simulating light interactions (e.g., reflections, shadows) for
photorealistic rendering in real time. CUDA Cores handle general-purpose parallel tasks, and Tensor
Cores optimize matrix operations for AI, but only Ray Tracing Cores target lighting simulation.
(Reference: NVIDIA GPU Architecture Whitepaper, Section on Ray Tracing Cores)
Which GPUs should be used when training a neural network for self-driving cars?
A
Explanation:
Training neural networks for self-driving cars requires immense computational power and high-
bandwidth memory to process vast datasets (e.g., sensor data, video). NVIDIA H100 GPUs, with their
cutting-edge architecture and massive throughput, are ideal for these demanding workloads. L4
GPUs are optimized for inference and efficiency, while DRIVE Orin targets in-vehicle inference, not
training, making H100 the best choice.
(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on GPU Selection for
Training)
A customer is evaluating an AI cluster for training and is questioning why they should use a large
number of nodes. Why would multi-node training be advantageous?
A
Explanation:
Multi-node training is advantageous when a model’s size—its parameters, activations, and
gradients—exceeds the memory capacity of a single GPU. By sharding the model across multiple
nodes (using techniques like data parallelism or model parallelism), training becomes feasible and
efficient. User count and inference scale are unrelated to training architecture needs, which focus on
compute and memory distribution.
(Reference: NVIDIA AI Infrastructure and Operations Study Guide, Section on Multi-Node Training
Benefits)
When should RoCE be considered to enhance network performance in a multi-node AI computing
environment?
C
Explanation:
RoCE (RDMA over Converged Ethernet) enhances network performance by offloading data transport
to the NIC via RDMA, bypassing CPU involvement. It’s particularly valuable when high CPU utilization
limits bandwidth usage, as it reduces overhead and unlocks full link capacity. While RoCE can handle
storage traffic, it’s less effective with high packet loss (requiring reliable networks), making CPU-
bound scenarios its prime use case.
(Reference: NVIDIA Networking Documentation, Section on RoCE Benefits)