comptia dy0-001 practice test

Exam Title: CompTIA DataX

Last update: Nov 27 ,2025
Question 1

The term "greedy algorithms" refers to machine-learning algorithms that:

  • A. update priors as more data is seen.
  • B. examine even/ node of a tree before making a decision.
  • C. apply a theoretical model to the distribution of the data.
  • D. make the locally optimal decision.
Answer:

D


Explanation:
Greedy algorithms build the solution iteratively by choosing at each step the option that appears
best at that moment, without reconsidering earlier choices.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 2

A data scientist is deploying a model that needs to be accessed by multiple departments with
minimal development effort by the departments. Which of the following APIs would be best for the
data scientist to use?

  • A. SOAP
  • B. RPC
  • C. JSON
  • D. REST
Answer:

D


Explanation:
RESTful APIs use standard HTTP methods and lightweight data formats (typically JSON), making them
easy for diverse teams to integrate with minimal effort and without heavy tooling.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 3

Which of the following compute delivery models allows packaging of only critical dependencies
while developing a reusable asset?

  • A. Thin clients
  • B. Containers
  • C. Virtual machines
  • D. Edge devices
Answer:

B


Explanation:
Containers encapsulate just the application and its critical dependencies on a lightweight runtime,
making the resulting asset portable and reusable without bundling an entire operating system.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 4

A data analyst is analyzing data and would like to build conceptual associations. Which of the
following is the best way to accomplish this task?

  • A. n-grams
  • B. NER
  • C. TF-IDF
  • D. POS
Answer:

A


Explanation:
n-grams capture contiguous sequences of words, revealing which terms co-occur and form
meaningful multi-word concepts. By analyzing these frequent word combinations, you directly
uncover conceptual associations in the text.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 5

Which of the following belong in a presentation to the senior management team and/or C-suite
executives? (Choose two.)

  • A. Full literature reviews
  • B. Code snippets
  • C. Final recommendations
  • D. High-level results
  • E. Detailed explanations of statistical tests
  • F. Security keys and login information
Answer:

C


Explanation:
Senior leaders need actionable insights and the overarching outcomes, not the implementation
details, so you present your key recommendations alongside a summary of results at a high level.

vote your answer:
A
B
C
D
E
F
A 0 B 0 C 0 D 0 E 0 F 0
Comments
Question 6

During EDA, a data scientist wants to look for patterns, such as linearity, in the dat
a. Which of the following plots should the data scientist use?

  • A. Violin
  • B. Box-and-whisker
  • C. Scatter
  • D. Q-Q
Answer:

C


Explanation:
Scatter plots display pairs of numeric values on two axes, letting you visually assess relationships and
patterns, such as linear trends, between variables.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 7

Which of the following distribution methods or models can most effectively represent the actual
arrival times of a bus that runs on an hourly schedule?

  • A. Binomial
  • B. Exponential
  • C. Normal
  • D. Poisson
Answer:

C


Explanation:
Scheduled buses tend to arrive around a fixed time with random delays that cluster symmetrically
around the hour. A normal distribution effectively models those continuous, bell-shaped deviations
from the exact schedule.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 8

A data scientist has constructed a model that meets the minimum performance requirements
specified in the proposal for a prediction project. The data scientist thinks the model's accuracy
should be improved, but the proposed deadline is approaching. Which of the following actions
should the data scientist take first?

  • A. Continue collecting data.
  • B. Request additional funding.
  • C. Consult the key project stakeholder.
  • D. Test additional model specifications.
Answer:

C


Explanation:
Since the model already meets the agreed-upon requirements and the deadline is near, the first step
is to confirm with the stakeholder whether pursuing further accuracy gains is worth the additional
time and resources. This ensures you align with business priorities before collecting more data,
requesting funding, or tweaking the model further.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 9

Which of the following best describes the minimization of the residual term in a ridge linear
regression?

  • A. |e|
  • B. e
  • C. e2
  • D. 0
Answer:

C


Explanation:
Ridge regression extends ordinary least squares by adding an L2 penalty on the coefficients, but it
still minimizes the sum of squared residuals (e²) as its loss term.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 10

A statistician notices gaps in data associated with age-related illnesses and wants to further
aggregate these observations. Which of the following is the best technique to achieve this goal?

  • A. Label encoding
  • B. Linearization
  • C. Binning
  • D. Imputing
Answer:

C


Explanation:
Binning groups continuous age values into discrete intervals (e.g., age ranges), filling gaps by
aggregating observations into broader categories. This directly addresses uneven or sparse age data
by creating consistent age groups.

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Page 1 out of 8
Viewing questions 1-10 out of 85
Go To
page 2