cloudera cca-500 practice test

Exam Title: Cloudera Certified Administrator for Apache Hadoop (CCAH)

Last update: Nov 27 ,2025
Question 1

What does CDH packaging do on install to facilitate Kerberos security setup?

  • A. Automatically configures permissions for log files at &MAPRED_LOG_DIR/userlogs
  • B. Creates users for hdfs and mapreduce to facilitate role assignment
  • C. Creates directories for temp, hdfs, and mapreduce with the correct permissions
  • D. Creates a set of pre-configured Kerberos keytab files and their permissions
  • E. Creates and configures your kdc with default cluster values
Answer:

B

vote your answer:
A
B
C
D
E
A 0 B 0 C 0 D 0 E 0
Comments
Question 2

You want to understand more about how users browse your public website. For example, you want to
know which pages they visit prior to placing an order. You have a server farm of 200 web servers
hosting your website. Which is the most efficient process to gather these web server across logs into
your Hadoop cluster analysis?

  • A. Sample the web server logs web servers and copy them into HDFS using curl
  • B. Ingest the server web logs into HDFS using Flume
  • C. Channel these clickstreams into Hadoop using Hadoop Streaming
  • D. Import all user clicks from your OLTP databases into Hadoop using Sqoop
  • E. Write a MapReeeduce job with the web servers for mappers and the Hadoop cluster nodes for reducers
Answer:

A,B

vote your answer:
A
B
C
D
E
A 0 B 0 C 0 D 0 E 0
Comments
Question 3

Which three basic configuration parameters must you set to migrate your cluster from MapReduce 1
(MRv1) to MapReduce V2 (MRv2)?

  • A. Configure the NodeManager to enable MapReduce services on YARN by setting the following property in yarn-site.xml: <name>yarn.nodemanager.hostname</name> <value>your_nodeManager_shuffle</value>
  • B. Configure the NodeManager hostname and enable node services on YARN by setting the following property in yarn-site.xml: <name>yarn.nodemanager.hostname</name> <value>your_nodeManager_hostname</value>
  • C. Configure a default scheduler to run on YARN by setting the following property in mapredsite.xml: <name>mapreduce.jobtracker.taskScheduler</name> <Value>org.apache.hadoop.mapred.JobQueueTaskScheduler</value>
  • D. Configure the number of map tasks per jon YARN by setting the following property in mapred: <name>mapreduce.job.maps</name> <value>2</value>
  • E. Configure the ResourceManager hostname and enable node services on YARN by setting the following property in yarn-site.xml: <name>yarn.resourcemanager.hostname</name> <value>your_resourceManager_hostname</value>
  • F. Configure MapReduce as a Framework running on YARN by setting the following property in mapred-site.xml: <name>mapreduce.framework.name</name> <value>yarn</value>
Answer:

A,B,D

vote your answer:
A
B
C
D
E
F
A 0 B 0 C 0 D 0 E 0 F 0
Comments
Question 4

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB.
Because you Hadoop cluster isn’t optimized for storing and processing many small files, you decide
to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with python
using Hadoop streaming.
Which data serialization system gives the flexibility to do this?

  • A. CSV
  • B. XML
  • C. HTML
  • D. Avro
  • E. SequenceFiles
  • F. JSON
Answer:

A,B

vote your answer:
A
B
C
D
E
F
A 0 B 0 C 0 D 0 E 0 F 0
Comments
Question 5

Identify two features/issues that YARN is designated to address:

  • A. Standardize on a single MapReduce API
  • B. Single point of failure in the NameNode
  • C. Reduce complexity of the MapReduce APIs
  • D. Resource pressure on the JobTracker
  • E. Ability to run framework other than MapReduce, such as MPI
  • F. HDFS latency
Answer:

B,D


Explanation:
Reference:
http://www.revelytix.com/?q=content/hadoop-ecosystem(YARN, first para)

vote your answer:
A
B
C
D
E
F
A 0 B 0 C 0 D 0 E 0 F 0
Comments
Question 6

Which YARN daemon or service monitors a Controller’s per-application resource using (e.g., memory
CPU)?

  • A. ApplicationMaster
  • B. NodeManager
  • C. ApplicationManagerService
  • D. ResourceManager
Answer:

A

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 7

Which is the default scheduler in YARN?

  • A. YARN doesn’t configure a default scheduler, you must first assign an appropriate scheduler class in yarn-site.xml
  • B. Capacity Scheduler
  • C. Fair Scheduler
  • D. FIFO Scheduler
Answer:

B


Explanation:
Reference:
http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarnsite/CapacityScheduler.html

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 8

Which YARN process run as “container 0” of a submitted job and is responsible for resource
qrequests?

  • A. ApplicationManager
  • B. JobTracker
  • C. ApplicationMaster
  • D. JobHistoryServer
  • E. ResoureManager
  • F. NodeManager
Answer:

C

vote your answer:
A
B
C
D
E
F
A 0 B 0 C 0 D 0 E 0 F 0
Comments
Question 9

Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a
reasonable time without starting long-running jobs?

  • A. Complexity Fair Scheduler (CFS)
  • B. Capacity Scheduler
  • C. Fair Scheduler
  • D. FIFO Scheduler
Answer:

C


Explanation:
Reference:
http://hadoop.apache.org/docs/r1.2.1/fair_scheduler.html

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Question 10

Your cluster is configured with HDFS and MapReduce version 2 (MRv2) on YARN. What is the result
when you execute: hadoop jar SampleJar MyClass on a client machine?

  • A. SampleJar.Jar is sent to the ApplicationMaster which allocates a container for SampleJar.Jar
  • B. Sample.jar is placed in a temporary directory in HDFS
  • C. SampleJar.jar is sent directly to the ResourceManager
  • D. SampleJar.jar is serialized into an XML file which is submitted to the ApplicatoionMaster
Answer:

A

vote your answer:
A
B
C
D
A 0 B 0 C 0 D 0
Comments
Page 1 out of 5
Viewing questions 1-10 out of 60
Go To
page 2