google professional machine learning engineer practice test

Professional Machine Learning Engineer

Last exam update: Dec 02 ,2023
Page 1 out of 4
Viewing questions 1-15 out of 60

Question 1

Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting
model that predicts customers account balances 3 days in the future. Your team will use the results in a new feature that will
notify users when their account balance is likely to drop below $25. How should you serve your predictions?

  • A. 1. Create a Pub/Sub topic for each user. 2. Deploy a Cloud Function that sends a notification when your model predicts that a users account balance will drop below the $25 threshold.
  • B. 1. Create a Pub/Sub topic for each user. 2. Deploy an application on the App Engine standard environment that sends a notification when your model predicts that a users account balance will drop below the $25 threshold.
  • C. 1. Build a notification system on Firebase. 2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when the average of all account balance predictions drops below the $25 threshold.
  • D. 1. Build a notification system on Firebase. 2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when your model predicts that a users account balance will drop below the $25 threshold.
Answer:

A

Discussions
0 / 1000

Question 2

You started working on a classification problem with time series data and achieved an area under the receiver operating
characteristic curve (AUC ROC) value of 99% for training data after just a few experiments. You havent explored using any
sophisticated algorithms or spent any time on hyperparameter tuning. What should your next step be to identify and fix the
problem?

  • A. Address the model overfitting by using a less complex algorithm.
  • B. Address data leakage by applying nested cross-validation during model training.
  • C. Address data leakage by removing features highly correlated with the target value.
  • D. Address the model overfitting by tuning the hyperparameters to reduce the AUC ROC value.
Answer:

B

Discussions
0 / 1000

Question 3

You are training an LSTM-based model on AI Platform to summarize text using the following job submission script:
gcloud ai-platform jobs submit training $JOB_NAME \
--package-path $TRAINER_PACKAGE_PATH \
--module-name $MAIN_TRAINER_MODULE \
--job-dir $JOB_DIR \
--region $REGION \
--scale-tier basic \
-- \
--epochs 20 \
--batch_size=32 \
--learning_rate=0.001 \
You want to ensure that training time is minimized without significantly compromising the accuracy of your model. What
should you do?

  • A. Modify the ‘epochs’ parameter.
  • B. Modify the ‘scale-tier’ parameter.
  • C. Modify the ‘batch size’ parameter.
  • D. Modify the ‘learning rate’ parameter.
Answer:

C

Discussions
0 / 1000

Question 4

You are developing models to classify customer support emails. You created models with TensorFlow Estimators using
small datasets on your on-premises system, but you now need to train the models using large datasets to ensure high
performance. You will port your models to Google Cloud and want to minimize code refactoring and infrastructure overhead
for easier migration from on-prem to cloud. What should you do?

  • A. Use AI Platform for distributed training.
  • B. Create a cluster on Dataproc for training.
  • C. Create a Managed Instance Group with autoscaling.
  • D. Use Kubeflow Pipelines to train on a Google Kubernetes Engine cluster.
Answer:

C

Discussions
0 / 1000

Question 5

Your data science team needs to rapidly experiment with various features, model architectures, and hyperparameters. They
need to track the accuracy metrics for various experiments and use an API to query the metrics over time. What should they
use to track and report their experiments while minimizing manual effort?

  • A. Use Kubeflow Pipelines to execute the experiments. Export the metrics file, and query the results using the Kubeflow Pipelines API.
  • B. Use AI Platform Training to execute the experiments. Write the accuracy metrics to BigQuery, and query the results using the BigQuery API.
  • C. Use AI Platform Training to execute the experiments. Write the accuracy metrics to Cloud Monitoring, and query the results using the Monitoring API.
  • D. Use AI Platform Notebooks to execute the experiments. Collect the results in a shared Google Sheets file, and query the results using the Google Sheets API.
Answer:

B

Discussions
0 / 1000

Question 6

You manage a team of data scientists who use a cloud-based backend system to submit training jobs. This system has
become very difficult to administer, and you want to use a managed service instead. The data scientists you work with use
many different frameworks, including Keras, PyTorch, theano, Scikit-learn, and custom libraries. What should you do?

  • A. Use the AI Platform custom containers feature to receive training jobs using any framework.
  • B. Configure Kubeflow to run on Google Kubernetes Engine and receive training jobs through TF Job.
  • C. Create a library of VM images on Compute Engine, and publish these images on a centralized repository.
  • D. Set up Slurm workload manager to receive jobs that can be scheduled to run on your cloud infrastructure.
Answer:

D

Discussions
0 / 1000

Question 7

As the lead ML Engineer for your company, you are responsible for building ML models to digitize scanned customer forms.
You have developed a TensorFlow model that converts the scanned images into text and stores them in Cloud Storage. You
need to use your ML model on the aggregated data collected at the end of each day with minimal manual intervention. What
should you do?

  • A. Use the batch prediction functionality of AI Platform.
  • B. Create a serving pipeline in Compute Engine for prediction.
  • C. Use Cloud Functions for prediction each time a new data point is ingested.
  • D. Deploy the model on AI Platform and create a version of it for online inference.
Answer:

D

Discussions
0 / 1000

Question 8

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to
execute the same preprocessing at prediction time. You deployed the model on AI Platform for high-throughput online
prediction. Which architecture should you use?

  • A. Validate the accuracy of the model that you trained on preprocessed data. Create a new model that uses the raw data and is available in real time. Deploy the new model onto AI Platform for online prediction.
  • B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataflow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
  • C. Stream incoming prediction request data into Cloud Spanner. Create a view to abstract your preprocessing logic. Query the view every second for new records. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
  • D. Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the Cloud Function. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
Answer:

D

Explanation:
Reference: https://cloud.google.com/pubsub/docs/publisher

Discussions
0 / 1000

Question 9

You are an ML engineer at a global shoe store. You manage the ML models for the companys website. You are asked to
build a model that will recommend new products to the user based on their purchase behavior and similarity with other users.
What should you do?

  • A. Build a classification model
  • B. Build a knowledge-based filtering model
  • C. Build a collaborative-based filtering model
  • D. Build a regression model using the features as predictors
Answer:

C

Explanation:
Reference: https://cloud.google.com/solutions/recommendations-using-machine-learning-on-compute-engine

Discussions
0 / 1000

Question 10

Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up
points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on
Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What
approach should you take?

  • A. 1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station. 2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.
  • B. 1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station. 2. Dispatch an available shuttle and provide the map with the required stops based on the prediction.
  • C. 1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints. 2. Dispatch an appropriately sized shuttle and indicate the required stops on the map.
  • D. 1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distancebased metric. 2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.
Answer:

A

Discussions
0 / 1000

Question 11

You have been asked to develop an input pipeline for an ML training model that processes images from disparate sources at
a low latency. You discover that your input data does not fit in memory. How should you create a dataset following Google-
recommended best practices?

  • A. Create a tf.data.Dataset.prefetch transformation.
  • B. Convert the images to tf.Tensor objects, and then run Dataset.from_tensor_slices().
  • C. Convert the images to tf.Tensor objects, and then run tf.data.Dataset.from_tensors().
  • D. Convert the images into TFRecords, store the images in Cloud Storage, and then use the tf.data API to read the images for training.
Answer:

B

Explanation:
Reference: https://www.tensorflow.org/api_docs/python/tf/data/Dataset

Discussions
0 / 1000

Question 12

You work for a large technology company that wants to modernize their contact center. You have been asked to develop a
solution to classify incoming calls by product so that requests can be more quickly routed to the correct support team. You
have already transcribed the calls using the Speech-to-Text API. You want to minimize data preprocessing and development
time. How should you build the model?

  • A. Use the AI Platform Training built-in algorithms to create a custom model.
  • B. Use AutoMlL Natural Language to extract custom entities for classification.
  • C. Use the Cloud Natural Language API to extract custom entities for classification.
  • D. Build a custom model to identify the product keywords from the transcribed calls, and then run the keywords through a classification algorithm.
Answer:

A

Discussions
0 / 1000

Question 13

You are an ML engineer at a global car manufacture. You need to build an ML model to predict car sales in different cities
around the world. Which features or feature crosses should you use to train city-specific relationships between car type and
number of sales?

  • A. Thee individual features: binned latitude, binned longitude, and one-hot encoded car type.
  • B. One feature obtained as an element-wise product between latitude, longitude, and car type.
  • C. One feature obtained as an element-wise product between binned latitude, binned longitude, and one-hot encoded car type.
  • D. Two feature crosses as an element-wise product: the first between binned latitude and one-hot encoded car type, and the second between binned longitude and one-hot encoded car type.
Answer:

C

Discussions
0 / 1000

Question 14

You work for a bank and are building a random forest model for fraud detection. You have a dataset that includes
transactions, of which 1% are identified as fraudulent. Which data transformation strategy would likely improve the
performance of your classifier?

  • A. Write your data in TFRecords.
  • B. Z-normalize all the numeric features.
  • C. Oversample the fraudulent transaction 10 times.
  • D. Use one-hot encoding on all categorical features.
Answer:

C

Explanation:
Reference: https://towardsdatascience.com/how-to-build-a-machine-learning-model-to-identify-credit-card-fraud-in-5-stepsa-
hands-on-modeling-5140b3bd19f1

Discussions
0 / 1000

Question 15

You need to build classification workflows over several structured datasets currently stored in BigQuery. Because you will be
performing the classification several times, you want to complete the following steps without writing code: exploratory data
analysis, feature selection, model building, training, and hyperparameter tuning and serving. What should you do?

  • A. Configure AutoML Tables to perform the classification task.
  • B. Run a BigQuery ML task to perform logistic regression for the classification.
  • C. Use AI Platform Notebooks to run the classification model with pandas library.
  • D. Use AI Platform to run the classification model job configured for hyperparameter tuning.
Answer:

B

Explanation:
BigQuery ML supports supervised learning with the logistic regression model type.
Reference: https://cloud.google.com/bigquery-ml/docs/logistic-regression-prediction

Discussions
0 / 1000
To page 2