Skip to main content
Artificial Intelligence ⭐ Premium

AWS Certified Machine Learning Engineer Associate (MLA-C01) - 340 Questions

By Webmaster Certland ❤️ 0 likes

Practice exam for the AWS Certified Machine Learning Engineer Associate (MLA-C01). Covers data preparation, ML model development, deployment and orchestration of ML workflows, and ML solution monitoring, maintenance, and security.

🔒

Premium Content

This exam is exclusive to Premium users. Upgrade to get unlimited access!

Become Premium

👁️ Free Preview (5 of 340 questions)

1. A machine learning engineer needs to organize raw training datasets in Amazon S3 so that SageMaker training jobs can efficiently load only the data belonging to a specific experiment without scanning the entire bucket. Which S3 design practice best achieves this?

A Design a structured prefix hierarchy such as s3://bucket/experiments//train/ and configure the SageMaker training job data channel to point to that prefix
B Store all dataset files in the bucket root with no prefix and use S3 Select to filter objects by experiment during training
C Create a separate S3 bucket for each experiment and grant the SageMaker execution role access to only the relevant bucket
D Tag every S3 object with an experiment ID key-value pair and configure the SageMaker data channel to filter by that tag

2. A data engineering team stores ML training files in Amazon S3 using the Parquet format. They want to ensure SageMaker training jobs can read the data as fast as possible and reduce Amazon S3 API costs. Which S3 access pattern should they implement?

A Store all Parquet files under a single prefix and enable S3 Intelligent-Tiering to optimize read latency
B Distribute Parquet files across multiple prefixes using a partition scheme (e.g., date or hash) to exploit S3's per-prefix request rate limits
C Enable S3 Transfer Acceleration on the bucket to increase GET request throughput
D Enable S3 Requester Pays so that training jobs bear the cost and the bucket owner experiences no throttling

3. A machine learning team has raw CSV files landing in Amazon S3 every hour. They need to make those files queryable via standard SQL for ad-hoc analysis without writing custom ETL code. Which AWS service combination is the MOST appropriate?

A Use an AWS Glue Crawler to catalog the S3 CSV files in the Glue Data Catalog, then query them using Amazon Athena
B Load the CSV files into Amazon Redshift using a COPY command, then run SQL queries against the Redshift cluster
C Write an AWS Glue ETL job to parse the CSVs and insert rows into Amazon RDS MySQL, then query RDS directly
D Launch an Amazon EMR cluster with Apache Spark and use SparkSQL to query the CSV files in S3

4. A company wants to ingest real-time clickstream events into a data lake on Amazon S3 for downstream ML model training. The events arrive at up to 50,000 records per second and must be stored in Parquet format with no custom servers to manage. Which AWS service should they use?

A Use Amazon Kinesis Data Streams with a custom AWS Lambda consumer that writes batched records to Amazon S3
B Use AWS Glue Streaming ETL to read from the source system and write Parquet files to Amazon S3
C Use Amazon Kinesis Data Firehose with dynamic partitioning and Parquet conversion enabled, delivering directly to Amazon S3
D Use Amazon Managed Streaming for Apache Kafka (MSK) with a Kafka Connect S3 sink connector

5. A machine learning engineer needs to reuse precomputed features across multiple SageMaker training jobs to avoid redundant feature computation. The features must be retrieved with low latency (< 10 ms) during online inference as well. Which AWS service should the engineer use?

A Store computed features in Amazon DynamoDB and write custom SageMaker data loading code to read from DynamoDB during training
B Use Amazon SageMaker Feature Store with both the online store and offline store enabled for each feature group
C Cache computed features in Amazon ElastiCache for Redis and load them into S3 before each training job
D Register computed features as tables in the AWS Glue Data Catalog and reference them from SageMaker training scripts

Want to test yourself for real?

Create a free account and run our exam simulation engine.

Free No credit card
  • Simulation engine
  • Up to 10 questions per attempt
  • Score & basic stats
Create free account Already have an account? Sign in
Best
Premium 7-day trial
  • All 340 questions
  • Detailed explanations
  • Smart Practice + Focus Mode
⭐ Start 7-day free trial

Information

Questions 340
Time 2h 10min
Difficulty Medium
Minimum Score 72.00%

🤍 Like

Related Exams

Discussion

No comments yet. Be the first to start the discussion!

Sign in to join the discussion.