AIF-C01 Deep Dive: Amazon Bedrock and Foundation Model Applications (Domains 2 & 3)
Domains 2 and 3 make up 52% of the AWS AI Practitioner exam. This deep dive covers transformer architecture, tokenization, inference parameters, Amazon Bedrock model families, RAG vs fine-tuning, Bedrock Knowledge Bases, Agents, Guardrails, and evaluation metrics — everything you need to master the highest-weighted sections of the AIF-C01.
If you are preparing for the AWS Certified AI Practitioner (AIF-C01), here is the most important number to remember: 52%. That is the combined weight of Domain 2 (Fundamentals of Generative AI, 24%) and Domain 3 (Applications of Foundation Models, 28%). These two domains alone determine whether you pass or fail. They cover the concepts behind generative AI — transformers, tokenization, embeddings, inference parameters — and the AWS services that bring those concepts to life, particularly Amazon Bedrock. In this deep dive, we break down every major topic in both domains with the detail you need to answer exam questions confidently.
- Transformer Architecture Essentials
- Tokenization, Embeddings, and Inference Parameters
- Amazon Bedrock Model Families
- RAG vs Fine-Tuning vs Prompt Engineering
- Amazon Bedrock Knowledge Bases
- Amazon Bedrock Agents
- Amazon Bedrock Guardrails
- Evaluation Metrics for Foundation Models
- SageMaker JumpStart for Fine-Tuning
- Practical Examples
Transformer Architecture Essentials
The transformer architecture, introduced in the 2017 paper "Attention Is All You Need," is the foundation of every modern large language model (LLM). For the AIF-C01 exam, you do not need to understand the mathematics behind transformers, but you must understand the key concepts at a conceptual level.
The core innovation of the transformer is the self-attention mechanism, which allows the model to weigh the importance of every word in a sentence relative to every other word. Unlike older recurrent neural networks (RNNs) that process text sequentially, transformers process all tokens in parallel, making them dramatically faster to train on large datasets.
Transformers come in three flavors, and the exam expects you to know the difference:
| Type | Architecture | Best For | Example Models |
|---|---|---|---|
| Encoder-only | Processes input to create representations | Classification, sentiment analysis, NER | BERT, RoBERTa |
| Decoder-only | Generates text token by token | Text generation, chat, code completion | GPT, Claude, Llama |
| Encoder-Decoder | Encodes input, then decodes output | Translation, summarization | T5, BART |
Tokenization, Embeddings, and Inference Parameters
Tokenization is the process of breaking input text into smaller units (tokens) that the model can process. A token is not always a word — it can be a word, a subword, or even a single character depending on the tokenizer. For exam purposes, remember that billing for foundation models is typically based on the number of input and output tokens, and that different models may tokenize the same text differently.
Embeddings are numerical vector representations of tokens. Words with similar meanings have embeddings that are close together in vector space. The exam tests whether you understand that embeddings enable semantic similarity search, which is the foundation of RAG (retrieval-augmented generation).
When invoking a foundation model, you control its behavior through inference parameters:
| Parameter | What It Controls | Low Value | High Value |
|---|---|---|---|
| Temperature | Randomness of output | More deterministic, factual | More creative, diverse |
| Top-p (nucleus sampling) | Cumulative probability threshold for token selection | Fewer token choices, more focused | More token choices, more varied |
| Top-k | Number of top tokens considered at each step | Only top few tokens considered | Many tokens considered |
| Max tokens | Maximum length of the generated response | Short responses | Long, detailed responses |
| Stop sequences | Strings that cause the model to stop generating | Used to prevent the model from continuing past a logical endpoint | |
Hallucination occurs when a model generates plausible-sounding but factually incorrect information. The exam tests your knowledge of hallucination mitigation strategies: lowering temperature, using RAG to ground responses in real data, implementing Bedrock Guardrails with grounding checks, and including system prompts that instruct the model to say "I don't know" when uncertain.
Amazon Bedrock Model Families
Amazon Bedrock provides access to multiple foundation model families through a single API. The exam expects you to know which models are available and when to choose each one.
| Provider | Model Family | Best For | Key Differentiator |
|---|---|---|---|
| Anthropic | Claude | Complex reasoning, analysis, coding, long context | 200K+ token context window, strong instruction following, Constitutional AI safety |
| Meta | Llama | General-purpose text, code generation, multilingual | Open-weight model, can be fine-tuned and customized |
| Amazon | Titan | Text, embeddings, image generation | AWS-native, built-in watermarking for Titan Image, Titan Embeddings for RAG |
| Mistral AI | Mistral / Mixtral | Cost-efficient text generation, coding | High performance-to-cost ratio, Mixture of Experts (MoE) architecture in Mixtral |
| Cohere | Command / Embed | Enterprise search, RAG, multilingual embeddings | Optimized for retrieval and enterprise document search use cases |
| Stability AI | Stable Diffusion | Image generation from text prompts | High-quality text-to-image generation, style control |
RAG vs Fine-Tuning vs Prompt Engineering
This is one of the most heavily tested concepts on the AIF-C01. You must understand when to use each approach for customizing foundation model behavior.
| Approach | When to Use | Cost | Complexity | AWS Service |
|---|---|---|---|---|
| Prompt Engineering | Quick customization with no training; when the model already has the required knowledge | Lowest (pay per inference) | Low | Bedrock (any model) |
| RAG | When the model needs access to private, current, or domain-specific data without retraining | Medium (embeddings + vector store + inference) | Medium | Bedrock Knowledge Bases |
| Fine-Tuning | When you need to change the model's behavior, style, or teach it domain-specific patterns | Highest (training compute + inference) | High | Bedrock Custom Models, SageMaker JumpStart |
The decision framework the exam expects: Start with prompt engineering. If the model lacks the required knowledge (e.g., private company data), add RAG. If you need to fundamentally change the model's behavior or output style (e.g., generate responses in a specific medical terminology format), use fine-tuning. The exam loves to test scenarios where candidates choose fine-tuning when RAG is the correct answer — remember that RAG is preferred when you need up-to-date, factual data because it retrieves from a live data source rather than baking knowledge into model weights.
Amazon Bedrock Knowledge Bases
Bedrock Knowledge Bases is AWS's managed RAG implementation. It handles the entire RAG pipeline: ingesting documents, chunking them, creating embeddings, storing them in a vector database, and retrieving relevant chunks at query time.
The RAG pipeline in Bedrock Knowledge Bases follows these steps:
- Ingest: Upload documents (PDF, TXT, HTML, CSV, DOCX) to an S3 bucket
- Chunk: Documents are split into smaller segments using configurable chunking strategies (fixed-size, semantic, or hierarchical)
- Embed: Each chunk is converted into a vector embedding using an embedding model (Titan Embeddings or Cohere Embed)
- Store: Embeddings are stored in a vector database (Amazon OpenSearch Serverless, Pinecone, or Amazon Aurora with pgvector)
- Retrieve: At query time, the user's question is embedded and used to find the most semantically similar chunks
- Generate: Retrieved chunks are added to the prompt context and sent to a foundation model for response generation
# AWS CLI: Create a Bedrock Knowledge Base data source
aws bedrock-agent create-data-source \
--knowledge-base-id "KB12345678" \
--name "company-docs" \
--data-source-configuration '{
"type": "S3",
"s3Configuration": {
"bucketArn": "arn:aws:s3:::my-company-docs"
}
}' \
--vector-ingestion-configuration '{
"chunkingConfiguration": {
"chunkingStrategy": "FIXED_SIZE",
"fixedSizeChunkingConfiguration": {
"maxTokens": 300,
"overlapPercentage": 20
}
}
}'
Amazon Bedrock Agents
Bedrock Agents enable agentic workflows — multi-step tasks where the foundation model reasons about what actions to take, invokes APIs or Lambda functions, and iterates until the task is complete. Think of an agent as an LLM with the ability to use tools.
Key concepts for the exam:
- Action Groups: Define the actions an agent can take, backed by Lambda functions or API schemas (OpenAPI)
- Knowledge Base integration: Agents can query Bedrock Knowledge Bases to retrieve information as part of their reasoning
- Orchestration: The agent uses chain-of-thought reasoning to decide which actions to take and in what order
- Session management: Agents maintain conversation context across multiple turns
A typical exam scenario: "A company wants to build a customer service chatbot that can look up order status in a database and answer product questions from documentation." The answer involves a Bedrock Agent with an action group (Lambda function to query the order database) and a Knowledge Base (product documentation in S3).
Amazon Bedrock Guardrails
Bedrock Guardrails provide configurable safeguards that apply to both input prompts and model responses. The exam tests all four guardrail types:
| Guardrail Type | What It Does | Example Use Case |
|---|---|---|
| Content Filtering | Blocks harmful content across categories (hate, violence, sexual, misconduct) with configurable thresholds | Prevent a chatbot from generating violent or inappropriate content |
| Topic Denial | Blocks conversations about specific topics you define | Prevent a financial chatbot from giving investment advice |
| PII Redaction | Detects and masks or blocks personally identifiable information (names, SSNs, emails, addresses) | Prevent a healthcare chatbot from storing or returning patient data |
| Grounding Checks | Validates that model responses are grounded in provided source material (reduces hallucination) | Ensure RAG responses only use information from the retrieved documents |
Evaluation Metrics for Foundation Models
The AIF-C01 expects you to know how to evaluate foundation model outputs. Amazon Bedrock provides both automated and human evaluation capabilities. Understanding the key metrics is essential for exam success.
| Metric | What It Measures | Best For | How It Works |
|---|---|---|---|
| ROUGE | Overlap between generated and reference text | Summarization | Recall-oriented; counts matching n-grams between output and reference |
| BLEU | Precision of generated text against reference | Translation | Precision-oriented; measures how many generated n-grams appear in the reference |
| BERTScore | Semantic similarity between generated and reference text | Any text generation | Uses BERT embeddings to compare meaning, not just word overlap |
| Human Evaluation | Quality, helpfulness, and safety judged by humans | Subjective quality, safety, tone | Human reviewers rate model outputs on predefined criteria |
Key exam distinction: ROUGE is recall-oriented (did the summary capture the key ideas from the reference?), while BLEU is precision-oriented (are the generated words accurate compared to the reference?). BERTScore goes beyond word matching to compare semantic meaning. When the exam mentions "evaluating the quality of summarization," the answer is ROUGE. For "evaluating translation quality," it is BLEU.
SageMaker JumpStart for Fine-Tuning
While Bedrock is the primary service for using foundation models, SageMaker JumpStart is the go-to for fine-tuning them. JumpStart provides pre-trained foundation models that you can fine-tune on your own data using SageMaker's managed training infrastructure.
The exam tests when to use Bedrock Custom Models (fine-tuning through Bedrock) vs SageMaker JumpStart:
- Bedrock Custom Models: Simpler, managed fine-tuning through the Bedrock console for supported models. No infrastructure management needed. Best for straightforward customization.
- SageMaker JumpStart: More control over the training process, hyperparameters, and infrastructure. Supports a wider range of models and fine-tuning techniques (full fine-tuning, LoRA, QLoRA). Best for ML teams that need granular control.
# AWS CLI: Invoke a Bedrock model (Claude on Bedrock example)
aws bedrock-runtime invoke-model \
--model-id "anthropic.claude-3-sonnet-20240229-v1:0" \
--body '{
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"temperature": 0.3,
"messages": [
{
"role": "user",
"content": "Explain the difference between RAG and fine-tuning in 3 sentences."
}
]
}' \
--content-type "application/json" \
--accept "application/json" \
output.json
Practical Examples
To tie these concepts together, here are three realistic exam scenarios and how to think through them:
Scenario 1: A legal firm wants their chatbot to answer questions about their internal policy documents. The documents are updated monthly. Which approach should they use?
Answer: RAG with Bedrock Knowledge Bases. The documents change frequently (ruling out fine-tuning, which would require retraining), and the model needs access to private data (ruling out prompt engineering alone). Set up an S3 bucket with the policy documents, create a Knowledge Base, and sync it monthly.
Scenario 2: A company wants to build a customer service bot that checks order status AND answers product questions. What Bedrock features should they combine?
Answer: Bedrock Agent with an action group (Lambda function to query the order database) and a Knowledge Base (product documentation). The agent orchestrates between the two based on the user's question.
Scenario 3: A healthcare company wants to use a foundation model but must ensure no patient PII is included in prompts or responses. What should they implement?
Answer: Bedrock Guardrails with PII redaction enabled. Configure the guardrail to detect and mask healthcare-specific PII types (names, medical record numbers) in both input and output.
Domains 2 and 3 are where the AIF-C01 exam is won or lost. Master the transformer fundamentals, know the Bedrock model families and their differentiators, understand the RAG-vs-fine-tuning decision framework, and be comfortable with Knowledge Bases, Agents, Guardrails, and evaluation metrics. With this foundation, you will be well-prepared to tackle more than half the exam with confidence.
Comments
No comments yet. Be the first!
Comments are reviewed before publication.