AI Guild

📊

Datadog Dashboard

One prompt to generate comprehensive metrics

ONE-SHOT

Can you please add as many metrics as possible to get a good picture
of what is happening with credits:

- How many times modals are opened
- Which modals are being opened
- When users click on various elements
- User interaction patterns
- etc.

Make it VERY VERY DETAILED.

And then give me JSON to create a dashboard in Datadog.

What Claude does:

Analyzes the codebase for all credit-related flows
Identifies all modal components and user interactions
Adds detailed tracking events throughout
Generates ready-to-import Datadog JSON

Result:

50+ new tracking events added
Complete visibility into credit usage
Dashboard ready in minutes, not hours

🐛

E2E Bug Fix: Post Likes

Browser verification + Temporal workflow testing

🌐

Browser: Verify the Bug

1

Run with browser access

claude --chrome

2

Navigate to post page

Claude sees the like button isn't working

3

Verify the fix visually

Takes screenshot to confirm post is now liked

⚡

Temporal CLI: Test the Fix

Claude runs the workflow directly to test the backend fix:

# Claude executes the workflow to test
temporal workflow execute \
  --type PostLikeWorkflow \
  --task-queue main \
  --input '{"postId": "123"}'

Sees the workflow succeed, confirming the backend fix works.

Key insight: Claude verifies fixes end-to-end: visually in the browser AND by running the actual Temporal workflow. No manual testing needed.

💰

AWS Cost Analysis

Full audit using AWS CLI

$300-500/mo SAVINGS

My prompts to Claude Code:

Use aws cli to see where we are spending money and how we could save money Provide a pdf report with how we could save money Also provide an .md file with all the steps that you did

AWS CLI Commands Claude Ran:

aws ce get-cost-and-usage --group-by SERVICE aws ce get-savings-plans-purchase-recommendation aws ce get-reservation-purchase-recommendation aws ce get-cost-and-usage --filter SERVICE --group-by USAGE_TYPE

Key Findings:

ECS/Fargate$587 (26.6%)

CloudWatch$243 (11%)

VPC (IPv4 + VPN)$219 (10%)

Bedrock/Claude$191 (8.7%)

Oct 2025 Anomaly$38,956!

Savings Plans

$116/mo

ElastiCache RI

$15/mo

IPv4 Cleanup

$30-50/mo

CloudWatch Opt

$50-100/mo

Output: PDF report + markdown docs with 8 analysis steps. Claude used pandoc + weasyprint for PDF generation.

📝

Requirements-First with Claude Code

A detailed design doc became a full project

HUMAN-IN-THE-LOOP

Key insight: The more time you spend describing requirements, the better Claude Code performs. A detailed spec is not overhead - it's your best investment.

The Design Document: 12 Comprehensive Sections

1. Overview

Core concepts, key decisions

2. Database Schema

6 Prisma models, enums, indexes

3. API Specification

20+ tRPC endpoints with I/O

4. Business Rules

State machines, constraints

5. Error Codes

Standardized error handling

6. Security

Auth, authorization, audit

7. Future Scope

Explicitly out-of-scope items

8. Generation System

Temporal workflows, retries

9. Frontend Design

Pages, components, state

10. File Structure

Directory layout, naming

11. Type Definitions

Full TypeScript types

12. Testing Strategy

Unit, integration, E2E

2,200+

Lines of specification

6

Database models

20+

API endpoints

5

Frontend pages

🔎

What "Detailed Requirements" Look Like

Examples from the design doc

Key Design Decisions Table

One experiment = One operator Simplifies assignment, avoids locking

Questions are immutable once used Preserves data integrity for analytics

Prompt snapshotted at creation Ensures reproducibility

Answers are immutable Audit trail integrity

API Endpoint Specification

// Input:
{
  aiVariableId: string;
  models: string[];  // ["gpt-4o", "claude-sonnet"]
  questionIds: Array<{
    questionId: string;
    required?: boolean;
    order?: number;
  }>;
  hypothesis?: string;
}

// Validations:
// - AI variable must exist and belong to user's client
// - At least 1 model must be specified
// - All questions must exist and be active

Prisma Schema with Indexes

model HumanLoopItem {
  id               String   @id @default(cuid())
  experimentId     String
  model            String
  generatedContent String
  filledPrompt     String
  entitySnapshot   Json
  status           ItemStatus @default(pending)
  displayOrder     Int      // Randomized for blind eval

  @@index([experimentId, status])
  @@index([experimentId, displayOrder])
}

Business Rules: State Machine

Experiment Status Transitions:

                                    draft
                                    →
                                    running
                                    →
                                    completed
                                    →
                                    archived
                                
Each transition has explicit validations defined

Why this matters: Claude Code can implement exactly what you need when you specify exactly what you want. Ambiguity leads to rework.

⚙️

The Workflow: From Spec to Working Code

My role vs Claude Code's role

1

Write Design Doc

Me: Drafted the full spec with Claude's help in regular chat. Iterated on edge cases, naming, and structure.

2

Implementation

Claude Code: Read the doc, created Prisma migrations, tRPC routers, Temporal workflows, React pages.

3

Review & Iterate

Together: I reviewed PRs, pointed out issues, Claude Code fixed them. Minimal back-and-forth.

What Claude Code Produced From the Spec:

🗂

Prisma Schema

6 models + migrations

🔌

tRPC Router

20+ procedures

⚡

Temporal Workflows

Generation + retries

🖥

React Pages

5 pages + components

📋

Types

Full TypeScript coverage

Takeaway: I spent ~2 hours on the design doc. Claude Code implemented the full feature in ~30 minutes of prompting. The ratio of thinking vs coding has completely flipped.

🤖

This Presentation

Done end-to-end by Claude

Explored templates

Read existing presentations

📁

Created directory

Set up folder and assets

✍️

Wrote the HTML

Generated all slides

☁️

AWS + CI/CD

S3, CloudFront, GitHub Actions

My role: Prompting + reviewing each step. Claude did all the coding, file operations, AWS infra setup, and CI/CD pipeline.

Table of Contents

Datadog Dashboard Generation

E2E Bug Fix with Browser

AWS Cost Analysis

Requirements-First with Claude Code

This Presentation

Jaume Puig

Datadog Dashboard

What Claude does:

Result:

E2E Bug Fix: Post Likes

Browser: Verify the Bug

Temporal CLI: Test the Fix

AWS Cost Analysis

AWS CLI Commands Claude Ran:

Key Findings:

Requirements-First with Claude Code

The Design Document: 12 Comprehensive Sections

What "Detailed Requirements" Look Like

Key Design Decisions Table

API Endpoint Specification

Prisma Schema with Indexes

Business Rules: State Machine

The Workflow: From Spec to Working Code

Write Design Doc

Implementation

Review & Iterate

What Claude Code Produced From the Spec:

Prisma Schema

tRPC Router

Temporal Workflows

React Pages

Types

This Presentation

Explored templates

Created directory

Wrote the HTML

AWS + CI/CD

Questions?