dlt REST API Source Development Prompts

This directory contains prompts for developing dlt REST API pipelines using a streamlined, test-driven approach.

Why Step-by-Step?

The prompts follow a sequential structure with the following goals:

Learning: Each step explains how dlt works, progressing from basic concepts to advanced patterns.
Clarity: The process is broken down into distinct phases: research & planning, implementation, and testing.
Quality: Test-driven development ensures each component works correctly before moving forward.

Development Philosophy

Test-Driven Development (TDD)

All prompts follow TDD methodology:

Tests are written first
Implementation follows test requirements
Tests are kept minimal but focused on critical functionality
Each component is validated before moving forward

Spec-Driven Development

Requirements are documented before implementation:

API research informs the specification
Specifications include test structure and acceptance criteria
Implementation follows the specification exactly

Progressive Disclosure

Information is presented when needed:

Phase 0 provides dlt context upfront
Phase 1 combines research and planning
Phase 2 focuses on implementation details
Phase 3 handles end-to-end testing

dlt Best Practices

All prompts follow dlt best practices:

Use built-in patterns: rest_api_source() and RESTClient
Leverage dlt's retry, rate limiting, and error handling
Follow dlt's resource and source patterns
Use dlt's incremental loading capabilities
Never build custom API clients

Directory Structure

prompts/
├── 00_prime/
│   └── 000_dlt_context.md          # dlt fundamentals and patterns
├── 01_plan/
│   └── 01_research_and_plan.md     # Research API + create spec
├── 02_implement/
│   └── 02_implement.md             # TDD implementation
└── 03_test/
    └── 03_test.md                  # Pipeline testing guide

Execution Flow

Standard Workflow

┌─────────────────────────────────────────────────────────────────┐
│ Phase 0: Prime (Optional)                                       │
│ Read dlt context before starting                                │
└─────────────────────────────────────────────────────────────────┘
    │
    └─> 00_prime/000_dlt_context.md
        Provides: dlt patterns, auth classes, paginators, decorators
    │
    v
┌─────────────────────────────────────────────────────────────────┐
│ Phase 1: Research & Plan                                        │
│ Research API + Generate specification                           │
└─────────────────────────────────────────────────────────────────┘
    │
    └─> 01_plan/01_research_and_plan.md
        Arguments: api_name (e.g., "arxiv", "github")
        Output: specs/YYYY-MM-DD_011_spec_dlt_rest_client_{api_name}.md

        This spec includes:
        - API research findings
        - Authentication configuration
        - Endpoint documentation
        - Pagination strategies
        - Incremental loading design
        - Complete test structure
        - TDD task breakdown
    │
    v
┌─────────────────────────────────────────────────────────────────┐
│ Phase 2: Implement                                              │
│ Follow spec using strict TDD                                    │
└─────────────────────────────────────────────────────────────────┘
    │
    └─> 02_implement/02_implement.md
        Arguments: spec_file_path
        Process: Write test → Run (fail) → Implement → Run (pass)
        Follows task list from spec sequentially
    │
    v
┌─────────────────────────────────────────────────────────────────┐
│ Phase 3: Test                                                   │
│ Run pipeline end-to-end                                         │
└─────────────────────────────────────────────────────────────────┘
    │
    └─> 03_test/03_test.md
        - Test with limited data first
        - Verify schema and data quality
        - Test incremental loading
        - Validate full pipeline execution

File Output Structure

project/
├── specs/
│   └── YYYY-MM-DD_011_spec_dlt_rest_client_{api_name}.md
├── sources/
│   └── {api_name}_source.py
├── tests/
│   ├── conftest.py
│   ├── test_auth.py (if auth required)
│   ├── test_resources.py
│   ├── test_incremental.py (if applicable)
│   ├── test_source.py
│   └── test_pipeline.py
├── .dlt/
│   ├── config.toml
│   └── secrets.toml.example
└── {api_name}_pipeline.py

Usage Example

Building a Pipeline for the GitHub API

Step 0 (Optional): Review dlt context

# Read: prompts/00_prime/000_dlt_context.md
# Learn about: RESTClient, auth classes, paginators, decorators

Step 1: Research & Plan

# Use prompt: prompts/01_plan/01_research_and_plan.md
# Provide: api_name="github"
# The prompt will:
#   1. Check for existing dltHub verified source
#   2. Research GitHub API documentation
#   3. Ask clarifying questions
#   4. Generate comprehensive specification
# Output: specs/2025-01-15_011_spec_dlt_rest_client_github.md

Step 2: Implement

# Use prompt: prompts/02_implement/02_implement.md
# Provide: spec_file_path="specs/2025-01-15_011_spec_dlt_rest_client_github.md"
# The prompt will:
#   1. Extract task list from spec
#   2. Follow TDD: test → fail → implement → pass
#   3. Report progress after each task
#   4. Ask user when spec is unclear
# Output: sources/, tests/, .dlt/, pipeline script

Step 3: Test

# Use guide: prompts/03_test/03_test.md
# Process:
#   1. Configure secrets in .dlt/secrets.toml
#   2. Test with limited data (add_limit(10))
#   3. Verify schema and data
#   4. Test incremental loading
#   5. Run full pipeline

Complete Workflow Commands

# Phase 1: Research & Plan
# → Provide "github" as api_name to prompt 01_research_and_plan.md
# → Review and approve generated specification

# Phase 2: Implement with TDD
# → Provide spec path to prompt 02_implement.md
# → Watch as tests are written and implementation follows

# Phase 3: Test Pipeline
# → Follow guide in 03_test.md
pytest tests/ -v                    # Run all tests
python github_pipeline.py           # Execute pipeline
dlt pipeline github_pipeline show   # Inspect results

Phase Details

Phase 0: Prime (00_prime/000_dlt_context.md)

Purpose: Provide foundational dlt knowledge before starting development.

Contains:

What is dlt and when to use it
Decision matrix: rest_api_source() vs RESTClient
Authentication patterns (API key, Bearer, Basic, OAuth2)
Pagination classes and examples
Resource and source decorators
Secrets management
Incremental loading patterns
Pipeline creation and testing

When to use: Read before Phase 1 if unfamiliar with dlt, or reference during implementation.

Phase 1: Research & Plan (01_plan/01_research_and_plan.md)

Purpose: Research API and generate complete specification with TDD structure.

Input:

api_name: Name of the API (e.g., "arxiv", "github")
Optional: --spec-output, --destination

Process:

Check for existing dltHub verified source
Research API documentation for:
- Base URL, authentication, endpoints
- Pagination, rate limits, incremental fields
- Data structure and special considerations
Ask clarifying questions
Design test-first approach
Map API patterns to dlt components
Generate specification document

Output: specs/YYYY-MM-DD_011_spec_dlt_rest_client_{api_name}.md

Specification includes:

API research findings
Authentication configuration
Endpoint documentation
Resource implementations
Test structure (conftest.py, test_auth.py, test_resources.py, etc.)
TDD task breakdown (8 phases)
Acceptance criteria

Phase 2: Implement (02_implement/02_implement.md)

Purpose: Implement specification following strict TDD methodology.

Input:

spec_file_path: Path to specification document

Critical rules:

Load spec and extract task list
TDD only: test first → verify fail → implement → verify pass
Follow sequential order, one task at a time
Ask user when spec is unclear or information is missing
No assumptions about credentials or requirements

Process for each task:

Write test (red)
Run test - should fail
Implement minimal code (green)
Run test - should pass
Report progress
Move to next task

Output:

sources/{api_name}_source.py
tests/ directory with all test files
.dlt/config.toml and .dlt/secrets.toml.example
{api_name}_pipeline.py
README.md

Validation:

All spec tests implemented and passing
All tasks checked off
Used dlt built-ins (no custom API clients)
Asked user when needed

Phase 3: Test (03_test/03_test.md)

Purpose: Test the pipeline end-to-end with real API.

Process:

Setup: Create venv, install dependencies
Configure: Add secrets to .dlt/secrets.toml
Limited test: Run with add_limit(10) or max_pages=1
Verify: Check successful load
Full test: Remove limit and run full pipeline
Inspect:
- View schema: dlt pipeline <name> schema
- Query data: dlt pipeline <name> show
- Check counts and data quality
Incremental test: Run twice, verify state and no duplicates
Validate: Complete checklist

Common issues covered:

Authentication failures
Schema changes
Data quality issues
Rate limiting

Output: Fully tested, production-ready pipeline

Key Principles

1. Use dlt Built-ins

Always use rest_api_source() or RESTClient
Leverage dlt's auth classes: BearerTokenAuth, APIKeyAuth, HttpBasicAuth, OAuth2ClientCredentials
Use built-in paginators: SinglePagePaginator, PageNumberPaginator, OffsetPaginator, JSONResponsePaginator, etc.
Never build custom API clients

2. Test-Driven Development

Write tests first, implementation second
Tests should fail initially (red)
Implement minimal code to pass (green)
Validate each component before moving forward
Keep tests minimal but focused on critical functionality

3. Progressive Complexity

Start with simplest approach that works
Only add complexity when needed
Follow spec exactly, don't over-engineer
Ask user when requirements are unclear

4. Security First

Never read credentials from files
Always use .dlt/secrets.toml or environment variables
Keep .dlt/secrets.toml in .gitignore
Provide .dlt/secrets.toml.example template

5. Incremental Development

Test with limited data first (add_limit(10))
Verify small sample before full load
Test incremental loading by running twice
Validate data quality at each step

Benefits of This Approach

Clarity: Each phase has a clear purpose and output
Quality: TDD ensures correctness at each step
Learning: Prompts teach dlt patterns progressively
Efficiency: Streamlined workflow reduces back-and-forth
Reliability: Specifications prevent misunderstandings
Maintainability: Well-tested code is easier to update

Additional Resources

dlt Documentation
REST API Source
RESTClient Guide
dlt Verified Sources - Check here first before building custom

Getting Help

If you encounter issues:

Review the relevant phase documentation
Check specification for clarity
Consult dlt documentation
Ask clarifying questions when spec is ambiguous

Contributing

Improvements welcome:

Clearer prompt instructions
Better examples
Additional patterns
Bug fixes

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.cursor/rules		.cursor/rules
prompts		prompts
.DS_Store		.DS_Store
.cursorignore		.cursorignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dlt REST API Source Development Prompts

Why Step-by-Step?

Development Philosophy

Test-Driven Development (TDD)

Spec-Driven Development

Progressive Disclosure

dlt Best Practices

Directory Structure

Execution Flow

Standard Workflow

File Output Structure

Usage Example

Building a Pipeline for the GitHub API

Complete Workflow Commands

Phase Details

Phase 0: Prime (00_prime/000_dlt_context.md)

Phase 1: Research & Plan (01_plan/01_research_and_plan.md)

Phase 2: Implement (02_implement/02_implement.md)

Phase 3: Test (03_test/03_test.md)

Key Principles

1. Use dlt Built-ins

2. Test-Driven Development

3. Progressive Complexity

4. Security First

5. Incremental Development

Benefits of This Approach

Additional Resources

Getting Help

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

dlt REST API Source Development Prompts

Why Step-by-Step?

Development Philosophy

Test-Driven Development (TDD)

Spec-Driven Development

Progressive Disclosure

dlt Best Practices

Directory Structure

Execution Flow

Standard Workflow

File Output Structure

Usage Example

Building a Pipeline for the GitHub API

Complete Workflow Commands

Phase Details

Phase 0: Prime (00_prime/000_dlt_context.md)

Phase 1: Research & Plan (01_plan/01_research_and_plan.md)

Phase 2: Implement (02_implement/02_implement.md)

Phase 3: Test (03_test/03_test.md)

Key Principles

1. Use dlt Built-ins

2. Test-Driven Development

3. Progressive Complexity

4. Security First

5. Incremental Development

Benefits of This Approach

Additional Resources

Getting Help

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages