This directory contains prompts for developing dlt REST API pipelines using a streamlined, test-driven approach.
The prompts follow a sequential structure with the following goals:
- Learning: Each step explains how dlt works, progressing from basic concepts to advanced patterns.
- Clarity: The process is broken down into distinct phases: research & planning, implementation, and testing.
- Quality: Test-driven development ensures each component works correctly before moving forward.
All prompts follow TDD methodology:
- Tests are written first
- Implementation follows test requirements
- Tests are kept minimal but focused on critical functionality
- Each component is validated before moving forward
Requirements are documented before implementation:
- API research informs the specification
- Specifications include test structure and acceptance criteria
- Implementation follows the specification exactly
Information is presented when needed:
- Phase 0 provides dlt context upfront
- Phase 1 combines research and planning
- Phase 2 focuses on implementation details
- Phase 3 handles end-to-end testing
All prompts follow dlt best practices:
- Use built-in patterns:
rest_api_source()andRESTClient - Leverage dlt's retry, rate limiting, and error handling
- Follow dlt's resource and source patterns
- Use dlt's incremental loading capabilities
- Never build custom API clients
prompts/
├── 00_prime/
│ └── 000_dlt_context.md # dlt fundamentals and patterns
├── 01_plan/
│ └── 01_research_and_plan.md # Research API + create spec
├── 02_implement/
│ └── 02_implement.md # TDD implementation
└── 03_test/
└── 03_test.md # Pipeline testing guide
┌─────────────────────────────────────────────────────────────────┐
│ Phase 0: Prime (Optional) │
│ Read dlt context before starting │
└─────────────────────────────────────────────────────────────────┘
│
└─> 00_prime/000_dlt_context.md
Provides: dlt patterns, auth classes, paginators, decorators
│
v
┌─────────────────────────────────────────────────────────────────┐
│ Phase 1: Research & Plan │
│ Research API + Generate specification │
└─────────────────────────────────────────────────────────────────┘
│
└─> 01_plan/01_research_and_plan.md
Arguments: api_name (e.g., "arxiv", "github")
Output: specs/YYYY-MM-DD_011_spec_dlt_rest_client_{api_name}.md
This spec includes:
- API research findings
- Authentication configuration
- Endpoint documentation
- Pagination strategies
- Incremental loading design
- Complete test structure
- TDD task breakdown
│
v
┌─────────────────────────────────────────────────────────────────┐
│ Phase 2: Implement │
│ Follow spec using strict TDD │
└─────────────────────────────────────────────────────────────────┘
│
└─> 02_implement/02_implement.md
Arguments: spec_file_path
Process: Write test → Run (fail) → Implement → Run (pass)
Follows task list from spec sequentially
│
v
┌─────────────────────────────────────────────────────────────────┐
│ Phase 3: Test │
│ Run pipeline end-to-end │
└─────────────────────────────────────────────────────────────────┘
│
└─> 03_test/03_test.md
- Test with limited data first
- Verify schema and data quality
- Test incremental loading
- Validate full pipeline execution
project/
├── specs/
│ └── YYYY-MM-DD_011_spec_dlt_rest_client_{api_name}.md
├── sources/
│ └── {api_name}_source.py
├── tests/
│ ├── conftest.py
│ ├── test_auth.py (if auth required)
│ ├── test_resources.py
│ ├── test_incremental.py (if applicable)
│ ├── test_source.py
│ └── test_pipeline.py
├── .dlt/
│ ├── config.toml
│ └── secrets.toml.example
└── {api_name}_pipeline.py
Step 0 (Optional): Review dlt context
# Read: prompts/00_prime/000_dlt_context.md
# Learn about: RESTClient, auth classes, paginators, decoratorsStep 1: Research & Plan
# Use prompt: prompts/01_plan/01_research_and_plan.md
# Provide: api_name="github"
# The prompt will:
# 1. Check for existing dltHub verified source
# 2. Research GitHub API documentation
# 3. Ask clarifying questions
# 4. Generate comprehensive specification
# Output: specs/2025-01-15_011_spec_dlt_rest_client_github.mdStep 2: Implement
# Use prompt: prompts/02_implement/02_implement.md
# Provide: spec_file_path="specs/2025-01-15_011_spec_dlt_rest_client_github.md"
# The prompt will:
# 1. Extract task list from spec
# 2. Follow TDD: test → fail → implement → pass
# 3. Report progress after each task
# 4. Ask user when spec is unclear
# Output: sources/, tests/, .dlt/, pipeline scriptStep 3: Test
# Use guide: prompts/03_test/03_test.md
# Process:
# 1. Configure secrets in .dlt/secrets.toml
# 2. Test with limited data (add_limit(10))
# 3. Verify schema and data
# 4. Test incremental loading
# 5. Run full pipeline# Phase 1: Research & Plan
# → Provide "github" as api_name to prompt 01_research_and_plan.md
# → Review and approve generated specification
# Phase 2: Implement with TDD
# → Provide spec path to prompt 02_implement.md
# → Watch as tests are written and implementation follows
# Phase 3: Test Pipeline
# → Follow guide in 03_test.md
pytest tests/ -v # Run all tests
python github_pipeline.py # Execute pipeline
dlt pipeline github_pipeline show # Inspect resultsPurpose: Provide foundational dlt knowledge before starting development.
Contains:
- What is dlt and when to use it
- Decision matrix:
rest_api_source()vsRESTClient - Authentication patterns (API key, Bearer, Basic, OAuth2)
- Pagination classes and examples
- Resource and source decorators
- Secrets management
- Incremental loading patterns
- Pipeline creation and testing
When to use: Read before Phase 1 if unfamiliar with dlt, or reference during implementation.
Purpose: Research API and generate complete specification with TDD structure.
Input:
api_name: Name of the API (e.g., "arxiv", "github")- Optional:
--spec-output,--destination
Process:
- Check for existing dltHub verified source
- Research API documentation for:
- Base URL, authentication, endpoints
- Pagination, rate limits, incremental fields
- Data structure and special considerations
- Ask clarifying questions
- Design test-first approach
- Map API patterns to dlt components
- Generate specification document
Output: specs/YYYY-MM-DD_011_spec_dlt_rest_client_{api_name}.md
Specification includes:
- API research findings
- Authentication configuration
- Endpoint documentation
- Resource implementations
- Test structure (conftest.py, test_auth.py, test_resources.py, etc.)
- TDD task breakdown (8 phases)
- Acceptance criteria
Purpose: Implement specification following strict TDD methodology.
Input:
spec_file_path: Path to specification document
Critical rules:
- Load spec and extract task list
- TDD only: test first → verify fail → implement → verify pass
- Follow sequential order, one task at a time
- Ask user when spec is unclear or information is missing
- No assumptions about credentials or requirements
Process for each task:
- Write test (red)
- Run test - should fail
- Implement minimal code (green)
- Run test - should pass
- Report progress
- Move to next task
Output:
sources/{api_name}_source.pytests/directory with all test files.dlt/config.tomland.dlt/secrets.toml.example{api_name}_pipeline.pyREADME.md
Validation:
- All spec tests implemented and passing
- All tasks checked off
- Used dlt built-ins (no custom API clients)
- Asked user when needed
Purpose: Test the pipeline end-to-end with real API.
Process:
- Setup: Create venv, install dependencies
- Configure: Add secrets to
.dlt/secrets.toml - Limited test: Run with
add_limit(10)ormax_pages=1 - Verify: Check successful load
- Full test: Remove limit and run full pipeline
- Inspect:
- View schema:
dlt pipeline <name> schema - Query data:
dlt pipeline <name> show - Check counts and data quality
- View schema:
- Incremental test: Run twice, verify state and no duplicates
- Validate: Complete checklist
Common issues covered:
- Authentication failures
- Schema changes
- Data quality issues
- Rate limiting
Output: Fully tested, production-ready pipeline
- Always use
rest_api_source()orRESTClient - Leverage dlt's auth classes:
BearerTokenAuth,APIKeyAuth,HttpBasicAuth,OAuth2ClientCredentials - Use built-in paginators:
SinglePagePaginator,PageNumberPaginator,OffsetPaginator,JSONResponsePaginator, etc. - Never build custom API clients
- Write tests first, implementation second
- Tests should fail initially (red)
- Implement minimal code to pass (green)
- Validate each component before moving forward
- Keep tests minimal but focused on critical functionality
- Start with simplest approach that works
- Only add complexity when needed
- Follow spec exactly, don't over-engineer
- Ask user when requirements are unclear
- Never read credentials from files
- Always use
.dlt/secrets.tomlor environment variables - Keep
.dlt/secrets.tomlin.gitignore - Provide
.dlt/secrets.toml.exampletemplate
- Test with limited data first (
add_limit(10)) - Verify small sample before full load
- Test incremental loading by running twice
- Validate data quality at each step
- Clarity: Each phase has a clear purpose and output
- Quality: TDD ensures correctness at each step
- Learning: Prompts teach dlt patterns progressively
- Efficiency: Streamlined workflow reduces back-and-forth
- Reliability: Specifications prevent misunderstandings
- Maintainability: Well-tested code is easier to update
- dlt Documentation
- REST API Source
- RESTClient Guide
- dlt Verified Sources - Check here first before building custom
If you encounter issues:
- Review the relevant phase documentation
- Check specification for clarity
- Consult dlt documentation
- Ask clarifying questions when spec is ambiguous
Improvements welcome:
- Clearer prompt instructions
- Better examples
- Additional patterns
- Bug fixes