Skip to content

Support {{media}} for including files #25

@chr15m

Description

@chr15m

Support for adding files to context is in #17. Supporting {{media}} simply inlines the files into a particular place in the prompt instead of what --read does, placing them at the start.

My feeling is this has marginal benefit, since you can refer to inlined files by name and the LLM will know what you mean, but if somebody needs it we can implement this.


LLM analysis of the dotprompt {{media}} var.

So this could be used like --urlVariable=@somefile.png or similar. Or maybe we can infer from the media tags that it's looking for a file and then we don't need the @.

Need to think about how arrays of files or multiple files could be included. E.g. a prompt like "extract a Markdown document from these images" where more than one image can be passed in. Maybe we can glob multiple files into a single var somehow.


Dotprompt Media Helper Implementation Guide

This document describes how the {{media}} helper works in Dotprompt and how to implement it.

Overview

The media helper enables multimodal prompts by inserting media content (images, audio, etc.) into prompts. It uses a two-phase rendering approach with an intermediate marker format.

Helper Syntax

{{media url=urlVariable}}
{{media url=urlVariable contentType="image/jpeg"}}

Parameters

Parameter Required Description
url Yes A variable containing a data: URI or https: URL
contentType No MIME type of the media (e.g., image/jpeg). Can be inferred from data URIs.

Implementation

Phase 1: Template Rendering

The media helper outputs an intermediate marker string:

<<<dotprompt:media:url URL>>>
<<<dotprompt:media:url URL CONTENT_TYPE>>>

Example helper implementation (pseudocode):

function media(options):
    url = options.hash["url"]
    contentType = options.hash["contentType"]
    
    if contentType is not empty:
        return "<<<dotprompt:media:url " + url + " " + contentType + ">>>"
    else:
        return "<<<dotprompt:media:url " + url + ">>>"

Phase 2: Post-Processing

After template rendering, the implementation must parse the rendered text and convert markers into proper MediaPart objects within the message structure.

Marker Format

<<<dotprompt:media:url {URL} [{CONTENT_TYPE}]>>>
  • {URL}: The media URL (data URI or https URL)
  • {CONTENT_TYPE}: Optional MIME type, space-separated from URL

Parsing Algorithm

  1. Scan the rendered template text for <<<dotprompt:media:url ... >>> markers
  2. For each marker found:
    • Extract the URL and optional contentType
    • Split the surrounding text into separate TextPart objects
    • Create a MediaPart object for the media content
  3. Assemble the final message with interleaved text and media parts

Phase 3: Final Output Structure

The final message structure contains separate TextPart and MediaPart objects.

MediaPart Schema

{
  "media": {
    "url": "string (required) - data: or https: URI",
    "contentType": "string (optional) - MIME type"
  }
}

Example

Input Template

Describe this image:
{{media url=imageUrl}}
What objects do you see?

After Phase 1 (Template Rendering)

Describe this image:
<<<dotprompt:media:url data:image/jpeg;base64,/9j/4AAQ...>>>
What objects do you see?

After Phase 2 (Final Message Structure)

{
  "role": "user",
  "content": [
    {"text": "Describe this image:\n"},
    {
      "media": {
        "url": "data:image/jpeg;base64,/9j/4AAQ...",
        "contentType": "image/jpeg"
      }
    },
    {"text": "\nWhat objects do you see?"}
  ]
}

Key Implementation Notes

  1. No Duplication: Media content appears only in MediaPart objects, not in text parts. The marker is replaced entirely.

  2. Content Type Inference: If contentType is not provided but the URL is a data URI, implementations should extract the MIME type from the data URI format: data:{contentType};base64,...

  3. URL Types: Implementations must support both:

    • data: URIs containing inline base64-encoded content
    • https: URLs pointing to remote media
  4. File Loading: Dotprompt does not handle file loading. The calling code is responsible for loading files and converting them to data URIs or URLs before passing them as template variables.

  5. Marker Escaping: The marker format uses <<< and >>> delimiters which are unlikely to appear in normal prompt text. Implementations should handle edge cases where these sequences might appear literally.

Multiple Media Files

To include multiple media files, use the {{#each}} helper to iterate over an array of URLs:

Simple Array of URLs

Describe these images:
{{#each imageUrls}}
{{media url=this}}
{{/each}}

With input:

{
  "imageUrls": [
    "data:image/jpeg;base64,/9j/4AAQ...",
    "https://example.com/image2.jpg",
    "data:image/png;base64,iVBORw0..."
  ]
}

Array of Objects with Metadata

For more control, pass an array of objects containing URL and content type:

{{#each images}}
{{media url=this.url contentType=this.type}}
{{/each}}

With input:

{
  "images": [
    {"url": "https://example.com/photo.jpg", "type": "image/jpeg"},
    {"url": "https://example.com/diagram.png", "type": "image/png"}
  ]
}

This renders multiple media markers which are then post-processed into a message containing multiple MediaPart objects.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions