Support for adding files to context is in #17. Supporting {{media}} simply inlines the files into a particular place in the prompt instead of what --read does, placing them at the start.
My feeling is this has marginal benefit, since you can refer to inlined files by name and the LLM will know what you mean, but if somebody needs it we can implement this.
LLM analysis of the dotprompt {{media}} var.
So this could be used like --urlVariable=@somefile.png or similar. Or maybe we can infer from the media tags that it's looking for a file and then we don't need the @.
Need to think about how arrays of files or multiple files could be included. E.g. a prompt like "extract a Markdown document from these images" where more than one image can be passed in. Maybe we can glob multiple files into a single var somehow.
Dotprompt Media Helper Implementation Guide
This document describes how the {{media}} helper works in Dotprompt and how to implement it.
Overview
The media helper enables multimodal prompts by inserting media content (images, audio, etc.) into prompts. It uses a two-phase rendering approach with an intermediate marker format.
Helper Syntax
{{media url=urlVariable}}
{{media url=urlVariable contentType="image/jpeg"}}
Parameters
| Parameter |
Required |
Description |
url |
Yes |
A variable containing a data: URI or https: URL |
contentType |
No |
MIME type of the media (e.g., image/jpeg). Can be inferred from data URIs. |
Implementation
Phase 1: Template Rendering
The media helper outputs an intermediate marker string:
<<<dotprompt:media:url URL>>>
<<<dotprompt:media:url URL CONTENT_TYPE>>>
Example helper implementation (pseudocode):
function media(options):
url = options.hash["url"]
contentType = options.hash["contentType"]
if contentType is not empty:
return "<<<dotprompt:media:url " + url + " " + contentType + ">>>"
else:
return "<<<dotprompt:media:url " + url + ">>>"
Phase 2: Post-Processing
After template rendering, the implementation must parse the rendered text and convert markers into proper MediaPart objects within the message structure.
Marker Format
<<<dotprompt:media:url {URL} [{CONTENT_TYPE}]>>>
{URL}: The media URL (data URI or https URL)
{CONTENT_TYPE}: Optional MIME type, space-separated from URL
Parsing Algorithm
- Scan the rendered template text for
<<<dotprompt:media:url ... >>> markers
- For each marker found:
- Extract the URL and optional contentType
- Split the surrounding text into separate
TextPart objects
- Create a
MediaPart object for the media content
- Assemble the final message with interleaved text and media parts
Phase 3: Final Output Structure
The final message structure contains separate TextPart and MediaPart objects.
MediaPart Schema
{
"media": {
"url": "string (required) - data: or https: URI",
"contentType": "string (optional) - MIME type"
}
}
Example
Input Template
Describe this image:
{{media url=imageUrl}}
What objects do you see?
After Phase 1 (Template Rendering)
Describe this image:
<<<dotprompt:media:url data:image/jpeg;base64,/9j/4AAQ...>>>
What objects do you see?
After Phase 2 (Final Message Structure)
{
"role": "user",
"content": [
{"text": "Describe this image:\n"},
{
"media": {
"url": "data:image/jpeg;base64,/9j/4AAQ...",
"contentType": "image/jpeg"
}
},
{"text": "\nWhat objects do you see?"}
]
}
Key Implementation Notes
-
No Duplication: Media content appears only in MediaPart objects, not in text parts. The marker is replaced entirely.
-
Content Type Inference: If contentType is not provided but the URL is a data URI, implementations should extract the MIME type from the data URI format: data:{contentType};base64,...
-
URL Types: Implementations must support both:
data: URIs containing inline base64-encoded content
https: URLs pointing to remote media
-
File Loading: Dotprompt does not handle file loading. The calling code is responsible for loading files and converting them to data URIs or URLs before passing them as template variables.
-
Marker Escaping: The marker format uses <<< and >>> delimiters which are unlikely to appear in normal prompt text. Implementations should handle edge cases where these sequences might appear literally.
Multiple Media Files
To include multiple media files, use the {{#each}} helper to iterate over an array of URLs:
Simple Array of URLs
Describe these images:
{{#each imageUrls}}
{{media url=this}}
{{/each}}
With input:
{
"imageUrls": [
"data:image/jpeg;base64,/9j/4AAQ...",
"https://example.com/image2.jpg",
"data:image/png;base64,iVBORw0..."
]
}
Array of Objects with Metadata
For more control, pass an array of objects containing URL and content type:
{{#each images}}
{{media url=this.url contentType=this.type}}
{{/each}}
With input:
{
"images": [
{"url": "https://example.com/photo.jpg", "type": "image/jpeg"},
{"url": "https://example.com/diagram.png", "type": "image/png"}
]
}
This renders multiple media markers which are then post-processed into a message containing multiple MediaPart objects.
Support for adding files to context is in #17. Supporting
{{media}}simply inlines the files into a particular place in the prompt instead of what--readdoes, placing them at the start.My feeling is this has marginal benefit, since you can refer to inlined files by name and the LLM will know what you mean, but if somebody needs it we can implement this.
LLM analysis of the dotprompt
{{media}}var.So this could be used like
--urlVariable=@somefile.pngor similar. Or maybe we can infer from the media tags that it's looking for a file and then we don't need the @.Need to think about how arrays of files or multiple files could be included. E.g. a prompt like "extract a Markdown document from these images" where more than one image can be passed in. Maybe we can glob multiple files into a single var somehow.
Dotprompt Media Helper Implementation Guide
This document describes how the
{{media}}helper works in Dotprompt and how to implement it.Overview
The
mediahelper enables multimodal prompts by inserting media content (images, audio, etc.) into prompts. It uses a two-phase rendering approach with an intermediate marker format.Helper Syntax
Parameters
urldata:URI orhttps:URLcontentTypeimage/jpeg). Can be inferred from data URIs.Implementation
Phase 1: Template Rendering
The
mediahelper outputs an intermediate marker string:Example helper implementation (pseudocode):
Phase 2: Post-Processing
After template rendering, the implementation must parse the rendered text and convert markers into proper
MediaPartobjects within the message structure.Marker Format
{URL}: The media URL (data URI or https URL){CONTENT_TYPE}: Optional MIME type, space-separated from URLParsing Algorithm
<<<dotprompt:media:url ... >>>markersTextPartobjectsMediaPartobject for the media contentPhase 3: Final Output Structure
The final message structure contains separate
TextPartandMediaPartobjects.MediaPart Schema
{ "media": { "url": "string (required) - data: or https: URI", "contentType": "string (optional) - MIME type" } }Example
Input Template
After Phase 1 (Template Rendering)
After Phase 2 (Final Message Structure)
{ "role": "user", "content": [ {"text": "Describe this image:\n"}, { "media": { "url": "data:image/jpeg;base64,/9j/4AAQ...", "contentType": "image/jpeg" } }, {"text": "\nWhat objects do you see?"} ] }Key Implementation Notes
No Duplication: Media content appears only in
MediaPartobjects, not in text parts. The marker is replaced entirely.Content Type Inference: If
contentTypeis not provided but the URL is a data URI, implementations should extract the MIME type from the data URI format:data:{contentType};base64,...URL Types: Implementations must support both:
data:URIs containing inline base64-encoded contenthttps:URLs pointing to remote mediaFile Loading: Dotprompt does not handle file loading. The calling code is responsible for loading files and converting them to data URIs or URLs before passing them as template variables.
Marker Escaping: The marker format uses
<<<and>>>delimiters which are unlikely to appear in normal prompt text. Implementations should handle edge cases where these sequences might appear literally.Multiple Media Files
To include multiple media files, use the
{{#each}}helper to iterate over an array of URLs:Simple Array of URLs
With input:
{ "imageUrls": [ "data:image/jpeg;base64,/9j/4AAQ...", "https://example.com/image2.jpg", "data:image/png;base64,iVBORw0..." ] }Array of Objects with Metadata
For more control, pass an array of objects containing URL and content type:
With input:
{ "images": [ {"url": "https://example.com/photo.jpg", "type": "image/jpeg"}, {"url": "https://example.com/diagram.png", "type": "image/png"} ] }This renders multiple media markers which are then post-processed into a message containing multiple
MediaPartobjects.