Skip to content
This repository was archived by the owner on Apr 14, 2026. It is now read-only.

Latest commit

 

History

History
236 lines (157 loc) · 5.64 KB

File metadata and controls

236 lines (157 loc) · 5.64 KB

update the architecture of JunieOrchestrator.py based on this:

High-level idea

You give it:

  • a natural-language task (for Junie),
  • a Git repo URL,

and it will:

  1. Create a new folder/branch for that task. 2. Generate confirmation.md (acceptance criteria) using GPT.
  2. Repeatedly run Junie + GPT review:
  • Junie works on the repo.
  • GPT checks whether the task is “done” according to the checklist.
  • If not done, GPT writes steps_to_complete_the_task.md for the next iteration.
  1. Stop when either:
  • GPT says “done ✅”, or
  • it hits the max number of iterations.

Detailed flow

1. Parse CLI arguments

main() uses argparse to parse:

  • task (positional): natural-language description, e.g. "Add login form with password reset".
  • --repo / -r: Git repo URL (optional; if omitted you’re prompted).
  • --base-dir / -b: where to create the task folder (default: .).
  • --max-iterations / -n: how many Junie loops to run (default: 5).
  • --model / -m: GPT model for confirmation/review (default: gpt-4.1-mini).

Then it:

  • Checks OPENAI_API_KEY is set; if not, exits.
  • Ensures it has a repo URL (arg or prompt); otherwise exits.

2. Set up the repo for this task

setup_repo(task_text, repo_url, base_dir):

  1. Slugify the task:
  • Lowercases, replaces spaces with -, strips weird chars.
  • Example: "Add login form"add-login-form.
  1. Create a folder under base_dir with that slug name.
  • If the folder already exists → exit to avoid overwriting stuff.
  1. Clone the repo into that folder:
git clone <repo_url> .
  1. Create and checkout a new branch named after the slug:
git checkout -b add-login-form
  1. Returns work_dir = path to this new working directory.

From here on, every operation happens inside work_dir.

3. Generate confirmation.md with GPT

generate_confirmation_md(task_text, work_dir, model):

  1. Builds a system message telling GPT:
  • “You define acceptance criteria; output a concise Markdown checklist; each item must be specific and verifiable; only output Markdown.”
  1. Builds a user message that includes the task text and explicitly asks:
  • “Write a checklist of goals that, if all completed, mean the task is fully done.”
  1. Calls run_gpt(...):
completion = client.chat.completions.create(...)

and gets the content back. 4. Writes that Markdown into work_dir/confirmation.md. 5. Returns the Markdown string.

This file is your source of truth for when the task is done.

4. Iterative loop over Junie + GPT review

The for iteration in range(1, max_iterations + 1) loop is the core.

4.1 Decide what to ask Junie to do

Inside the loop:

  • Iteration 1: Junie gets the original task:
junie_task = task_text
  • Later iterations: Junie gets a meta-task:
junie_task = (
    "Implement all remaining work described in steps_to_complete_the_task.md "
    "so that all goals in confirmation.md are fully satisfied."
)

So first pass: “do the task”. Subsequent passes: “follow the remaining steps we wrote down”.

4.2 Run Junie and capture logs

run_junie(junie_task, work_dir, iteration):

  1. Builds a command:
junie --output-format=text "<junie_task>"
  1. Runs it in work_dir (via run_cmd wrapper).

  2. Produces a combined log string:

  • command line
  • return code
  • STDOUT
  • STDERR
  1. Writes that log to:
work_dir/junie_output_iter_<iteration>.txt
  1. Returns the combined log string.

4.3 Capture current git diff

get_git_diff(work_dir):

  • Runs:
git diff

in that repo.

  • Returns the full diff as a string.

This is how GPT “sees” what code changed.

4.4 Ask GPT: is the task complete?

evaluate_completion(task_text, confirmation_md, junie_output, git_diff, model):

  1. Builds a strict system prompt saying:
  • You are a strict reviewer.

  • You get: task, confirmation.md, Junie output, git diff.

  • Your job:

  • Decide if all checklist items are satisfied.

  • If not, write Markdown for steps_to_complete_the_task.md.

  • Respond with only JSON of this shape:

    {
        "done": true or false,
        "reason": "...",
        "steps_md": "..."
    }
  1. Creates a payload JSON with:

    • task
    • confirmation_md
    • junie_output
    • git_diff
  2. Sends that payload as a user message and parses the response with json.loads. 4. Returns:

  • done (bool),
  • reason (string),
  • steps_md (Markdown string; empty if done).

Back in the loop, it prints:

print(f"Review: done={done}, reason={reason}")

4.5 Act on GPT’s decision

In the loop:

  • If done is True:

  • Print a success message.

  • Return from main() (exit 0).

  • If done is False and steps_md is empty:

  • Print a failure message (to avoid infinite loops).

  • Return from main() (exit non-zero).

  • If done is False and steps_md has content:

  • Call write_steps_md(steps_md, work_dir):

  • Writes work_dir/steps_to_complete_the_task.md.

  • Next iteration will use this file to instruct Junie.

This is your feedback loop:

  • Junie changes repo → GPT inspects diff vs acceptance criteria → GPT writes remaining steps → Junie tries again.

5. Stop after max iterations

If the for loop finishes without returning:

  • It prints:
Reached max iterations without completion. Check the repo and markdown files manually.
  • Then exits (implicitly via fall-through).

So you always end in exactly one of these states:

  1. ✅ GPT thinks the task is complete.
  2. ❌ Not complete, but GPT couldn’t give steps.
  3. ⏱️ Hit the max-iteration limit.