update the architecture of JunieOrchestrator.py based on this:
You give it:
- a natural-language task (for Junie),
- a Git repo URL,
and it will:
- Create a new folder/branch for that task. 2. Generate confirmation.md (acceptance criteria) using GPT.
- Repeatedly run Junie + GPT review:
- Junie works on the repo.
- GPT checks whether the task is “done” according to the checklist.
- If not done, GPT writes steps_to_complete_the_task.md for the next iteration.
- Stop when either:
- GPT says “done ✅”, or
- it hits the max number of iterations.
main() uses argparse to parse:
task(positional): natural-language description, e.g."Add login form with password reset".--repo / -r: Git repo URL (optional; if omitted you’re prompted).--base-dir / -b: where to create the task folder (default:.).--max-iterations / -n: how many Junie loops to run (default:5).--model / -m: GPT model for confirmation/review (default:gpt-4.1-mini).
Then it:
- Checks
OPENAI_API_KEYis set; if not, exits. - Ensures it has a repo URL (arg or prompt); otherwise exits.
setup_repo(task_text, repo_url, base_dir):
- Slugify the task:
- Lowercases, replaces spaces with
-, strips weird chars. - Example:
"Add login form"→add-login-form.
- Create a folder under
base_dirwith that slug name.
- If the folder already exists → exit to avoid overwriting stuff.
- Clone the repo into that folder:
git clone <repo_url> .- Create and checkout a new branch named after the slug:
git checkout -b add-login-form- Returns
work_dir= path to this new working directory.
From here on, every operation happens inside work_dir.
generate_confirmation_md(task_text, work_dir, model):
- Builds a system message telling GPT:
- “You define acceptance criteria; output a concise Markdown checklist; each item must be specific and verifiable; only output Markdown.”
- Builds a user message that includes the task text and explicitly asks:
- “Write a checklist of goals that, if all completed, mean the task is fully done.”
- Calls
run_gpt(...):
completion = client.chat.completions.create(...)and gets the content back.
4. Writes that Markdown into work_dir/confirmation.md.
5. Returns the Markdown string.
This file is your source of truth for when the task is done.
The for iteration in range(1, max_iterations + 1) loop is the core.
Inside the loop:
- Iteration 1: Junie gets the original task:
junie_task = task_text- Later iterations: Junie gets a meta-task:
junie_task = (
"Implement all remaining work described in steps_to_complete_the_task.md "
"so that all goals in confirmation.md are fully satisfied."
)So first pass: “do the task”. Subsequent passes: “follow the remaining steps we wrote down”.
run_junie(junie_task, work_dir, iteration):
- Builds a command:
junie --output-format=text "<junie_task>"-
Runs it in
work_dir(viarun_cmdwrapper). -
Produces a combined log string:
- command line
- return code
- STDOUT
- STDERR
- Writes that log to:
work_dir/junie_output_iter_<iteration>.txt
- Returns the combined log string.
get_git_diff(work_dir):
- Runs:
git diffin that repo.
- Returns the full diff as a string.
This is how GPT “sees” what code changed.
evaluate_completion(task_text, confirmation_md, junie_output, git_diff, model):
- Builds a strict system prompt saying:
-
You are a strict reviewer.
-
You get: task, confirmation.md, Junie output, git diff.
-
Your job:
-
Decide if all checklist items are satisfied.
-
If not, write Markdown for
steps_to_complete_the_task.md. -
Respond with only JSON of this shape:
{ "done": true or false, "reason": "...", "steps_md": "..." }
-
Creates a payload JSON with:
taskconfirmation_mdjunie_outputgit_diff
-
Sends that payload as a user message and parses the response with
json.loads. 4. Returns:
done(bool),reason(string),steps_md(Markdown string; empty if done).
Back in the loop, it prints:
print(f"Review: done={done}, reason={reason}")In the loop:
-
If
doneisTrue: -
Print a success message.
-
Return from
main()(exit 0). -
If
doneisFalseandsteps_mdis empty: -
Print a failure message (to avoid infinite loops).
-
Return from
main()(exit non-zero). -
If
doneisFalseandsteps_mdhas content: -
Call
write_steps_md(steps_md, work_dir): -
Writes
work_dir/steps_to_complete_the_task.md. -
Next iteration will use this file to instruct Junie.
This is your feedback loop:
- Junie changes repo → GPT inspects diff vs acceptance criteria → GPT writes remaining steps → Junie tries again.
If the for loop finishes without returning:
- It prints:
Reached max iterations without completion. Check the repo and markdown files manually.
- Then exits (implicitly via fall-through).
So you always end in exactly one of these states:
- ✅ GPT thinks the task is complete.
- ❌ Not complete, but GPT couldn’t give steps.
- ⏱️ Hit the max-iteration limit.