Skip to content

feat(stdlib): Add CLI parsing helpers#1086

Open
aaronflorey wants to merge 2 commits intoamber-lang:stagingfrom
php-depkit:feat/cli-arg-parser
Open

feat(stdlib): Add CLI parsing helpers#1086
aaronflorey wants to merge 2 commits intoamber-lang:stagingfrom
php-depkit:feat/cli-arg-parser

Conversation

@aaronflorey
Copy link
Copy Markdown

@aaronflorey aaronflorey commented Apr 17, 2026

Summary

Add a new std/cli module for common argument parsing patterns in Amber scripts.

This includes helpers for:

  • normalizing raw_args with cli_args
  • matching commands from an allowed set
  • consuming long and short flags
  • consuming long and short options
  • consuming positional arguments
  • collecting all remaining positional arguments
  • asserting that parsing finished without leftovers

This also fixes terminator handling so a lone -- is treated as syntax rather than an unconsumed argument once parsing is complete.

Tests

  • add stdlib fixture coverage for commands, flags, options, positionals, and --
  • add tests for cli_args(raw_args)
  • add an end-to-end CLI test for forwarded script arguments

Validation

  • cargo test cli_
  • cargo test cli_take_option_
  • cargo test cli_take_flag_
  • cargo test cli_assert_empty
  • cargo test test_cli_file_starting_with_dash
  • cargo test test_cli_raw_args_forwarded_to_script

Summary by CodeRabbit

  • New Features

    • Added a CLI parsing library: drops the script name, handles long/short flags and named options (inline or separate values), subcommands, repeated options, and positional arguments; respects the -- terminator and validates missing values.
  • Tests

    • Extensive tests covering argument forwarding, terminator behavior, flag/option/positional success and failure cases, repeated options, and end-of-input assertion.

Add stdlib helpers for parsing commands, flags, options, and positionals from Amber scripts. Cover the new parsing behavior with stdlib fixtures and CLI tests so forwarded arguments and terminator handling stay stable.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c07ee850-14fe-4298-9c9f-3f12a1de69db

📥 Commits

Reviewing files that changed from the base of the PR and between 76e366b and bb440aa.

📒 Files selected for processing (6)
  • src/std/cli.ab
  • src/tests/cli.rs
  • src/tests/stdlib/cli_assert_empty.ab
  • src/tests/stdlib/cli_take_command_mismatch.ab
  • src/tests/stdlib/cli_take_option_failures.ab
  • src/tests/stdlib/cli_take_option_repeated.ab
✅ Files skipped from review due to trivial changes (2)
  • src/tests/stdlib/cli_take_command_mismatch.ab
  • src/tests/stdlib/cli_assert_empty.ab
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/tests/stdlib/cli_take_option_repeated.ab
  • src/tests/stdlib/cli_take_option_failures.ab
  • src/tests/cli.rs
  • src/std/cli.ab

📝 Walkthrough

Walkthrough

Added a new standard-library CLI parser (src/std/cli.ab) with functions to drop the executable token, detect/consume flags/options/commands/positionals, handle -- terminator and inline --name=value, plus comprehensive tests and test helpers exercising parsing and forwarding behavior.

Changes

Cohort / File(s) Summary
CLI Standard Library Module
src/std/cli.ab
New module implementing token classification (long/short/terminator), splitting --name=value, cli_args, mutating parsers take_flag, take_option, take_command, take_positional, take_all_positionals, and cli_assert_empty.
Test Harness / Runner
src/tests/cli.rs
Added helpers compile_without_messages, eval_bash_with_args, write_temp_amber_script; refactored existing test to use them; added integration tests that invoke Amber CLI and verify argument forwarding.
Stdlib CLI Tests
src/tests/stdlib/...
src/tests/stdlib/cli_args.ab, .../cli_assert_empty.ab, .../cli_flag_absent.ab, .../cli_take_flag_long.ab, .../cli_take_flag_short.ab, .../cli_take_flag_stops_at_terminator.ab, .../cli_option_terminator.ab, .../cli_take_all_positionals.ab, .../cli_take_all_positionals_after_terminator.ab, .../cli_take_command.ab, .../cli_take_command_mismatch.ab, .../cli_take_command_option_first.ab, .../cli_take_option_failures.ab, .../cli_take_option_long_inline.ab, .../cli_take_option_long_separate.ab, .../cli_take_option_repeated.ab, .../cli_take_option_short.ab, .../cli_take_option_short_dash_value.ab, .../cli_take_option_stops_at_terminator.ab, .../cli_take_positional.ab, .../cli_take_positional_missing.ab, .../cli_take_positional_mixed_args.ab, .../cli_take_positional_terminator_missing.ab
New comprehensive unit/integration tests covering: cli_args behavior, flag parsing (long/short/absent/after-terminator), option parsing (inline/separate/short/repeated/failure cases/after-terminator), command extraction (allowed/mismatch/missing), positional extraction (mixed/after-terminator/missing), and take_all_positionals behavior.

Sequence Diagram(s)

(Skipped — changes are library + tests without a multi-component runtime flow that benefits from a sequence diagram.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~28 minutes

Suggested reviewers

  • Ph0enixKM
  • Mte90
  • callframe

Poem

🐰 Hopping through args both near and far,
Flags and values caught like a star,
Terminators kept at bay,
Positionals find their way,
Amber's CLI hops home—hip hooray! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(stdlib): Add CLI parsing helpers' accurately and concisely describes the main change: adding a new CLI parsing module to the standard library.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (8)
src/std/cli.ab (3)

82-88: LGTM, with a tiny simplification option.

Works correctly for 0/1-element inputs. Optional micro-nit: the const args = [Text] local is unnecessary — return [Text] reads the same.

♻️ Optional
-    if len(raw_args) <= 1 {
-        const args = [Text]
-        return args
-    }
+    if len(raw_args) <= 1 {
+        return [Text]
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/std/cli.ab` around lines 82 - 88, In the cli_args function remove the
unnecessary local const args and return the empty array literal directly;
specifically, inside pub fun cli_args(raw_args: [Text]) replace the block that
declares const args = [Text] and returns args with a direct return [Text] so
cli_args returns an empty [Text] when len(raw_args) <= 1 while keeping the rest
of the function (the slice return raw_args[1..len(raw_args)]) unchanged.

238-246: Unreachable return positionals after while true.

The while true loop can only exit via the failed { return positionals } block, making line 245 dead code. Minor cleanup:

♻️ Cleanup
     let positionals = [Text]
     while true {
         const positional = take_positional(args) failed {
             return positionals
         }
         positionals += [positional]
     }
-    return positionals
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/std/cli.ab` around lines 238 - 246, The final `return positionals` is
unreachable because the `while true` loop only exits via the `failed { return
positionals }` branch; remove the dead `return positionals` statement after the
loop (keep the loop and the `failed { return positionals }` behavior) and ensure
the loop body uses `take_positional(args)` as before; references: the
`positionals` variable, the `while true` loop, and the `take_positional` call.

124-152: Minor: redundant long_eq and caller-dependent arg removal on inline form.

Two small notes:

  1. long_eq is computed here (line 125) and recomputed inside find_option_index (line 49) when allow_inline_long=true. Not a correctness issue, just duplication.
  2. On inline-failure branches (parts[1] == ""), args is left untouched, but on separate-value failure (missing value, value is --) args is also untouched — good and consistent. Worth adding a short doc sentence that on failure args is unmodified so callers can provide fallback logic without worrying about partial consumption.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/std/cli.ab` around lines 124 - 152, In take_option, remove the redundant
computation of long_eq (it's already computed by find_option_index when
allow_inline_long=true) and stop using the local long_eq variable; rely on
find_option_index(long, short, true) and use split_long_option(arg) only when
needed. Also add a short doc comment on the take_option function stating that on
any failure path (inline form with empty value, missing value, or value being an
option terminator) the args array is left unmodified so callers can safely
perform fallback logic without worrying about partial consumption.
src/tests/stdlib/cli_take_command_mismatch.ab (1)

1-11: LGTM.

Exercises the not array_contains(commands, command) failure branch. Note: this test does not verify that args is left unchanged on failure (args should still contain "deploy"); consider adding an echo(args) line for completeness if you want to guard that contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tests/stdlib/cli_take_command_mismatch.ab` around lines 1 - 11, The test
exercises the failure branch of take_command but doesn't assert that the input
args remains unchanged; update the test in main to echo the args variable after
the failed take_command call so it verifies args still contains "deploy"
(reference the main function, the args binding, and the take_command invocation)
— add a line like an echo(args) after the failed block to confirm the contract.
src/tests/stdlib/cli_take_option_repeated.ab (1)

7-12: Consider extending to verify second consumption.

The test validates only the first take_option call. A follow-up call (take_option(args, "--output")? returning "two") would strengthen coverage of repeated-option semantics and ensure the second occurrence is consumable after the first is removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tests/stdlib/cli_take_option_repeated.ab` around lines 7 - 12, The test
only checks the first consumption of the "--output" option; add a second call to
take_option(args, "--output")? (e.g., assign to second) and assert it equals
"two" and that args no longer contains those options (update echo lines to show
second and the final args length/content). Locate main, the args array, and the
existing const first = take_option(args, "--output")? and add the follow-up
const second = take_option(args, "--output")? plus appropriate echo checks for
second and remaining args.
src/tests/stdlib/cli_assert_empty.ab (1)

23-26: Final args state is not asserted after the failed branch.

The leftover case verifies the failure branch is taken, but does not assert that cli_assert_empty left args unchanged (still ["build"]). Consider echoing rest=({len(args)}) [{args}] after the block — consistent with other tests in this suite — to lock in non-mutation on failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tests/stdlib/cli_assert_empty.ab` around lines 23 - 26, The test calls
cli_assert_empty(args) and checks the failed branch but doesn't assert that args
remains unchanged; after the failed block add an echo/assert that prints the
final args state (e.g., echo "rest=({len(args)}) [{args}]") to ensure
cli_assert_empty did not mutate args; reference the args variable and the
cli_assert_empty(...) failed { ... } block when inserting this post-block
assertion.
src/tests/stdlib/cli_take_option_failures.ab (1)

14-22: Consider also testing --output= with a short alias and value-is-terminator case.

The three failures covered are good. Two adjacent negative cases worth adding for completeness: (1) short-form missing value (e.g. ["build", "-o"] with take_option(args, "--output", "-o")), and (2) value equal to the terminator ["build", "--output", "--"], which per src/std/cli.ab:143-145 should also fail. Optional — skip if covered elsewhere.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tests/stdlib/cli_take_option_failures.ab` around lines 14 - 22, Add two
negative test cases to the file covering short-form alias and
terminator-as-value: add args = ["build", "-o"] and call take_option(args,
"--output", "-o") expected to fail (e.g., failed { echo("short option missing
value") }) and add args = ["build", "--output", "--"] with take_option(args,
"--output") expected to fail (e.g., failed { echo("option value is terminator")
}); place them alongside the existing cases so take_option is exercised for both
the short alias "-o" and the terminator-value scenario.
src/tests/cli.rs (1)

18-38: Shell-escaping in eval_bash_with_args is fragile.

The current escaping only handles " but wraps each arg in double quotes, so values containing \, $, `, or ! would be interpreted by the shell rather than passed literally. It works for the present callers (one/two/three and empty) but is a footgun for future tests that exercise CLI parsing with special characters (which is exactly what this PR enables). Consider single-quote escaping (''\'') or using shell_words::quote for robustness.

♻️ Suggested safer escaping
-    let quoted_args = args
-        .iter()
-        .map(|arg| format!("\"{}\"", arg.replace('"', "\\\"")))
-        .collect::<Vec<String>>();
+    let quoted_args = args
+        .iter()
+        .map(|arg| format!("'{}'", arg.replace('\'', "'\\''")))
+        .collect::<Vec<String>>();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tests/cli.rs` around lines 18 - 38, The function eval_bash_with_args
currently builds a shell command by wrapping each arg in double quotes and only
escaping double quotes, which allows characters like \, $, `, ! to be
interpreted by the shell; replace the fragile quoting with a robust
shell-escaping strategy inside eval_bash_with_args (e.g., use shell_words::quote
on each arg or implement single-quote escaping by replacing ' with '\'' and
wrapping each arg in single quotes) before composing the set -- ... string, so
the call built for AmberCompiler::find_shell(...).arg("-c").arg(bash_code)
passes literal argument values to the invoked shell.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/std/cli.ab`:
- Around line 82-88: In the cli_args function remove the unnecessary local const
args and return the empty array literal directly; specifically, inside pub fun
cli_args(raw_args: [Text]) replace the block that declares const args = [Text]
and returns args with a direct return [Text] so cli_args returns an empty [Text]
when len(raw_args) <= 1 while keeping the rest of the function (the slice return
raw_args[1..len(raw_args)]) unchanged.
- Around line 238-246: The final `return positionals` is unreachable because the
`while true` loop only exits via the `failed { return positionals }` branch;
remove the dead `return positionals` statement after the loop (keep the loop and
the `failed { return positionals }` behavior) and ensure the loop body uses
`take_positional(args)` as before; references: the `positionals` variable, the
`while true` loop, and the `take_positional` call.
- Around line 124-152: In take_option, remove the redundant computation of
long_eq (it's already computed by find_option_index when allow_inline_long=true)
and stop using the local long_eq variable; rely on find_option_index(long,
short, true) and use split_long_option(arg) only when needed. Also add a short
doc comment on the take_option function stating that on any failure path (inline
form with empty value, missing value, or value being an option terminator) the
args array is left unmodified so callers can safely perform fallback logic
without worrying about partial consumption.

In `@src/tests/cli.rs`:
- Around line 18-38: The function eval_bash_with_args currently builds a shell
command by wrapping each arg in double quotes and only escaping double quotes,
which allows characters like \, $, `, ! to be interpreted by the shell; replace
the fragile quoting with a robust shell-escaping strategy inside
eval_bash_with_args (e.g., use shell_words::quote on each arg or implement
single-quote escaping by replacing ' with '\'' and wrapping each arg in single
quotes) before composing the set -- ... string, so the call built for
AmberCompiler::find_shell(...).arg("-c").arg(bash_code) passes literal argument
values to the invoked shell.

In `@src/tests/stdlib/cli_assert_empty.ab`:
- Around line 23-26: The test calls cli_assert_empty(args) and checks the failed
branch but doesn't assert that args remains unchanged; after the failed block
add an echo/assert that prints the final args state (e.g., echo
"rest=({len(args)}) [{args}]") to ensure cli_assert_empty did not mutate args;
reference the args variable and the cli_assert_empty(...) failed { ... } block
when inserting this post-block assertion.

In `@src/tests/stdlib/cli_take_command_mismatch.ab`:
- Around line 1-11: The test exercises the failure branch of take_command but
doesn't assert that the input args remains unchanged; update the test in main to
echo the args variable after the failed take_command call so it verifies args
still contains "deploy" (reference the main function, the args binding, and the
take_command invocation) — add a line like an echo(args) after the failed block
to confirm the contract.

In `@src/tests/stdlib/cli_take_option_failures.ab`:
- Around line 14-22: Add two negative test cases to the file covering short-form
alias and terminator-as-value: add args = ["build", "-o"] and call
take_option(args, "--output", "-o") expected to fail (e.g., failed { echo("short
option missing value") }) and add args = ["build", "--output", "--"] with
take_option(args, "--output") expected to fail (e.g., failed { echo("option
value is terminator") }); place them alongside the existing cases so take_option
is exercised for both the short alias "-o" and the terminator-value scenario.

In `@src/tests/stdlib/cli_take_option_repeated.ab`:
- Around line 7-12: The test only checks the first consumption of the "--output"
option; add a second call to take_option(args, "--output")? (e.g., assign to
second) and assert it equals "two" and that args no longer contains those
options (update echo lines to show second and the final args length/content).
Locate main, the args array, and the existing const first = take_option(args,
"--output")? and add the follow-up const second = take_option(args, "--output")?
plus appropriate echo checks for second and remaining args.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7b127789-02bb-424c-8b13-8ca3d7ef35e1

📥 Commits

Reviewing files that changed from the base of the PR and between 960c5c5 and 76e366b.

📒 Files selected for processing (25)
  • src/std/cli.ab
  • src/tests/cli.rs
  • src/tests/stdlib/cli_args.ab
  • src/tests/stdlib/cli_assert_empty.ab
  • src/tests/stdlib/cli_flag_absent.ab
  • src/tests/stdlib/cli_option_terminator.ab
  • src/tests/stdlib/cli_take_all_positionals.ab
  • src/tests/stdlib/cli_take_all_positionals_after_terminator.ab
  • src/tests/stdlib/cli_take_command.ab
  • src/tests/stdlib/cli_take_command_mismatch.ab
  • src/tests/stdlib/cli_take_command_option_first.ab
  • src/tests/stdlib/cli_take_flag_long.ab
  • src/tests/stdlib/cli_take_flag_short.ab
  • src/tests/stdlib/cli_take_flag_stops_at_terminator.ab
  • src/tests/stdlib/cli_take_option_failures.ab
  • src/tests/stdlib/cli_take_option_long_inline.ab
  • src/tests/stdlib/cli_take_option_long_separate.ab
  • src/tests/stdlib/cli_take_option_repeated.ab
  • src/tests/stdlib/cli_take_option_short.ab
  • src/tests/stdlib/cli_take_option_short_dash_value.ab
  • src/tests/stdlib/cli_take_option_stops_at_terminator.ab
  • src/tests/stdlib/cli_take_positional.ab
  • src/tests/stdlib/cli_take_positional_missing.ab
  • src/tests/stdlib/cli_take_positional_mixed_args.ab
  • src/tests/stdlib/cli_take_positional_terminator_missing.ab

Comment thread src/std/cli.ab
Simplify the CLI helpers and remove dead parsing code so failure paths stay predictable. Add regression coverage for literal shell arguments and argument preservation across CLI parser failures.
@Mte90
Copy link
Copy Markdown
Member

Mte90 commented Apr 17, 2026

I am wondering if this kind of stuff it should be Amber native and not a stdlib.
We already have args so we can improve internally.

@aaronflorey
Copy link
Copy Markdown
Author

I am wondering if this kind of stuff it should be Amber native and not a stdlib.

We already have args so we can improve internally.

I thought so too. But I saw an earlier PR or issue where the consensus was that it would be easier to approve it it was written in amber. I'm happy to work on a native version though, might be cleaner.

@Mte90
Copy link
Copy Markdown
Member

Mte90 commented Apr 17, 2026

I think that we should discuss about your PR before what stuff make sense that is inside the compiler and what should be a builtin or a stdlib.
If you can do a first analysis we can see and move further, we are trying to do the new major release so we can focus again on new stuff.

@aaronflorey
Copy link
Copy Markdown
Author

That makes sense.

My first pass would be:

  • compiler/runtime: how args / raw_args are exposed, script-name handling, and any platform-specific normalization
  • builtin: only lower-level primitives that are awkward or impossible to express cleanly in Amber and would be broadly useful elsewhere
  • stdlib: the higher-level parsing helpers (take_flag, take_option, take_command, take_positional, take_all_positionals, cli_assert_empty), since they are just operations on [Text] and are easier to iterate on at the language level

So if args gets improved internally, cli_args is probably the first piece that could disappear, while the parsing helpers still feel more like stdlib material to me.

I started with the stdlib version mainly to validate the API and semantics first without baking that behavior into the compiler too early. If the consensus is that some part of this should move lower, I’m happy to split or rework the PR in that direction.

Given the focus on the new major release, I can write up this analysis in the PR first so we can agree on the boundary before I keep pushing implementation changes.

@Mte90
Copy link
Copy Markdown
Member

Mte90 commented Apr 20, 2026

@Ph0enixKM @Tirito6626 @KrosFire what do you think of this proposal for the next major?

I think that we need a ticket to address those and 3 different PRs following your order

@Tirito6626
Copy link
Copy Markdown
Member

Tirito6626 commented Apr 20, 2026

i think that we need something like getopt like bash and C have, but im not sure whether it should be as builtin or stdlib and/or we need that many helpers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants