Notes on Agentic AI from the Enablement week by nicos-arango · Pull Request #972 · arangodb/docs-hugo

nicos-arango · 2026-04-20T05:32:08Z

Description

Upstream PRs

3.10:
3.11:
3.12:
4.0:

Summary by CodeRabbit

Documentation

Added “Where your data lives” section detailing how produced artifacts are persisted and that uploaded raw files are stored in object storage while extracted structured data is stored alongside other artifacts.
Clarified schema-aware, data-private behavior: only schema/metadata is shared for query generation; raw data remains in place.
Expanded Reasoner docs: now described as a query optimization agent with iterative, read-only optimization and result-validation workflow.

arangodb-docs-automation · 2026-04-20T05:32:11Z

Deploy Preview Available Via
https://deploy-preview-972--docs-hugo.netlify.app

coderabbitai · 2026-04-20T05:32:27Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6b0f6adc-efb7-4036-a86c-727812ecf924

📥 Commits

Reviewing files that changed from the base of the PR and between 0d99f45 and e368573.

📒 Files selected for processing (2)

site/content/agentic-ai-suite/_index.md
site/content/agentic-ai-suite/reasoner/_index.md

✅ Files skipped from review due to trivial changes (1)

site/content/agentic-ai-suite/_index.md

🚧 Files skipped from review as they are similar to previous changes (1)

site/content/agentic-ai-suite/reasoner/_index.md

📝 Walkthrough

Walkthrough

This PR updates Agentic AI Suite documentation: adds a “Where your data lives” section clarifying storage locations, a “Schema-aware, data-private” subsection for AQLizer explaining schema-only metadata use, and expands Reasoner workflow to an iterative, read-only query-optimization agent using EXPLAIN/PROFILE and validation.

Changes

Cohort / File(s)	Summary
Agentic AI Suite: overview `site/content/agentic-ai-suite/_index.md`	Added "Where your data lives" describing persistence: structured artifacts stored in ArangoDB collections/documents; uploaded raw binary files (PDFs, images, office docs) stored in object storage via File Manager; extracted structured data written back to ArangoDB.
AQLizer / Natural Language → AQL `site/content/agentic-ai-suite/natural-language-to-aql/_index.md`	Added "Schema-aware, data-private" subsection: AQLizer inspects collection/index/document schema metadata (not raw data) to generate queries; LLM acts as translator receiving schema metadata only; raw data remains in ArangoDB and only generated AQL is executed.
Reasoner / Query optimization agent `site/content/agentic-ai-suite/reasoner/_index.md`	Reworded to "query optimization agent"; described workflow using EXPLAIN and PROFILE, tools to inspect indexes/collection stats, iterative rewrite/measure/select loop with up to 3 retries, read-only execution, and result validation vs original query.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AQLizer as AQLizer (LLM translator)
    participant Reasoner as Reasoner (optimizer)
    participant ArangoDB as ArangoDB (DB)
    participant FileMgr as File Manager (Object Storage)

    User->>AQLizer: Submit natural language query
    AQLizer->>ArangoDB: Request schema/index metadata (no raw data)
    ArangoDB-->>AQLizer: Return schema & index info
    AQLizer->>ArangoDB: Send generated AQL query
    ArangoDB-->>Reasoner: EXPLAIN/PROFILE for query (invoked by Reasoner)
    Reasoner->>ArangoDB: Inspect indexes & collection stats (tools)
    Reasoner->>Reasoner: Rewrite query / create alternative plans
    Reasoner->>ArangoDB: Execute alternatives (read-only) and PROFILE results
    ArangoDB-->>Reasoner: Execution profiles & results
    Reasoner->>Reasoner: Compare alternatives, validate identical outputs
    Reasoner->>AQLizer: Return optimized query/result selected
    Note right of FileMgr: Raw uploaded binaries (PDF/images/docs) -> stored here
    Note right of ArangoDB: Structured artifacts and extracted data -> stored here

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Possibly related PRs

DOC-883 | Restructure and review Natural Language to AQL #961: Overlapping edits to Natural Language → AQL documentation (schema/privacy/control-flow content).
DOC-862 | Query Editor: AQL Optimizer (via Reasoner service) #951: Overlapping Reasoner documentation changes (optimization workflow, EXPLAIN/PROFILE, validation).
Base branch for GA release of the Data Platform #866: Prior Agentic AI Suite doc edits touching storage and component descriptions.

Suggested reviewers

nerpaula
Simran-B

Poem

🐰 I hopped through docs both neat and spry,
Mapping where your data likes to lie.
Graphs and embeddings sleep in rows,
Binaries tucked where object-storage goes—
A little hop, a privacy cheer,
The Agentic Suite is safe and near. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Notes on Agentic AI from the Enablement week' is vague and generic, using non-descriptive phrasing that doesn't convey the specific technical changes made (documentation about data persistence, schema-aware query generation, and query optimization).	Replace the vague title with a specific summary of the main changes, such as 'Document Agentic AI Suite: data persistence, schema-aware queries, and optimization workflow' or 'Add Agentic AI documentation covering data storage, AQLizer, and Reasoner behavior.'

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch notes-agentic-ai

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

site/content/agentic-ai-suite/reasoner/_index.md (1)

19-21: Soften the absolute “identical results” guarantee.

At Line 20, the wording is absolute. Consider qualifying this to deterministic/read-only queries (or “validated where applicable”) to keep behavior claims precise.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@site/content/agentic-ai-suite/reasoner/_index.md` around lines 19 - 21,
Update the absolute claim in the Reasoner description: change the sentence that
currently reads "The final optimized query is validated to ensure it returns
identical results to the original" to a qualified form such as "The final
optimized query is validated, where applicable, to ensure it returns equivalent
results for deterministic/read-only queries" (or similar wording that limits the
guarantee to deterministic/read-only cases) so the Reasoner description is
precise and does not promise absolute identical results for non-deterministic
queries.

site/content/agentic-ai-suite/natural-language-to-aql/_index.md (1)

41-45: Add a deployment caveat for schema metadata transmission.

Line 43 states only schema metadata is sent to the LLM, which is good. Consider adding one sentence that metadata handling depends on the configured provider (public vs private endpoint), to avoid over-broad privacy interpretation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@site/content/agentic-ai-suite/natural-language-to-aql/_index.md` around lines
41 - 45, Add a short caveat sentence after the AQLizer description clarifying
that which schema metadata is transmitted to an LLM depends on the configured
provider/endpoint (e.g., public vs. private/self-hosted) and that using a
private endpoint or on-prem provider can keep metadata within your controlled
network; update the paragraph mentioning "AQLizer" so it explicitly states
metadata handling varies by provider configuration and privacy guarantees.

site/content/agentic-ai-suite/_index.md (1)

58-67: Clarify the absolute data-boundary claim to avoid contradiction.

Line 60 says data “never leaves the database,” but Lines 62-65 immediately define an exception for uploaded binaries in object storage. Please scope the claim to structured/extracted data only.

Proposed wording tweak

-Everything the Agentic AI Suite produces (knowledge graphs, embeddings,
-analytics results, query history) is persisted in ArangoDB. Your data
-never leaves the database.
+Everything the Agentic AI Suite produces (knowledge graphs, embeddings,
+analytics results, query history) is persisted in ArangoDB.
+Structured data produced by the suite remains in ArangoDB.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@site/content/agentic-ai-suite/_index.md` around lines 58 - 67, The page
currently states “Your data never leaves the database” but immediately documents
an exception for uploaded raw files; update the wording to scope the “never
leaves” claim to structured/extracted data only (e.g., change the sentence that
begins “Everything the Agentic AI Suite produces (knowledge graphs, embeddings,
analytics results, query history) is persisted in ArangoDB. Your data never
leaves the database.” to explicitly say that structured/extracted data is
persisted in ArangoDB and remains in the database, while raw uploaded binaries
(PDFs, images, office documents) are stored in object storage and managed by the
File Manager service).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@site/content/agentic-ai-suite/_index.md`:
- Around line 58-67: The page currently states “Your data never leaves the
database” but immediately documents an exception for uploaded raw files; update
the wording to scope the “never leaves” claim to structured/extracted data only
(e.g., change the sentence that begins “Everything the Agentic AI Suite produces
(knowledge graphs, embeddings, analytics results, query history) is persisted in
ArangoDB. Your data never leaves the database.” to explicitly say that
structured/extracted data is persisted in ArangoDB and remains in the database,
while raw uploaded binaries (PDFs, images, office documents) are stored in
object storage and managed by the File Manager service).

In `@site/content/agentic-ai-suite/natural-language-to-aql/_index.md`:
- Around line 41-45: Add a short caveat sentence after the AQLizer description
clarifying that which schema metadata is transmitted to an LLM depends on the
configured provider/endpoint (e.g., public vs. private/self-hosted) and that
using a private endpoint or on-prem provider can keep metadata within your
controlled network; update the paragraph mentioning "AQLizer" so it explicitly
states metadata handling varies by provider configuration and privacy
guarantees.

In `@site/content/agentic-ai-suite/reasoner/_index.md`:
- Around line 19-21: Update the absolute claim in the Reasoner description:
change the sentence that currently reads "The final optimized query is validated
to ensure it returns identical results to the original" to a qualified form such
as "The final optimized query is validated, where applicable, to ensure it
returns equivalent results for deterministic/read-only queries" (or similar
wording that limits the guarantee to deterministic/read-only cases) so the
Reasoner description is precise and does not promise absolute identical results
for non-deterministic queries.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8cefca80-dada-4a17-87dc-7326ac1dc087

📥 Commits

Reviewing files that changed from the base of the PR and between 4f6d21e and 0d99f45.

📒 Files selected for processing (3)

site/content/agentic-ai-suite/_index.md
site/content/agentic-ai-suite/natural-language-to-aql/_index.md
site/content/agentic-ai-suite/reasoner/_index.md

nerpaula · 2026-04-20T10:56:41Z

-inefficient queries, generates optimized versions, and validates them by
-comparing results with the original query to ensure correctness.
+The Reasoner is an AI-powered query optimization agent for ArangoDB that
+automatically improves the performance of AQL queries. When you submit a query,


Using "automatically" might be a bit misleading; users need to set up the service first, and then start it either via the web interface or the API so it is not really an automatic process.

nerpaula · 2026-04-20T11:00:18Z

+The one exception is raw files (PDFs, images, office documents, and other
+binaries) that you upload for processing. These are stored in object storage
+(S3, MinIO, or another blob store) and managed through the
+[File Manager](../platform-suite/file-manager/_index.md) service.


I think we have to explain better the difference of data living in the ArangoDB core database and the Data Platform. Besides that, the File Manager also stores the files uploaded via the Container Manager (Bring Your Own Code) so we have to be more specific.

nerpaula · 2026-04-23T10:28:09Z

-comparing results with the original query to ensure correctness.
+The Reasoner is an AI-powered query optimization agent for ArangoDB that helps
+improve the performance of AQL queries. Once the service is set up and started,
+you can submit a query and the Reasoner agent inspects it by running `EXPLAIN`


Suggested change

you can submit a query and the Reasoner agent inspects it by running `EXPLAIN`

you can submit a query and the Reasoner agent inspects it by running **Explain**

nerpaula · 2026-04-23T10:28:22Z

+The Reasoner is an AI-powered query optimization agent for ArangoDB that helps
+improve the performance of AQL queries. Once the service is set up and started,
+you can submit a query and the Reasoner agent inspects it by running `EXPLAIN`
+and `PROFILE`, calls tools to examine available indexes and collection


Suggested change

and `PROFILE`, calls tools to examine available indexes and collection

and **Profile**, calls tools to examine available indexes and collection

Simran-B · 2026-04-23T16:12:37Z

+### Schema-aware, data-private
+
+AQLizer is fully schema-aware: it inspects your collections, indexes, and
+document structure so that generated queries are accurate and efficient. The LLM


As discussed, would be nice to add a blurp how it figures out the document structure because there is no schema (custom sampling?)

Simran-B · 2026-04-23T16:15:42Z

+File Manager also holds the code packages uploaded through the Container
+Manager's
+[Bring Your Own Code](../platform-suite/container-manager/_index.md#bring-your-own-code)
+flow, so its contents are not exclusive to the Agentic AI Suite. Any structured data extracted from uploaded files


Suggested change

flow, so its contents are not exclusive to the Agentic AI Suite. Any structured data extracted from uploaded files

flow, so its contents are not exclusive to the Agentic AI Suite.

Any structured data extracted from uploaded files

Notes on Agentic AI from the Enablement week

0d99f45

cla-bot Bot added the cla-signed label Apr 20, 2026

nicos-arango requested a review from Simran-B April 20, 2026 05:32

nicos-arango requested a review from nerpaula April 20, 2026 05:32

coderabbitai Bot reviewed Apr 20, 2026

View reviewed changes

nerpaula reviewed Apr 20, 2026

View reviewed changes

Feedback

e368573

nerpaula reviewed Apr 23, 2026

View reviewed changes

Simran-B reviewed Apr 23, 2026

View reviewed changes

	you can submit a query and the Reasoner agent inspects it by running `EXPLAIN`
	you can submit a query and the Reasoner agent inspects it by running Explain

	and `PROFILE`, calls tools to examine available indexes and collection
	and Profile, calls tools to examine available indexes and collection

	flow, so its contents are not exclusive to the Agentic AI Suite. Any structured data extracted from uploaded files
	flow, so its contents are not exclusive to the Agentic AI Suite.
	Any structured data extracted from uploaded files

Conversation

nicos-arango commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Upstream PRs

Summary by CodeRabbit

Documentation

Uh oh!

arangodb-docs-automation Bot commented Apr 20, 2026

Uh oh!

coderabbitai Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

nerpaula Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

nerpaula Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

nerpaula Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nerpaula Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Simran-B Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Simran-B Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nicos-arango commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 20, 2026 •

edited

Loading

nerpaula Apr 23, 2026 •

edited

Loading

nerpaula Apr 23, 2026 •

edited

Loading