Notes on Agentic AI from the Enablement week#972
Conversation
|
Deploy Preview Available Via |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR updates Agentic AI Suite documentation: adds a “Where your data lives” section clarifying storage locations, a “Schema-aware, data-private” subsection for AQLizer explaining schema-only metadata use, and expands Reasoner workflow to an iterative, read-only query-optimization agent using EXPLAIN/PROFILE and validation. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant AQLizer as AQLizer (LLM translator)
participant Reasoner as Reasoner (optimizer)
participant ArangoDB as ArangoDB (DB)
participant FileMgr as File Manager (Object Storage)
User->>AQLizer: Submit natural language query
AQLizer->>ArangoDB: Request schema/index metadata (no raw data)
ArangoDB-->>AQLizer: Return schema & index info
AQLizer->>ArangoDB: Send generated AQL query
ArangoDB-->>Reasoner: EXPLAIN/PROFILE for query (invoked by Reasoner)
Reasoner->>ArangoDB: Inspect indexes & collection stats (tools)
Reasoner->>Reasoner: Rewrite query / create alternative plans
Reasoner->>ArangoDB: Execute alternatives (read-only) and PROFILE results
ArangoDB-->>Reasoner: Execution profiles & results
Reasoner->>Reasoner: Compare alternatives, validate identical outputs
Reasoner->>AQLizer: Return optimized query/result selected
Note right of FileMgr: Raw uploaded binaries (PDF/images/docs) -> stored here
Note right of ArangoDB: Structured artifacts and extracted data -> stored here
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
site/content/agentic-ai-suite/reasoner/_index.md (1)
19-21: Soften the absolute “identical results” guarantee.At Line 20, the wording is absolute. Consider qualifying this to deterministic/read-only queries (or “validated where applicable”) to keep behavior claims precise.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site/content/agentic-ai-suite/reasoner/_index.md` around lines 19 - 21, Update the absolute claim in the Reasoner description: change the sentence that currently reads "The final optimized query is validated to ensure it returns identical results to the original" to a qualified form such as "The final optimized query is validated, where applicable, to ensure it returns equivalent results for deterministic/read-only queries" (or similar wording that limits the guarantee to deterministic/read-only cases) so the Reasoner description is precise and does not promise absolute identical results for non-deterministic queries.site/content/agentic-ai-suite/natural-language-to-aql/_index.md (1)
41-45: Add a deployment caveat for schema metadata transmission.Line 43 states only schema metadata is sent to the LLM, which is good. Consider adding one sentence that metadata handling depends on the configured provider (public vs private endpoint), to avoid over-broad privacy interpretation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site/content/agentic-ai-suite/natural-language-to-aql/_index.md` around lines 41 - 45, Add a short caveat sentence after the AQLizer description clarifying that which schema metadata is transmitted to an LLM depends on the configured provider/endpoint (e.g., public vs. private/self-hosted) and that using a private endpoint or on-prem provider can keep metadata within your controlled network; update the paragraph mentioning "AQLizer" so it explicitly states metadata handling varies by provider configuration and privacy guarantees.site/content/agentic-ai-suite/_index.md (1)
58-67: Clarify the absolute data-boundary claim to avoid contradiction.Line 60 says data “never leaves the database,” but Lines 62-65 immediately define an exception for uploaded binaries in object storage. Please scope the claim to structured/extracted data only.
Proposed wording tweak
-Everything the Agentic AI Suite produces (knowledge graphs, embeddings, -analytics results, query history) is persisted in ArangoDB. Your data -never leaves the database. +Everything the Agentic AI Suite produces (knowledge graphs, embeddings, +analytics results, query history) is persisted in ArangoDB. +Structured data produced by the suite remains in ArangoDB.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@site/content/agentic-ai-suite/_index.md` around lines 58 - 67, The page currently states “Your data never leaves the database” but immediately documents an exception for uploaded raw files; update the wording to scope the “never leaves” claim to structured/extracted data only (e.g., change the sentence that begins “Everything the Agentic AI Suite produces (knowledge graphs, embeddings, analytics results, query history) is persisted in ArangoDB. Your data never leaves the database.” to explicitly say that structured/extracted data is persisted in ArangoDB and remains in the database, while raw uploaded binaries (PDFs, images, office documents) are stored in object storage and managed by the File Manager service).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@site/content/agentic-ai-suite/_index.md`:
- Around line 58-67: The page currently states “Your data never leaves the
database” but immediately documents an exception for uploaded raw files; update
the wording to scope the “never leaves” claim to structured/extracted data only
(e.g., change the sentence that begins “Everything the Agentic AI Suite produces
(knowledge graphs, embeddings, analytics results, query history) is persisted in
ArangoDB. Your data never leaves the database.” to explicitly say that
structured/extracted data is persisted in ArangoDB and remains in the database,
while raw uploaded binaries (PDFs, images, office documents) are stored in
object storage and managed by the File Manager service).
In `@site/content/agentic-ai-suite/natural-language-to-aql/_index.md`:
- Around line 41-45: Add a short caveat sentence after the AQLizer description
clarifying that which schema metadata is transmitted to an LLM depends on the
configured provider/endpoint (e.g., public vs. private/self-hosted) and that
using a private endpoint or on-prem provider can keep metadata within your
controlled network; update the paragraph mentioning "AQLizer" so it explicitly
states metadata handling varies by provider configuration and privacy
guarantees.
In `@site/content/agentic-ai-suite/reasoner/_index.md`:
- Around line 19-21: Update the absolute claim in the Reasoner description:
change the sentence that currently reads "The final optimized query is validated
to ensure it returns identical results to the original" to a qualified form such
as "The final optimized query is validated, where applicable, to ensure it
returns equivalent results for deterministic/read-only queries" (or similar
wording that limits the guarantee to deterministic/read-only cases) so the
Reasoner description is precise and does not promise absolute identical results
for non-deterministic queries.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8cefca80-dada-4a17-87dc-7326ac1dc087
📒 Files selected for processing (3)
site/content/agentic-ai-suite/_index.mdsite/content/agentic-ai-suite/natural-language-to-aql/_index.mdsite/content/agentic-ai-suite/reasoner/_index.md
| inefficient queries, generates optimized versions, and validates them by | ||
| comparing results with the original query to ensure correctness. | ||
| The Reasoner is an AI-powered query optimization agent for ArangoDB that | ||
| automatically improves the performance of AQL queries. When you submit a query, |
There was a problem hiding this comment.
Using "automatically" might be a bit misleading; users need to set up the service first, and then start it either via the web interface or the API so it is not really an automatic process.
| The one exception is raw files (PDFs, images, office documents, and other | ||
| binaries) that you upload for processing. These are stored in object storage | ||
| (S3, MinIO, or another blob store) and managed through the | ||
| [File Manager](../platform-suite/file-manager/_index.md) service. |
There was a problem hiding this comment.
I think we have to explain better the difference of data living in the ArangoDB core database and the Data Platform. Besides that, the File Manager also stores the files uploaded via the Container Manager (Bring Your Own Code) so we have to be more specific.
| comparing results with the original query to ensure correctness. | ||
| The Reasoner is an AI-powered query optimization agent for ArangoDB that helps | ||
| improve the performance of AQL queries. Once the service is set up and started, | ||
| you can submit a query and the Reasoner agent inspects it by running `EXPLAIN` |
There was a problem hiding this comment.
| you can submit a query and the Reasoner agent inspects it by running `EXPLAIN` | |
| you can submit a query and the Reasoner agent inspects it by running **Explain** |
| The Reasoner is an AI-powered query optimization agent for ArangoDB that helps | ||
| improve the performance of AQL queries. Once the service is set up and started, | ||
| you can submit a query and the Reasoner agent inspects it by running `EXPLAIN` | ||
| and `PROFILE`, calls tools to examine available indexes and collection |
There was a problem hiding this comment.
| and `PROFILE`, calls tools to examine available indexes and collection | |
| and **Profile**, calls tools to examine available indexes and collection |
| ### Schema-aware, data-private | ||
|
|
||
| AQLizer is fully schema-aware: it inspects your collections, indexes, and | ||
| document structure so that generated queries are accurate and efficient. The LLM |
There was a problem hiding this comment.
As discussed, would be nice to add a blurp how it figures out the document structure because there is no schema (custom sampling?)
| File Manager also holds the code packages uploaded through the Container | ||
| Manager's | ||
| [Bring Your Own Code](../platform-suite/container-manager/_index.md#bring-your-own-code) | ||
| flow, so its contents are not exclusive to the Agentic AI Suite. Any structured data extracted from uploaded files |
There was a problem hiding this comment.
| flow, so its contents are not exclusive to the Agentic AI Suite. Any structured data extracted from uploaded files | |
| flow, so its contents are not exclusive to the Agentic AI Suite. | |
| Any structured data extracted from uploaded files |
Description
Upstream PRs
Summary by CodeRabbit
Documentation