Summary
Convert the repo's latent product contract into a repeatable benchmark suite with explicit pass/fail evidence.
This issue was generated from an org-wide EvalOps mining pass on 2026-05-10 07:57 UTC. It combines live GitHub repo signals with a per-repo arXiv search. Treat the research links as grounding for a concrete implementation, not as a request for a literature review.
Repo Evidence
- Repository description: Org-level defaults: issue/PR templates, reusable workflows, community health files
- Tree signals: 0 docs files, 6 workflows, 0 proto files, 5 test-like files.
AGENTS.md:9 includes latent-spec language: - Before editing templates, check live issues and PRs in evalops/.github so updates do not duplicate an existing convention effort. - Template changes should improve evidence quality without making every PR or issue feel heavy. Prefer optional fields for specialized work and required fields only when missing them wou
README.md:7 includes latent-spec language: Treat this repository as a small control plane: conventions should be explicit, validated, and easy for downstream repos to adopt without copying private
README.md:55 includes latent-spec language: Each template expects an OPENAI_API_KEY repository secret. Repositories that need stronger, repo-specific behavior should copy the matching prompt from .github/codex/prompts/ into their own .github/codex/prompts/ directory and
README.md:97 includes latent-spec language: Use .github/workflow-templates/review-thread-guard.yml on repos where review threads should be merge blockers:
README.md:146 includes latent-spec language: services.yaml is intentionally lightweight. It should answer:
.github/codex/prompts/local-traffic-canary.md:14 includes latent-spec language: - Verify that generated trace IDs, traceparent, NATS subjects, and manifest paths match the repo contract. - Keep fixes local-tooling focused unless the failure exposes a production
Research Grounding
Repo axes: memory, governance, evaluation, tooling
Search keywords: github, workflow, local, workflows, run, yaml, codex, repos, review, yml, issue, changes
- arXiv:2504.08893v1 Knowledge Graph-extended Retrieval Augmented Generation for Question Answering (Jasper Linders, Jakub M. Tomczak), 2025.
- arXiv:2405.15436v1 Hybrid Context Retrieval Augmented Generation Pipeline: LLM-Augmented Knowledge Graphs and Vector Database for Accreditation Reporting Assistance (Candace Edwards), 2024.
- arXiv:2504.05163v2 Evaluating Knowledge Graph Based Retrieval Augmented Generation Methods under Knowledge Incompleteness (Dongzhuoran Zhou, Yuqicheng Zhu, Xiaxia Wang, Yuan He, Jiaoyan Chen, Steffen Staab), 2025.
- arXiv:2502.06864v1 Knowledge Graph-Guided Retrieval Augmented Generation (Xiangrong Zhu, Yuexiang Xie, Yi Liu, Yaliang Li, Wei Hu), 2025.
- arXiv:2507.16826v1 A Query-Aware Multi-Path Knowledge Graph Fusion Approach for Enhancing Retrieval-Augmented Generation in Large Language Models (Qikai Wei, Huansheng Ning, Chunlong Han, Jianguo Ding), 2025.
- arXiv:2512.20626v2 MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation (Chi-Hsiang Hsiao, Yi-Cheng Wang, Tzung-Sheng Lin, Yi-Ren Yeh, Chu-Song Chen), 2025.
- arXiv:2502.01113v3 GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation (Linhao Luo, Zicheng Zhao, Gholamreza Haffari, Dinh Phung, Chen Gong, Shirui Pan), 2025.
- arXiv:2506.21556v3 VAT-KG: Knowledge-Intensive Multimodal Knowledge Graph Dataset for Retrieval-Augmented Generation (Hyeongcheol Park, Jiyoung Seo, MinHyuk Jang, Hogun Park, Ha Dam Baek, Gyusam Chang), 2025.
- arXiv:2508.09460v1 Towards Self-cognitive Exploration: Metacognitive Knowledge Graph Retrieval Augmented Generation (Xujie Yuan, Shimin Di, Jielong Tang, Libin Zheng, Jian Yin), 2025.
- arXiv:2511.11017v1 AI Agent-Driven Framework for Automated Product Knowledge Graph Construction in E-Commerce (Dimitar Peshevski, Riste Stojanov, Dimitar Trajanov), 2025.
What To Build
- Define the smallest representative
.github golden workflow and capture expected inputs, outputs, and evidence artifacts.
- Add fixtures for a successful path, an ambiguous/degraded path, and a failure path.
- Publish a command that local agents and CI can run before shipping related changes.
Acceptance Criteria
Notes
- Generated issue 1/5 for
evalops/.github by evalops_org_miner.py.
- Before implementation, confirm the sampled latent-spec snippets still match
main; this issue intentionally cites exact file paths/lines where the mining pass saw them.
Summary
Convert the repo's latent product contract into a repeatable benchmark suite with explicit pass/fail evidence.
This issue was generated from an org-wide EvalOps mining pass on 2026-05-10 07:57 UTC. It combines live GitHub repo signals with a per-repo arXiv search. Treat the research links as grounding for a concrete implementation, not as a request for a literature review.
Repo Evidence
AGENTS.md:9includes latent-spec language: - Before editing templates, check live issues and PRs inevalops/.githubso updates do not duplicate an existing convention effort. - Template changes should improve evidence quality without making every PR or issue feel heavy. Prefer optional fields for specialized work and required fields only when missing them wouREADME.md:7includes latent-spec language: Treat this repository as a small control plane: conventions should be explicit, validated, and easy for downstream repos to adopt without copying privateREADME.md:55includes latent-spec language: Each template expects anOPENAI_API_KEYrepository secret. Repositories that need stronger, repo-specific behavior should copy the matching prompt from.github/codex/prompts/into their own.github/codex/prompts/directory andREADME.md:97includes latent-spec language: Use.github/workflow-templates/review-thread-guard.ymlon repos where review threads should be merge blockers:README.md:146includes latent-spec language:services.yamlis intentionally lightweight. It should answer:.github/codex/prompts/local-traffic-canary.md:14includes latent-spec language: - Verify that generated trace IDs,traceparent, NATS subjects, and manifest paths match the repo contract. - Keep fixes local-tooling focused unless the failure exposes a productionResearch Grounding
Repo axes: memory, governance, evaluation, tooling
Search keywords: github, workflow, local, workflows, run, yaml, codex, repos, review, yml, issue, changes
What To Build
.githubgolden workflow and capture expected inputs, outputs, and evidence artifacts.Acceptance Criteria
Notes
evalops/.githubbyevalops_org_miner.py.main; this issue intentionally cites exact file paths/lines where the mining pass saw them.