This document defines how to interpret parser outputs:
postgresparser.ParseSQL/postgresparser.ParseSQLStrictoutput (ParsedQueryinir.go)postgresparser.ParseSQLAlloutput (ParseBatchResultinir.go)- Options-enabled variants:
postgresparser.ParseSQLWithOptionspostgresparser.ParseSQLAllWithOptionspostgresparser.ParseSQLStrictWithOptions
ParsedQuery is an analysis-oriented intermediate representation (IR), not a full PostgreSQL AST.
It is designed for:
- query linting
- dependency extraction
- lineage and metadata tooling
- migration and DDL inspection
It is not designed for:
- lossless round-trip SQL generation
- preserving every grammar-level node detail
ParseSQLparses only the first statement in the input string.ParseSQLAllparses all statements in the input string and returnsParseBatchResult.ParseSQLStrictreturns an error unless exactly one statement is present.ParseSQLWithOptions/ParseSQLAllWithOptions/ParseSQLStrictWithOptionsbehave identically while enabling optional metadata extraction flags.- Unrelated sections are expected to be empty for a given command.
Commandis the primary discriminator for which sections to read.
ParseBatchResult fields:
Statements: OneStatementParseResultper input statement in source order.Index: 1-based input statement index.RawSQL: statement-scoped SQL text.Query: parsed IR when statement conversion succeeds (nilon failure).Warnings: statement-scoped warnings (SYNTAX_ERROR).
HasFailures:truewhen any statement has a nilQueryor anyWarnings.
Command: High-level statement type (SELECT,INSERT,UPDATE,DELETE,MERGE,DDL,UNKNOWN).RawSQL: Preprocessed SQL string used for parsing.Parameters: Positional/anonymous parameter placeholders ($1,?, etc.).Placeholders: Placeholder occurrences in source order with syntactic roles such aswhere_value,function_arg,group_by_ordinal,order_by_ordinal,limit,offset,interval_operand,insert_value, andupdate_set_value.
Tables: Structured relation refs (Schema,Name,Alias,Type,Raw).CTEs:WITHdefinitions.Name: CTE binding name.Query: raw SQL text of the CTE body.ParsedQuery: nested IR for the CTE body when the body is a supported preparable statement.Materialized: materialization hint (MATERIALIZED,NOT MATERIALIZED, or empty).
Subqueries: Nested query refs discovered in the statement.JoinConditions: Raw join condition expressions.Correlations: Outer/inner alias correlation metadata for lateral/correlated subqueries.
Columns: Projection expressions and aliases.ColumnUsage: Expression-level column usage classification. See Column Usage for the per-field contract.Where: WHERE/CURRENT clauses as raw expressions.Having: HAVING clauses.GroupBy: GROUP BY expressions.OrderBy: ORDER BY expressions + direction/nulls modifiers.Limit: LIMIT/OFFSET metadata.SetOperations: UNION/INTERSECT/EXCEPT branches.DerivedColumns: Alias-to-expression map for derived projection columns.
ColumnUsage describes a single reference to a column in the parsed statement. Beyond the basic fields (TableAlias, Column, Expression, UsageType, Operator, Side, Context), two fields surface function-call context around the column:
Functions []string— function names that wrap the column reference outside WHERE clauses (SELECT projection, ORDER BY, GROUP BY, HAVING, etc.). Listed in source order from outermost to innermost. Empty on WHERE-clause entries.Function *FunctionWrapper— populated only on WHERE-clause filter usages whose subject column is wrapped by an allowlisted function. Nil otherwise.
The asymmetry is deliberate: Functions []string is sufficient for projection-level introspection (a SQL formatter or column-rename tool), while Function *FunctionWrapper carries the typed metadata needed for predicate-level reasoning (a query rewriter or simulation tool that needs to know exactly which function wraps a filter column and what its other arguments are).
type FunctionWrapper struct {
Name string // canonical lowercase, unqualified
Schema string // "" for unqualified or pg_catalog (canonicalised)
Args []FunctionArg // literal arguments other than the column itself
IsNested bool // true when reached through additional wrappers
Cast string // outermost cast target type, "" if no cast
}The allowlist that triggers attribution is exactly eight functions: length, lower, upper, coalesce, extract, date_trunc, char_length, octet_length. These are the function calls whose presence in a predicate position changes the comparison's semantics in a way downstream consumers (ORMs, linters, query rewriters, AI-assisted SQL generators) typically key off.
Attribution rules:
- Scope is WHERE-equivalent only. WHERE clauses, the WHERE clause of UPDATE/DELETE, the WHERE clause inside CTE bodies, the WHERE clause inside subqueries, and the WHERE clause in each branch of UNION/INTERSECT/EXCEPT all attach wrappers. JOIN ON, ORDER BY, GROUP BY, HAVING, window PARTITION/ORDER, RETURNING, and SELECT projection do not — those use
Functions []stringonly. - Schema canonicalization. Bare names and
pg_catalog.<name>calls produce wrappers withSchema = "". Any other schema (public.foo(...),myschema.length(...)) rejects the wrapper outright. - Outermost-only attribution. When wrappers nest (
lower(lower(col)),length(lower(col))),Namereflects the outermost call andIsNested = true. The inner chain is observable only via the flag, not enumerated. - Casts. A typecast around the wrapper as a whole (
length(col)::int,CAST(length(col) AS bigint)) is captured inCastas the textual target type. Chained casts record the outermost. A cast on the bare column (col::int < 5) does not produce a wrapper. - Expression-wrapped columns are skipped. When the function argument is itself an expression (
length(col || 'x'),lower(coalesce(name, ''))), no wrapper is attached. Consumers needing this case should fall through to the standard column+operator interpretation.
type FunctionArg struct {
Literal *string // SQL textual form; nil = non-literal expression
IsNull bool // explicit SQL NULL keyword
}FunctionArg is a three-state encoding:
Literalnon-nil,IsNullfalse → literal value present. Numerics, booleans, and intervals are stringified ("0","true","1 day"); string literals retain their surrounding quotes.Literalnil,IsNulltrue → explicitNULLkeyword.Literalnil,IsNullfalse → non-literal expression at this position (column reference, sub-call, placeholder).
Consumers requiring a specific literal value (date_trunc unit, extract field) MUST nil-check Literal before dereferencing, and fall back to the standard column+operator interpretation when nil.
InsertColumns: Target columns for INSERT.SetClauses: SET clauses for UPDATE (and related clause extraction).Returning: RETURNING clauses.Upsert:ON CONFLICTmetadata for INSERT.Merge: MERGE metadata (target/source/condition/actions).
DDLActions: Normalized DDL actions extracted from DDL statements.
Common DDL action fields:
Type:CREATE_TABLE,DROP_TABLE,DROP_COLUMN,ALTER_TABLE,CREATE_INDEX,DROP_INDEX,TRUNCATE,COMMENT.ObjectName: Unqualified target object identifier.ObjectType: Object category for action-specific handling (for exampleTABLE,COLUMN,INDEXonCOMMENTactions).Schema: Parsed schema when available.Columns: Column names or indexed expressions relevant to the action.Flags: Modifiers likeIF_EXISTS,IF_NOT_EXISTS,CASCADE,CONCURRENTLY, etc.IndexType: Index method forCREATE_INDEX(for examplebtree,gin).ColumnDetails: Column metadata forCREATE_TABLEactions.Constraints: Optional*DDLConstraintsgrouping PK/FK/UNIQUE/CHECK metadata (CREATE_TABLE,ALTER_TABLE ADD CONSTRAINT).Target: Generic fully-qualified target path for comment-like actions (for examplepublic.users.email).Comment: Comment text forCOMMENTactions.
ColumnDetails ([]DDLColumn) fields:
NameTypeNullableDefaultComment([]string, optional): inline--comment lines preceding a column definition whenIncludeCreateTableFieldComments=true.
Constraints (*DDLConstraints) fields:
PrimaryKey(*DDLPrimaryKey):ConstraintName(optional),Columns.ForeignKeys([]DDLForeignKey):ConstraintName(optional),Columns,ReferencesSchema(optional),ReferencesTable,ReferencesColumns(optional),OnDelete(optional),OnUpdate(optional). Referential actions:CASCADE,SET NULL,SET DEFAULT,RESTRICT,NO ACTION.UniqueKeys([]DDLUniqueConstraint):ConstraintName(optional),Columns.CheckConstraints([]DDLCheckConstraint):ConstraintName(optional),Expression.
Current DDL convention:
CREATE_TABLEpopulatesColumnDetailsandConstraintsfor inline and table-level constraints.COMMENT ON ...populatesDDLActionswithType=COMMENT.- Other DDL actions currently do not populate
ColumnDetails. ALTER_TABLEusesColumnsandFlagsfor operation-level details.ALTER_TABLE ... ADD CONSTRAINTpopulatesConstraints.
ParseOptions currently supports:
IncludeCreateTableFieldComments:- default
false - when
true, captures inline--field comments inCREATE TABLEintoDDLActions[].ColumnDetails[].Comment.
- default
Notes:
- This option only affects inline
--field comments inCREATE TABLE. COMMENT ON ...statement extraction is always enabled.
SELECT: read-query shape + relation metadata.INSERT: relation metadata + DML (InsertColumns,Upsert,Returning).UPDATE: relation metadata + DML (SetClauses,Where,Returning).DELETE: relation metadata + DML (Where,Returning).MERGE: relation metadata +Merge.DDL:DDLActions(+Tableswhere applicable).UNKNOWN: minimal envelope only.
| Section / Field Group | SELECT | INSERT | UPDATE | DELETE | MERGE | DDL | UNKNOWN |
|---|---|---|---|---|---|---|---|
Core envelope (Command, RawSQL, Parameters) |
Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Relations (Tables, CTEs, Subqueries) |
Yes | Yes | Yes | Yes | Yes | Sometimes | No |
Read-query shape (Columns, Where, GroupBy, OrderBy, Limit, SetOperations, ColumnUsage) |
Yes | No | Partial | Partial | Partial | No | No |
DML shape (InsertColumns, SetClauses, Returning, Upsert) |
No | Yes | Yes | Partial | No | No | No |
MERGE payload (Merge) |
No | No | No | No | Yes | No | No |
DDL payload (DDLActions) |
No | No | No | No | No | Yes | No |
Notes:
- "Partial" means only relevant subsets are filled for that command.
- "Sometimes" under DDL relations means
Tablesis populated for actions where a base relation is explicitly parsed (for exampleCREATE TABLE,ALTER TABLE,TRUNCATE). - Empty/nil in unrelated sections is expected behavior.
-
Use
Commandfirst, then inspect relevant sections. -
Treat empty slices/nil in unrelated sections as expected behavior.
-
For DDL identity, prefer structured values where available:
- use
Schema+ObjectNamefor action identity (ObjectNameis unqualified) - use
Tableswhen you need normalized relation references
- use
-
For predicate analysis, key off
ColumnUsage.Functionrather than scanning the raw SQL orContextstring. Walk filter usages, branch onFunction.Name, and consultFunction.Argsfor the literal arguments where needed:for _, u := range result.ColumnUsage { if u.UsageType != analysis.SQLUsageTypeFilter || u.Function == nil { continue } switch u.Function.Name { case "length", "char_length", "octet_length": // length-class predicate on u.Column case "lower", "upper": // case-folding predicate on u.Column case "date_trunc": // u.Function.Args[0].Literal carries the unit when literal } }
-
ALTER TABLEdelta metadata model- Goal: add operation-level
AlterOpspayload instead of overloadingColumnDetails. - Scope:
ADD COLUMN,DROP COLUMN,TYPE,SET/DROP DEFAULT,SET/DROP NOT NULL,RENAME COLUMN. - Non-goal: full pre/post reconstruction without schema state.
- Goal: add operation-level
-
CREATE TABLEtype coverage expansion- Goal: maintain a broad regression matrix covering common PostgreSQL type families.
- Scope: numerics, text/binary, time/date, JSON/XML, network, geometric, ranges, arrays.