From oracle-ai-data-platform-workbench-engineer-agent
Discovers and caches AIDP catalog metadata (tables, columns, joins) into a grounding file for NL-to-SQL. Run on first setup or after schema changes.
How this skill is triggered — by the user, by Claude, or both
Slash command
/oracle-ai-data-platform-workbench-engineer-agent:aidp-catalog-initThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Walk the AIDP catalog tree and generate `.aidp/catalog.md` — the cached, user-editable grounding file that
aidp-catalog-init — build the catalog grounding fileWalk the AIDP catalog tree and generate .aidp/catalog.md — the cached, user-editable grounding file that
makes subsequent NL-to-SQL fast and accurate.
Discovery is pure control-plane — no SQL, no compute (except optional --with-counts, which uses
the bundled SQL helper). Self-contained: no aidp MCP required.
--refresh after schema changes.aidp CLI (control-plane, no compute)Preferred engine is the official Oracle aidp CLI; oci raw-request is the fallback when the CLI isn't
installed. Both hit the same data-plane REST API with the same auth — see
references/aidp-cli-map.md for the full skill→command map and
references/oci-raw-request.md for base URL + auth ladder + conventions.
CLI (preferred):
# 1. catalogs
aidp catalog list --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region <r>
# 2. schemas in a catalog
aidp schema list --catalog-key <cat> --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region <r>
# 3. tables in a schema (schema-key is the dotted <cat.schema>)
aidp schema list-tables --catalog-key <cat> --schema-key <cat.schema> --instance-id <DATALAKE_OCID> --auth api_key --profile DEFAULT --region <r>
# single catalog/schema/table: aidp catalog get · aidp schema get · aidp schema get-table
Fallback (no CLI installed) — oci raw-request (LIVE-VERIFIED 20240831 / dataLakes /
--profile DEFAULT — see references/no-mcp-rest-map.md):
B="https://aidp.<region>.oci.oraclecloud.com/20240831/dataLakes/<DATALAKE_OCID>"
oci raw-request --http-method GET --target-uri "$B/catalogs" --profile DEFAULT
oci raw-request --http-method GET --target-uri "$B/schemas?catalogKey=<cat>" --profile DEFAULT
oci raw-request --http-method GET --target-uri "$B/tables?catalogKey=<cat>&schemaKey=<cat.schema>" --profile DEFAULT
aidp schema get-table (or the REST tables?… list, which returns columns,
types, and properties); filter to the one
table client-side by its key (no dedicated single-table param confirmed — see no-mcp-rest-map.md).400 InvalidParameter: query param X must not be null, which names the missing param.401/403/"Security Token", follow the auth ladder (refresh AIDP_SESSION, retry with
--auth security_token) in oci-raw-request.md.aidp catalog list → for each, aidp schema list --catalog-key → for
each, aidp schema list-tables --catalog-key --schema-key (columns, types, properties) — or the REST
fallback above. For large catalogs, fan out one subagent per catalog to parallelize discovery.*_sk, *_id, shared column names) and any
declared keys in the table properties. Record them so the agent doesn't guess joins later.--with-counts path), or mark TODO..aidp/catalog.md with sections: Quick Reference (concept→table), Catalogs → schemas →
tables (columns, types, join keys, flags), Value dictionaries, Gotchas. Preserve user edits + HTML
comments on --refresh; flag removed tables with <!-- REMOVED -->.aidp-semantic-model for metrics, aidp-analyzing-data to ask questions).--refresh — regenerate, preserving user edits and Quick-Reference rows.
--catalog <name> — limit to one catalog.
--with-counts — also fetch row counts / distinct values via the bundled SQL helper (uses the cluster,
off by default — it costs compute and needs a running cluster):
python "$PLUGIN_DIR/scripts/aidp_sql.py" --region <r> --datalake <DATALAKE_OCID> --workspace <ws> --cluster <key> \
--code "spark.sql('SELECT COUNT(*) AS n FROM <cat>.<schema>.<table>').show()"
Returns JSON with status / outputs / spark_job_ids; mints a UPST from the api_key DEFAULT profile and
auto-creates a scratch notebook (no AIDP_SESSION required). See
references/oci-raw-request.md for the control-plane side.
.aidp/catalog.md)# AIDP catalog — generated <date> (edit freely)
## Quick Reference
| Concept | Table | Key |
|---|---|---|
| customers | default.default.customer | c_customer_sk |
## <catalog> → <schema>
#### <table> (rows: <n if --with-counts>; LARGE if big)
| Column | Type | Notes (PK/FK/join) |
## Value dictionaries
## Gotchas
<region> / <DATALAKE_OCID> / <workspace> explicitly — catalog calls are scoped to the
DataLake; the SQL helper is scoped to a workspace + cluster..aidp/ is git-ignored — it's a per-project cache, not shipped with the plugin.…/dataLakes/<OCID>/extractors (NOT /metadataExtractors, which 404s — an earlier note probed the wrong
path). LIVE-VERIFIED 2026-06-12: GET …/20240831/dataLakes/<OCID>/extractors → 200 {"items":[]}.
Surface: GET/POST/DELETE /extractors, GET /extractors/<key>/extractedEntities,
GET /extractors/<key>/extractedTables/<name>, POST /extractors/<key>/actions/manageExtractedEntities
(accept/reject/import), lifecycle ACCEPTED→IN_PROGRESS→SUCCEEDED/FAILED/IN_REVIEW. This complements (does
not replace) the discovery walk above and aidp-ingest-file-to-table. Probe the create/manage write paths
live (need an Object Storage source) before relying on them.list_catalogs /
list_schemas / list_tables / get_table instead of the raw calls, but it is not required.aidp CLI command map (primary engine)npx claudepluginhub anthropics/claude-plugins-official --plugin oracle-ai-data-platform-workbench-engineer-agentLive browsing of AIDP data catalogs: list catalogs/schemas/tables/volumes, inspect columns, and resolve names to catalog keys. Ad-hoc, no SQL required.
Generates .astro/warehouse.md with warehouse schema metadata: tables, columns, row counts, schemas; enriches with dbt models, Gusty SQL, and codebase context. Run once per project for lookups.
Full inventory and audit of AWS Glue Data Catalog assets across S3 Tables, Redshift-federated, and remote Iceberg catalogs. Use for catalog overview, listing tables, or data landscape mapping.