Operate the entire Oracle AI Data Platform (AIDP) workbench in natural language: discover and query lakehouse catalogs, author and run Spark SQL (DDL/DML/Delta maintenance), build and deploy agent flows with LangGraph and guardrails, manage RAG knowledge bases, schedule jobs, provision clusters, handle RBAC and credentials, track ML experiments, and debug Spark performance — all through a single MCP agent with 37 skills.
Based on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Discover, author, deploy, and run AIDP agent flows. Use when the user wants to list/inspect agent flows, or create/update/deploy/run a flow, manage sessions/guardrails, attach compute, or attach a remote MCP server to a flow (an MCP_TOOL node — AIDP's "Native MCP Client Support" LA feature, where the flow connects OUT to OAC/ADW/OIC/any MCP-compatible service). Everything runs over the Limited-Availability AgentFlows REST API via `oci raw-request`; verify live first. (An `aidp` MCP `list_agent_flows`, if configured, is an optional read accelerator only.)
Build AIDP agents in high-code Python with aidputils + LangGraph (the code-first alternative to the low-code agent-flow canvas). Use when the user wants to write an agent in Python, use LangGraph / create_react_agent / StateGraph, call aidputils (OCIAIConf, AIDPToolConf, init_oci_llm, create_langgraph_tool), build a custom or multi-agent supervisor flow in code, or asks "how do I code an AIDP agent". For the low-code/REST node-graph path use aidp-agent-flows.
Run LLM functions inside Spark SQL on AIDP via ai_generate(). Use when the user wants to summarize/classify/extract/enrich rows with an LLM directly in SQL, generate narratives over aggregated results, or do grounded RAG-style analysis in the lakehouse. Signature is model-first; available models must be confirmed live before relying on it.
Answer business questions over the AIDP lakehouse with Spark SQL. Use when the user asks a data question ("how many…", "top N…", "show me…", "trend of…", "revenue by…") or wants to run ad-hoc Spark SQL. Grounds in .aidp/catalog.md + .aidp/semantic.md and reuses validated verified queries before generating SQL, then executes via the bundled aidp_sql.py helper.
Manage and search AIDP audit logs — enable/disable auditing, set retention, and query audit-log entries for a DataLake. Use when the user asks about audit logging, who did what, compliance/retention of AIDP activity, or wants to search audit events. Self-contained — official `aidp audit` CLI preferred, `oci raw-request` fallback.
This repository contains a curated collection of sample notebooks demonstrating how to build data pipelines, run machine learning workloads, and integrate AI capabilities using Oracle AI Data Platform (AIDP) Workbench — a unified, governed workspace for data engineering, ML, and AI development powered by Apache Spark.
Oracle AI Data Platform Workbench is a unified, governed workspace for building, managing, and deploying AI and data-driven solutions. It brings together notebooks, agent development, orchestration, and catalog management in a single collaborative platform — empowering teams to explore data, fine-tune models, and operationalize AI with trust and speed.
Learn more about AIDP Workbench →
oracle-aidp-samples/
├── getting-started/ # Foundational notebooks for new users
│ ├── Delta_Lake/ # Delta Lake feature walkthroughs
│ └── migration/ # Migrating workloads to AIDP
├── data-engineering/
│ ├── ingestion/ # Connectors and data loading patterns
│ └── transformation/ # Pipeline architectures and table formats
│ ├── liquid-clustering/
│ ├── medallion-lake/
│ ├── scd/
│ └── streaming/
├── ai/
│ ├── agent-flows/ # Agent orchestration and scheduling
│ └── ml-datascience/ # ML, LLM, and AI service integrations
└── shared-utils/ # Reusable utilities and data generators
Foundational examples to help you get up and running on AIDP Workbench.
| Notebook | Description |
|---|---|
| Access ALH Data | Write and query data in Oracle Autonomous AI Lakehouse (ALH) using PySpark insertInto and SQL INSERT statements with external catalogs. |
| Access Object Storage Data | Read and write data from OCI Object Storage using direct access, external volumes, and external tables. |
| Analyse Data Using PySpark | PySpark fundamentals: catalog and schema setup, table creation, data insertion, schema exploration, and matplotlib visualizations. |
| Analyse Data Using SQL | Core SQL operations on AIDP including DataFrame creation, transformations, aggregations, and simple visualizations. |
| ALH External Catalog MERGE | End-to-end MERGE workflow into an ALH table via an AIDP external catalog: insert/update/delete with merge keys and OOS-staging skip optimization. |
| Notebook | Description |
|---|---|
| Use Delta Lake Table | Comprehensive guide covering Delta table operations: updates, merges, time travel, liquid clustering, and vacuuming. |
| Delta Change Data Feed | Capture row-level changes (inserts, updates, deletes) from Delta tables for CDC, incremental processing, and streaming pipelines. |
| Handle Schema Evolution | Add and evolve columns in Delta tables without rewriting existing data, leveraging automatic schema evolution. |
| Delta UniForm Tables | Create Delta UniForm tables that automatically synchronize Iceberg metadata for cross-format interoperability. |
| Notebook | Description |
|---|---|
| Migrate Files from Databricks to AIDP | Recursively export notebooks and files from a Databricks workspace to AIDP using the databricks-sdk library. |
| Download from Git to AIDP | Download notebooks and files from a Git repository as a ZIP archive and extract them directly into an AIDP workspace volume. |
Patterns for connecting to and loading data from a wide range of sources.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimnpx claudepluginhub anthropics/claude-plugins-official --plugin oracle-ai-data-platform-workbench-engineer-agentOracle AI Data Platform Workbench Spark connectors for Claude Code. 23 connector skills covering every data source workbench customers commonly need: Oracle Autonomous DB family (ALH/ADW/ATP) via wallet/IAM-DB-Token/API-key, generic Oracle Database, ExaCS, Oracle PeopleSoft, Oracle Siebel, Fusion ERP REST, Fusion BICC, EPM Cloud Planning, Essbase 21c, OCI Streaming (Kafka), OCI Object Storage, Apache Iceberg, plus PostgreSQL, MySQL/HeatWave, SQL Server, Apache Hive, Salesforce, Snowflake, Azure ADLS Gen2, AWS S3, generic REST, custom JDBC, and Excel. v0.5.0 adds 5 new connectors plus pushdown.sql / catalog.id / manifest.path patterns from oracle-samples PR #46.
Oracle AI Data Platform Workbench Spark connectors for Claude Code. 23 connector skills covering every data source workbench customers commonly need: Oracle Autonomous DB family (ALH/ADW/ATP) via wallet/IAM-DB-Token/API-key, generic Oracle Database, ExaCS, Oracle PeopleSoft, Oracle Siebel, Fusion ERP REST, Fusion BICC, EPM Cloud Planning, Essbase 21c, OCI Streaming (Kafka), OCI Object Storage, Apache Iceberg, plus PostgreSQL, MySQL/HeatWave, SQL Server, Apache Hive, Salesforce, Snowflake, Azure ADLS Gen2, AWS S3, generic REST, custom JDBC, and Excel. v0.5.0 adds 5 new connectors plus pushdown.sql / catalog.id / manifest.path patterns from oracle-samples PR #46.
Databricks development toolkit with skills for data engineering, ML, and AI agents plus MCP tools for direct Databricks operations
Claude Code skill pack for Databricks (24 skills)
This plugin provides a specialized suite of skills for data engineers and database practitioners working on Google Cloud. It acts as an expert assistant, allowing you to use natural language prompts in your preferred coding agent to architect complex data pipelines, transform data with dbt, write Spark and BigQuery SQL notebooks, and orchestrate end-to-end workflows across GCP's data ecosystem.
Data engineering plugin - warehouse exploration, pipeline authoring, Airflow integration
Spec-Driven Development framework for Data Engineering — 58 agents, 24 KB domains, 5-phase SDD workflow, 31 commands