Build and orchestrate GCP data pipelines using natural language — discover assets, profile and transform data with dbt, Dataform, Spark, and BigQuery, then deploy and monitor workflows on Cloud Composer and Dataproc.
**STOP AND VERIFY**: Before running any command or tool that results in irreversible data loss, you MUST obtain explicit user consent. When in doubt, ask. It is better to wait for confirmation than to accidentally delete production data or critical project assets. Use this for: - SQL: DROP TABLE/VIEW/SCHEMA/DATABASE, TRUNCATE, or broad DELETE (missing WHERE or using 1=1). - Cloud Storage: gsutil rm or gcloud storage rm targeting production data or critical buckets. - Infrastructure: gcloud projects delete, deleting Spanner/BigQuery/Dataproc resources, deleting secrets, or KMS key destruction.
Discovers and inspects BigQuery Data Transfer Service (DTS) configurations. Use this to identify existing ingestion pipelines and extract datasource or transfer config metadata for data pipelines. Use when a user asks for ingestion scenarios while building or managing data pipelines or when a user asks to "ingest" or "add" data that may already be managed by a DTS transfer.
Build modern data apps, dashboards, and interactive reports using either React + Vite or Streamlit. Includes optional Gemini Data Analytics chat integration for an AI powered "chat with your data" experience. Relevant when any of the following conditions are true: 1. User explicitly requests to build a data dashboard, data application, or visualization UI, and the UI pulls data from a GCP database (defaulting to BigQuery unless otherwise specified). 2. You need to generate a frontend web application to interact with, query, and visualize data from GCP data sources. 3. User wants to build a "chat with your data" experience or integrate the Gemini Data Analytics chat API into a web interface. Do NOT use when any of the following conditions are true: 1. The request is for building backend-only services. 2. The request is for simple CLI scripts or command-line applications. 3. The web application is not data-centric or does not involve visualizing/querying data from GCP sources.
Automated data quality and transformation capabilities for Dataform/dbt/BigQuery pipelines. Processes data sourced from BigQuery or Cloud Storage (GCS), applying best practices for data ingestion, movement, schema mapping, and comprehensive data cleaning.
Expertise in generating clean, correct, and efficient Dataform pipeline code for BigQuery ELT. Use this when creating or modifying Dataform pipelines, actions, or source declarations, when Dataform, SQLX, or BigQuery are mentioned in a transformation, when data needs to be ingested from GCS into BigQuery via Dataform, or when setting up a new Dataform project or configuring workflow_settings.yaml.
Admin access level
Server config contains admin-level keywords
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
This plugin requires configuration values that are prompted when the plugin is enabled. Sensitive values are stored in your system keychain.
GCP_REGIONRegion for GCP services (e.g. us-west1)
${user_config.GCP_REGION}PROJECT_IDProject ID when using the MCP toolbox for databases
${user_config.PROJECT_ID}BIGQUERY_LOCATIONLocation for BigQuery datasets (e.g. US)
${user_config.BIGQUERY_LOCATION}External network access
Connects to servers outside your machine
External network access
Connects to servers outside your machine
Requires secrets
Needs API keys or credentials to function
Requires secrets
Needs API keys or credentials to function
[!NOTE] This extension is currently in beta (pre-v1.0), and may see breaking changes until the first stable release (v1.0).
This plugin provides a specialized suite of skills and MCP tools for data engineers and database practitioners working on Google Cloud. It acts as an expert assistant, allowing you to use natural language prompts in your preferred coding agent to architect complex data pipelines, transform data with dbt, write Spark and BigQuery SQL notebooks, and orchestrate end-to-end workflows across the Google Cloud data ecosystem (BigQuery, Spanner, BigLake, Dataproc, etc.).
[!IMPORTANT] We Want Your Feedback! Please share your thoughts with us by opening an issue on GitHub. Your input is invaluable and helps us improve the project for everyone.
Ensure you have the following installed:
Choose the installation method for your preferred coding agent. Run the commands in terminal
Install the plugin directly from GitHub:
agy plugin install https://github.com/gemini-cli-extensions/data-agent-kit-starter-pack
Install the extension directly from GitHub:
gemini extensions install https://github.com/gemini-cli-extensions/data-agent-kit-starter-pack --ref 0.4.0
Run the claude command to start the agent, then follow these steps:
/plugin install data-agent-kit-starter-pack@claude-plugins-official
macOS / Linux:
CODEX_TAG="0.4.0"; curl -sSL https://raw.githubusercontent.com/gemini-cli-extensions/data-agent-kit-starter-pack/$CODEX_TAG/codex-install.sh | bash -s -- $CODEX_TAG
Windows:
$env:CODEX_TAG="0.4.0"; irm "https://raw.githubusercontent.com/gemini-cli-extensions/data-agent-kit-starter-pack/$env:CODEX_TAG/codex-install.ps1" | iex
Start the Codex agent (codex), then run:
/plugins
Use the interactive options to install the plugin with the name Data Agent Kit Starter Pack.
This extension brings a suite of specialized Skills and MCP toolboxes. While skills are ready to use upon installation, you must configure the MCP toolboxes and authenticate with Google Cloud for them to start successfully.
[!NOTE] If you use Gemini CLI, Claude Code, or Codex in your IDE (e.g., via VS Code extensions), they share the same underlying configuration and MCP servers as the CLI agents.
🐉 Specialised SRE skills for outage investigations, monitoring graphs, and post-mortems on Google Cloud Platform.
Connect to Looker and interact with your data using LookML.
Connect, query, and generate data insights for BigQuery datasets and data.
The CI/CD extension provides Gemini powered AI assisted CI/CD. It supports deployment to Cloud Run and Cloud Storage as well as creation of a robust CI/CD pipeline.
Create, connect, and interact with a Cloud SQL for PostgreSQL database and data.
npx claudepluginhub gemini-cli-extensions/data-agent-kit-starter-pack --plugin dakEditorial "Data Engineering" bundle for Claude Code from Antigravity Awesome Skills.
Data engineering plugin - warehouse exploration, pipeline authoring, Airflow integration
Data analysis expert for SQL queries, BigQuery operations, and data insights. Use proactively for data analysis tasks and queries.
Spec-Driven Development framework for Data Engineering — 58 agents, 24 KB domains, 5-phase SDD workflow, 31 commands
Claude Code skill pack for Snowflake data platform — snowflake-sdk, SQL, Snowpark (30 skills)
BigQuery cost analysis and optimization utilities