This skill should be used when users need to create or debug Bloblang transformation scripts. Trigger when users ask about transforming data, mapping fields, parsing JSON/CSV/XML, converting timestamps, filtering arrays, or mention "bloblang", "blobl", "mapping processor", or describe any data transformation need like "convert this to that" or "transform my JSON".
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Create working, tested Bloblang transformation scripts from natural language descriptions.
Generate a Bloblang (blobl) script that correctly transforms the user's input data according to their requirements. The script MUST be tested before presenting it.
This skill requires rpk rpk connect, python3, and jq.
See the SETUP for installation instructions.
Generates category-organized Bloblang reference files in XML format. Run once at the start of each session before searching for functions/methods.
# Usage:
./resources/scripts/format-bloblang.sh
functions-General.xml, methods-String_Manipulation.xml)BLOBLREF_DIR variable for later use)Generated function files have functions-<Category>.xml names and contain functions relevant to that category.
functions-Encoding.xml - Schema registry headersfunctions-Environment.xml - Environment vars, files, timestamps, hostnamefunctions-Fake_Data_Generation.xml - Fake data generationfunctions-General.xml - Bytes, counter, deleted, ksuid, nanoid, uuid, random, range, snowflakefunctions-Message_Info.xml - Batch index, content, error, metadata, span links, tracing IDsThe function XML tag format:
name attribute - function nameparams attribute - comma-separated list of parameters with types, format <name>:<type> or empty string if no parametersexample XML subtag
summary attribute (optional) - brief description of the exampleExample function definition:
<function name="random_int" params="seed:query expression, min:integer, max:integer">
Generates a pseudo-random non-negative 64-bit integer.
Use this for creating random IDs, sampling data, or generating test values.
Provide a seed for reproducible randomness, or use a dynamic seed like `timestamp_unix_nano()` for unique values per mapping instance.
Optional `min` and `max` parameters constrain the output range (both inclusive).
For dynamic ranges based on message data, use the modulo operator instead: `random_int() % dynamic_max + dynamic_min`.
<example>
root.first = random_int()
root.second = random_int(1)
root.third = random_int(max:20)
root.fourth = random_int(min:10, max:20)
root.fifth = random_int(timestamp_unix_nano(), 5, 20)
root.sixth = random_int(seed:timestamp_unix_nano(), max:20)
</example>
<example summary="Use a dynamic seed for unique random values per mapping instance.">
root.random_id = random_int(timestamp_unix_nano())
root.sample_percent = random_int(seed: timestamp_unix_nano(), min: 0, max: 100)
</example>
</function>
Generated method files have methods-<Category>.xml names and contain methods relevant to that category.
methods-Encoding_and_Encryption.xml - Base64, compression, hashing, encryptionmethods-General.xml - Basic operations, type checkingmethods-GeoIP.xml - GeoIP lookupsmethods-JSON_Web_Tokens.xml - JWT operationsmethods-Number_Manipulation.xml - Arithmetic, rounding, formattingmethods-Object___Array_Manipulation.xml - Filtering, mapping, sorting, mergingmethods-Parsing.xml - JSON, CSV, XML, protocol buffer parsingmethods-Regular_Expressions.xml - Regex matching and replacementmethods-SQL.xml - SQL operationsmethods-String_Manipulation.xml - Case, trimming, splitting, formattingmethods-Timestamp_Manipulation.xml - Parsing, formatting, timezone conversionmethods-Type_Coercion.xml - Type conversionsThe method XML tag format:
name attribute - function nameparams attribute - comma-separated list of parameters with types, format <name>:<type> or empty string if no parametersexample XML subtag
summary attribute (optional) - brief description of the exampleExample method definition:
<method name="ts_format" params="format:string, tz:string">
Formats a timestamp into a string using the specified format layout.
<example>
root.formatted = this.timestamp.ts_format("2006-01-02T15:04:05Z07:00")
</example>
</method>
Lists Available functions and methods without loading full files.
# List all available functions and methods by name
grep -hE '<(function|method) name=' "$BLOBLREF_DIR"
# Search by keyword (searches names, descriptions, params, examples)
grep -i "timestamp" "$BLOBLREF_DIR"
# Search by parameter name (e.g., find all with "format" parameter)
grep 'params="[^"]*format' "$BLOBLREF_DIR"
BLOBLREF_DIR set to the directory output by format-bloblang.shTests a Bloblang script against input data. Executes the transformation and returns results or errors. Can be run repeatedly during iteration.
# Usage:
./resources/scripts/test-blobl.sh <target-directory>
data.json (input) and script.blobl (transformation) in the target directoryBloblang (blobl) is Redpanda Connect's native mapping language for transforming message data. It's designed for readability and safely reshaping documents of any structure.
Assignment: Create new documents by assigning values to paths.
root = the new document being createdthis = the input document being read# Copy entire input
root = this
# Create specific fields
root.id = this.thing.id
root.type = "processed"
# In: {"thing":{"id":"abc123"}}
# Out: {"id":"abc123","type":"processed"}
Field Paths: Use dot notation for nested fields. Use quotes for special characters:
root.user.name = this.customer.full_name
root."foo.bar".baz = this."field with spaces"
Literals: Numbers, booleans, strings, null, arrays, and objects:
root = {
"count": 42,
"active": true,
"items": ["a", "b", "c"],
"nested": {"key": "value"}
}
Functions generate values (no target needed):
root.id = uuid_v4()
root.timestamp = now()
root.hostname = hostname()
Methods transform values (called on a target with .):
root.upper = this.name.uppercase()
root.formatted = this.date.ts_parse("2006-01-02").ts_format("Mon Jan 2")
root.sorted = this.items.sort()
Methods can be chained:
root.clean = this.text.trim().lowercase().replace_all("_", "-")
Methods require a target (called with .), while functions do not.
Check the XML reference files to determine correct usage:
# Bad: floor() is a method, not a function
root.rounded = floor(this.value) # Error: floor is not a function
# Good: Call floor() as a method on a value
root.rounded = this.value.floor()
# Bad: uuid_v4() is a function, not a method
root.id = this.uuid_v4() # Error: uuid_v4 is not a method
# Good: Call uuid_v4() as a function
root.id = uuid_v4()
Discovering Available Functions & Methods
Bloblang provides hundreds of functions and methods organized into categories. Start with these foundational categories that cover common use cases:
functions-General.xml - Core utility functions (uuid_v4, timestamp, random, etc.)functions-Message_Info.xml - Message metadata access (hostname, env, content_type, etc.)methods-General.xml - Universal transformations (type conversions, existence checks, etc.)For specialized needs, consult domain-specific categories: strings (uppercase, trim, regexp), timestamps (ts_parse, ts_format), arrays (map_each, filter), objects (keys, values), encoding (base64, json), and more.
Discovery tools:
format-bloblang.sh to generate category-organized XML reference files in a versioned directoryConditionals (if/else):
root.category = if this.score >= 80 {
"high"
} else if this.score >= 50 {
"medium"
} else {
"low"
}
Pattern Matching (match):
root.sound = match this.animal {
"cat" => "meow"
"dog" => "woof"
"cow" => "moo"
_ => "unknown" # Catch-all
}
Coalescing (try multiple paths with |):
# Use first non-null value from alternative fields
root.content = this.article.body | this.comment.text | "no content"
# Try different nested paths
root.id = this.data.(primary_id | secondary_id | backup_id)
Note: Use | for alternative field paths (missing fields), use .catch() for operation failures (parse errors, type mismatches).
Deletion:
root = this
root.password = deleted() # Remove field
# Or filter entire message
root = if this.spam { deleted() }
Variables (reuse values without adding to output):
let user_id = this.user.id
let enriched = this.user.name + " (" + $user_id + ")"
root.display_name = $enriched
root.user_id = $user_id
IMPORTANT: Variables must be declared at the top level, not inside if, match, or other blocks.
# Bad: Will cause "expected }" parse error
root.age = if this.birthdate != null {
let parsed = this.birthdate.ts_parse("2006-01-02") # let not allowed here!
$parsed.ts_unix()
}
# Good: Declare variables at top level
let parsed = this.birthdate.ts_parse("2006-01-02").catch(null)
root.age = if $parsed != null {
$parsed.ts_unix()
} else {
null
}
Named mappings: (reusable scripts)
map extract_user {
root.id = this.user_id
root.name = this.full_name
root.email = this.contact.email
}
root.customer = this.customer_data.apply("extract_user")
root.vendor = this.vendor_data.apply("extract_user")
Error Handling (provide fallback values):
# Catch errors from any point in the chain
root.count = this.items.length().catch(0)
root.parsed = this.data.parse_json().catch({})
# Catch missing/null values
root.name = this.user.name.or("anonymous")
# Multi-format parsing with catch chains
# Store value in variable for reliable access in catch fallbacks
let date_str = this.date
root.parsed = $date_str.ts_parse("2006-01-02").catch(
$date_str.ts_parse("2006/01/02")
).catch(null)
IMPORTANT: When using .catch() with fallback expressions that reference this.field, store the field in a variable first.
Context references in catch chains can be unreliable:
# Risky: Context may not be preserved in catch
root.parsed = this.date.ts_parse("2006-01-02").catch(
this.date.ts_parse("2006/01/02") # this.date might not work here
)
# Safe: Store in variable first
let date_str = this.date
root.parsed = $date_str.ts_parse("2006-01-02").catch(
$date_str.ts_parse("2006/01/02") # variable reference is reliable
)
Metadata:
# Read metadata with @ or metadata()
root.topic = @kafka_topic
root.partition = @kafka_partition
# Set metadata
meta output_key = this.id
meta content_type = "application/json"
Safe field access with fallbacks
# Bad: Will fail if user or name is missing
root.name = this.user.name
# Good: Provides fallback chain
root.name = this.user.name.or("anonymous")
root.name = this.(user.name | profile.display_name | "unknown")
Safe collection operations
# Bad: Will fail on empty array
root.first = this.items[0]
# Good: Handles empty arrays
root.first = if this.items.length() > 0 { this.items[0] } else { null }
root.first = this.items[0].catch(null)
Safe parsing with error recovery
# Bad: Will fail on invalid JSON
root.data = this.payload.parse_json()
# Good: Provides fallback on parse failure
root.data = this.payload.parse_json().catch({})
root.data = this.payload.parse_json().catch(this.payload) # Keep original on failure
Safe type coercion
# Bad: Assumes field is already a string
root.id = this.user_id.uppercase()
# Good: Converts to string first
root.id = this.user_id.string().uppercase()
root.count = this.total.number().catch(0)
IMPORTANT: Arithmetic operations on null values fail silently.
Always check for null or use .catch() to provide fallbacks:
# Bad: Fails silently if price is null
root.total = this.price * this.quantity
# Good: Check for null before operations
root.total = if this.price != null && this.quantity != null {
this.price * this.quantity
} else {
null
}
# Also good: Use catch to handle null gracefully
root.total = (this.price * this.quantity).catch(null)
Understand - Analyze input structure, desired output, and required transformations
Discover - Generate category files to versioned directory (capture BLOBLREF_DIR from script output), identify relevant categories, read specific category XML files to find actual Bloblang functions/methods (NEVER guess)
Develop - Write valid Bloblang syntax using discovered functions (root for output, this for input, chain methods, handle nulls)
Validate - Test script with sample input data, verify output matches expectations, iterate on errors until working
Deliver - Write the working script and example input to files (script.blobl, data.json), present the tested output, document any assumptions
Critical: Never present untested code. All scripts must be validated before showing to user.