From dominodatalab
Access external LLM providers through Domino AI Gateway — a secure proxy with centralized API key management, usage monitoring, and compliance. Supports OpenAI, AWS Bedrock, Azure OpenAI, Anthropic.
How this skill is triggered — by the user, by Claude, or both
Slash command
/dominodatalab:ai-gatewayThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill helps users work with Domino AI Gateway - a secure proxy for accessing external Large Language Model (LLM) providers with centralized management, monitoring, and compliance.
This skill helps users work with Domino AI Gateway - a secure proxy for accessing external Large Language Model (LLM) providers with centralized management, monitoring, and compliance.
Activate this skill when users want to:
Domino AI Gateway provides:
| Provider | Models |
|---|---|
| OpenAI | GPT-4, GPT-4 Turbo, GPT-3.5 |
| AWS Bedrock | Claude, Titan, Llama 2 |
| Azure OpenAI | GPT-4, GPT-3.5 |
| Anthropic | Claude 3, Claude 2 |
| Google Vertex AI | PaLM, Gemini |
| Cohere | Command, Embed |
openai-gpt4)# Create endpoint via Domino API
import requests, os
TOKEN = requests.get("http://localhost:8899/access-token").text.strip()
BASE = os.environ["DOMINO_API_HOST"]
response = requests.post(
f"{BASE}/api/aigateway/v1/endpoints",
headers={"Authorization": f"Bearer {TOKEN}"},
json={
"name": "openai-gpt4",
"provider": "openai",
"model": "gpt-4",
"providerApiKey": "sk-..."
}
)
AI Gateway provides an OpenAI-compatible interface:
from openai import OpenAI
# Configure client to use AI Gateway
client = OpenAI(
api_key="not-needed", # Handled by AI Gateway
base_url="https://your-domino.com/api/aigateway/v1/openai"
)
# Use like standard OpenAI
response = client.chat.completions.create(
model="openai-gpt4", # Your endpoint name
messages=[
{"role": "user", "content": "Hello, how are you?"}
]
)
print(response.choices[0].message.content)
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="openai-gpt4", # Endpoint name
openai_api_key="not-needed",
openai_api_base="https://your-domino.com/api/aigateway/v1/openai"
)
response = llm.invoke("What is machine learning?")
print(response.content)
import requests, os
TOKEN = requests.get("http://localhost:8899/access-token").text.strip()
BASE = os.environ["DOMINO_API_HOST"]
response = requests.post(
f"{BASE}/api/aigateway/v1/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {TOKEN}",
},
json={
"model": "openai-gpt4",
"messages": [{"role": "user", "content": "Hello!"}]
}
)
result = response.json()
print(result["choices"][0]["message"]["content"])
Configure who can use each endpoint:
# Via UI: Endpoints > Gateway LLMs > Download logs
# Logs include:
# - Timestamp
# - User
# - Model
# - Input/Output tokens
# - Response time
# - Status
{
"timestamp": "2024-01-15T10:30:00Z",
"user": "[email protected]",
"endpoint": "openai-gpt4",
"model": "gpt-4",
"inputTokens": 150,
"outputTokens": 200,
"durationMs": 1500,
"status": "success"
}
AI Gateway tracks token usage per:
Admins can configure:
# Define endpoint once
LLM_ENDPOINT = "production-gpt4"
# Use throughout code
response = client.chat.completions.create(
model=LLM_ENDPOINT,
messages=[...]
)
import time
from openai import RateLimitError
def call_llm_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="openai-gpt4",
messages=messages
)
except RateLimitError:
if attempt < max_retries - 1:
time.sleep(2 ** attempt)
else:
raise
import logging
logger = logging.getLogger(__name__)
def query_llm(prompt):
logger.info(f"Querying LLM with prompt length: {len(prompt)}")
response = client.chat.completions.create(
model="openai-gpt4",
messages=[{"role": "user", "content": prompt}]
)
logger.info(f"Response tokens: {response.usage.total_tokens}")
return response.choices[0].message.content
# Streaming response
stream = client.chat.completions.create(
model="openai-gpt4",
messages=[{"role": "user", "content": "Write a long story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Error: 401 Unauthorized
Error: 429 Too Many Requests
Error: Model 'model-name' not found
Before writing or verifying any API call, use the cluster swagger to confirm current endpoint paths and field names. Use public docs for workflow context and field explanations.
Get the cluster base URL: $DOMINO_API_HOST (injected by Domino into every workspace, job, and app).
Fetch the swagger spec:
# No authentication required for the public API spec
curl "$DOMINO_API_HOST/assets/public-api.json"
# Browser UI: $DOMINO_API_HOST/assets/lib/swagger-ui/index.html?url=/assets/public-api.json#/
Public docs (workflow context and field explanations):
Offers UI/UX design guidance for web and mobile with 50+ styles, 161 color palettes, 57 font pairings, and 99 UX guidelines across 10 stacks. Use for designing pages, components, color systems, or reviewing UI code.
Mines projects and conversations into a searchable memory palace. Activates on queries about MemPalace, memory palace, mining, searching, palace setup, wings, rooms, drawers, or recalling past work.
npx claudepluginhub anthropics/claude-plugins-official --plugin dominodatalab