Set up GPU sandboxes for interpretability research. Use when writing setup.py scripts with Sandbox, SandboxConfig, ModelConfig, or create_notebook_session. Provides the exact API for Modal GPU environments - MUST read before writing any sandbox setup code.
Inherits all available tools
Additional assets for this skill
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Seer provides GPU-accelerated sandboxed environments for running interpretability experiments. You can set up remote environments with models pre-loaded, connect to Jupyter notebooks, and run experiments interactively.
DO NOT use start_new_session() directly. That tool is only for local Jupyter sessions without GPU.
For GPU sandboxes, you MUST:
src library to create a Modal sandboxuv run --with "seer @ git+https://github.com/ajobi-uhc/seer.git" python setup.pysession_id and jupyter_urlattach_to_session(session_id, jupyter_url) to connect# setup.py
from src.environment import Sandbox, SandboxConfig, ExecutionMode, ModelConfig
from src.workspace import Workspace
from src.execution import create_notebook_session
import json
config = SandboxConfig(
execution_mode=ExecutionMode.NOTEBOOK,
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate"],
)
sandbox = Sandbox(config).start()
session = create_notebook_session(sandbox, Workspace())
print(json.dumps({
"session_id": session.session_id,
"jupyter_url": session.jupyter_url,
"sandbox_id": sandbox.sandbox_id, # IMPORTANT: Save this for sandbox management
}))
uv run --with "seer @ git+https://github.com/ajobi-uhc/seer.git" python setup.py
# Output: {"session_id": "abc123", "jupyter_url": "https://...", "sandbox_id": "sb-xyz..."}
# Now use MCP tool to connect
attach_to_session(session_id="abc123", jupyter_url="https://...")
Only after attach_to_session() succeeds can you use execute_code().
Save the sandbox_id - you'll need it to manage the sandbox (terminate, snapshot, exec commands, etc.) using the modal-sandbox MCP tools.
Configuration for a Modal sandbox environment.
@dataclass
class SandboxConfig:
gpu: Optional[str] = None # "A100", "H100", "A10G", "L4", "T4", None for CPU
gpu_count: int = 1 # Number of GPUs (for multi-GPU setups)
execution_mode: ExecutionMode = ExecutionMode.NOTEBOOK # NOTEBOOK or CLI
models: List[ModelConfig] = [] # Models to pre-load
python_packages: List[str] = [] # pip packages
system_packages: List[str] = [] # apt packages
secrets: List[str] = [] # Env var names to pass from local env
repos: List[RepoConfig] = [] # Git repos to clone
env: Dict[str, str] = {} # Environment variables
timeout: int = 3600 # Timeout in seconds
local_files: List[Tuple[str, str]] = [] # (local_path, sandbox_path)
local_dirs: List[Tuple[str, str]] = [] # (local_dir, sandbox_dir)
debug: bool = False # Start code-server for debugging
GPU Options:
"H100" - NVIDIA H100 (80GB, fastest, best for 70B+ models)"A100-80GB" - NVIDIA A100 80GB (use for large models, 30B+)"A100-40GB" - NVIDIA A100 40GB (good default for most models)"A10G" - NVIDIA A10G (24GB, good for 7B-13B models)"L4" - NVIDIA L4 (24GB, cost-effective)"T4" - NVIDIA T4 (16GB, cheapest, good for small models)None - CPU onlyWhich GPU to use:
Multi-GPU Example:
config = SandboxConfig(
gpu="A100",
gpu_count=2, # 2x A100s for large models
models=[ModelConfig(name="meta-llama/Llama-3-70b-hf")],
)
Example:
config = SandboxConfig(
gpu="A100",
execution_mode=ExecutionMode.NOTEBOOK,
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate", "matplotlib"],
system_packages=["git"],
secrets=["HF_TOKEN", "OPENAI_API_KEY"],
env={"CUSTOM_VAR": "value"},
timeout=7200, # 2 hours
)
Configuration for a model to load in the sandbox.
@dataclass
class ModelConfig:
name: str # HuggingFace model ID (REQUIRED)
var_name: str = "model" # Variable name in notebook
hidden: bool = False # Hide details from agent
is_peft: bool = False # Is PEFT adapter
base_model: Optional[str] = None # Base model for PEFT
IMPORTANT: These are the ONLY valid parameters. Do not add load_kwargs, dtype, device_map, quantization, or any other parameters - they don't exist. Model loading configuration is handled automatically by the sandbox.
Examples:
Basic model:
ModelConfig(name="google/gemma-2-9b-it")
Multiple models with custom names:
models=[
ModelConfig(name="google/gemma-2-9b-it", var_name="model_a"),
ModelConfig(name="meta-llama/Llama-2-7b-hf", var_name="model_b"),
]
PEFT adapter (for investigating fine-tuned models):
ModelConfig(
name="user/gemma-adapter",
base_model="google/gemma-2-9b-it",
is_peft=True,
hidden=True # Hide the adapter name from the agent
)
Configuration for cloning Git repositories.
@dataclass
class RepoConfig:
url: str # Git URL (e.g., "owner/repo" or full URL)
dockerfile: Optional[str] = None # Path to Dockerfile in repo
install: bool = False # pip install -e the repo
Example:
repos=[RepoConfig(url="anthropics/circuits", install=True)]
The Sandbox class manages GPU environments on Modal.
sandbox = Sandbox(config)
# Start the sandbox (required before any other operations)
sandbox.start(name="my-sandbox") # Returns self for chaining
# Execute shell commands
output = sandbox.exec("pip list")
output = sandbox.exec("nvidia-smi", timeout=30)
# Execute Python code directly
result = sandbox.exec_python("import torch; print(torch.cuda.is_available())")
# Write files to sandbox
sandbox.write_file("/workspace/script.py", "print('hello')")
# Create directories
sandbox.ensure_dir("/workspace/data/outputs")
# Snapshot current state (for resuming later)
snapshot = sandbox.snapshot("after training")
# Terminate (optionally save snapshot first)
sandbox.terminate()
# or
snapshot = sandbox.terminate(save_snapshot=True, snapshot_description="final state")
# Restore from snapshot
sandbox2 = Sandbox.from_snapshot(snapshot, config)
sandbox.jupyter_url # Jupyter server URL (if notebook mode)
sandbox.code_server_url # VS Code server URL (if debug=True)
sandbox.sandbox_id # Modal sandbox ID
sandbox.model_handles # List of ModelHandle objects
sandbox.repo_handles # List of RepoHandle objects
sandbox.modal_sandbox # Raw modal.Sandbox object (advanced)
Returned by create_notebook_session(). Represents a live Jupyter kernel.
session.session_id # Unique session ID (use with MCP tools)
session.jupyter_url # Jupyter server URL
session.sandbox # Parent Sandbox object
session.model_info_text # Formatted string describing loaded models
session.mcp_config # MCP server config dict for connecting agents
# Execute code in notebook kernel
result = session.exec("print('hello')")
result = session.exec("x = 42", hidden=True) # Hidden from notebook
# Execute a Python file
result = session.exec_file("script.py")
# Apply workspace (usually done automatically)
session.setup(workspace)
Libraries are Python files that get injected into the execution environment.
from src.workspace import Library
# From a single Python file
lib = Library.from_file("utils.py")
lib = Library.from_file("helpers.py", name="my_helpers") # Custom import name
# From a directory (Python package)
lib = Library.from_directory("my_package/") # Must have __init__.py
# From a skill directory (SKILL.md format)
lib = Library.from_skill_dir("skills/steering-hook/") # Loads code.py
# Manual construction
lib = Library(
name="tools",
files={"tools.py": "def helper(): pass"},
docs="Helper utilities for experiments",
)
lib.name # Import name
lib.files # Dict of filename -> source code
lib.docs # Documentation string
lib.is_single_file # True if single .py file (not package)
Workspace bundles libraries and configuration for a session.
from src.workspace import Workspace, Library
workspace = Workspace(
libraries=[
Library.from_file("steering_hook.py"),
Library.from_file("extract_activations.py"),
],
skills=[], # Skill objects to install
skill_dirs=[], # Paths to skill directories
local_files=[], # (local_path, workspace_path) for files
local_dirs=[], # (local_path, workspace_path) for directories
custom_init_code="", # Code to run during setup
preload_models=True, # Whether to load models into kernel
hidden_model_loading=True, # Hide model loading cells from notebook
)
# Get combined documentation from all libraries
docs = workspace.get_library_docs()
Standard GPU sandbox with Jupyter notebook. Use this for most experiments.
from src.environment import Sandbox, SandboxConfig, ExecutionMode, ModelConfig
from src.workspace import Workspace, Library
from src.execution import create_notebook_session
import json
config = SandboxConfig(
execution_mode=ExecutionMode.NOTEBOOK,
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate"],
)
sandbox = Sandbox(config).start()
workspace = Workspace(
libraries=[
Library.from_file("path/to/helper.py"),
]
)
session = create_notebook_session(sandbox, workspace)
# Output connection info as JSON
print(json.dumps({
"session_id": session.session_id,
"jupyter_url": session.jupyter_url,
"model_info": session.model_info_text,
}))
Isolated GPU sandbox with RPC interface. Use when you need to expose GPU functions to local code.
from src.environment import ScopedSandbox, SandboxConfig, ModelConfig
from src.workspace import Workspace
from src.execution import create_local_session
config = SandboxConfig(
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b")],
python_packages=["torch", "transformers"],
)
scoped = ScopedSandbox(config)
scoped.start()
# Serve an interface file via RPC
interface_lib = scoped.serve(
"path/to/interface.py",
expose_as="library", # or "mcp" for MCP server
name="model_tools"
)
workspace = Workspace(libraries=[interface_lib])
session = create_local_session(
workspace=workspace,
workspace_dir="./workspace",
name="experiment"
)
# Now the interface functions are available via RPC
ScopedSandbox Methods:
# Start with optional workspace (libraries the RPC code needs)
scoped.start(workspace=Workspace(libraries=[...]), name="my-sandbox")
# Serve code via RPC with different expose modes
lib = scoped.serve("interface.py", expose_as="library", name="tools") # Returns Library
mcp = scoped.serve("interface.py", expose_as="mcp", name="tools") # Returns MCP config dict
prompt = scoped.serve("interface.py", expose_as="prompt", name="tools") # Returns prompt string
skill = scoped.serve("interface.py", expose_as="skill", name="tools") # Returns Skill object
# Debug RPC server issues
scoped.show_rpc_logs(lines=100) # Print recent RPC server logs
Models are PRE-LOADED into the notebook kernel namespace.
When you run create_notebook_session(), the library:
model, tokenizer (or custom var_name)session.model_info_text describing what's availableThe session.model_info_text contains critical information:
### Pre-loaded Models
The following models are already loaded in the kernel:
**model** (google/gemma-2-9b-it)
- Type: Gemma2ForCausalLM
- Device: cuda:0
- Parameters: 9.24B
**tokenizer** (google/gemma-2-9b-it)
- Type: GemmaTokenizerFast
- Vocab size: 256,000
**IMPORTANT:** Do NOT reload these models. They are already loaded and ready to use.
execute_code(session_id, """
import torch
# Models are ALREADY loaded - just use them!
print(f"Model device: {model.device}")
print(f"Model type: {type(model).__name__}")
# Use directly
inputs = tokenizer("Hello world", return_tensors="pt").to(model.device)
outputs = model(**inputs)
print(f"Logits shape: {outputs.logits.shape}")
""")
# ❌ Don't load models manually!
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained(...) # WRONG!
# The models are already loaded. Just use `model` and `tokenizer` directly.
When you load multiple models:
config = SandboxConfig(
models=[
ModelConfig(name="google/gemma-2-9b-it", var_name="gemma"),
ModelConfig(name="meta-llama/Llama-2-7b-hf", var_name="llama"),
]
)
Both will be pre-loaded and available:
execute_code(session_id, """
# Use both models
gemma_output = gemma.generate(**gemma_inputs)
llama_output = llama.generate(**llama_inputs)
""")
When using ScopedSandbox, you create an interface file that exposes functions via RPC. This lets you run functions on the GPU while your main code runs locally.
"""interface.py - Functions that run on GPU via RPC"""
from transformers import AutoModel, AutoTokenizer
import torch
# get_model_path() is injected by the RPC server
model_path = get_model_path("google/gemma-2-9b")
model = AutoModel.from_pretrained(model_path, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_path)
@expose # @expose decorator is also injected by RPC server
def get_model_info() -> dict:
"""Get basic model information."""
config = model.config
return {
"num_layers": config.num_hidden_layers,
"hidden_size": config.hidden_size,
"vocab_size": config.vocab_size,
"device": str(model.device),
}
@expose
def analyze_text(text: str) -> dict:
"""Analyze text using the model."""
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model(**inputs, output_hidden_states=True)
# Return simple types (dict, list, str, int, float, bool)
return {
"text": text,
"num_tokens": len(inputs.input_ids[0]),
"logits_shape": list(outputs.logits.shape),
}
get_model_path(model_name): Injected helper to get local model path@expose: Decorator injected by RPC server to expose functions"""interface.py - Stateful interface with conversation history"""
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_path = get_model_path("google/gemma-2-9b-it")
model = AutoModelForCausalLM.from_pretrained(
model_path,
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_path)
# State (persists across calls)
conversation_history = []
@expose
def send_message(message: str, max_tokens: int = 512) -> dict:
"""Send message and get response."""
global conversation_history
# Add to history
conversation_history.append({"role": "user", "content": message})
# Format prompt
prompt = tokenizer.apply_chat_template(
conversation_history,
tokenize=False,
add_generation_prompt=True
)
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=max_tokens,
temperature=0.7,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract just the new response
response = response[len(prompt):].strip()
# Add to history
conversation_history.append({"role": "assistant", "content": response})
return {
"response": response,
"history_length": len(conversation_history),
}
@expose
def reset_conversation() -> str:
"""Reset conversation history."""
global conversation_history
conversation_history = []
return "Conversation reset"
@expose
def get_history() -> list:
"""Get full conversation history."""
return conversation_history.copy()
For advanced use cases, you can create sophisticated interfaces with multiple functions and state:
"""conversation_interface.py - Full conversation management via RPC"""
from typing import Optional
import asyncio
# Import other local files (they're all in /root/ in the sandbox)
from target_agent import TargetAgent
_target: Optional[TargetAgent] = None
def get_target(model: str = "openai/gpt-4o-mini") -> TargetAgent:
"""Get or create singleton target."""
global _target
if _target is None:
_target = TargetAgent(model=model)
return _target
@expose
def initialize_target(system_message: str) -> str:
"""Initialize target with system prompt."""
target = get_target()
asyncio.run(target.initialize(system_message))
return "Target initialized"
@expose
def send_to_target(message: str) -> dict:
"""Send message to target and get response."""
target = get_target()
response = asyncio.run(target.send_message(message))
return {
"type": response["type"],
"content": response.get("content", ""),
"tool_calls": response.get("tool_calls", []),
}
See experiments/petri-style-harness/conversation_interface.py for a complete example.
Libraries are Python files that get injected into the sandbox environment.
From file:
from src.workspace import Library
lib = Library.from_file("path/to/helper.py")
From code string:
code = '''
def my_helper(x):
return x * 2
'''
lib = Library.from_code(code, name="helpers")
From RPC interface (ScopedSandbox only):
interface_lib = scoped.serve(
"path/to/interface.py",
expose_as="library",
name="model_tools"
)
from src.workspace import Workspace, Library
workspace = Workspace(
libraries=[
Library.from_file("my_helpers.py"), # Your custom helper files
]
)
session = create_notebook_session(sandbox, workspace)
Now these libraries are importable in the notebook:
execute_code(session_id, """
import my_helpers
# Use your library
result = my_helpers.analyze(model, tokenizer, "test input")
""")
"""Basic steering vectors experiment on Gemma."""
import asyncio
from pathlib import Path
from src.environment import Sandbox, SandboxConfig, ExecutionMode, ModelConfig
from src.workspace import Workspace, Library
from src.execution import create_notebook_session
import json
async def main():
# Configure sandbox
config = SandboxConfig(
execution_mode=ExecutionMode.NOTEBOOK,
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate", "matplotlib", "numpy"],
)
# Start sandbox (takes ~5min first time, <1min after)
sandbox = Sandbox(config).start()
# Create notebook session
session = create_notebook_session(sandbox, Workspace())
# Output connection info
print(json.dumps({
"session_id": session.session_id,
"jupyter_url": session.jupyter_url,
"model_info": session.model_info_text,
}))
if __name__ == "__main__":
asyncio.run(main())
After running this script, parse the JSON output and connect:
attach_to_session(
session_id="<from output>",
jupyter_url="<from output>"
)
Then execute experiments:
execute_code(session_id, """
import torch
from steering_hook import create_steering_hook
# Model is already loaded
print(f"Model: {type(model).__name__} on {model.device}")
# Extract steering vector from contrast pair
from extract_activations import get_layer_activations
positive_text = "I strongly agree with your perspective."
negative_text = "I disagree with your perspective."
pos_acts = get_layer_activations(model, tokenizer, positive_text, layer=20)
neg_acts = get_layer_activations(model, tokenizer, negative_text, layer=20)
steering_vector = pos_acts - neg_acts
# Test steering
test_prompt = "What do you think about this idea?"
inputs = tokenizer(test_prompt, return_tensors="pt").to(model.device)
# Generate with steering
with create_steering_hook(model, layer_idx=20, vector=steering_vector, strength=2.0):
outputs = model.generate(**inputs, max_new_tokens=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(f"Steered response: {response}")
""")
"""Investigate hidden preferences in a fine-tuned model."""
import asyncio
from pathlib import Path
from src.environment import Sandbox, SandboxConfig, ExecutionMode, ModelConfig
from src.workspace import Workspace, Library
from src.execution import create_notebook_session
import json
async def main():
example_dir = Path(__file__).parent
toolkit = example_dir.parent / "toolkit"
# Configure with PEFT adapter (hidden from agent)
config = SandboxConfig(
gpu="A100",
execution_mode=ExecutionMode.NOTEBOOK,
models=[ModelConfig(
name="user/gemma-adapter-secret-preference",
base_model="google/gemma-2-9b-it",
is_peft=True,
hidden=True # Agent won't know which adapter
)],
python_packages=["torch", "transformers", "accelerate", "datasets", "peft"],
secrets=["HF_TOKEN"],
)
sandbox = Sandbox(config).start()
workspace = Workspace(
libraries=[
Library.from_file(toolkit / "steering_hook.py"),
Library.from_file(toolkit / "extract_activations.py"),
Library.from_file(toolkit / "generate_response.py"),
]
)
session = create_notebook_session(sandbox, workspace)
# Include research methodology
task = (example_dir / "task.md").read_text()
methodology = (toolkit / "research_methodology.md").read_text()
print(json.dumps({
"session_id": session.session_id,
"jupyter_url": session.jupyter_url,
"model_info": session.model_info_text,
"task": task,
"methodology": methodology,
}))
if __name__ == "__main__":
asyncio.run(main())
"""Expose GPU functions via RPC interface."""
import asyncio
from pathlib import Path
from src.environment import ScopedSandbox, SandboxConfig, ModelConfig
from src.workspace import Workspace
from src.execution import create_local_session
import json
async def main():
example_dir = Path(__file__).parent
# Create scoped sandbox with GPU
config = SandboxConfig(
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b")],
python_packages=["torch", "transformers", "anthropic", "openai"],
secrets=["ANTHROPIC_API_KEY", "OPENAI_API_KEY"],
)
scoped = ScopedSandbox(config)
scoped.start()
# Serve interface as MCP server
interface_lib = scoped.serve(
str(example_dir / "conversation_interface.py"),
expose_as="mcp",
name="conversation_tools"
)
# Also upload supporting files
# (they'll be in /root/ alongside interface.py)
scoped.upload_file(str(example_dir / "target_agent.py"))
scoped.upload_file(str(example_dir / "prompts.py"))
# Create local session with MCP tools
workspace = Workspace(libraries=[interface_lib])
session = create_local_session(
workspace=workspace,
workspace_dir=str(example_dir / "workspace"),
name="multi-agent-experiment"
)
print(json.dumps({
"status": "ready",
"interface": "conversation_tools (MCP)",
"workspace_dir": str(example_dir / "workspace"),
}))
# Now you can use the MCP tools from your local code
if __name__ == "__main__":
asyncio.run(main())
"""Experiment using code from an external repository."""
import asyncio
from src.environment import Sandbox, SandboxConfig, ExecutionMode, ModelConfig, RepoConfig
from src.execution import create_notebook_session
import json
async def main():
config = SandboxConfig(
execution_mode=ExecutionMode.NOTEBOOK,
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate"],
system_packages=["git"],
repos=[
RepoConfig(
url="anthropics/transformer-circuits",
install=True # Run pip install -e on the repo
)
],
)
sandbox = Sandbox(config).start()
session = create_notebook_session(sandbox, Workspace())
print(json.dumps({
"session_id": session.session_id,
"jupyter_url": session.jupyter_url,
}))
if __name__ == "__main__":
asyncio.run(main())
Common utility code patterns you can write directly in notebooks:
# Activation steering via forward hook
import torch
from contextlib import contextmanager
@contextmanager
def steering_hook(model, layer_idx, vector, strength=1.0):
"""Add steering vector to residual stream at specified layer."""
def hook(module, input, output):
# output is (hidden_states, ...) tuple
hidden = output[0]
hidden[:, :, :] = hidden + strength * vector.to(hidden.device)
return (hidden,) + output[1:]
# Get the layer
layer = model.model.layers[layer_idx]
handle = layer.register_forward_hook(hook)
try:
yield
finally:
handle.remove()
# Usage
with steering_hook(model, layer_idx=20, vector=steering_vec, strength=2.0):
outputs = model.generate(**inputs)
# Get activations from a specific layer
def get_layer_activations(model, tokenizer, text, layer):
"""Extract residual stream activations from a layer."""
activations = []
def hook(module, input, output):
activations.append(output[0].detach())
handle = model.model.layers[layer].register_forward_hook(hook)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
model(**inputs)
handle.remove()
# Return last token's activation
return activations[0][0, -1, :]
# Simple generation helper
def generate_response(model, tokenizer, prompt, max_tokens=256, temperature=0.7):
"""Generate text from prompt."""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=max_tokens,
temperature=temperature,
do_sample=temperature > 0,
pad_token_id=tokenizer.eos_token_id,
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
When: You want to quickly test something on a model.
config = SandboxConfig(
execution_mode=ExecutionMode.NOTEBOOK,
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate"],
)
sandbox = Sandbox(config).start()
session = create_notebook_session(sandbox, Workspace())
# -> attach and experiment
When: You want to investigate steering vectors.
config = SandboxConfig(
gpu="A100",
execution_mode=ExecutionMode.NOTEBOOK,
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate", "matplotlib"],
)
sandbox = Sandbox(config).start()
session = create_notebook_session(sandbox, Workspace())
# -> attach, write steering/activation code inline, test steering
When: You want to investigate a fine-tuned model without revealing which one.
config = SandboxConfig(
gpu="A100",
models=[ModelConfig(
name="user/secret-adapter",
base_model="google/gemma-2-9b-it",
is_peft=True,
hidden=True,
)],
python_packages=["torch", "transformers", "peft"],
secrets=["HF_TOKEN"],
)
When: You need local code to call GPU functions remotely.
scoped = ScopedSandbox(SandboxConfig(gpu="A100", models=[...]))
scoped.start()
interface_lib = scoped.serve("interface.py", expose_as="library", name="tools")
workspace = Workspace(libraries=[interface_lib])
session = create_local_session(workspace=workspace, workspace_dir="./work", name="exp")
# -> local code can import and call GPU functions via RPC
When: You need API keys or credentials.
config = SandboxConfig(
gpu="A100",
secrets=["HF_TOKEN", "OPENAI_API_KEY", "ANTHROPIC_API_KEY"],
)
Secrets are available as environment variables in the sandbox:
execute_code(session_id, """
import os
hf_token = os.environ.get("HF_TOKEN")
openai_key = os.environ.get("OPENAI_API_KEY")
""")
Users need Modal configured:
modal token new
And secrets for HuggingFace:
modal secret create huggingface-secret HF_TOKEN=hf_...
First sandbox creation takes ~5 minutes because:
Subsequent runs are much faster (<1 min) because models are cached.
Common issues:
secrets=["HF_TOKEN"]If a package fails to install:
system_packages)"transformers==4.36.0"Begin with minimal config, add complexity incrementally:
# Start here
config = SandboxConfig(
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate"],
)
# Add as needed
# python_packages += ["matplotlib", "pandas", "datasets"]
# libraries = [Library.from_file("steering_hook.py")]
# secrets = ["HF_TOKEN"]
Don't create new sandboxes unnecessarily. Attach to existing sessions when possible.
Define helper functions once at the start of experiments. See the "Utility Code Examples" section for common patterns like steering hooks, activation extraction, and response generation.
Use add_markdown() to document your experiments in the notebook:
add_markdown(session_id, """
## Hypothesis 1: Model has strong helpfulness steering vector
Testing by:
1. Extracting contrast pair activations
2. Computing difference vector
3. Applying with varying strengths
""")
For investigative work, follow these principles:
Make interfaces clear with type hints:
@expose
def analyze_text(text: str, layer: int = 20) -> dict:
"""Analyze text at specific layer."""
...
from src.environment import Sandbox, SandboxConfig, ExecutionMode, ModelConfig, RepoConfig
from src.environment import ScopedSandbox
from src.workspace import Workspace, Library
from src.execution import create_notebook_session, create_local_session
import json
config = SandboxConfig(
execution_mode=ExecutionMode.NOTEBOOK,
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b-it")],
python_packages=["torch", "transformers", "accelerate"],
)
sandbox = Sandbox(config).start()
session = create_notebook_session(sandbox, Workspace())
print(json.dumps({
"session_id": session.session_id,
"jupyter_url": session.jupyter_url,
}))
scoped = ScopedSandbox(SandboxConfig(
gpu="A100",
models=[ModelConfig(name="google/gemma-2-9b")],
))
scoped.start()
interface = scoped.serve("interface.py", expose_as="library", name="tools")
workspace = Workspace(libraries=[interface])
The modal-sandbox MCP server provides tools for managing sandboxes directly. Use these for debugging, monitoring, and cleanup.
IMPORTANT: These tools require a sandbox_id which is output by your setup script. Make sure your setup script outputs sandbox.sandbox_id in the JSON.
List all running sandboxes.
list_sandboxes() # Lists sandboxes from default app
list_sandboxes(app_name="my-app") # Filter by app name
Get sandbox status and tunnel URLs.
get_sandbox_info(sandbox_id="sb-xxx...")
# Returns: {"sandbox_id": "...", "tunnels": {8888: "https://..."}, "status": "running"}
Execute a shell command in the sandbox.
exec_in_sandbox(sandbox_id="sb-xxx", command="nvidia-smi")
exec_in_sandbox(sandbox_id="sb-xxx", command="ls -la /root", timeout=30)
# Returns: {"stdout": "...", "stderr": "...", "return_code": 0, "success": true}
Execute Python code in the sandbox.
exec_python_in_sandbox(sandbox_id="sb-xxx", code="import torch; print(torch.cuda.is_available())")
Quick way to check GPU status (runs nvidia-smi).
get_gpu_status(sandbox_id="sb-xxx")
List running processes in the sandbox.
get_running_processes(sandbox_id="sb-xxx")
Terminate a running sandbox.
terminate_sandbox(sandbox_id="sb-xxx")
# Returns: {"success": true, "message": "Sandbox sb-xxx terminated successfully"}
Create a filesystem snapshot for later restoration.
snapshot_sandbox(sandbox_id="sb-xxx", description="After training")
# Returns: {"success": true, "image_id": "im-yyy..."}
Read a file from the sandbox.
read_sandbox_file(sandbox_id="sb-xxx", path="/root/output.txt")
Write a file to the sandbox.
write_sandbox_file(sandbox_id="sb-xxx", path="/root/script.py", content="print('hello')")
List files in a directory.
list_sandbox_files(sandbox_id="sb-xxx", path="/root")
When something goes wrong:
Check sandbox is running:
get_sandbox_info(sandbox_id="sb-xxx")
Check GPU status:
get_gpu_status(sandbox_id="sb-xxx")
Check processes:
get_running_processes(sandbox_id="sb-xxx")
Run diagnostic commands:
exec_in_sandbox(sandbox_id="sb-xxx", command="df -h") # Disk space
exec_in_sandbox(sandbox_id="sb-xxx", command="free -h") # Memory
exec_in_sandbox(sandbox_id="sb-xxx", command="cat /var/log/jupyter.log") # Logs
Clean up when done:
terminate_sandbox(sandbox_id="sb-xxx")
sandbox.sandbox_id from setup scripts for managementThis skill enables powerful interpretability research workflows. The key is understanding the two modes: