Convert documents (PDF, EPUB, PPTX, DOCX, XLSX, HTML) to Markdown using marker-pdf with Claude Haiku LLM enhancement for accurate image and table extraction. Use when user needs to extract content from documents, convert PDFs to markdown, or process document files.
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Convert PDF, EPUB, PPTX, DOCX, XLSX, and HTML files to clean Markdown format using the marker-pdf tool with multimodal LLM enhancement.
# Install marker-pdf with full document support
uv tool install marker-pdf[full]
Requires Python 3.10+ and PyTorch.
When the user provides a document file path, run:
marker_single "<file_path>" \
--output_format=markdown \
--output_dir="<output_directory>" \
--use_llm \
--llm_service=marker.services.claude.ClaudeService \
--claude_model_name=claude-haiku-4-5 \
--claude_api_key=$ANTHROPIC_API_KEY \
--disable_image_extraction
Note: --disable_image_extraction generates plain text output. Remove this flag if images need to be preserved.
<file_path>: Path to the input document (PDF, EPUB, PPTX, DOCX, XLSX, HTML)--output_format: Output format, options: markdown (default), html, json, chunks--output_dir: Directory to save output files (defaults to same directory as input)--use_llm: Enable LLM-based accuracy improvements for tables, formatting, and form extraction--page_range: Process specific pages (e.g., "0,5-10,20")marker_single "./docs/report.pdf" \
--output_format=markdown \
--output_dir="./docs/" \
--use_llm \
--llm_service=marker.services.claude.ClaudeService \
--claude_model_name=claude-haiku-4-5 \
--claude_api_key=$ANTHROPIC_API_KEY \
--disable_image_extraction
marker_single "./slides/presentation.pptx" \
--output_format=html \
--output_dir="./slides/" \
--use_llm \
--llm_service=marker.services.claude.ClaudeService \
--claude_model_name=claude-haiku-4-5 \
--claude_api_key=$ANTHROPIC_API_KEY \
--disable_image_extraction
marker_single "./docs/book.pdf" \
--output_format=markdown \
--output_dir="./docs/" \
--page_range="1-10" \
--use_llm \
--llm_service=marker.services.claude.ClaudeService \
--claude_model_name=claude-haiku-4-5 \
--claude_api_key=$ANTHROPIC_API_KEY \
--disable_image_extraction
--force_ocr: Force OCR on entire document, converts inline math to LaTeX--redo_inline_math: Highest quality inline math conversion (use with --use_llm)--disable_image_extraction: Skip image extraction--paginate_output: Add page separators to output--debug: Enable diagnostic loggingThe tool generates:
$ANTHROPIC_API_KEY environment variable is setmarker_single command with appropriate options