From huntkit
Conducts deep OSINT research on individuals: builds a scored digital footprint, psychoprofile, career history, and social graph with recursive self-evaluation until completeness.
How this skill is triggered — by the user, by Claude, or both
Slash command
/huntkit:osintThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Systematic intelligence gathering on individuals. From a name or handle to a scored
assets/dossier-template.mdreferences/content-extraction.mdreferences/platforms.mdreferences/psychoprofile.mdreferences/tools.mdscripts/apify.shscripts/brightdata.shscripts/capture-evidence.shscripts/diagnose.shscripts/exa.shscripts/first-volley.shscripts/ingest-client-document.shscripts/jina.shscripts/mcp-client.pyscripts/merge-volley.shscripts/next-ev-id.shscripts/package.jsonscripts/parallel.shscripts/perplexity-playbook.shscripts/perplexity.shSystematic intelligence gathering on individuals. From a name or handle to a scored dossier with psychoprofile, career map, and entry points.
Determine entry point from context:
Default (full research request): Phase 0 → 1 → 1.5 → 2 → 3 → 4 → 5 → 6.
All API keys via environment variables. Never hardcode tokens.
PERPLEXITY_API_KEY — Perplexity Sonar (fast answers + deep research)EXA_API_KEY — Exa AI (semantic search, company/people research, deep research)TAVILY_API_KEY — Tavily (agent-optimized search + extract, $0.005/req basic)APIFY_API_TOKEN — Apify scraping (LinkedIn, Instagram, Facebook)JINA_API_KEY — Jina reader/search/deepsearchPARALLEL_API_KEY — Parallel AI searchBRIGHTDATA_MCP_URL — Bright Data MCP endpoint (full URL with token)MCPORTER_CONFIG — mcporter config pathRun from skill dir: bash scripts/<name>.sh.
Each validates env vars, exits with descriptive error + URL to get the key.
Search & Research:
diagnose.sh — run FIRST. Capability map of all tools.mcp__perplexity-ask__perplexity_ask directly. It returns an AI answer with citations (equivalent to sonar mode). Use the shell script for search (ranked web results), reason (reconcile contradictions), and deep (long-form research) modes, or from bash pipelines.perplexity.sh — search <query> | sonar <query> (AI answer) | reason <query> (sonar-reasoning-pro, compare leads / reconcile contradictions) | deep <query> (deep research). Required by first-volley.sh and any bash-only pipeline.tavily.sh — search <query> (basic $0.005) | deep <query> (advanced) | extract <url>exa.sh — search <query> | company <name> | people <name> | crawl <url> | deep <prompt>first-volley.sh "Name" "context" — parallel search, all engines at once.merge-volley.sh <outdir> — deduplicate and merge first-volley results.Scraping:
apify.sh — linkedin <url> | instagram <handle> | run | results | store-searchrun-actor.sh — universal Apify runner (55+ actors). Embedded from apify/agent-skills.
Quick answer: bash scripts/run-actor.sh "actor/id" '{"input":"json"}'
Export: bash scripts/run-actor.sh "actor/id" '{"input":"json"}' --output /tmp/out.csvjina.sh — read <url> | search <query> | deepsearch <query>parallel.sh — search <query> | extract <url>brightdata.sh — scrape <url> | scrape-batch | search | search-geo <cc> | search-yandexPrinciple: cheap before expensive, fast before deep.
Always start here. Get quick context before digging. Run ALL in parallel:
# Perplexity (default: MCP tool, not the shell script)
# Claude call: mcp__perplexity-ask__perplexity_ask with a structured prompt
# (see "Prompt Templates" section — use the Entity Profile template, not ad-hoc "Who is X")
# Shell fallback (bash pipelines or when you need sonar explicitly):
# bash skills/osint/scripts/perplexity.sh sonar "<structured prompt from Entity Profile template>"
# Brave Search — classic web search
web_search "<Name> <company> <role>"
# Tavily — agent-optimized search with AI answer
bash skills/osint/scripts/tavily.sh search "<Name> <context>"
# Exa — semantic search + company/people research
bash skills/osint/scripts/exa.sh search "<Name> <context>"
bash skills/osint/scripts/exa.sh people "<Name>"
→ Returns: quick facts, links, context. → Decision: enough? → Phase 6. Need more? → Level 2.
Verify sources from Level 1 via fetch:
# Read discovered URLs
web_fetch "<url_from_perplexity>"
bash skills/osint/scripts/jina.sh read "<url>"
bash skills/osint/scripts/parallel.sh extract "<url>"
→ Returns: verified facts, cross-references. → Match? → enrich the dossier. Need deeper? → Level 3.
Bring in scrapers for social platforms:
# LinkedIn
bash skills/osint/scripts/apify.sh linkedin "<url>"
# Instagram
bash skills/osint/scripts/apify.sh instagram "<handle>"
# Facebook, geo-blocked sites
bash skills/osint/scripts/brightdata.sh scrape "<url>"
→ Returns: structured profiles, photos, connections.
If you need to go deeper — compose an extended prompt and send to deep research. Run ALL in parallel (30-60 sec each):
# Perplexity Deep Research — use a template from "Prompt Templates" section
bash skills/osint/scripts/perplexity.sh deep "<filled-in Entity Profile or Network Mapping template>"
# Perplexity Reasoning — reconcile contradictions surfaced in Level 1-3
bash skills/osint/scripts/perplexity.sh reason "<filled-in Contradiction Reconciliation template>"
# Exa Deep Research
bash skills/osint/scripts/exa.sh deep "<detailed prompt>"
# Parallel AI Deep Search
bash skills/osint/scripts/parallel.sh search "<detailed query>"
# Jina DeepSearch
bash skills/osint/scripts/jina.sh deepsearch "<query>"
Rule: the Level 4 prompt must be EXTENDED — include everything you already know from Level 1-3 so deep research does not repeat basic facts and digs further instead.
Every OSINT query to Perplexity (mcp__perplexity-ask__perplexity_ask, perplexity.sh sonar|reason|deep) must follow the 5-part pattern. Ad-hoc queries produce summary blobs; structured queries produce source-anchored tables we can route to investigation/findings/.
Entity profile (mode: deep or sonar)
Use public sources only to profile <TARGET>. Identify associated social profiles,
domains, companies, locations, and public activity since <DATE>. Prioritize primary
sources (social, filings, forums); exclude news aggregators and obvious duplicates.
Output: table with source URL, evidence snippet, date, confidence (high/medium/low),
followed by a timeline of key events. End with "what is missing or weakly supported?"
Network mapping (mode: deep)
Map connections between <ENTITY_A> and <ENTITY_B>. Find shared infrastructure,
co-mentions, emails, social links, funding ties from public sources (GitHub, LinkedIn,
WHOIS, SEC filings) in the last 2 years. Return a table: nodes, edges, source URLs,
link strength (strong/moderate/weak/speculative); highlight contradictions.
Event timeline (mode: deep)
Reconstruct the timeline of <EVENT> from public mentions on forums, social, blogs,
and official statements since <DATE>. Focus on IOCs, affected parties, responses.
Output: chronological table (date, source URL, key fact, credibility grade). End
with a "gaps in evidence" list and suggested follow-up queries.
Infrastructure recon (mode: deep)
Enumerate public-facing infrastructure for <DOMAIN_OR_IP>: subdomains, hosting
provider, tech stack, linked domains, certificates, and changes since <DATE>.
Pull from Shodan/Censys/VirusTotal/CT-log-style public data. Output: table with
asset, details, source URL, last observed; end with an exposure risk assessment.
Geolocation (mode: sonar or deep)
Find public geolocation signals for <TARGET>. Correlate images, posts, metadata,
check-ins from Instagram/X/Telegram since <DATE>. Privacy-compliant sources only.
Output: table with estimated lat/long, source URL, confidence; note any conflicting
locations.
Contradiction reconciliation (mode: reason)
I have these conflicting claims about <TARGET>:
1. <CLAIM_A> (source: <URL_A>)
2. <CLAIM_B> (source: <URL_B>)
Reason step-by-step about which is more likely correct, what additional evidence
would resolve the conflict, and what the most plausible combined narrative is.
Output: assessment, confidence, missing evidence, recommended next queries.
bash skills/osint/scripts/capture-evidence.sh before being cited in any brief or report.investigation/findings/.reason mode output is the preferred input to /analyze ach when reconciling contradictions.perplexity-playbook.shFor the first research pass on any target, run the playbook rather than crafting ad-hoc sonar calls. The playbook runs a fixed, target-type-specific query set in parallel, then merges citations with URL normalization and dedupe into a reproducible run directory.
# Target types: person | company | domain | incident
bash skills/osint/scripts/perplexity-playbook.sh person "jane-doe" "Jane Doe, CEO Acme"
bash skills/osint/scripts/perplexity-playbook.sh domain "acmecorp-com" "acmecorp.com"
# Fully automated pass: playbook -> capture -> persist to active case
bash skills/osint/scripts/perplexity-playbook.sh company "acme-inc" "Acme Inc" \
--capture --case case-015-linkedin-algorithm
Output lives at /tmp/osint-<slug>-<ISO8601>/: evidence.json (merged citations), urls.txt, urls.tsv (batch input for capture-evidence.sh), report.md, run_manifest.json. With --case, artifacts are copied to investigations/<case>/investigation/evidence/raw-collections/.
Use perplexity.sh modes directly only for follow-up queries outside the target-type doctrine (e.g., reason for contradiction reconciliation, deep for a specific Level 4 deep dive). The playbook is the default so that first-pass research is uniform, auditable, and parallel.
OSINT research runs as a swarm of parallel sub-agents on Sonnet. The main agent is the coordinator — it does NOT scrape itself.
sessions_spawn with model: sonnet, mode: runstreamers/youtube-channel-scraper for channel dataapify/facebook-pages-scraper + apify/facebook-page-contact-informationvdrmota/contact-info-scraper on found websitesclockworks/tiktok-profile-scraper), local registries, press, university records, Yandex search, Google Maps (compass/crawler-google-places if business owner)/tmp/osint-<subject>-<task>.mdbash skills/osint/scripts/diagnose.sh.Start with Level 1 (quick answers) ALWAYS before heavy scraping.
mcp__perplexity-ask__perplexity_ask with the question. Returns AI answer + citations in-session.bash skills/osint/scripts/perplexity.sh search "Who is <Name>, <context>" (ranked web results) or sonar (AI answer).web_search "<Name> <company>"
bash skills/osint/scripts/first-volley.sh "Full Name" "context"
web_fetch "<citation_url_1>"
web_fetch "<citation_url_2>"
bash skills/osint/scripts/merge-volley.sh /tmp/osint-<timestamp>.Rate limiting: wait 1s between Brave queries, 2s between Jina calls. Do NOT hammer APIs in tight loops — stagger parallel launches.
Before going external, check what you already know. This phase is optional and applies only when you have local/internal sources that may contain relevant history on the target: prior conversations, email archives, CRM cards, or notes. Skip entirely if none apply.
The tgspyder CLI (third-party, see README) or equivalent Telegram OSINT tool can pull
public group membership, chat messages, and user lookups. Use only on data you are
authorized to access.
What to extract from Telegram history:
⚠️ Telegram history is Grade A intelligence — unfiltered, real-time, authentic. Weight it higher than curated LinkedIn/Instagram profiles. ⚠️ Privacy: internal intelligence stays in the dossier. Never quote DMs in public outputs.
Any local email client (himalaya, mutt, notmuch, etc.) or mail archive can be searched for prior correspondence with the target or their domain.
What to extract from email:
If you maintain a CRM, vault, or notes system (Obsidian, Notion export, plain-text notes), check for existing cards on the target before starting external research. Enrich the existing card after research completes instead of duplicating.
After Phase 1.5, you should know:
This context shapes Phase 2 priorities — if we already know their career from emails, focus external research on psychoprofile and social media instead.
Read references/platforms.md ONLY when needing URL patterns or extraction signals.
Tool priority (primary → fallback). If primary fails, switch immediately. Never retry same tool.
apify.sh linkedin → brightdata.sh scrape → jina.sh readapify.sh instagram → brightdata.sh scraperun-actor.sh "apify/instagram-tagged-scraper" (who tags them), apify/instagram-comment-scraper (sentiment)brightdata.sh scrape → none (only Bright Data works)run-actor.sh "apify/facebook-pages-scraper" → brightdata.sh scraperun-actor.sh "clockworks/tiktok-profile-scraper" → clockworks/tiktok-scraper (comprehensive)run-actor.sh "clockworks/tiktok-user-search-scraper" (find by keywords)run-actor.sh "streamers/youtube-channel-scraper" → jina.sh read → brightdata.sh scrapeweb_fetch t.me/s/{channel} → jina.sh readpython3 scripts/twitter.py tweet <url> → jina.sh readrun-actor.sh "compass/crawler-google-places"run-actor.sh "vdrmota/contact-info-scraper" (extract emails/phones from any URL)jina.sh read → brightdata.sh scraperun-actor.sh = universal Apify runner (embedded, 55+ actors). See references/tools.md for full actor catalog.
Read references/tools.md ONLY when troubleshooting a failed tool.
When you find YouTube, podcast, blog, or conference talks — read references/content-extraction.md immediately and extract 3-5 pieces of content on the spot.
Do NOT just note the URL. Extract transcripts/text NOW. A 20-minute YouTube video reveals more about a person than their entire LinkedIn. Content platforms are the #1 source for psychoprofile — skipping them = shallow dossier.
If initial searches return unusually little for someone who should have a footprint:
web_fetch "https://web.archive.org/web/2024*/target-url" — deleted profiles, old biosweb_search "cache:domain.com/path" — recently removed pagesbrightdata.sh search-yandex "Name" — Yandex indexes CIS deeper and caches longerList every claim as a row: fact | source 1 | source 2 | grade.
For each critical fact (employer, role, location, education):
If LinkedIn says "CEO" but company site says "Co-founder" — flag explicitly. Include both with sources. Do NOT silently pick one.
If common name — verify at least 2 facts (company + city, or photo + company) link to same person. If unsure, split into separate entities.
Internal intelligence (Phase 1.5) counts as an independent source.
Read references/psychoprofile.md ONLY at this phase.
9 mandatory checks. If any fail, flag as critical gap:
| Dimension | Weight | What to score (1-10) |
|---|---|---|
| Identity | 0.15 | Full name, DOB, location, education, photo |
| Career | 0.20 | Completeness of work history, current role clarity |
| Digital footprint | 0.15 | Number of platforms found, account activity level |
| Psychoprofile | 0.15 | MBTI confidence, writing style quantified, values deduced |
| Internal intel | 0.10 | Telegram/email history depth, vault data |
| Personal life | 0.05 | Family, hobbies, lifestyle, pets |
| Cross-reference | 0.10 | How many facts are A-grade, contradiction count |
| Actionability | 0.10 | Entry points identified, approach strategy clear |
Weighted sum (1-10) = Depth Score.
Count unique source types used (max 12): LinkedIn, Instagram, Facebook, Telegram DM, Telegram channel, VK, Twitter/X, company website, press/media articles, conference profiles, government/business registries, email correspondence.
| Depth Score | Coverage | Diagnosis | Action |
|---|---|---|---|
| 8+ | All pass | Strong dossier | Proceed to Phase 6 |
| 8+ | Some fail | Deep but blind spots | Target failed checks, 1 more cycle |
| <7 | All pass | Wide but shallow | Deepen via interviews/articles/deepsearch |
| <7 | Some fail | Restart needed | Different search angle, new tool combination |
(a) Depth Score ≥ 8.0 AND all coverage checks pass → exit to Phase 6 (b) 3 cycles completed → deliver best available with honest assessment (c) Two cycles with delta < 0.5 → plateau reached, deliver with note
Read assets/dossier-template.md before rendering. Follow the template structure exactly.
No markdown tables in output (Telegram cannot render). Bullet lists only.
Report Depth Score, source count, source types, and total API spend.
If internal intelligence was used, add a separate "Internal intelligence" section (marked as internal/confidential, not for sharing outside).
$0.50: ask user before proceeding.
brightdata.sh scrape as primary instead of Apify.apify.sh store-search "linkedin scraper" for alternatives. Actors on Apify are volatile — always have a Bright Data fallback.jina.sh deepsearch. Check Telegram history.bash scripts/apify.sh store-search "people search". If mcpc installed: APIFY_TOKEN=$APIFY_API_TOKEN mcpc --json mcp.apify.com --header "Authorization: Bearer $APIFY_TOKEN" tools-call search-actors keywords:="people search" limit:=10. Check Telegram contacts by phone.clockworks/free-tiktok-scraper (free tier) as fallback. TikTok usernames often differ from other platforms — search by real name via clockworks/tiktok-user-search-scraper.vdrmota/contact-info-scraper — it crawls the site and extracts all contact info.npx claudepluginhub assafkip/huntkitRecommends 150+ OSINT tools and methodologies for investigations like reverse image search, geolocation, domain WHOIS, social media intel, and threat analysis. Routes to specialized skills.
Executes offensive OSINT methodology for red team recon, bug bounty, and target investigation. Covers domain recon, email harvesting, social profiling, code leaks, Shodan/Censys, breach data, crypto tracing, and geospatial intelligence.
Routes OSINT investigation queries to appropriate tools and techniques across 150+ tools, with methodology guides and OSINT Navigator integration. Works offline with any LLM.