Scrape Creator Profile

Extract structured profile data from a content creator's public page.

Returns a normalized JSON object regardless of platform.

When to Use This Skill

Use this skill for any request that involves:

Looking up a creator, influencer, streamer, or public figure's profile
Fetching follower counts, bio, links, recent content, or engagement stats
Comparing multiple creators
Building lead lists or research reports about creators
Monitoring a creator's stats over time

If the user provides a URL, jump straight to Step 2. If they only give a

name or handle, start at Step 1.

Step 1 — Resolve the Profile URL

If the user gave a username/handle without a URL:

Construct the likely URL using the platform mapping below.
If the platform is ambiguous, try the most likely one first (see detection

hints), then ask the user to confirm.

Platform URL Patterns

Platform	URL Pattern	Example
--------------	------------------------------------------	--------------------------------------------
YouTube	`https://www.youtube.com/@{handle}`	`https://www.youtube.com/@mkbhd`
Instagram	`https://www.instagram.com/{handle}/`	`https://www.instagram.com/natgeo/`
TikTok	`https://www.tiktok.com/@{handle}`	`https://www.tiktok.com/@charlidamelio`
Twitter / X	`https://x.com/{handle}`	`https://x.com/sama`
LinkedIn	`https://www.linkedin.com/in/{handle}/`	`https://www.linkedin.com/in/satyanadella/`
Twitch	`https://www.twitch.tv/{handle}`	`https://www.twitch.tv/shroud`
Substack	`https://{handle}.substack.com`	`https://astralcodexten.substack.com`
GitHub	`https://github.com/{handle}`	`https://github.com/torvalds`
Patreon	`https://www.patreon.com/{handle}`	`https://www.patreon.com/kurzgesagt`

Detection Hints

@ prefix with short handle → likely Twitter/X or Instagram
"yt:" or "youtube" keyword → YouTube
"channel", "subscribe", "views" → YouTube
"stream", "live" → Twitch
"reel", "story", "ig:" → Instagram
"tiktok", "tt:" or handle with numbers → TikTok
Professional title + name → LinkedIn

Step 2 — Fetch the Profile Page

2a. Try plain web_fetch first

web_fetch(url)

Check if the response contains useful profile data (bio, follower count, etc.).

If the content is mostly JS placeholders, empty

s, or a login wall,

move to 2b.

2b. Use browser (managed mode) for JS-heavy sites

Platforms that almost always require browser mode:

Instagram — always requires browser
TikTok — always requires browser
LinkedIn — login wall without session; use browser if session cookies available
YouTube — web_fetch usually works; use browser if blocked

Browser steps:

Open the profile URL in managed browser mode.
Wait for the main content container to render (look for follower/subscriber

counts, bio text).

Take a DOM snapshot.
Extract data from the snapshot using the field map in Step 3.

2c. Apify fallback (if APIFY_API_TOKEN is set)

For persistent blocks, use the Apify actor for that platform. See

references/apify-actors.md for actor IDs and call patterns.

Step 3 — Extract Structured Fields

Parse the fetched content and populate the following fields. Mark any missing

field as null — do not guess.

{
  "platform": "string",          // youtube | instagram | tiktok | twitter | linkedin | twitch | substack | github | patreon | other
  "handle": "string",            // @-prefixed username
  "display_name": "string",      // Full display name
  "verified": true | false,
  "bio": "string",               // Profile description / about text
  "profile_url": "string",       // Canonical URL used to scrape
  "avatar_url": "string | null",
  "external_links": ["string"],  // Any links in bio or link-in-bio
  "stats": {
    "followers": "number | null",
    "following": "number | null",
    "subscribers": "number | null",
    "total_views": "number | null",
    "total_posts": "number | null",
    "monthly_listeners": "number | null",  // Spotify-style, if applicable
    "engagement_rate": "number | null"     // Percentage, if computable
  },
  "recent_content": [
    {
      "title": "string | null",
      "url": "string",
      "published_at": "ISO 8601 string | null",
      "views": "number | null",
      "likes": "number | null",
      "comments": "number | null"
    }
    // Up to 5 most recent items
  ],
  "contact_info": {
    "email": "string | null",    // Only if publicly listed in bio/links
    "website": "string | null"
  },
  "scraped_at": "ISO 8601 UTC timestamp"
}

> Privacy rule: Only capture fields that are explicitly public on the

> profile page. Do not infer, deduce, or cross-reference private information.

> Do not store or relay phone numbers even if visible.

Platform-specific extraction hints

See references/platform-selectors.md for CSS selectors and JSON-LD paths

per platform. Quick reference:

YouTube: subscriber count in #subscriber-count or meta itemprop=interactionCount; description in #description-inner
Twitter/X: follower count in [data-testid="UserProfileHeader_Items"]; bio in [data-testid="UserDescription"]

scrape-creator-profile

概述