WebScraper URL批量替换工具

批量替换Web Scraper配置中的startUrl - 支持自动清理标点符号

使用说明：左侧输入框粘贴多行URL（每行一个，支持自动清理逗号、句号、顿号、分号、引号、冒号、感叹号等标点符号），中间输入框粘贴Web Scraper配置JSON（包含startUrl数组），点击"替换生成"按钮，右侧将显示替换后的配置

URL列表输入

0 个URL

Web Scraper配置

{"_id":"trustmrr","startUrl":["https://trustmrr.com/startup/local-seo-guy?metric=mrr"],"selectors":[{"id":"name","parentSelectors":["_root"],"type":"SelectorText","selector":"h1.text-3xl","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"pfp","parentSelectors":["_root"],"type":"SelectorImage","selector":"img.w-20","multiple":false,"version":2,"multipleType":"singleColumn"},{"id":"category","parentSelectors":["_root"],"type":"SelectorText","selector":".gap-3 a span.inline-flex","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"number","parentSelectors":["_root"],"type":"SelectorText","selector":"span.cursor-help","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"website","parentSelectors":["_root"],"type":"SelectorLink","selector":".gap-3 a.justify-center","multiple":false,"version":2,"linkType":"linkFromHref"},{"id":"Total revenue","parentSelectors":["_root"],"type":"SelectorText","selector":"div.bg-card:nth-of-type(1) div.text-2xl","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"last 30 days","parentSelectors":["_root"],"type":"SelectorText","selector":"div.bg-card:nth-of-type(2) div.text-2xl","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"MRR (estimated)","parentSelectors":["_root"],"type":"SelectorText","selector":"div:nth-of-type(3) div.text-2xl","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"active subscriptions","parentSelectors":["_root"],"type":"SelectorText","selector":"div:nth-of-type(3) p.mt-1","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"Founded","parentSelectors":["_root"],"type":"SelectorText","selector":".gap-2.flex-col div","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"country","parentSelectors":["_root"],"type":"SelectorText","selector":"span.text-muted-foreground","multiple":false,"regex":"","multipleType":"singleColumn","version":2},{"id":"expired","parentSelectors":["_root"],"type":"SelectorText","selector":"p.text-sm","multiple":false,"regex":"","multipleType":"singleColumn","version":2}]}

包含 startUrl:[...] 数组的JSON配置

生成结果

等待生成...

示例说明

默认示例：TrustMRR数据采集配置

参考网站：trustmrr.com

功能：使用Web Scraper Chrome扩展批量采集TrustMRR上的SaaS公司数据

📖 配置工作原理：

1. _id：Scraper配置的唯一标识符，用于在Web Scraper中识别此采集任务

2. startUrl：起始URL数组，定义采集的入口页面。支持批量添加多个页面URL

3. selectors：选择器数组，定义要提取的数据字段和对应的CSS选择器

4. 数据字段：包括公司名称(name)、头像(pfp)、分类(category)、营收数据(Total revenue/MRR)、订阅数(active subscriptions)、成立时间(Founded)、国家(country)等

5. 选择器类型：支持SelectorText(文本)、SelectorImage(图片)、SelectorLink(链接)等多种数据类型

6. 工作流程：Web Scraper会访问startUrl中的每个页面，根据selectors定义提取数据，最终导出为CSV或JSON格式

💡 这是一个完整的Web Scraper配置，可直接导入Chrome扩展使用。通过批量替换startUrl，可以一次性采集数百个页面的数据

左侧输入（URL列表）：


https://trustmrr.com/startup/gumroad?metric=mrr

https://trustmrr.com/startup/easytools-sp-z-o-o?metric=mrr

https://trustmrr.com/startup/maidsnblack?metric=mrr

...


支持带标点：

"https://trustmrr.com/startup/example",

'https://trustmrr.com/startup/example';

https://trustmrr.com/startup/example。

中间输入（配置模板）：


{

  "_id": "trustmrr",

  "startUrl": ["OLD_URL"],

  "selectors": [...]

}

使用技巧

从第二份文档中复制100个URL，粘贴到左侧输入框
确保每个URL都包含 ?metric=mrr 参数
中间保持默认的Web Scraper配置模板
点击"替换生成"后，复制右侧结果
在Web Scraper扩展中选择"Import Sitemap"，粘贴配置
点击"Scrape"开始批量采集数据