Synaptic SkillsSynapticSkills
MarketplaceSkill GraphCriar SkillMCP ServerPlataformaEnterprise
v0.1.0-beta
Voltar ao Marketplace
DataMédioAuto-Sync

Web Scraper

porTHIAGONOMA·THIAGONOMA· v1.6.2 · atualizado em 2026-04-12T22:48:47.837Z
79
Score

Extrai dados estruturados de websites com Playwright/Cheerio, respeitando robots.txt, rate limiting e anti-bot detection. Produz JSON/CSV limpo com retry automático e deduplicação.

web-scrapingplaywrightcheeriocrawleedata-extractionrobots-txt
Linguagens
TypeScriptJavaScriptPython
1.6KStars
234Forks
21.3KUsos
Fork

Documento do Skill

SKILL.mdweb-scraper/workflow
Passo-a-passo detalhado do skill, referenciando as fases cognitivas:
1
SENSE — Análise do alvo e planejamento
```python
from urllib.robotparser import RobotFileParser
rp = RobotFileParser()
rp.set_url("https://example.com/robots.txt")
rp.read()
if not rp.can_fetch("*", url):
raise PermissionError("Scraping not allowed by robots.txt")
crawl_delay = rp.crawl_delay("*") or 1.0
```
Identificar tipo de conteúdo (estático vs. SPA) com HEAD request
2
CONTEXTUALIZE — Inspecionar estrutura HTML
Analisar 3-5 páginas manualmente para identificar seletores estáveis
Preferir data-attributes (`data-product-id`) sobre classes CSS (podem mudar)
Verificar sitemap.xml: `curl https://example.com/sitemap.xml`
3
HYPOTHESIZE — Escolher stack
HTML estático → Cheerio + node-fetch (10x mais rápido)
SPA/dinamico → Playwright (mais robusto, suporte a múltiplos browsers)
Alto volume → Crawlee (queue, proxy rotation, sessões automáticas)
4
RECOMMEND — Implementar scraper
```typescript
// Playwright com rate limiting e retry
const browser = await chromium.launch({ headless: true });
const context = await browser.newContext({
userAgent: 'Mozilla/5.0 (compatible; MyBot/1.0)',
});
for (const url of urls) {
await delay(crawlDelay * 1000); // respeitar robots.txt Crawl-delay
const page = await context.newPage();
await page.goto(url, { waitUntil: 'networkidle', timeout: 30_000 });
const data = await page.evaluate(() => ({ ... }));
results.push(data);
await page.close();
}
```
5
EVALUATE — Validar dados extraídos
Verificar cobertura: todos os campos esperados extraídos
Detectar outliers: campos vazios, valores fora do range esperado
Checar encoding: caracteres especiais preservados corretamente
6
REFLECT — Qualidade, ética e telemetria
Validar deduplicação: sem URLs ou registros repetidos
Confirmar rate limiting respeitado via logs de timestamp
Reportar telemetria via mcp-skillschain

Telemetria de Agentes

Execuções
0
total
Taxa de Sucesso
0%
últimos 30d
Latência Média
0.0s
p50
Alucinação
0.0%
detecção
Tokens Entrada
0
avg 0/exec
Tokens Saída
0
avg 0/exec

Uso por Plataforma

Skills Relacionados

Compõe comData Visualization
21%
Hebbian Synapse
Composite0.210
w = 0.3·α + 0.5·β + 0.2·γ
80
Compõe com ←ETL Pipeline Builder
21%
Hebbian Synapse
Composite0.210
w = 0.3·α + 0.5·β + 0.2·γ
86
Compõe com ←Playwright CLI Skill
70%
Hebbian Synapse
Composite0.700
w = 0.3·α + 0.5·β + 0.2·γ
80
Compõe com ←Chrome DevTools Agent
70%
Hebbian Synapse
Composite0.700
w = 0.3·α + 0.5·β + 0.2·γ
80
Similar a ←Requirements for Outputs
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
83
Similar a ←BigQuery Pipeline Audit: Cost, Safety and Production Readiness
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
80
Similar a ←Azure Cosmos DB NoSQL Data Modeling Expert System Prompt
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
83
Similar a ←When to Use This Skill
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
80
Similar a ←Geofeed Tuner – Create Better IP Geolocation Feeds
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
80
Similar a ←Migrating Stored Procedures from Oracle to PostgreSQL
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
82
Similar a ←Oracle-to-PostgreSQL Database Migration
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
82
Similar a ←Shuffle JSON Data
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
79
Similar a ←Snowflake Semantic Views
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
82
Similar a ←SQL Performance Optimization Assistant
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
87
Similar a ←Schema Markup
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
83
Co-executed ←GEO-SEO Analyzer
40%
Hebbian Synapse
Composite0.400
w = 0.3·α + 0.5·β + 0.2·γ
90
Co-executed ←Data Visualization
12%
Hebbian Synapse
Composite0.115
w = 0.3·α + 0.5·β + 0.2·γ
80
Co-executed ←React Component Generator
40%
Hebbian Synapse
Composite0.400
w = 0.3·α + 0.5·β + 0.2·γ
89
Co-executed ←REST API Builder
40%
Hebbian Synapse
Composite0.400
w = 0.3·α + 0.5·β + 0.2·γ
90
Co-executed ←ETL Pipeline Builder
48%
Hebbian Synapse
Composite0.478
w = 0.3·α + 0.5·β + 0.2·γ
86
Co-executed ←Mobile Responsive Checker
40%
Hebbian Synapse
Composite0.400
w = 0.3·α + 0.5·β + 0.2·γ
79
Co-executed ←BigQuery Pipeline Audit: Cost, Safety and Production Readiness
48%
Hebbian Synapse
Composite0.478
w = 0.3·α + 0.5·β + 0.2·γ
80
Co-executed ←Azure Cosmos DB NoSQL Data Modeling Expert System Prompt
47%
Hebbian Synapse
Composite0.475
w = 0.3·α + 0.5·β + 0.2·γ
83
Co-executed ←When to Use This Skill
48%
Hebbian Synapse
Composite0.478
w = 0.3·α + 0.5·β + 0.2·γ
80
Co-executed ←Migrating Stored Procedures from Oracle to PostgreSQL
49%
Hebbian Synapse
Composite0.494
w = 0.3·α + 0.5·β + 0.2·γ
82
Co-executed ←Oracle-to-PostgreSQL Database Migration
49%
Hebbian Synapse
Composite0.491
w = 0.3·α + 0.5·β + 0.2·γ
82
Co-executed ←Snowflake Semantic Views
51%
Hebbian Synapse
Composite0.506
w = 0.3·α + 0.5·β + 0.2·γ
82
Co-executed ←SQL Performance Optimization Assistant
51%
Hebbian Synapse
Composite0.508
w = 0.3·α + 0.5·β + 0.2·γ
87

Árvore do Skill

Web Scraper
web-scraper
Fases Cognitivas5
1.SENSE: Percepção
2.CONTEXTUALIZE: Contextualização
3.HYPOTHESIZE: Hipótese
4.RECOMMEND: Recomendação
5.REFLECT: Reflexão
Triggers15
scrape websiteextrair dados de sitecrawl pageweb crawlingcoletar dados webweb scrapinghtml parsingdata extractionplaywright scraperbeautifulsoupcheerio scraperextrair tabela do sitemonitorar preçoscraping éticocrawler configuration

Avaliar este Skill

Score Breakdown

⭐Avaliação Humana0%
🤖Sucesso de Agentes0%
🕐Atualidade100%
🔗Saúde de Dependências100%
🕸️Centralidade no Grafo0%
🛡️Segurança50%
CompositeScore = α·Humano + β·Agente + γ·Recência + δ·Deps + ε·Centralidade + ζ·Segurança

Instalação

$ synaptic mcp download web-scraper
$ synaptic skills detail web-scraper
$ synaptic skills live web-scraper

Links

GitHub Repository