Synaptic SkillsSynapticSkills
MarketplaceSkill GraphCriar SkillMCP ServerPlataformaEnterprise
v0.1.0-beta
Voltar ao Marketplace
AgentsMédio

Site Crawler Skill

pormindmorass·mindmorass· v1.0.0 · atualizado em 2026-04-11
78
Score

Crawl and extract content from websites

web-crawlingcontent-extractionrag-pipelinedata-ingestionsite-scrapingdocument-processinginformation-retrieval
Linguagens
Python
0Stars
0Forks
0Usos
Fork

Documento do Skill

SKILL.mdsite-crawler/workflow
1
Identify Target Website: — Determine the base URL and scope of the website to be crawled.
2
Check Robots.txt: — Respectfully parse the robots.txt file to identify disallowed paths.
3
Discover URLs: — Use sitemaps and initial URLs to build a queue of pages to crawl.
4
Crawl Pages: — Fetch each page, respecting rate limits, and extract content.
5
Extract Content: — Use trafilatura and BeautifulSoup to extract the main content, headings, and metadata.
6
Convert to Markdown: — Convert the extracted content to markdown format for RAG ingestion.
7
Store Results: — Save the extracted content and metadata for use in a RAG pipeline.

Telemetria de Agentes

Execuções
0
total
Taxa de Sucesso
0%
últimos 30d
Latência Média
0.0s
p50
Alucinação
0.0%
detecção
Tokens Entrada
0
avg 0/exec
Tokens Saída
0
avg 0/exec

Uso por Plataforma

Skills Relacionados

Similar aByted Web Search
60%
Hebbian Synapse
Composite0.600
w = 0.3·α + 0.5·β + 0.2·γ
85

Árvore do Skill

Site Crawler Skill
site-crawler
Fases Cognitivas4
1.SENSE
2.CONTEXTUALIZE
3.ACT
4.REFLECT
Triggers8
crawl a website for contentextract content from a URLscrape a website for RAGingest data from a websitecrawl documentation sitesextract structured content from a websiteharvest web content for RAGcrawl a site and extract markdown

Avaliar este Skill

Score Breakdown

⭐Avaliação Humana0%
🤖Sucesso de Agentes0%
🕐Atualidade100%
🔗Saúde de Dependências100%
🕸️Centralidade no Grafo0%
🛡️Segurança50%
CompositeScore = α·Humano + β·Agente + γ·Recência + δ·Deps + ε·Centralidade + ζ·Segurança

Instalação

$ synaptic mcp download site-crawler
$ synaptic skills detail site-crawler
$ synaptic skills live site-crawler

Dependências

httpxbeautifulsoup4lxmltrafilaturamarkdownify

Links

GitHub Repository