arcade-mcp/toolkits/web/arcade_web/tools/models.py at cc2a08ec3499b47bcf4affa18fbee79d5ea2db45 - admin-valentin/arcade-mcp - Forgejo: Beyond coding. We Forge.

admin-valentin/arcade-mcp

Eric Gustin cc2a08ec34

Add Firecrawl Tools For The New arcade_web` Toolkit (#110 )

# PR Description
This PR adds 6 new tools inside the new `arcade_web` toolkit. None of
these tools require auth. They do, however, require the
`FIRECRAWL_API_KEY` API Key to be set.

The new tools implement the [Firecrawl](https://www.firecrawl.dev/) APIs
`/scrape (POST)`, `/crawl (POST)`, `/crawl/{id} (GET)`, `/crawl/{id}
(DELETE)`, and `/map (POST)`.

The six tools are:
* `Web.ScrapeUrl`: 
- In the future I would like this tool to support actions (clicking,
scrolling, screenshotting, etc) and extract (specify what you want to
scrape) parameters. Firecrawl supports both of these parameters.
* `Web.CrawlWebsite`:
- If `async_crawl` is true, then the tool just returns the id of the
crawl job, which you can retrieve later with the `Web.GetCrawlData`
tool. If `async_crawl` is false, then the entire contents of the crawl
are returned.
* `Web.GetCrawlStatus`
- Works for in progress or recently finished crawl jobs (Firecrawl's
limitation)
* `Web.GetCrawlData`
- Works for in progress or recently finished crawl jobs (Firecrawl's
limitation)
* `Web.CancelCrawl`
    - You can cancel an in progress async crawl job
* `Web.MapWebsite`
- This endpoint is in alpha, but it can give you all of the links of an
entire website, or optionally, you can specify in natural language what
type of links you want to map by using the `search` parameter. For
example "only map webpages that are about AI"

2024-10-17 16:10:53 -07:00

11 lines

264 B

Python

Raw Blame History

 from enum import Enum
 # Models and enums for firecrawl web tools
 class Formats(str, Enum):
     MARKDOWN = "markdown"
     HTML = "html"
     RAW_HTML = "rawHtml"
     LINKS = "links"
     SCREENSHOT = "screenshot"
     SCREENSHOT_AT_FULL_PAGE = "screenshot@fullPage"