# Add Bright Data Web Scraping and Data Extraction Toolkit
## Overview
This PR introduces a comprehensive Bright Data toolkit that provides web
scraping, search, and structured data extraction capabilities through
the Bright Data API.
## Features Added
### Core Tools
1. **`scrape_as_markdown`** - Scrapes any webpage and returns clean
Markdown content
2. **`get_screenshot`** - Captures screenshots of webpages and saves
them locally
3. **`search_engine`** - Advanced search functionality across Google,
Bing, and Yandex with customizable parameters
4. **`web_data_feed`** - Extracts structured data from major platforms
(LinkedIn, Amazon, Instagram, Facebook, X, YouTube, Zillow, Booking.com,
etc.)
### Supporting Infrastructure
- **`BrightDataClient`**
- Error handling
- URL encoding utilities and request optimization
## Technical Details
### Search Engine Capabilities
- Multi-engine support (Google, Bing, Yandex)
- Advanced parameters: language, country, search type (images, shopping,
news)
- Device targeting (mobile, iOS, Android, iPad)
- Pagination and result count control
- Location-based searches
### Structured Data Sources
Supports 13+ data sources including:
- **E-commerce**: Amazon products and reviews
- **Professional**: LinkedIn profiles and companies, ZoomInfo
- **Social Media**: Instagram, Facebook, X (Twitter) content
- **Real Estate**: Zillow property listings
- **Travel**: Booking.com hotel listings
- **Video**: YouTube videos and metadata
## Testing & Validation
- [x] Deployed and tested on personal account
- [x] Tested via ngrok as well
- [x] Verified all tool functions work as expected
- [x] Validated against multiple data sources and search engines
- [x] Confirmed error handling and edge cases
## Security & Best Practices
- Requires proper API key and zone configuration via secrets
## Dependencies
- `requests` - HTTP client
- `arcade_tdk` - Arcade toolkit framework
- Standard library modules: `json`, `time`, `typing`, `urllib.parse`
## Notes
- All tools require `BRIGHTDATA_API_KEY` secret
- Search and scraping tools also require `BRIGHTDATA_ZONE` secret
- Follows Arcade AI toolkit patterns and conventions
- Comprehensive docstrings with examples provided
This toolkit significantly expands Arcade AI's web data capabilities,
enabling users to scrape, search, and extract structured data from
across the web through a single, unified interface.
---------
Authored-by: meirk-brd