Overview
The Web Scraping API is a powerful tool that allows users to extract content and data from any website.
Features
- Support CSS selectors: Scraping API allows users to specify CSS selectors to extract specific elements from a web page, enabling targeted and precise data extraction.
- Blocks Ads and trackers: The API blocks advertisements and tracking scripts, providing a clean and uninterrupted scraping experience.
- Based on Google Chrome Headless: The API uses Google Chrome Headless, a headless version of the Chrome browser, to render and interact with web pages. This ensures reliable and consistent scraping results.
- Support TLS/HTTPS: The API supports scraping of websites that use HTTPS, allowing users to extract data from secure web pages.
- Highly available and easy to use: The API is designed to be highly available and easy to integrate into various applications and workflows.
Getting Started
To access the Web Scraping API , you need to obtain an API key from AnyAPI.io. The API key is used for authentication and should be included in the request URL as a query parameter.
Let’s Try Sign In.
Parse and extract data from a given URL endpoint
Send a GET request to the following URL to Parse and extract data from a given URL
curl --request GET \
--url 'https://anyapi.io/api/v1/scrape?url=https%3A%2F%2Fexample.com&selector=SOME_STRING_VALUE&apiKey=YOUR_API_KEY'
{
"content": "<!DOCTYPE html><html>...</body></html>"
}
Request parameters
Use the parameters listed below to customize your request. Mandatory parameters are indicated
urlstringRequired
URL (urlencoded)
selectorstringOptional
CSS Selector
apiKeystringRequired
Your unique API key, which is required to authenticate your requests.
Response parameters
The API returns its response in a simple, lightweight JSON format.
contentstring
The response from the web scraping API usually includes the content of the web page you requested. This can be in the form of HTML code or structured data, depending on the API's capabilities.
Response and error codes
Whenever you make a request that fails for some reason, an error is returned also in the JSON format. The errors include an error code and description, which you can find in detail below.
Status Code | Type | Details |
---|---|---|
200 | OK | The request was successful. |
400 | Bad Request | The request was invalid or cannot be otherwise served. |
401 | Unauthorized | Authentication credentials were missing or incorrect. |
422 | Quota reached | The request cannot be served due to the application's rate limit having been exhausted for the resource. |
404 | Not Found | The requested resource could not be found. |
429 | Too Many Requests | The request cannot be served due to the application's rate limit having been exhausted for the resource. |
500 | Internal Server Error | Something went wrong on the server. |
503 | Service Unavailable | The service is temporarily unavailable. |