Set up your Batch API
Welcome to Similarweb’s Batch API — giving you scalable access to the world’s largest digital measurement database. Get Similarweb data for more than 1,000,000 domains and 5 years of history, with tens of metrics, in one API call.
This guide walks you through the two main steps required to get millions of data points from the API.
Get started checklist
- 🔑 Get an API Key tutorial, or generate a new key directly from your account
- 📈 Choose the data and metrics you need based on your subscription on datahub.similarweb.com or discover new datasets here
- 📝 Create a request report with a valid JSON body
- 🔗 Connect and integrate to your data lake (S3, Snowflake, Databricks)
Step-by-step guide
1
Submit a POST request
Make a POST request with a JSON body — either inline or attached as a file using multipart/form-data.
2
Track your report
After you receive your report ID, use the Request Report Status endpoint to receive the status and download URL once the report is ready.
https://api.similarweb.com/batch/v5/request-report
import requests
url = "https://api.similarweb.com/batch/v5/request-report"
payload = {}
files = [
('request', ('Batchexample.json', open('/Users/Batchexample.json', 'rb'), 'application/json'))
]
headers = {
'api-key': '{{your_api_key}}'
}
response = requests.request("POST", url, headers=headers, data=payload, files=files)
print(response.text)
Example JSON body
{
"delivery_information": {
"response_format": "csv"
},
"report_query": {
"tables": [
{
"vtable": "traffic_and_engagement",
"granularity": "monthly",
"filters": {
"domains": ["similarweb.com", "api.similarweb.com"],
"countries": ["WW", "US"],
"include_subdomains": true
},
"metrics": [
"all_traffic_visits",
"desktop_new_visitors",
"desktop_pages_per_visit",
"desktop_returning_visitors"
],
"start_date": "2023-02",
"end_date": "2024-02"
}
]
}
}
Mandatory parameters
When requesting a report, include the following parameters in your JSON body.
vtable | The dataset you want to pull metrics from. See the full list on Datahub or the discovery endpoints. | traffic_and_engagement |
domains | Domain names may include letters, numbers, dashes, and hyphens. One request supports up to 1M domains. | amazon.com |
countries | 2-letter ISO country codes (case-sensitive, capital letters). Use "WW" for worldwide. When calling desktop_top_geo, remove all countries from your JSON. | WW, US, GB |
metrics | List of metrics per dataset. | all_traffic_visits |
start_date, end_date | Daily granularity uses YYYY-MM-DD; monthly granularity uses YYYY-MM. | Daily: 2023-06-30 / Monthly: 2023-06 |
granularity | Time series granularity. | monthly, weekly, daily |
response_format | Output format of the API call. | JSON, csv, parquet, orc |
Save the report ID you receive after your API request — you'll need it to track the report status.
The request limit is 20 pending requests per user. If you receive a 429 error, you've exceeded the limit. Reduce the frequency of your requests to stay within your account limits.
Optional parameters
delivery_method | Default is "download_link". When set to "snowflake", the response_format field is not required. | download_link, bucket_access, snowflake |
delivery_method_params | Use when delivering reports to aggregated Snowflake tables. Input "table_name": "your_table_name". | table_name, integration_name, retention_days, overwrite_partitions |
all_history | When true, automatically overrides dates to the minimum start and maximum end dates. | true / false |
latest | When true, overrides the end date with the latest available date. | true / false |
window_size | Overrides the start date with a time relative to the end date. | {number}{y/m/d} — e.g. 12d, 3m, 2y |
limit | Limits the number of results per entity selected. | Above 0; default is 100 for most metrics |
include_subdomains | Default is true. | true / false |
webhook_url | Delivery URL we'll ping when the status of your report changes. | URL |
sort | Sort by a specific metric. | "sort": "all_traffic_visits" |
Get the report status
After submitting a request and receiving your report ID, use the Request Report Status endpoint to check progress.
https://api.similarweb.com/batch/request-status/{{generated_report_id}}
import requests
url = "https://api.similarweb.com/batch/request-status/{{generated_report_id}}"
payload = {}
headers = {
'api-key': '{{your_api_key}}'
}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
Example responses
{
"data_points_count": 1779429,
"download_url": "example_url.com",
"status": "completed",
"used_quota": 35589
}
{
"status": "pending"
}
Data credits per request
The download link remains valid for 30 days. We recommend saving it for some time in case you need our help troubleshooting any issue.
Data credits are calculated for each report based on the number of results you are actually receiving:
Formula: Number of domains × Number of metrics × history × cadence (daily/monthly) × Number of countries × Number of results
In order to calculate the estimated credits the report will cost, you can use the "request-validate" endpoint:
On this page
- Set up your Batch API