Google Cloud Storage Integration
Integrate Similarweb’s Batch API with your Google Cloud Storage (GCS) environment. This integration allows clients to store and access exported data directly in their own GCS buckets.
1
Create a new Google Cloud connection
Start by creating a new GCS integration to store and access data. This setup generates GCS access and secret keys that allow you to interact with your data securely in the specified bucket.
2
Save your credentials
After creating the connection, you receive private credentials including a GCS service-account JSON. These credentials are essential to access the shared bucket securely — store them in a safe place.
3
Generate a report using the create-table URL
Creating a table is similar to requesting a one-time report, but you modify the delivery information and call the create-table URL instead. This sets up a table that will store data continuously.
4
Modify the delivery information
Adjust the delivery_information block to define where and how the data is delivered. Set the delivery method to google_bucket_access to store the data in your GCS bucket.
5
Specify the integration name
Set the integration_name parameter to gcs_default to direct data to the appropriate integration.
6
Name your table
Provide a unique name for your table. This name is used to identify and manage the table in your GCS bucket — choose something that corresponds to the data you'll store.
7
Run the request
Once you've filled out the request body, submit it to create the table. After submission, the table is set up in your GCS bucket and the requested data is added to it.
8
Retrieve the table path
After the table is created, you receive a location representing the GCS path (gs:÷…) where the table data is stored. Use this path to access the stored data.
9
Add data to the table
To append more data to the table after it's been created, use the request-report URL. This adds new data to the existing table without replacing it.
10
Send data to the table
By specifying the same integration_name and table_name as before, new data is added to your existing table. This keeps your data up to date with additional information as needed.
11
Keep field names consistent
When appending data, response_format, delivery_method, integration_name, and table_name must match the values you used when the table was first created.
Response Properties
The response contains the information of the created integration.
expiration_date str | Date when the credentials expire | 2025-03-21 |
gcs_bucket str | The bucket where Similarweb will deliver the data | sw-daas-production |
gcs_prefix str | The path prefix inside the GCS bucket for this integration | account_id/my_integration/ |
integration_name str | The name of the created integration. (can not be changed) | my_integration |
secret str | A complete GCP service account JSON for use with the integration | * |
The secret field in the response contains a Google service account JSON.
To access your GCS integration programmatically (e.g., using PySpark),save this JSON to a file and use the GOOGLE_APPLICATION_CREDENTIALS environment variable to authenticate.
POST
/
1