Scrapy feed export storage backend for GoogleSheets.
- Python 3.8+
Install the Google Sheets Exporter for Scrapy via pip:
pip install git+https://github.com/scrapy-plugins/scrapy-feedexporter-google-sheetsFollow these steps to use the Google Sheets Exporter with Scrapy:
-
Add this storage backend to the Scrapy settings FEED_STORAGES dict, as follows:
# settings.py FEED_STORAGES = {'gsheets': 'scrapy_google_sheets_exporter.gsheets_exporter.GoogleSheetsFeedStorage'}
-
Configure authentication by providing Google service account credentials as a dictionary in the Scrapy settings
GOOGLE_CREDENTIALS.For example:
# settings.py GOOGLE_CREDENTIALS = { "type": "service_account", "project_id": "project_id here", "private_key_id": "private_key_id here", "private_key": "private_key here", "client_email": "[email protected]", "client_id": "client_id here", "auth_uri": "auth_uri here", "token_uri": "token_uri here", "auth_provider_x509_cert_url": "auth_provider_x509_cert_url here", "client_x509_cert_url": "client_x509_cert_url here" }
-
Share the Google Sheet file with the service account's email (available in GOOGLE_CREDENTIALS as
client_email) and give itEditoraccess. -
Configure the Scrapy settings FEEDS by passing the Google Sheet file URI where the feed should be exported.
For example:
# settings.py FEEDS = { "gsheets://docs.google.com/spreadsheets/d/{spreadsheet_key}/edit#gid={worksheet_id}": { "format": "csv", # mandatory "overwrite": True } }
- You can get the
spreadsheet_keyandworksheet_idfrom the URL of the shared Google Sheet file- e.g:
https://docs.google.com/spreadsheets/d/1fWJgq5yuOdeN3YnkBZiTD0VhB1MLzBNomz0s9YwBREo/edit#gid=1261678709
- e.g:
- IMPORTANT: If the worksheet id is not provided (i.e.: there is no
/edit#gid={worksheet_id}in the end of the URL), this exporter will export data to the first worksheet as default.
- You can get the
- This feed exporter only supports
csvformat. This setting is mandatory, there is no fallback value. - The
overwritefeed option (default = False) determines whether data should be appended to the existing rows in the worksheet or overwrite the data in the worksheet completely, when set to True. - If you are using this exporter in append mode (i.e., overwrite = False), please make sure that the fields to be exported match the data already present in the worksheet. This can be achieved by passing the feed option
fieldsor configuring the FEED_EXPORT_FIELDS setting. - If you prefer not to export the CSV headers to the worksheet (for example, when using the exporter in append mode), please include the following feed option:
-
"item_export_kwargs": {"include_headers_line": False} - This exporter does not support the
batch_item_countfeed option. Instead, please usebatch_size(number of items in each batch) if you want to insert rows in batches into the spreadsheet. Please note that the batch will be capped at a size of 2 MB, in line with Google's recommendation.