How to get the scraper to not store original HTML files? #233
Replies: 1 comment
-
|
In case anybody else was curious, I resolved this by setting up a cron job in a new command line session that would delete all .html files in the news-please subdirectory for the chosen date every 2 minutes or so. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, the documentation says that in the default config, the tool also stores the HTML files from websites, when running in CLI mode. I wanted to ask if there was a way to switch this off and only store the extracted JSON. I couldn't find out how to do so in the config file. Any help would be appreciated!
Beta Was this translation helpful? Give feedback.
All reactions