Posts tagged "scraping"
Found 7 posts tagged scraping
First of all: this is based on the idea of https://simonwillison.net/2020/Oct/9/git-scraping/ and https://github.com/simonw/ca-fires-history The gist: - every 20min - scrape hackernews frontpage items - save them in
hn.json- commit and push Find the git repository here at christian-fei/hn-history
After quite a bit of research, I found the full list of Chromium Command Line Switches.
I was encountering this error when trying to set up a puppeteer instance with a proxy.
In a previous post I tried to explain how to troubleshoot an issue when connecting to a Proxy with Puppeteer investigating API documentations , Chromium flags and all that funny jazz..
Example repository and explanation to a practical crawling with browserless and puppeteer.
> Browser automation built for enterprises, loved by developers. browserless.io is a neat service for hosted puppeteer scraping, but there is also the official Docker image for running it locally. I was amazed when I found out about it 🤯! Find the whole source code on Github christian-fei/browserless-example!