I was encountering this error when trying to set up a puppeteer instance with a proxy.
I tried different approaches, but were either outdated or led me to a wrong path to try to solve my problem.
The main issue was that I tried to authenticate to the proxy with setExtraHTTPHeaders
, like this:
await page.setExtraHTTPHeaders({
'Proxy-Authorization': 'Basic ...'
});
Then I stumbled upon the following piece of chromium source code:
// Removing headers can't make the set of pre-existing headers unsafe, but
// adding headers can.
if (!AreRequestHeadersSafe(modified_headers)) {
NotifyCompleted(net::ERR_INVALID_ARGUMENT);
// |this| may have been deleted.
return;
}
This indicates that some headers were modified in a “unsafe” way, which let to the net::ERR_INVALID_ARGUMENT
error on chrome and in the terminal.
documentation, documentation, documentation
on pptr.de/ unfortunately nada
Decided looking through the official puppeteer API.md and found the documentation for puppeteer.launch([options])
.
Alright.
Which led me to the documentation of the supported Chromium flags.
Searching for proxy
revealed this:
Chromium flags and puppeteer
So we can use --proxy-server
to specify a proxy to which the browser instance connects to.
In puppeteer this would look something like this:
...
const args = [
'--no-sandbox',
`--proxy-server=${PROXY_IP}`,
...
]
const puppeteerOptions = {
args,
ignoreHTTPSErrors: true,
...
}
const browser = await puppeteer.launch(puppeteerOptions)
const page = await browser.newPage()
That should be it if you just want to connect to a Proxy!
What about a Proxy with auth?
No problem.
You still specify the proxy IP address as above
...
const args = [
'--no-sandbox',
`--proxy-server=${AUTHENTICATED_PROXY_IP}`,
...
]
And now you simply authenticate to the page instance using page.authenticate(options)
.
Note: used for HTTP Authentication
The headers WWW-Authenticate
&& Authorization
used in HTTP Basic are essentially the same applied for proxies, just their named differently, namely Proxy-Authentication
&& Proxy-Authorization
.
To use page.authenticate
and supply proxy authorization credentials you would do something along these lines:
await page.authenticate({ username, password })
a full example
Here you go:
- a browser instance connected to a proxy
- a browser page (that authenticated to the proxy with username and password if needed)
- using
get-free-https-proxy
- using
- show the current IP to verify the proxy is effectively used
- using ipinfo.io
const puppeteer = require('puppeteer')
const getFreeProxies = require('get-free-https-proxy')
;(async () => {
const [proxy1] = await getFreeProxies()
console.log('using proxy', proxy1)
const browser = await puppeteer.launch({
args: [
'--no-sandbox',
`--proxy-server=${proxy1.host}:${proxy1.port}`
],
headless: false,
ignoreHTTPSErrors: true
})
const page = await browser.newPage()
// if you're using an authenticated proxy
// await page.authenticate({ username, password })
await page.goto('https://ipinfo.io/json')
const content = await page.content()
const serialized = content.substring(content.indexOf('{'), content.indexOf('}') + 1)
console.log(JSON.parse(serialized))
await page.waitFor(5000)
await page.close()
await browser.close()
process.exit(0)
})()