Proxy authentication with HTTPConnection and HTTPSConnection

Having recently had to modify Selenium Wire to properly support proxy server authentication with Python’s http.client.HTTPConnection and http.client.HTTPSConnection, I thought I’d summarise how I did that here, in case it’s useful to anybody in future.

These examples assume you’re working with a proxy server that’s using Basic Authentication. This authentication method works by sending a base64 encoded username/password string as a request header in the format:

Proxy-Authorization: Basic username:password

Creating the Proxy-Authorization header

The username and password credentials should be separated by a colon and then the whole string base64 encoded. Python’s base64 module makes that fairly trivial:

import base64



username = 'myusername'

password = 'mypassword'
cred = '{}:{}'.format(username, password)

cred = base64.b64encode(cred.encode('utf-8')).decode('utf-8')

Note that the base64.b64encode() function takes a byte string, which that means that the string containing the credentials must first be turned into bytes by calling  .encode('utf-8'). Similarly, .decode('utf-8') is called on the base64 encoded result to turn it back into a string again.

The Proxy-Authorization header can then be created and held within a dictionary:

headers = {

    'Proxy-Authorization': 'Basic {}'.format(cred)

}

Connecting to the proxy server

Whether you’re connecting to a site over HTTP or HTTPS, you establish a connection to the proxy server in the same way – by passing the proxy’s hostname and port to the  HTTPConnection and HTTPSConnection classes when you instantiate them. For example:

http = HTTPConnection('proxy1:8080')

https = HTTPSConnection('proxy1:8080')

Sending the credentials

This is where things start to differ slightly.

HTTPConnection

For HTTPConnection, you pass the Proxy-Authorization header each time you call the request() method. This method accepts headers via an optional headers argument where you can supply the headers dictionary:

http.request('GET', 'http://www.example.com/', headers=headers)

One thing to note here is that the URL that you pass as the second argument must be an absolute URL containing the hostname of the remote site. If you pass a relative URL then the proxy server won’t know where to send your request on to. This is true whether your proxy server is using authentication or not.

The Proxy-Authorization header is a single-hop header, which means it won’t get passed on to the remote site by the proxy.

HTTPSConnection

HTTPSConnection works slightly differently. After connecting to the proxy server, you need to tell it to establish a tunnel to the remote site, via the set_tunnel() method.

Setting up a tunnel is necessary with HTTPS even if you’re not using proxy authentication. But when you are, it’s at this point where the Proxy-Authorization header should be used.

set_tunnel() takes an optional headers argument, in a similar way to request(), where you can supply the headers dictionary:

https.set_tunnel('www.example.com', headers=headers)

Once the tunnel has been established, subsequent calls to https.request() can be made without sending the Proxy-Authorization header. In fact, it’s important that you don’t supply the header to the request() method, otherwise you may inadvertently expose the credentials to the remote site.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s