Making Pyodide more powerful using a CORS proxy

Browsers are strict about Cross Origin Resource Sharing (CORS), to protect users from leaking credentials to different untrusted domains. This can be a hurdle if you try to use Pyodide (Python in the browser).

Previously, I shimmed Python requests to be usable from Pyodide, but I will not use this shim here. It has now been deprecated by a better way to patch this into the requests and aiohttp libraries. To illustrate the CORS problem in isolation, I’m simply going to use Javascript’s fetch in this post.

First, I made a few modifications to the httpbin project, so we can simulate responses with various CORS headers. If we try to get a file from a host that has a CORS policy with Access-Control-Allow-Origin set to http://example.com

fetch('https://httpbin.example.com/response-headers?Access-Control-Allow-Origin=http://example.com')

we might get the following error:

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://httpbin.example.com/response-headers?Access-Control-Allow-Origin=http://example.com. (Reason: CORS header ‘Access-Control-Allow-Origin’ does not match ‘http://example.com’).

And/Or this one if a Content Security Policy is in place:

Content Security Policy: The page’s settings observed the loading of a resource at https://httpbin.example.com/response-headers (“default-src”). A CSP report is being sent.

Or the following error if the header is missing completely:

Access to fetch at 'https://httpbin.example.com/no-cors-headers' from origin 'https://notebook.example.com' has been blocked by CORS policy: No ‘Access-Control-Allow-Origin’ header is present on the requested resource. If an opaque response serves your needs, set the request’s mode to ‘no-cors’ to fetch the resource with CORS disabled.

So there’s a lot of stars that have to align for our request to work. This request would be trivial if we made it with an HTTP client that is not a browser. CORS policies are only enforced by browsers, as mentioned in MDN’s documentation on CORS:

The Cross-Origin Resource Sharing standard works by adding new HTTP headers that let servers describe which origins are permitted to read that information from a web browser.

So it’s the server that determines which hosts can see which contents. Having the server change their headers is not feasible in most cases, so let’s introduce CORS proxies. A CORS proxy is a server in the middle that simply strips or replaces some of the headers that are causing trouble.

A minimal example to set this up, using the cors-anywhere project:

/* corsserver.js */
var cors_proxy = require('cors-anywhere');
cors_proxy.createServer({
    originWhitelist: [],
    requireHeader: [],
    removeHeaders: []
}).listen(8080, '0.0.0.0', function() {
    console.log('Running CORS Anywhere');
});

Start it with:

npm install cors-anywhere  # install the dependency to your project
node corsserver.js  # run the server

Typical HTTP proxies work with the CONNECT method. This one doesn’t, it works by suffixing the URL you want to fetch behind the URL of the CORS proxy. There’s a good reason this suffixing is used. Browsers don’t expose the HTTP CONNECT method typically used for proxies.

Using our proxy our modified request will look something like this:

fetch('https://corsproxy.example.com/https://httpbin.example.com/response-headers?Access-Control-Allow-Origin=http://example.com')

Note that our response now is available, and no errors appear in the browser’s Console. The Access-Control-* headers have been stripped. With some work this can be added to the requests shim as well.

If you would actually put this into production, there’s some things you might want to do:

TLDR: If you are deploying an application that uses Pyodide (a project like Starboard, Quadratic or Jupyter Lite), it might make sense to also deploy a CORS proxy.