Pyodide requests shim: Binary requests are working!

Update 2023-04-21: Everything here describes a hacky solution. The same has now been done properly by the pyodide_http project.

In the previous post I showed how shimming the Python module requests was done. In the meantime I have made processing binary responses possible, using a slightly weird browser feature that probably still exists for backward compatibility reasons.

Since the requests API is a simple blocking Python call, we can’t use asynchronous fetch calls. This means XMLHttpRequest is the only (built-in) option to perform our HTTP requests in JavaScript (from Python code). So the two challenges are that the requests need to be done with XMLHttpRequest, and they should be synchronous calls.

Normally, if you want to do something with the raw bytes of an XMLHttpRequest, you would simply do:

request = new XMLHttpRequest();
request.responseType = "arraybuffer";
// or  .responseType = "blob";

However, if this responseType is combined with the async parameter set to false in the open call, you get the following error (and deprecations):

request = new XMLHttpRequest(); 
request.responseType = 'arraybuffer'; 
request.open("GET", "https://httpbin.org/get", false); 
request.send();
// Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end 
//     user’s experience. For more help http://xhr.spec.whatwg.org/
// Use of XMLHttpRequest’s responseType attribute is no longer supported in the synchronous mode in window context.
// Uncaught DOMException: XMLHttpRequest.open: synchronous XMLHttpRequests do not support timeout and responseType

The Mozilla docs provide helpful tricks for handling binary responses, back from when the responseTypes arraybuffer and blob simply didn’t exist yet. The trick is to override the MIME type, say that it is text, but that the character set is something user-defined: text/plain; charset=x-user-defined.

request.overrideMimeType("text/plain; charset=x-user-defined");
request.responseIsBinary = true;  // as a custom flag for the code that needs to process this

The request.response we get contains two-byte “characters”, some of which are within Unicode’s Private Use Area. We will need to strip every other byte to get the original bytes back. Note that the following code block contains Python code made for Pyodide. The request object is still an XMLHttpRequest, but it’s accessed from the Python code:

def __init__(self, request):
    if request.responseIsBinary:
        # bring everything outside the range of a single byte within this range
        self.raw = BytesIO(bytes(ord(byte) & 0xff for byte in request.response))

Even though this works right now, some concessions have been made to achieve the goal of performing HTTP requests from Pyodide. The worst concession is running on the main thread, with the potential of freezing browser windows. The future of this project is to write asynchronous Python code using aiohttp, and shim aiohttp to use the Javascript fetch API.

To see all these things in action, check the current state of shimming requests on Github: bartbroere/requests#1

Update 2023-04-20: I’m no longer maintaining and hosting a custom Pyodide build to demonstrate it. The link to it has been removed.