Pyodide requests shim: Binary requests are working!
25 Apr 2022Update 2023-04-21: Everything here describes a hacky solution. The same has now been done properly by the pyodide_http
project.
In the previous post I showed how shimming the Python module requests
was done.
In the meantime I have made processing binary responses possible, using a slightly weird browser feature that probably still exists for backward compatibility reasons.
Since the requests
API is a simple blocking Python call, we can’t use asynchronous fetch
calls.
This means XMLHttpRequest
is the only (built-in) option to perform our HTTP requests in JavaScript (from Python code).
So the two challenges are that the requests need to be done with XMLHttpRequest
, and they should be synchronous calls.
Normally, if you want to do something with the raw bytes of an XMLHttpRequest
, you would simply do:
request = new XMLHttpRequest();
request.responseType = "arraybuffer";
// or .responseType = "blob";
However, if this responseType
is combined with the async
parameter set to false
in the open
call, you get the following error (and deprecations):
request = new XMLHttpRequest();
request.responseType = 'arraybuffer';
request.open("GET", "https://httpbin.org/get", false);
request.send();
// Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end
// user’s experience. For more help http://xhr.spec.whatwg.org/
// Use of XMLHttpRequest’s responseType attribute is no longer supported in the synchronous mode in window context.
// Uncaught DOMException: XMLHttpRequest.open: synchronous XMLHttpRequests do not support timeout and responseType
The Mozilla docs provide helpful tricks for handling binary responses, back from when the responseType
s arraybuffer
and blob
simply didn’t exist yet.
The trick is to override the MIME type, say that it is text, but that the character set is something user-defined: text/plain; charset=x-user-defined
.
request.overrideMimeType("text/plain; charset=x-user-defined");
request.responseIsBinary = true; // as a custom flag for the code that needs to process this
The request.response
we get contains two-byte “characters”, some of which are within Unicode’s Private Use Area. We will need to strip every other byte to get the original bytes back.
Note that the following code block contains Python code made for Pyodide. The request
object is still an XMLHttpRequest
, but it’s accessed from the Python code:
def __init__(self, request):
if request.responseIsBinary:
# bring everything outside the range of a single byte within this range
self.raw = BytesIO(bytes(ord(byte) & 0xff for byte in request.response))
Even though this works right now, some concessions have been made to achieve the goal of performing HTTP requests from Pyodide.
The worst concession is running on the main thread, with the potential of freezing browser windows.
The future of this project is to write asynchronous Python code using aiohttp
, and shim aiohttp
to use the Javascript fetch
API.
To see all these things in action, check the current state of shimming requests
on Github: bartbroere/requests#1
Update 2023-04-20: I’m no longer maintaining and hosting a custom Pyodide build to demonstrate it. The link to it has been removed.