Exploring how to create binary wheels for Pythonista

This is a follow-up on the previous post on how to get Pip working with Pythonista. We ended with a working pip but didn’t have a way of installing binary packages yet (like scipy and scikit-learn).

Using the Oracle Cloud, which offers (free!) aarch64 intances, I tried to build some Python wheels for my iPhone.

sudo apt install zlib1g-dev make libssl-dev curl
git clone https://github.com/deadsnakes/python3.6
cd python3.6
./configure
make
curl -L https://bootstrap.pypa.io/pip/3.6/get-pip.py > ./get-pip.py
./python get-pip.py
./python -m pip wheel scikit-learn

After uploading it to my PyPI repository we can try to install it using pip. Pip installing a custom wheel built on the Oracle Cloud

Except unfortunately, this wheel is still not in the expected format. Not a supported wheel on this platform

This is related to how the wheel file format is specified in PEP 0427. The short summary is that the platform tag can be seen in the filename: {distribution}-{version}(-{build tag})?-{python tag}-{abi tag}-{platform tag}.whl.

To find out what platform tags Pythonista asks for on iPhones, I added a debug line in pip/_internal/index/package_finder.py, and ran pip again. Debug line in package_finder.py

This resulted in the following list of platform tags: Cleaner output of logging

The Python wheels will need to comply to this expected platform tag. Instead of cp36-cp36m-manylinux2014_aarch64 it looks like it needs to say cp36-cp36-darwin_21_5_0_iphone12,8.

Update: Right after finishing this post I noticed that the platform tag changed to cp36-cp36-macosx_15_0_iphone12,8 (instead of the darwin tag above). This might have been caused by an iOS update.

After running auditwheel fix on the created wheel all relevant system libraries are copied into the wheel.

Now I run the following script to modify the wheel to use the expected platform tag:

import zipfile


with zipfile.ZipFile('scipy-1.5.4-cp36-cp36m-linux_aarch64.whl', 'r') as input_wheel:
    with zipfile.ZipFile('scipy-1.5.4-cp36-cp36-macosx_15_0_iphone12,8.whl', 'w',
                         compression=zipfile.ZIP_DEFLATED) as output_wheel:
        for input_zipinfo in input_wheel.infolist():
            if input_zipinfo.filename.endswith('.dist-info/WHEEL'):
                output_wheel.writestr(
                    input_zipinfo.filename,
                    input_wheel.read(input_zipinfo.filename).replace(
                        b'cp36-cp36m-linux_aarch64',
                        b'cp36-cp36-macosx_15_0_iphone12,8')
                )
            elif input_zipinfo.filename.endswith('.dist-info/RECORD'):
                output_wheel.writestr(
                    input_zipinfo.filename,
                    input_wheel.read(input_zipinfo.filename).replace(
                        b'.cpython-36m-aarch64-linux-gnu',
                        b'')
                )
            else:
                output_wheel.writestr(
                    input_zipinfo.filename.replace('.cpython-36m-aarch64-linux-gnu', ''),
                    input_wheel.read(input_zipinfo.filename)
                )

Now it is recognized by pip as suitable for the platform and installs without issue.

Pip installing scipy wheel

When trying to use the newly installed scipy however, it still can’t find the correct shared objects.

import scipy No binary module

If we try to directly import this shared object using ctypes, we can see better why it will not work:

IMG-1901 IMG-1902

DLLs need to be Mach-O, instead of the a.out format.

But how does Pythonista include the non-standard-libraries it ships with? To find out, I made a copy of the app itself. This was quite easy to do, since Pythonista ships with Python:

Dumping the app

Using this dump, I could determine that the extra packages like numpy and matplotlib all live in Frameworks/Py3Kit.framework/pylib/site-packages. However, in this directory, all shared objects that normally also live there, are missing.

If we decompile the app’s Py3Kit.framework executable, we can see that it actually contains these binary Python modules that were missing in site-packages. They are all added to the built-in Python packages, using the _PyImport_AppendInittab method available in Python’s C API.

void PYK3Interpreter::registerBuiltinModules(ID param_1,SEL param_2)

{
  _PyImport_AppendInittab("speech",_PyInit_speech);
  _PyImport_AppendInittab("reminders",_PyInit_reminders);
  _PyImport_AppendInittab("contacts",_PyInit_contacts);
  _PyImport_AppendInittab("sound",_PyInit_sound);
  _PyImport_AppendInittab("linguistictagger",_PyInit_linguistictagger);
  _PyImport_AppendInittab("_ui",_PyInit__ui);
  _PyImport_AppendInittab("_notification",_PyInit__notification);
  _PyImport_AppendInittab("_pythonista",_PyInit__pythonista);
  _PyImport_AppendInittab("_keyboard",_PyInit__keyboard);
  _PyImport_AppendInittab("_dialogs",_PyInit__dialogs);
  _PyImport_AppendInittab("_appex",_PyInit__appex);
  _PyImport_AppendInittab("_font_cache",_PyInit__font_cache);
  _PyImport_AppendInittab("_scene2",_PyInit__scene2);
  _PyImport_AppendInittab("console",_PyInit_console);
  _PyImport_AppendInittab("_clipboard",_PyInit__clipboard);
  _PyImport_AppendInittab("_photos",_PyInit__photos);
  _PyImport_AppendInittab("_photos2",_PyInit__photos2);
  _PyImport_AppendInittab("_webbrowser",_PyInit__webbrowser);
  _PyImport_AppendInittab("_twitter",_PyInit__twitter);
  _PyImport_AppendInittab("location",_PyInit_location);
  _PyImport_AppendInittab("_motion",_PyInit__motion);
  _PyImport_AppendInittab("keychain",_PyInit_keychain);
  _PyImport_AppendInittab("_cb",_PyInit__cb);
  _PyImport_AppendInittab("_canvas",_PyInit__canvas);
  _PyImport_AppendInittab("_imaging",_PyInit__imaging);
  _PyImport_AppendInittab("_imagingft",_PyInit__imagingft);
  _PyImport_AppendInittab("_imagingmath",_PyInit__imagingmath);
  _PyImport_AppendInittab("_imagingmorph",_PyInit__imagingmorph);
  _PyImport_AppendInittab("_np_multiarray",_PyInit_multiarray);
  _PyImport_AppendInittab("_np_scalarmath",_PyInit_scalarmath);
  _PyImport_AppendInittab("_np_umath",_PyInit_umath);
  _PyImport_AppendInittab("_np_fftpack_lite",_PyInit_fftpack_lite);
  _PyImport_AppendInittab("_np__compiled_base",_PyInit__compiled_base);
  _PyImport_AppendInittab("_np__umath_linalg",_PyInit__umath_linalg);
  _PyImport_AppendInittab("_np_lapack_lite",_PyInit_lapack_lite);
  _PyImport_AppendInittab("_np_mtrand",&_PyInit_mtrand);
  _PyImport_AppendInittab("_np__capi",_PyInit__capi);
  _PyImport_AppendInittab("_mpl__backend_agg",_PyInit__backend_agg);
  _PyImport_AppendInittab("_mpl__image",_PyInit__image);
  _PyImport_AppendInittab("_mpl__path",_PyInit__path);
  _PyImport_AppendInittab("_mpl_ttconv",_PyInit_ttconv);
  _PyImport_AppendInittab("_mpl__cntr",_PyInit__cntr);
  _PyImport_AppendInittab("_mpl_ft2font",_PyInit_ft2font);
  _PyImport_AppendInittab("_mpl__png",_PyInit__png);
  _PyImport_AppendInittab("_mpl__delaunay",_PyInit__delaunay);
  _PyImport_AppendInittab("_mpl__qhull",_PyInit__qhull);
  _PyImport_AppendInittab("_mpl__tri",_PyInit__tri);
  _PyImport_AppendInittab("_counter",_PyInit__counter);
  _PyImport_AppendInittab("_AES",_PyInit__AES);
  _PyImport_AppendInittab("_ARC2",_PyInit__ARC2);
  _PyImport_AppendInittab("_ARC4",_PyInit__ARC4);
  _PyImport_AppendInittab("_Blowfish",_PyInit__Blowfish);
  _PyImport_AppendInittab("_CAST",_PyInit__CAST);
  _PyImport_AppendInittab("_DES3",_PyInit__DES3);
  _PyImport_AppendInittab("_DES",_PyInit__DES);
  _PyImport_AppendInittab("_MD2",_PyInit__MD2);
  _PyImport_AppendInittab("_MD4",_PyInit__MD4);
  _PyImport_AppendInittab("_RIPEMD160",_PyInit__RIPEMD160);
  _PyImport_AppendInittab("_SHA224",_PyInit__SHA224);
  _PyImport_AppendInittab("_SHA256",_PyInit__SHA256);
  _PyImport_AppendInittab("_SHA512",_PyInit__SHA512);
  _PyImport_AppendInittab("_XOR",_PyInit__XOR);
  _PyImport_AppendInittab("strxor",_PyInit_strxor);
  _PyImport_AppendInittab("pykit_io",_PyInit_pykit_io);
  return;
}

In a next post I’ll be looking into compiling the wheels with Mach-O shared libraries (or bundles as Apple calls them).

Adding pip to Pythonista for iOS

Pythonista is probably the most popular Python app for iOS. This post is a summary of the work I did to get pip working. Here’s how to do it:

Installing pip

import requests
import sys
from io import BytesIO
from zipfile import ZipFile

# Get the location of the Python 3 site-packages
site_packages = next(filter(
  lambda x: 'site-packages-3' in x,
  sys.path
))

# extract directly into site-packages
ZipFile(BytesIO(requests.get(
    'https://files.pythonhosted.org/packages/90/a9/1ea3a69a51dcc679724e3512fc2aa1668999eed59976f749134eb02229c8/pip-21.3-py3-none-any.whl'
).content)).extractall(site_packages)

print("Downloaded pip")

This downloads pip to the site-packages folder for Python 3. Pythonista calls this folder site-packages-3.

Now that we have pip set up, we can start downloading our first package:

Using pip from Pythonista

import pip
import sys

site_packages = next(filter(
  lambda x: 'site-packages-3' in x,
  sys.path
))

print(
  pip.main(f'install ',
           f'--target {site_packages} '
           f'tqdm'.
  split(' '))
)

This works a bit differently from how you would typically use pip. Since we use it as a library, we call the pip.main function with a list of arguments (created by .split(' ')).

The default directory pip tries is not writable. It’s part of the Pythonista app. We therefore manually indicate it should write to our site-packages-3 folder using --target. Note that this probably will not yet work for dependencies with binary extensions (libraries like scipy etc.).

Of course, I also tried to use StaSh. This seemed quite suitable at first, but upon closer inspection, the pip it contained is not the common version. In fact it contains its own pip.py which approximates the canonical pip’s behaviour.

In a next post I’ll explore how to use pip to get binary wheels to install on your iDevice. This will involve building wheels specific for iOS and maybe even setting up a PyPI mirror.

Pyodide requests shim: Binary requests are working!

Update 2023-04-21: Everything here describes a hacky solution. The same has now been done properly by the pyodide_http project.

In the previous post I showed how shimming the Python module requests was done. In the meantime I have made processing binary responses possible, using a slightly weird browser feature that probably still exists for backward compatibility reasons.

Since the requests API is a simple blocking Python call, we can’t use asynchronous fetch calls. This means XMLHttpRequest is the only (built-in) option to perform our HTTP requests in JavaScript (from Python code). So the two challenges are that the requests need to be done with XMLHttpRequest, and they should be synchronous calls.

Normally, if you want to do something with the raw bytes of an XMLHttpRequest, you would simply do:

request = new XMLHttpRequest();
request.responseType = "arraybuffer";
// or  .responseType = "blob";

However, if this responseType is combined with the async parameter set to false in the open call, you get the following error (and deprecations):

request = new XMLHttpRequest(); 
request.responseType = 'arraybuffer'; 
request.open("GET", "https://httpbin.org/get", false); 
request.send();
// Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end 
//     user’s experience. For more help http://xhr.spec.whatwg.org/
// Use of XMLHttpRequest’s responseType attribute is no longer supported in the synchronous mode in window context.
// Uncaught DOMException: XMLHttpRequest.open: synchronous XMLHttpRequests do not support timeout and responseType

The Mozilla docs provide helpful tricks for handling binary responses, back from when the responseTypes arraybuffer and blob simply didn’t exist yet. The trick is to override the MIME type, say that it is text, but that the character set is something user-defined: text/plain; charset=x-user-defined.

request.overrideMimeType("text/plain; charset=x-user-defined");
request.responseIsBinary = true;  // as a custom flag for the code that needs to process this

The request.response we get contains two-byte “characters”, some of which are within Unicode’s Private Use Area. We will need to strip every other byte to get the original bytes back. Note that the following code block contains Python code made for Pyodide. The request object is still an XMLHttpRequest, but it’s accessed from the Python code:

def __init__(self, request):
    if request.responseIsBinary:
        # bring everything outside the range of a single byte within this range
        self.raw = BytesIO(bytes(ord(byte) & 0xff for byte in request.response))

Even though this works right now, some concessions have been made to achieve the goal of performing HTTP requests from Pyodide. The worst concession is running on the main thread, with the potential of freezing browser windows. The future of this project is to write asynchronous Python code using aiohttp, and shim aiohttp to use the Javascript fetch API.

To see all these things in action, check the current state of shimming requests on Github: bartbroere/requests#1

Update 2023-04-20: I’m no longer maintaining and hosting a custom Pyodide build to demonstrate it. The link to it has been removed.

Making the Python requests module work in Pyodide

Update 2023-04-21: Everything here describes a hacky solution. The same has now been done properly by the pyodide_http project.

The Pyodide project compiles the CPython interpreter and a collection of popular libraries to the browser. This is done using emscripten and results in Javascript and WebAssembly packages. This means you get an almost complete Python distribution, that can run completely in the browser.

Pure Python packages are also pip-installable in Pyodide, but these packages might not be usable if they (indirectly) depend on one of the unsupported Python standard libraries. This has the result that you can’t do a lot of network-related things. Anything that depends on importing socket or the http library will not work. Because of this, you can’t use the popular library requests yet. The project’s roadmap has plans for adding this networking support, but this might not be ready soon.

Therefore I created an alternative requests module specifically for Pyodide, which bridges the requests API and makes JavaScript XMLHttpRequests. I’m currently developing it in a fork at bartbroere/requests#1. Helping hands are always welcome!

Since most browsers have strong opinions on what a request should look like in terms of included headers and cookies, this new version of requests will not always do what the normal requests does. This can be a feature instead of a bug. For example, if the browser already has an authenticated session to an API, you could automatically send authenticated requests from your Python code.

This is the end result, combined with some slightly dirty hacks (in python3.9.js) to make the script MIME type text/x-python evaluate automatically:

<script src="https://pypi.bartbroe.re/python3.9.js"></script>
<script type="text/x-python">
    from pprint import pprint
    import requests
    pprint(requests.get('https://httpbin.org/get', params={'key': 'value'}).json())
    pprint(requests.get('https://httpbin.org/post', data={'key': 'value'}).json())
</script>

Hopefully, my requests module will not have a long life, because the Pyodide project has plans to make a more sustainable solution. Until then, it might be a cool hack to support the up to 34K libraries that depend on requests in the Pyodide interpreter.

A new 2to3 fixer for unicode dundermethods

In Python 2 Django prefers using the __unicode__ member of any class to get human-readable strings to its interfaces. In Python 3 however, it defaults to the __str__ member. Porting guides and utilities specific to Django used to solve this by suggesting having a __str__ method, with the python_2_unicode_compatible decorator on the class. This was a nice enough solution for a long time, for code bases migrating from Python 2 to Python 3 or wanting to support both at the same time.

However, with the official deprecation of Python 2 on January 1st 2020, adding this decorator started making less sense to me. Now you definitely only should support Python 3 runtimes for Django projects. As an additional porting utility, I created a fixer for 2to3, that renames all __unicode__ dundermethods to __str__, where possible.

The current status of the fixer util is that I have created a pull request on the 2to3 library (even though I’m not sure whether it will be accepted).

Update: lib2to3 is no longer maintained, so just get the fixer from the diff of the closed pull request if you want to use it.