Why pip and Homebrew make a dangerous cocktail20 Jul 2023
pip install will happily replace anything in
With the source distribution, it’s simply running the
setup.py which often contains a call to
You could argue that pip installing source distributions is RCE by design.
Binary distributions (often wheels) are not intended to run code immediately during install.
They simply copy files using all kinds of logic defined in
This should make them less dangerous.
Homebrew is “The Missing Package Manager for macOS”, and probably the most popular way for MacOS users to get Python:
Python 3.* was brew installed around 500,000 times in the last 30 days.
“Homebrew installs packages to their own directory and then symlinks their files into
/opt/homebrew (on Apple Silicon).”
Note that this is
/usr/local/ on Intel Macs.
“Homebrew won’t install files outside its prefix and you can place a Homebrew installation wherever you like.”
In this post I’ll assume a Python 3.9 installation, performed with
brew install [email protected].
I’ll demonstrate that a malicious Python package can replace files in the Homebrew prefix directory, by default
/usr/local for Intel Macs and
/opt/homebrew/ for ARM Macs.
By defining the following
setup.py we could even replace the
python3.9 executable itself.
python3.9 with an executable that simply outputs
Not Python to demonstrate the issue:
from setuptools import setup setup(name='malware', version='3.2.1', description='malware', url='https://example.com', author='', author_email='[email protected]', # Every file in /usr/local can be poisoned by including data_files. # If they already existed, the executable flag is preserved # This is just one example of a file that can be replaced: data_files=[("Cellar/[email protected]/3.9.17_1/bin", ["python3.9"])], packages=, install_requires=)
In a video this looks like this:
If a data file has the same path as an existing file, and the existing file has executable bits set, they will remain set!
Furthermore, real world attacks will be much more subtle than the one above.
An attacker could patch some malware into a dynamic library while preserving its original functionality.
lief will help you do that with ease.
Of course, before posting this here, I tried to find out how well known this issue is. After some discussion with the people running the security mailing lists at Python and the Python Packaging Authority, the conclusion is: There is no fix for this, at least not without breaking someone’s legitimate uses.
It is possible to think of some countermeasures to reduce the risk, however. One countermeasure could be screening packages for executables and libraries in places they don’t typically belong. This can be done in three steps:
- Download the package and its dependencies using
pip download, without installing them. The
--only-binary=:all:is important because
pip downloadwill run a source distribution to find out its dependencies.
mkdir /tmp/scan python3 -m pip download --only-binary=:all: -d /tmp/scan package_name
- Apply the following script to detect any executables or libraries in places they don’t belong (using
import glob import zipfile import magic # brew install libmagic && python3.9 -m pip install python-magic from tqdm import tqdm # python3.9 -m pip install tqdm for wheel in tqdm(list(glob.glob("/tmp/scan/*.whl"))): wheel = zipfile.ZipFile(wheel) for file in wheel.filelist: # this is an indication that data was added using data_files: if '.data/data/' in file.filename: # let libmagic find out what it is: magic_guess = magic.from_buffer(wheel.open(file.filename).read()) # change this when not on MacOS to something relevant to your platform: if 'Mach-O' in magic_guess: print(wheel.filename, file.filename, magic_guess)
Note that in this example I only check for Mach-O libraries and executables. This reduces false positives. For example: If a package author sets
include_package_data=Truethere will be lots of Python files in the data directory as well. This scanning also does not check for shell scripts or other types of executables. To cast a wider net, check for the words
'library'in the output of
- Inspect the output of step 2, and decide if you still want to run
pip installon the same target and its transitive dependencies.
Of course, it’s best to combine this scan-before-you-install with some long-standing best practices:
Firstly, avoid installing pip source distributions when possible, using the
Secondly, follow the recommendations here, and configure Homebrew to use a directory not on the
This means disregarding Homebrew’s own documentation, warning that it might be inconvenient not to have this set to
To summarize: installing things always introduces an inherent risk, but the way Homebrew’s Python is configured might pose an additional risk.