Install a Python package on Debian/Devuan when apt has no package for it

I want to install some Python package pandasql system-wide, on a Devuan (or Debian) system. It’s in the Python Package Index, but – there doesn’t seem to be a (dpkg) package for it; let’s assume that there actually isn’t.

Now, if I try to pip install pandasql, I get a message suggesting I use a virtual environment:

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
    python3-xyz, where xyz is the package you are trying to
    install.
    
    If you wish to install a non-Debian-packaged Python package,
    create a virtual environment using python3 -m venv path/to/venv.
    Then use path/to/venv/bin/python and path/to/venv/bin/pip. Make
    sure you have python3-full installed.
    
    If you wish to install a non-Debian packaged Python application,
    it may be easiest to use pipx install xyz, which will manage a
    virtual environment for you. Make sure you have pipx installed.
    
    See /usr/share/doc/python3.11/README.venv for more information.

… but virtual environments are not what I want to do: I want to install foo system-wide. How can I do that?

Notes:

  • If possible, please answer more generally than just regarding pandasql.
  • Devuan Excalibur (~= Debian Trixie), Python 3.11.6, x86_64 machine
Asked By: einpoklum

||

The fact that Debian won’t let you do that does come from somewhere: pip system-wide installations are, by design, incompatible with system-managed python installations. Pip itself has recognized that when it degraded --system installations from "the default" to "the user needs to specify it and hopefully knows what they’re doing". Debian knows exactly that you’ll break things that way, so it doesn’t ship a pip that will let you do that.

So much for the theory (more background) of why Debian prevents you from doing that.

Now for the practice:

What can I do?

So, you need to do what pip install --system pandasql does, but do it in a Debian-compatible, safe way, that also doesn’t break everything when any dependency is updated. This includes inferring the Debian package names of the dependencies from the pip package, and making it so that if the dependencies change, that you’re not left with a dysfunctional package.

To little surprise, the Debian way of installing software are Debian packages. They can be updated, they have a list of dependencies, dpkg makes sure they don’t overwrite each other’s files, they can be cleanly uninstalled. All in all, they, from a whole-system perspective, are nicer than pip packages. You want that!

There’s a little helper program that does the hard work for you; sudo apt install pypi2deb; it’s not hard to use:

mkdir package_pandasql
cd package_pandasql
# Try an initial build
py2dsp --build pandasql

If that last step fails with error: Unmet build dependencies, that will tell you what you need to install to progress. In my case:

sudo apt install python3-all python3-numpy python3-pandas python3-setuptools python3-sqlalchemy

Let the installation run through and try a second time

py2dsp --build pandasql

That worked!

Now you have an installable package in package_pandasql/result/, and you can install it using sudo apt install ./result/python3-pandasql*.deb.

Answered By: Marcus Müller

As has been noted, there are good reasons why pip no longer lets you install packages system wide, but I also have this use case managing software for large numbers of users who may have access to the same machine and expect software to "just work" without having to have developer knowledge of setting up the software.

When users do try to install things on their own it leads to a lot of different versions being installed, some of them get updated but others don’t, and "it works for me but not everyone else" situations. We also use a lot of disk space by having independent copies of large packages for every single user.

A solution that currently works for me in debian:bookworm is to install packages for a special user and group, called python-global, and then allow other users to inherit that user’s packages by manipulating the PYTHONPATH variable.

Installing a package looks like:

sudo -u python-global python3 -m pip install <x> --break-system-packages

And in /etc/profile.d/python-global.sh there is this code:

if [ -z "${__SOURCED_PYTHON_GLOBAL__}" ]; then
    export PYTHONPATH="/home/python-global/python-packages/:$PYTHONPATH";
    export PATH="/home/python-global/.local/bin/:$PATH";
    export __SOURCED_PYTHON_GLOBAL__=1;
fi;

Whilst this could still sometimes result in a conflict with a system package, it’s relatively easy for a user to disable it by clearing the PYTHONPATH variable if they need to. Or they can set __SOURCED_PYTHON_GLOBAL__ in their own .profile to avoid the changes.

You may have to do additional work as /etc/profile.d is not automatically sourced, depending on your login manager. You may need to create a copy in /etc/X11/Xsession.d/ as well.

If using the flag that is described as "break your system" makes you nervous, you could probably replace the --break-system-packages here with a proper virtual environment owned by the python-global user. The virtual environment only has to be setup once, and every user uses the same virtual environment.

Answered By: szmoore