Poetry

Warning

This is a legacy document, for left here for historical reasons. If you’re looking for a way manage your environment, checkout the uv guide.

The Problem

Pip, conda, virtualenv, venv, pipenv, poetry, pyenv, pyenv-virtualenv, requirements.txt, environment.yml, Pipfile, Pipfile.lock, poetry, setup.py, setup.cfg, uv, rye… A lot, right?

Python is a great language, but managing dependencies can be a nightmare. This is especially true when you’re working on multiple projects, each with its own set of dependencies. You use venv, but then realize each project might require different version of python itself. So you use conda, but then you have to remember to activate the right one every time you switch projects. And you don’t know which packages to install with conda, and which with pip. You switch to pipenv, and you have to wait 20 minutes to install one package in a complex environment…

If only we could have cargo for Python…

The Solution

Poetry is a tool that solves all these problems. At least it solves them better than other tools.

The pythonic community sort-of agreed that Poetry is the ~way to go~ best tool it has so far. It defines dependencies in pyproject.toml, which is more readable than requirements.txt and you can specify different groups of dependencies like dev, deploy and others. It also creates a virtual environment for you, so you don’t have to worry about it. Then there’s the poetry.lock file that ensures that the versions of libraries you’ve specified will work together. It’s cross-platform, so you can use it on Windows, Mac, and Linux. By making environments tied to the project directory, you can easily switch between different projects without worrying about conflicts and which environment has been used where.

Poetry also handles multiple python version and packaging which is a big plus and you don’t have to worry about setup.py and setup.cfg files.

One might argue that we already achieve reproducibility with Devcontainers, but practice shows not everyone uses them, not in every setup they can be used. For example for deploying apps on Posit Connect, you cannot use Devcontainers! We recommend using Devcontainer with poetry pre-installed, but if you want to use poetry in your local environment, you should install it with pipx. pipx installs every package in a separate virtual environment, so it’s the recommended way to install CLI tools like poetry.

Packaging

You may wonder why do we want to package our code. To visualize it better, let’s consider the following example:

Project structure
├── src
│   └── model.py
└── scripts
    └── 01_analysis.py

And we get the following errors in 01_analysis.py when trying to import from model.py:

    from my_project.model import get_model
ModuleNotFoundError: No module named 'my_project'

or

    from ..my_project.model import get_model
ImportError: attempted relative import with no known parent package

Sounds familiar? I guess so.

The problem is that Python doesn’t know where to look for the src package, and you often end up with a mess of sys.path.append and PYTHONPATH environment variables. This is where packaging comes in.

By adding a few lines to pyproject.toml, you can make your project a package.

New structure
├── my_project
│   └── model.py
├── pyproject.toml
└── scripts
    └── 01_analysis.py
pyproject.toml generated with poetry new my_project
[tool.poetry]
name = "my_project"
version = "0.1.0"
description = ""
authors = ["Appsilon.com <hello@appsilon.com>"]
packages = [{include = "my_project"}]


[tool.poetry.dependencies]
python = "^3.12"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

and then, after running poetry install && poetry shell you can import model.py in 01_analysis.py with from my_project.model import get_model. 🎉

This is immensely useful, and this knowledge is definitely under-shared in the data science community.

Note

Tapyr of course comes with packaging out of the box, so you don’t have to worry about it.

Installing Poetry-managed packages in a non-Poetry environment

While poetry is a great tool, sometimes you may need to install a package in an environment without poetry. This is the case when deploying a Shiny app on Posit Connect. Fortunately, it’s as easy as running pip install ..

Tip

Note 1: You don’t need poetry on your system to use project with poetry-defined dependencies!

Why not conda and when conda?

Conda has its place in the world, but it’s not the best tool for managing dependencies in enterprise-ready projects. It’s good for data science projects, in particular you can install additional, non-python artifacts like r or julia! You can also install cuda drivers with conda, which allows you to have projects that use different versions of cuda on the same machine (important for computing clusters)!

However, conda is not known for its reproducibility nor for its speed. It also don’t provide a built-in packaging system, so you have to use setuptools and setup.py files. It’s just built for a different purpose.

Tip

poetry doesn’t work well with pytorch, but you rarely use it in Shiny apps. Better export your model to onnx and use onnxruntime.