uv

The Problem

Pip, conda, virtualenv, venv, pipenv, pyenv, pyenv-virtualenv, requirements.txt, environment.yml, Pipfile, Pipfile.lock, poetry, setup.py, setup.cfg, uv, rye… A lot, right?

Python is a great language, but managing dependencies can be a nightmare. This is especially true when you’re working on multiple projects, each with its own set of dependencies. You use venv, but then realize each project might require different version of python itself. So you use conda, but then you have to remember to activate the right one every time you switch projects. And you don’t know which packages to install with conda, and which with pip. You switch to pipenv, and you have to wait 20 minutes to install one package in a complex environment…

If only we could have cargo for Python…

The Solution

Poetry is a tool that solves all these problems. At least it solves them better than other tools

Note

When creating tapyr, we decided to use poetry as a dependency manager. However, recent advancements in uv made us consider switching to it in tapyr v0.2. If you’re already using poetry, you can stick to it, but if you’re starting a new project, we recommend using uv. It’s faster, seems more lightweight, and works better with ML libraries like torch. You can find the original version of this page here.

uv is a new tool which takes python community by storm. It comes from the creators of ruff, other tool that has been very warmly welcomed by the community. Check out the uv guide for manage dependencies in projects.

Important

If you can’t use uv for some reason, you can install package with pip install . -e in the project directory. This will install the package (and its dependencies) in editable mode, so you can make changes to the code and see them immediately. You will have to manually add and install dependencies to pyproject.toml or requirements.txt.

Back to uv, it’s a tool that manages dependencies, virtual environments, and packaging. Apart from being very fast, it’s also very easy to use. Naturally it handles different platforms, is able to install the proper python version for the project. Since it uses the .venv directory, the majority of IDEs and other tools are able to recognize it and use it.

To have requirements.txt is better than nothing, but it’s not enough. uv defines dependencies in pyproject.toml, and has it’s own uv.lock file that ensures that the versions of libraries you’ve specified will work together. You can also specify different groups of dependencies, for example you need pytest for testing, but not for deploying and running your dashboard. By making environments tied to the project directory, with uv you can easily switch between different projects without worrying about conflicts and which environment has been used where.

One might argue that we already achieve reproducibility with Devcontainers, but practice shows not everyone uses them, not in every setup they can be used. For example for deploying apps on Posit Connect, you cannot use Devcontainers! We recommend using Devcontainer with uv pre-installed, but if you want to use uv in your local environment, follow the installation guide.

Packaging

You may wonder why do we want to package our code. To visualize it better, let’s consider the following example:

Project structure
├── src
│   └── model.py
└── scripts
    └── 01_analysis.py

And we get the following errors in 01_analysis.py when trying to import from model.py:

    from my_project.model import get_model
ModuleNotFoundError: No module named 'my_project'

or

    from ..my_project.model import get_model
ImportError: attempted relative import with no known parent package

Sounds familiar? I guess so.

The problem is that Python doesn’t know where to look for the src package, and you often end up with a mess of sys.path.append and PYTHONPATH environment variables. This is where packaging comes in.

By adding a few lines to pyproject.toml, you can make your project a package.

New structure
├── my_project
│   └── model.py
├── pyproject.toml
└── scripts
    └── 01_analysis.py
example pyproject.toml generated with uv init my_project
[project]
name = "my_project"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = []

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

Now all imports will work and you can run the script with uv run python scripts/01_analysis.py with from my_project.model import get_model. 🎉 If you don’t want to prefix commands with uv, you can activate the environment with source .venv/bin/activate and then run the script with python scripts/01_analysis.py.

This is immensely useful, and this knowledge is definitely under-shared in the data science community.

Note

Tapyr of course comes with packaging out of the box, so you don’t have to worry about it.

Installing uv-managed packages/projects in a non-uv environment

While uv is a great tool, sometimes you may need to install a package in an environment without uv. This is the case when deploying a Shiny app on Posit Connect. Fortunately, it’s as easy as running pip install ..

Tip

Note 1: You don’t need uv on your system to use project with uv-defined dependencies!

Direct uv and poetry comparison

In this section we compare common actions in uv and poetry. Both store dependencies in pyproject.toml, and use uv.lock and poetry.lock files to ensure reproducibility.

Creating a new project

# uv
uv init my_project --lib # or uv init . --lib if you're already in the project directory
# poetry
poetry new my_project

Note that uv has application and library modes. We’re using the library mode here for packages. If you need an environment for a script, you can use uv init my_project.

Running a script

# uv
uv run scripts/01_analysis.py # or uv run python scripts/01_analysis.py
# poetry
poetry run python scripts/01_analysis.py

Activating the environment

# uv
source .venv/bin/activate
# poetry
poetry shell

Syncing dependencies

# uv
uv sync
# poetry
poetry install

Adding a dependency

# uv
uv add pandas
# poetry
poetry add pandas

Removing a dependency

# uv
uv remove pandas
# poetry
poetry remove pandas

Adding a dev dependency

# uv
uv add pytest --dev
# poetry
poetry add pytest --group dev

Exporting requirements.txt

# uv
uv export --no-hashes --format requirements-txt > requirements.txt
# poetry
poetry export --without-hashes --format=requirements.txt > requirements.txt

Note that poetry doesn’t export the current project, so it won’t work if you’re not in the project directory. This means that for the Posit Connect deployment, you may need to add line (single dot)

.

to the requirements.txt file.

Why not conda and when conda?

Conda has its place in the world, but it’s not the best tool for managing dependencies in enterprise-ready projects. It’s good for data science projects, in particular you can install additional, non-python artifacts like r or julia! You can also install cuda drivers with conda, which allows you to have projects that use different versions of cuda on the same machine (important for computing clusters)!

However, conda is not known for its reproducibility nor for its speed. It also don’t provide a built-in packaging system, so you have to use setuptools and setup.py files. It’s just built for a different purpose.

Tip

Some packages like PyTorch might be difficult to install and are just heavy. You should consider exporting your model to onnx and use onnxruntime.