uv
The Problem
Pip, conda, virtualenv, venv, pipenv, pyenv, pyenv-virtualenv, requirements.txt, environment.yml, Pipfile, Pipfile.lock, poetry, setup.py, setup.cfg, uv, rye… A lot, right?
Python is a great language, but managing dependencies can be a nightmare. This is especially true when you’re working on multiple projects, each with its own set of dependencies. You use venv
, but then realize each project might require different version of python itself. So you use conda
, but then you have to remember to activate the right one every time you switch projects. And you don’t know which packages to install with conda
, and which with pip
. You switch to pipenv
, and you have to wait 20 minutes to install one package in a complex environment…
If only we could have cargo
for Python…
The Solution
Poetry
is a tool that solves all these problems. At least it solves them better than other tools
When creating tapyr
, we decided to use poetry
as a dependency manager. However, recent advancements in uv
made us consider switching to it in tapyr v0.2. If you’re already using poetry
, you can stick to it, but if you’re starting a new project, we recommend using uv
. It’s faster, seems more lightweight, and works better with ML libraries like torch. You can find the original version of this page here.
uv
is a new tool which takes python community by storm. It comes from the creators of ruff
, other tool that has been very warmly welcomed by the community. Check out the uv
guide for manage dependencies in projects.
If you can’t use uv
for some reason, you can install package with pip install . -e
in the project directory. This will install the package (and its dependencies) in editable mode, so you can make changes to the code and see them immediately. You will have to manually add and install dependencies to pyproject.toml
or requirements.txt
.
Back to uv
, it’s a tool that manages dependencies, virtual environments, and packaging. Apart from being very fast, it’s also very easy to use. Naturally it handles different platforms, is able to install the proper python version for the project. Since it uses the .venv
directory, the majority of IDEs and other tools are able to recognize it and use it.
To have requirements.txt
is better than nothing, but it’s not enough. uv
defines dependencies in pyproject.toml
, and has it’s own uv.lock
file that ensures that the versions of libraries you’ve specified will work together. You can also specify different groups of dependencies, for example you need pytest
for testing, but not for deploying and running your dashboard. By making environments tied to the project directory, with uv
you can easily switch between different projects without worrying about conflicts and which environment has been used where.
One might argue that we already achieve reproducibility with Devcontainers, but practice shows not everyone uses them, not in every setup they can be used. For example for deploying apps on Posit Connect, you cannot use Devcontainers! We recommend using Devcontainer with uv
pre-installed, but if you want to use uv
in your local environment, follow the installation guide.
Packaging
You may wonder why do we want to package our code. To visualize it better, let’s consider the following example:
Project structure
├── src
│ └── model.py
└── scripts
└── 01_analysis.py
And we get the following errors in 01_analysis.py
when trying to import from model.py
:
from my_project.model import get_model
ModuleNotFoundError: No module named 'my_project'
or
from ..my_project.model import get_model
ImportError: attempted relative import with no known parent package
Sounds familiar? I guess so.
The problem is that Python doesn’t know where to look for the src
package, and you often end up with a mess of sys.path.append
and PYTHONPATH
environment variables. This is where packaging comes in.
By adding a few lines to pyproject.toml
, you can make your project a package.
New structure
├── my_project
│ └── model.py
├── pyproject.toml
└── scripts
└── 01_analysis.py
example pyproject.toml
generated with uv init my_project
[project]
name = "my_project"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.10"
dependencies = []
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
Now all imports will work and you can run the script with uv run python scripts/01_analysis.py
with from my_project.model import get_model
. 🎉 If you don’t want to prefix commands with uv
, you can activate the environment with source .venv/bin/activate
and then run the script with python scripts/01_analysis.py
.
This is immensely useful, and this knowledge is definitely under-shared in the data science community.
Tapyr of course comes with packaging out of the box, so you don’t have to worry about it.
Installing uv-managed packages/projects in a non-uv environment
While uv
is a great tool, sometimes you may need to install a package in an environment without uv
. This is the case when deploying a Shiny app on Posit Connect. Fortunately, it’s as easy as running pip install .
.
Note 1: You don’t need uv
on your system to use project with uv
-defined dependencies!
Direct uv
and poetry
comparison
In this section we compare common actions in uv
and poetry
. Both store dependencies in pyproject.toml
, and use uv.lock
and poetry.lock
files to ensure reproducibility.
Creating a new project
# uv
uv init my_project --lib # or uv init . --lib if you're already in the project directory
# poetry
poetry new my_project
Note that uv
has application and library modes. We’re using the library mode here for packages. If you need an environment for a script, you can use uv init my_project
.
Running a script
# uv
uv run scripts/01_analysis.py # or uv run python scripts/01_analysis.py
# poetry
poetry run python scripts/01_analysis.py
Activating the environment
# uv
source .venv/bin/activate
# poetry
poetry shell
Syncing dependencies
# uv
uv sync
# poetry
poetry install
Adding a dependency
# uv
uv add pandas
# poetry
poetry add pandas
Removing a dependency
# uv
uv remove pandas
# poetry
poetry remove pandas
Adding a dev dependency
# uv
uv add pytest --dev
# poetry
poetry add pytest --group dev
Exporting requirements.txt
# uv
uv export --no-hashes --format requirements-txt > requirements.txt
# poetry
poetry export --without-hashes --format=requirements.txt > requirements.txt
Note that poetry
doesn’t export the current project, so it won’t work if you’re not in the project directory. This means that for the Posit Connect deployment, you may need to add line (single dot)
.
to the requirements.txt
file.
Why not conda
and when conda
?
Conda has its place in the world, but it’s not the best tool for managing dependencies in enterprise-ready projects. It’s good for data science projects, in particular you can install additional, non-python artifacts like r
or julia
! You can also install cuda
drivers with conda
, which allows you to have projects that use different versions of cuda
on the same machine (important for computing clusters)!
However, conda
is not known for its reproducibility nor for its speed. It also don’t provide a built-in packaging system, so you have to use setuptools
and setup.py
files. It’s just built for a different purpose.
Some packages like PyTorch might be difficult to install and are just heavy. You should consider exporting your model to onnx
and use onnxruntime
.