Migrating to Tapyr Case Study

Important

The migration guide was written before Shiny 1.0 release, and tapyr 0.2 release. Most of the information should still be valid, the only larger change is that uv is now used instead of poetry for managing the environment. Instead of poetry add ... you should use uv add ..., and to run the a script you should use uv run shiny run app.py. uv guide on managing dependencies in projects.

The previous tutorial showed how to migrate a simple app to Tapyr.

Without further ado, let’s consider a more complex project, the Respiratory Diseases dashboard. Important steps are marked with πŸ“Œ emoji.

Environment

Original Repository

First, we have to prepare an environment. When I came back to the original repository, its dependencies were managed by requirements.txt + venv. I tried to create a new environment and run pip install -r requirements.txt, but it failed.

error: subprocess-exited-with-error

Γ— Getting requirements to build wheel did not run successfully.
β”‚ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Why? Nobody really knows, I tried to hunt down the package that was breaking the installation, I juggled with the python versions and finally managed to get it working.

Note

This is a good example of why it is important to have a reproducible environment. You never know when you will need to recreate it.

Environment with Tapyr

To create the environment with Tapyr, I followed the these steps:

  • πŸ“Œ Created a new repository using the Tapyr template on GitHub.
  • Cloned it to my local machine.
  • πŸ“Œ (In VS Code) Changed the name of app directory to respiratory_disease_tapyr from tapyr_template.
  • Note that if you have a python extension enabled, the above step will also refactor all the occurrences of tapyr_template to respiratory_disease_tapyr in the source code.
  • πŸ“Œ Changed all occurrences (name of the package and so on) of tapyr_template to respiratory_disease_tapyr in pyproject.toml
  • πŸ“Œ Installed the tapyr dependencies using poetry install.

Now it’s get a bit tricky, we have to find the real dependencies of the original project. It is a bit of a stretch, but I call real dependencies the ones that are actually used in the project. For example pandas is a real dependency, but lxml rarely is. Real are those that are imported in the code.

I went through the code and found the following dependencies [this is the requirements.txt version]:

  • htmltools [this is shiny dependency]
  • plotly [5.9.0]
  • pandas [1.4.3]
  • geopandas [missing]
  • shinywidgets [this is shiny dependency]
  • ipywidgets [7.7.1]
  • ipyleaflet [0.17.0]
  • branca [0.5.0]
  • numpy [1.23.1]

I don’t know what’s going on with geopandas, but that’s how it is with project that use requirements.txt file.

One important note is that the original project has been done in the alpha version of Shiny so I assume a lot has changed and I don’t want to pin it’s version. The same goes with htmltools and shinywidgets.

The rest of dependencies have been added to the pyproject.toml file. It could be either done with poetry add or manually. I went for manual addition. To do so I went to the pyproject.toml file:

Click to see pyproject.toml
[tool.poetry.dependencies]
python = "^3.10"
shiny = "^0.9.0"
rich = "^13.7.1"
loguru = "^0.7.2"
pydantic-settings = "^2.2.1"
python-dotenv = "^1.0.1"
# <--- added dependencies vvv
htmltools = "*"
plotly = "^5.9.0"
pandas = "^1.4.3"
geopandas = "*"
shinywidgets = "*"
ipywidgets = "^7.7.1"
ipyleaflet = "^0.17.0"
branca = "^0.5.0"
numpy = "^1.23.1"
# <--- added dependencies ^^^

[tool.poetry.group.dev.dependencies]
icecream = "^2.1.3"  # For debugging, print() on steroids
ipykernel = "^6.29.4"  # For running Jupyter notebooks in VS Code
# <--- removed ipywidgets vvv
# ipywidgets = "^8.1.1"
# <--- removed ipywidgets ^^^
pre-commit = "^3.7.0"
ruff = "^0.4.1"

Above you can see the adjusted pyproject.toml file. πŸ“Œ Changes I’ve made:

  • Added any version of htmltools, geopandas, shinywidgets to the dependencies with = "*" syntax.
  • Added the rest of the dependencies with the version that was in the requirements.txt file.
    • Note that I’ve used ^ to pin the version of plotly and numpy to the minor version. This is a good practice to avoid breaking changes in the future, while still allowing for minor updates, which should be safe. In case of problems, you can always pin the version to the exact one.
  • Removed ipywidgets from the dev dependencies, as it is required in the main dependencies in an earlier version.

Now, you can either reopen the project in devcontainer or (if you have poetry installed locally) run:

poetry lock
poetry install

and all the dependencies should be installed. The environment is ready.

If you use VS Code, please select the environment in the bottom left corner now.

Environment final touches

πŸ“Œ For the final touches, run poetry shell and then pre-commit install to install the pre-commit hooks. Those will prevent you from committing code that doesn’t pass the linting and other checks.

πŸ“Œ Commit the changes to the repository. In the early steps of migration you may need to add the -n flag to git commit to skip the hooks as the code may not pass them yet. But we’ll get there!

Code

Original Code

Let’s take a look at the original repository files:

.
β”œβ”€β”€ README.md
β”œβ”€β”€ app.py
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ geojson_to_dataframe.py
β”‚   └── *.csv / *.geojson  # data files
β”œβ”€β”€ modules
β”‚   β”œβ”€β”€ map.py
β”‚   └── plot.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ utils
β”‚   β”œβ”€β”€ helper_text.py
β”‚   β”œβ”€β”€ map_utils.py
β”‚   └── plot_utils.py
└── www
    └── *  # png/js/css and other static files

After inspection, we see that app.py defines both, the ui and the server functions. In the modules directory we have two shiny modules, this is already a sign of good code organization. Additionally, we have some utility functions and data processing/loading functions that controversially have been placed in the data directory.

Migration

  • πŸ“Œ As the first and easy migration step, we can move csv and geojson files to the data directory in the Tapyr project. Check the 4fff141 commit.
  • πŸ“Œ The same goes with the www directory contents.

πŸ“Œ Now, the main game begins. We have to migrate the code.

  1. First, let’s extract the contents of app.py to respiratory_disease_tapyr/view/root/server.py and respiratory_disease_tapyr/view/root/ui.py.
  2. Let’s migrate modules by creating map and plot directories inside respiratory_disease_tapyr/view. In those directories, create server.py, ui.py, and __init__.py files. Copy the server and ui functions from the original modules to the new files. In the __init__.py file, import the server and ui functions for easier access.
from .ui import plot_ui  # noqa F401
from .server import plot_server  # noqa F401
  1. Move helpers to respiratory_disease_tapyr/helpers by putting there all 3 files from the utils directory. Also create an empty __init__.py file there (it is required for Python to recognize the directory as a package).
  2. Put the __init__.py and geojson_to_dataframe.py files from the data directory to the respiratory_disease_tapyr/logic/data_loading directory.

Huh, that was a lot of work. But now we have also to fix imports. For example, instead of from modules import map, plot we should import from respiratory_disease_tapyr.view import map, plot. Now we can run the app for the first time and see if it works πŸŽ‰ Before we commit we can also run pre-commit run --all-files to fix at least some of the issues like formatting and part of the linting. With those steps, we can commit the changes (we still need -n to ignore some linting issues). Check 7d794b0 commit.

Code Fixing

Ruff Linter

The ruff linter found some issues in the code. Let’s fix them.

Those are fortunately easy to fix. 1. Ruff tells us to not use zip without the strict argument. We can fix it by adding strict=True to the zip function. This ensures that if two iterables have different lengths, an error will be raised. 2. We don’t have quick and easy fix for json = eval(gdf.set_index("id").to_json()). It’s a security risk to use eval function in general. For now we mark it with # noqa: S307, and we will come back to it later. 3. In geojson_to_dataframe.py we create a dataframe and save it to file. As for python-first developer it looks strange as we do imports from this file, but that’s for later. Ruff tells us that we shouldn’t use df as dataframe name. However, we don’t have any knowledge about this code so we mark it # noqa: PD901. 4. Ruff tells us to not concatenate lists with +. We can fix it by using [*l1, new_element] instead of l1 + [new_element].

Now we can commit and we finally don’t need -n flag! Check 165a605 commit.

Pyright type checker

Pyright found some issues in the code. Let’s fix them. They all arise from pandas being unable to infer the types of certain operations. One general rule is to always use df.loc/df.loc instead of df[] as those are more type-safe/predictable.

Fixes are easy:

  1. In lines like str(round(row["Death.Rate"], 2))) add loc so it’s str(round(row.loc["Death.Rate"], 2))).
  2. In few other places add loc to dataframes return data.loc[data["Year"] == year] -> return data.loc[data.loc["Year"] == year].

Now we can commit the changes. Check d62bfeb commit.

PyTest

After running pytest we see that 2 tests are passing, one is failing. The one failing is not worrying, but what the heck is passing? We see that we forgot to remove the divide function from the tapyr repository.

We should remove the file respiratory_disease_tapyr/logic/utils.py and remove the contents of tests/unit/test_utils.py files.

Now, let’s take a look at the test_ui.py file.

def test_footer(page: Page):
    page.goto(APP_URL)
    expect(page.get_by_test_id("docs-link")).to_contain_text("Start with the docs!")

It fails, because we’re testing the the template app. However, we don’t want to remove this test entirely, let’s change it to:

def test_startup(page: Page):
    page.goto(APP_URL)
    expect(page).not_to_have_title("404")

This way we will test if the app is running correctly.

Important

It’s not just running the app. This way we ensure that:

  1. The environment has been build correctly from devcontainer.json file.
  2. All the dependencies are installed.
  3. The app is able to start

πŸŽ‰

Awaited Green Checkmark

We get bunch of warnings from deprecations in dependencies, but we can ignore them for now.

Push. 2316722

Summary

We have a fully working app now. We have migrated the code, fixed the issues found by linters and type checkers, and even added the tests.

Now we can focus on further polishing the app, adding new features, and improving the code quality, depending on the project and stake holders requirements. πŸŽ‰