Dive Into PyShiny by Appsilon
pandas
users.git
, just use git
.A very common problem in data science is how to organize your code, and experiments.
We recommend creating package for the main code, and then call it from scripts/notebooks.
Check out the example_datascience_project
directory for an example.
├── README.md
├── data <- Folder for data
│ └── large_data.txt
├── example_datascience_project <- Main logic
│ ├── __init__.py
│ ├── model.py
│ └── serialization.py
├── notebooks <- Jupyter notebooks, numbered for order
│ └── 01_run_model.ipynb
├── poetry.lock
├── pyproject.toml <- Project configuration
├── scripts <- Scripts to run the code
│ └── 01_run_model.py
└── tests <- Tests for the code
├── __init__.py
└── unit
└── test_model.py
Important
However, you should never install packages globally on your machine.
This is the simplest way to manage dependencies. However, it’s not very convenient, and I don’t recommend it.
One of venv
problems is that you can work only with the python version you have installed.
pyproject.toml
, and storing code as package.Important
Poetry doesn’t work well with pytorch, tensorflow. For these, you should use conda
.
ruff
that takes python ecosystem by storm.cargo
for python.cargo
is a beloved tool for dependency management in Rust.ruff
Tip
You should use ruff
instead of flake8
, pylint-*
, black
, isort
, bandit
!
ruff format
ruff format
ruff check
ruff check example.py
gives:
example.py:2:5: F841 Local variable `c` is assigned to but never used
Found 1 error.
No fixes available (1 hidden fix can be enabled with the `--unsafe-fixes` option).
Note
We found an error in the code that would be hard to spot without ruff
.
mypy
/ pyright
mypy example.py
gives:
example.py:3: error: Module has no attribute "not_existing_function" [attr-defined]
Found 1 error in 1 file (checked 1 source file)
mypy
/ pyright
mypy
is a static type checker for python.mypy
will check if you’re using them correctly.mypy
/ pyright
Let’s consider more advanced example
mypy
/ pyright
mypy
catches two potential errors!
example.py:4: error: List item 1 has incompatible type "tuple[str, None]"; expected "tuple[str, str]" [list-item]
example.py:7: error: Argument 2 to "Person" has incompatible type "str"; expected "int" [arg-type]
pydantic
pydantic
will.pydantic
We get:
pydantic_core._pydantic_core.ValidationError: 1 validation error for Person age
Input should be a valid integer [type=int_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/int_type
While dataclasses
will be silent about it.
Important
Type hints are just hints. They’re not enforced by python interpreter!
Business needs to have dashboards validated.
Many companies have dedicated Quality Assurance teams.
As software engineers, we call the validation process testing.
The process of validating an app by mimicking real users behavior is called end-to-end testing.
It’s not some function, runs with some parameters, and we’re happy.
It’s simulating real interactions with clicks and typing inputs in a programmatic way.
pytest
.cypress
for end-to-end testing.playwright
that allows you to write end-to-end tests in python.