Skip to content
Menu
Klein Embedded
  • About
Klein Embedded
May 31, 2022August 28, 2022

Developing and publishing a full-fledged Python package

I have recently been working on some Python code for interacting with an embedded device from a PC, e.g. setting and reading configuration parameters, reading measurements and doing firmware upgrades. The device uses our own communication protocol, for which we have already accumulated quite a few Python scripts. However, this protocol is used both for our main product as well as several customer projects, so we are looking to make the code a bit more “release-ready”. This includes:

  • Reviewing the existing code and refactoring/re-designing for easier maintainability, extensibility and user friendliness
  • Writing (in-code) documentation
  • Testing and performing static analysis to ensure code quality
  • Packaging everything up nicely and publishing it to The Python Package Index so it can be installed from anywhere with pip install my_package

The purpose of this post is to give a general idea of my process of developing the package, and provide you with inspiration for tools, language features, design patterns, etc. you might want to use in your own projects. I will cover a lot of different subjects in this post, so I will not go into depth with each of them, but I will provide links for further reading. Keep reading below or jump straight to a specific section:

  • Setting up the development environment
    • Virtual environment
    • Folder structure
  • Re-design and refactoring
    • Abstract base class (or an “interface” class)
    • Factory design pattern
    • Type hints and circular imports
  • Testing, type checking and linting
    • Unit testing
    • Static type checking
    • Linting
    • Automating it all with Jenkins
  • Documentation
  • Building and publishing a package
    • Building the package
    • Publishing the package

Setting up the development environment

I usually write Python code in PyCharm which is free to use, has intelligent code completion with IntelliSense, checks syntax and PEP8 conformity as you type, makes it easy to set up and manage a virtual environment and you can configure it to use Eclipse hotkey mappings – a good thing for an embedded developer who is used to Eclipse-based IDEĀ“s such as STM32CubeIDE or Code Composer Studio.

I created a new project with a new virtual environment using Python 3.10.

Virtual environment

I practically always create a new virtual environment when starting a new project. You can think of a virtual environment as an isolated Python installation that includes only the packages need for your specific project – completely separated from your global Python installation. This allows you to:

  • Use a specific Python version
  • Use specific packages (perhaps you need an older version than the one installed globally)
  • Keep track of which packages are required for your project
  • Avoid polluting the global environment with packages for every single project you create

This is very practical if you want to clone another developers environment or have another developer clone your environment. PyCharm asks you whether it should create a virtual environment for you when you create a new project, making the setup trivial. Even if you prefer to use the command line, the setup is as simple as running:

$ python -m venv .\venv

to create a virtual environment in the directory venv, and then activating it by running:

$ .\venv\Scripts\activate.bat

We’ll now see (venv) appear at the beginning of the command line. If we run pip list we will see that the environment only contains the packages pip and setuptools. If we run pip install <some_package>, the package will only be installed in the virtual environment and have no effect on the global environment. When sharing the project with other developers, we can generate a list (usually named requirements.txt) of all the packages installed in the virtual environment (and thus required to run the project) with the command pip freeze > requirements.txt. Other developers can now create their own virtual environment, install all the required packages with pip install -r requirements.txt and be up and running in no time.

Folder structure

When creating a Python package (described in detail here), there are a few mandatory files which must be added to the project:

  • pyproject.toml which tells the build tool how to build your package
  • setup.cfg which contains information about your package, such as name, author, dependencies, etc.

It is recommended to also include a README.md and a LICENSE file.

I will create the directory src/<my_package>/ where the source code will live. I will also add an __init__.py file to the directory to indicate that it is a package. The file will be left empty for now.

I will also add a doc folder for documentation, tests folder for unit tests and of course we have a venv folder for the virtual environment, which PyCharm created for us. What we end up with is a structure like this:

my_project
|--doc/
|--src/
|  |--my_package/
|     |-- __init__.py
|--tests/
|--venv/
|  pyproject.toml
|  setup.cfg
|  README.md
|  LICENSE

In order to try out the package while developing it, we can install it as an editable package with pip install -e <path_to_package_root> inside the virtual environment. Now we can open up a Python console, type import my_package and test things out. When you make changes to the source code, just re-open the console and import the package again – no need to reinstall it.

Now that everything is set up, we can get started on the code.

Re-design and refactoring

As with any project that starts out with modest requirements and then grows over time, there comes a time when it will be a good idea to stop and re-evaluate the design and refactor the code before it becomes too unwieldy. If this process is omitted, each new feature will take increasingly longer time to implement, until the code is so spaghetti’d up that implementing even a simple feature takes a inordinate amount of time – and you will probably risk breaking something in the process. I took some time to discuss the existing code with the original developers and together we came up with a list of improvements and design ideas.

The existing code consisted of a single Python file containing mostly procedural code and just a few (fairly large) classes. The first steps were to group related functionality together, determine areas where we should be able to extend the code in the future and generally take on a more object-oriented approach. Keeping the SOLID principles in mind is good way to keep the code clean as you go along.

In the process of refactoring the code I made use of concepts such as abstract base classes, the factory design pattern and type hints, which I will describe below.

Abstract base class (or an “interface” class)

In our custom communication protocol, Device objects communicate with each other using Packet objects that consist of a Header and a Payload. The Header always contains the same type of information and is interpreted in the same way, but the interpretation of the Payload depends on the “payload type” which is indicated in the Header – and the protocol should be extendable with new user-defined payload types. When the user creates a Payload to send to a device, he/she knows about the specific type of Payload. The module responsible for transmitting Packets, however, does not care which specific type it is dealing with. It just needs a way to determine the length, get an array of bytes to transmit and perhaps a string representation it can print out the console. To make sure that all payload types conform to this common interface, we can create an abstract base class which all payload types must inherit from:

from abc import ABC, abstractmethod

class Payload(ABC):
  @abstractmethod
  def __len__(self):
    pass

  @abstractmethod
  def __str__(self):
    pass

  @abstractmethod
  def to_bytes(self):
    pass

To create a concrete implementation of the interface, we simply inherit from Payload:

class ConcretePayload(Payload):
  def __init__(self, data):
    self.data = data

  def __len__(self):
    return len(self.data)

  def __str__(self):
    print("I am a concrete payload")

  def to_bytes(self):
    return bytes(self.data)

If we forget to implement an @abstractmethod in the concrete implementation, Python will give us an error as soon as an instance is created – as opposed to raising a NotImplementedError in the base class, which will only throw an exception if the method is called.

Using an abstract base class is not strictly necessary in Python since the language uses duck typing where the philosophy is: “If it walks() like a Duck and quacks() like a Duck, it’s probably a Duck“. This means that if we expect to be able to call e.g. a to_bytes() method on the object we pass into the transmit function, any object that has a to_bytes() method will work – the type does not matter.

However, I think creating an abstract base class is a good practice as it makes it very clear which methods are required and will make sure the developer actually implements them in the subclass.

Factory design pattern

Continuing the example from above, when receiving data from a device, I needed a way to create the correct Payload subclass depending on the payload type in the Header. The solution I chose for this was to create a factory function that takes the raw byte data and the payload type as arguments and then returns the correct Payload subclass:

def create_payload(data: array.array, payload_type: int) -> Payload:
  if payload_type in payload_subclass_map:
    return payload_subclass_map[payload_type](data)

The payload_subclass_map is simply a dictionary that maps payload_type to a specific subclass of Payload. Now, whenever we need to add a new payload type, we simply implement a new Payload subclass and add it to the payload_subclass_map.

Type hints and circular imports

Since I usually do most of my programming in statically typed languages such as C and C++, Python’s dynamic typing can make me a bit uncomfortable at times. I like being able to tell a function’s parameter types directly from its definition. Luckily, Python supports type hints which allow us to indicate the type for variables, objects, parameters and return values like so:

a: int = 42
b: float = 3.14
c: str = "Hello, World!"

def is_even(value: int) -> bool:
  return (value % 2 == 0)

It is important to remember that these are only hints, meaning that Python will not enforce type annotations and will happily reassign a variable from an int to a boolean. Type hinting does, however, allow us to perform static type checking (e.g. with mypy), makes it easier to follow objects passed into functions/methods and also nudges the code completion tool in the right direction.

One problem I have run into using type hints is circular imports. Say we have a class Foo with a helper class Bar in two separate modules. Foo imports Bar and creates an instance of Bar by passing a reference to itself in the constructor:

from bar import Bar

class Foo:
  def __init__(self):
    pass

  def create_bar(self) -> Bar:
    return Bar(self)

  def __str__(self) -> str:
    return "Foo"

If we write bar.py without type hints, and run main.py everything will work just fine:

class Bar:
  def __init__(self, foo):
    print(f"I am helping {foo}")
from foo import Foo

my_foo = Foo()
my_bar = my_foo.create_bar()

I am helping Foo

However, if we specify that our foo parameter is of the type Foo – and thus need to import Foo – we get a circular import error:

from foo import Foo

class Bar:
  def __init__(self, foo: Foo):
    print(f"I am helping {foo}")

ImportError: cannot import name ‘Foo’ from partially initialized module ‘foo’ (most likely due to a circular import)

Since we are only importing Foo in order to perform static type checking, we can use the typing package along with postponed evaluation of annotations (introduced in Python 3.7) to ensure that the import only happens when we are running our static type checking tool. To avoid having to enter type hints as strings, we can use from __future__ import annotations and just enter type hints as usual:

from __future__ import annotations
import typing

if typing.TYPE_CHECKING:
  from .foo import Foo

class Bar:
  def __init__(self, foo: Foo):
    print(f"I am helping {foo}")

And now the program runs as expected. We can now let mypy analyze the source code and check for type errors.

Testing, type checking and linting

Unit testing

For writing unit tests I will be using the unit testing framework unittest which is part of the standard library. To create a unit test, simply import the unittest package, create a class that inherits from unittest.TestCase which will serve as a “test group” and then implement each test as a method in that class. If the module is invoked directly, call unittest.main() to start the test runner. When using PyCharm, this is automatically generated when you create a new Python unit test file:

import unittest

class MyTestCase(unittest.TestCase):
  def test_something(self):
    self.assertEqual(True, False)  # add assertion here

if __name__ == '__main__':
  unittest.main()

As the project grows we will have several test files in the tests folder. In order for unittest to automatically discover all the tests in the folder, we will prefix all the test files with test_ (e.g. test_something.py) and add an empty __init__.py file. The folder will now look like this:

|--tests/
|  |-- __init__.py
|  |-- test_something.py

We can now discover and execute all tests in the tests folder by running the following command from the command line:

$ python -m unittest discover -s tests

Static type checking

Since we have added type hints to our code, we can run a static type check with a tool such as mypy. I installed the package with pip install mypy and can now run the static type check recursively on the src folder with:

$ mypy src

You can configure mypy by invoking it with command line arguments or by adding a [mypy] section with your configuration in either setup.cfg or a separate mypy.ini.

Linting

Apart from static type checking, I’ll also do static code analysis with pylint. As described in the documentation, it “checks for errors, enforces a coding standard, looks for code smells, and can make suggestions about how the code could be refactored“.

Basically, it will keep your code a bit cleaner. By default it is very pedantic and you probably want to configure it by creating a .pylintrc configuration file specifying exactly which errors, warnings, etc. you are interested in.

Automating it all with Jenkins

When writing code I tend to run both the unit test suite and mypy/pylint quite regularly – and (almost) always before pushing to the Git repository. However, to ensure that the tests and analyses are run on every commit, I added a job for the project on our Jenkins build server.

I created a Jenkinsfile at the root of the repository in which I defined a declarative pipeline with the following stages:

  • Do a clean checkout from Git
  • Set up virtual environment and install requirements
  • Perform static type check (mypy)
  • Perform static code analysis/linting (pylint)
  • Run unit tests
  • Build documentation

The build server polls the Git repository once in a while and executes the pipeline if it detects any new commits. The build results are published to our Microsoft Teams group, so we are immediately made aware if a commit breaks the build.

Documentation

For documentation I used Sphinx, which is undeniably the most widely used documentation tool for Python. Plain text source files are written in the markup language reStructuredText (or reST) which can then be compiled to make a pretty output in various different formats such as HTML or PDF. The documentation can be written both as comments directly in the Python files or as separate .rst files.

In case we want to host the documentation online, this can be done for free at ReadTheDocs.org. They also provide a Sphinx theme that I like a lot better than the default theme and thus will be using for this project.

To install Sphinx and the ReadTheDocs theme, I open up a terminal in the virtual environment and run:

$ pip install sphinx sphinx-rtd-theme

Now the basic folder structure, configuration files and build files can be created in the docs folder by running:

$ sphinx-quickstart docs

The docs folder now looks like this:

|-- docs/
|   |-- build/
|   |-- source/
|       |-- conf.py
|       |-- index.rst
|   |-- make.bat
|   |-- Makefile

The source folder contains a configuration file conf.py and a single reST file index.rst which serves as the entry point for our documentation. The build folder will contain the compile output which is generated when running the batch script make.bat.

To generate HTML output we can try to run make.bat html from the command line, and we will see that the build directory has been populated. We can open build/html/index.html to view the documentation.

To use the ReadTheDocs theme instead of the default theme, I’ll add the following to conf.py:

import sphinx_rtd_theme

extensions = [
  ...
  'sphinx_rtd_theme',
]

html_theme = "sphinx_rtd_theme"

In order to generate documentation from docstrings in our Python source code, we must also enable the autodoc feature and tell Sphinx where our Python package is located (i.e. in the parent directory). At the top of conf.py in the “Path setup” section, make sure that these lines are uncommented and that the path points to the parent (..) instead of the current (.) directory:

import os
import sys
sys.path.insert(0, os.path.abspath('..'))

Lastly, add autodoc to the list of extensions:

extensions = [
  ...

  'sphinx_rtd_theme',
  'sphinx.ext.autodoc'
]

Now, we can document classes, methods, functions, enums, etc. directly in the Python code using docstrings and include this documentation in the reST files where we find it appropriate using autodoc directives. I find that keeping most of the documentation in the code (where the developers are forced to look at it) increases the probability it being maintained properly.

Building and publishing a package

Building the package

When we get to a point where the code is ready for release, we first check conf.py to make sure that all package details are the way that we want them – and remember to update the version number for each release. We will first install the build package with pip install build, and then run the build command from the root of our package directory:

$ python -m build

A distribution folder dist/ is created with a zip archive (.tar.gz) and a wheel (.whl). The wheel is used to install the package (i.e. pip install my_package.whl) and the zip contains the source code, LICENSE, README.md, setup.cfg and pyproject.toml. The zip does not contain docs, tests or any other folders – those are just for the developers.

We now have the files needed to publish the package.

Publishing the package

Whenever we are installing a Python package with pip, it searches for the package in the Python Package Index (PyPI) by default. You can also tell pip explicitly which index to use by specifying the --index-url at the command line. For example to install a package from Test PyPI, which is a separate index created to let you play around with the distribution tools, we can run pip like this:

$ pip install --index-url https://test.pypi.org/project/my-package

To upload our package to the index we must first create an account. I already have an account for Test PyPI, so I will use that for now.

The tool used to publish the package is called twine and can be installed with pip install twine. Then publishing to Test PyPI is as simple as calling twine with the index and the path to the local distribution files:

$ python -m twine upload --repository testpypi dist/*

After entering the credentials, the package should be visible at https://test.pypi.org/project/<my_package> after a few minutes.

And that’s it! I learned a lot about Python through this process and hope this post has given you some ideas that you can use in your own projects.

Leave a Reply Cancel reply

You must be logged in to post a comment.

Subscribe to the newsletter

Get notified by email when a new blog post is published.

Check your inbox or spam folder to confirm your subscription.

Recent Posts

  • Adding right-click context menu items in Windows 10
  • CI/CD with Jenkins and Docker
  • STM32 without CubeIDE (Part 3): The C Standard Library and printf()
  • Understanding the (Embedded) Linux boot process
  • Calling C code from Python

Recent Comments

  1. Kristian Klein-Wengel on STM32 without CubeIDE (Part 3): The C Standard Library and printf()
  2. Milos on STM32 without CubeIDE (Part 3): The C Standard Library and printf()
  3. otann on STM32 without CubeIDE (Part 1): The bare necessities
  4. Ricci on STM32 without CubeIDE (Part 2): CMSIS, make and clock configuration
  5. Ricci on STM32 without CubeIDE (Part 2): CMSIS, make and clock configuration

Archives

  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021

Categories

  • C++
  • DevOps
  • DSP
  • Electronics
  • Embedded C
  • Embedded Linux
  • Firmware
  • Project
  • Python
  • Software Design
  • Testing
  • Tutorial
  • Uncategorized
©2025 Klein Embedded