Python Packaging and Software Supply Chain

The Python language and open-source package ecosystem associated with it are today central to many numerically-oriented engineering, finance and business systems. For example, Python-based deep-neural network frameworks such as PyTorch and TensorFlow are the building blocks of most neural network applications, while packages such as scipy, pandas and similar and de-facto standard in many areas of engineering, science, financial analysis, econometrics and other fields.

While Python packaging makes it relatively easy to get going a number of clients have approached us recently for advice regarding more advanced packaging issues:

  1. Re-building Python and all required packages directly from their source code for full provenance information and fully securing the software supply chain
  2. Automatic build/test infrastructure for cloud deployments
  3. Building of Python packages and deployment advice for Microsoft Windows ecosystem

Security of Python packages

Python packages are largely open-source software without any contractual responsibility in respect of their security, performance or functionality. They are an incredibly useful and in the vast majority of cases safe and secure resource. However if you use them, you are the only person/business responsible for the outcomes.

Our specialisation is reducing the risk of Python packages with pre-compiled binary code (Python “wheels”, as used by e.g., numpy, scipy, pytorch, tensorflow, pandas etc) by maintaining a full software supply chain, that is by recompiling these packages directly from the source-code from all of the dependencies (including transitive dependencies).

Besides reducing security risks, this approach allows in-depth debugging, performance optimisation and makes it much easier to make custom modifications and improvements to the packages.

If you need advice or solutions, please let us know on webs@bnikolic.co.uk .