Skip to content

About this Resource

This repository is meant to serve as an opinionated, pedagogical guide on software engineering best practices for those of us in machine learning.

What is this guide NOT?

It is NOT a comprehensive overview of best practices in software engineering. It is a highly opinionated sampling of tools and ideas that are important for writing good code in a machine learning project. This includes linting, formatting and testing.

Who is this guide for?

Machine learners who are not currently following software engineering best practices in their projects but would like to.

What is machine learning specific about this guide?

In truth, nothing important. We use Python as the language of choice as it is popular in machine learning, and we write some machine learning specific tests. Otherwise, this guide could apply to (almost) any python project.

What tools will this guide cover?

  • Poetry, for managing virtual environments and package dependencies.
  • flake8, for linting.
  • black, for formatting.
  • pytest, for testing.
  • GitHub Actions, for continous integration / continous development (CI/CD).