Post

Where to put all your utils in Python projects?

This is a follow-up post to Stop naming your Python modules "utils". This time, let's see different options on organizing utility code.

What is utility code?

It is code that is created as a side effect when working on features but does not belong to where they are implemented. It still is necessary or convenient to use but can be easily generalized or reused in many different places in the code.

Few examples from the projects I've been working on:

  • a generator function that accepts an iterable and yields chunks of N elements
  • sanitiser function for phone numbers
  • convert XML to dict
  • format datetime objects in a way that will be accepted by some 3rd party provider*
  • get a random string of a given length

*that one is not typical util, it should be somewhere near the code responsible for communicating with that 3rd party provider.

Where can you put utils in Pythonic projects?

Next to feature code that uses it

Initially, this may be a good idea. However, eventually becomes awkward if you want to reuse that elsewhere. Let's say there is a utility function implemented next to Feature A and you want to use that in Feature B. Should the latter know about Feature A only because we would like to reuse some utility code? Nope.

Keep reading if you want to learn about better ways. That being said, if a utility is only used in a Feature A - deferring refactoring it out to another place will not do any harm.

In "utils.py"/"misc.py" etc

...πŸ‘Ώ No, do not put it there. Why? See why this is bad in a related article: Stop naming your Python modules "utils".

Under specific path in the project, grouped by themes

Throwing every random bit of utility code in a single file is a terrible idea. How about acknowledging the need of extending a standard library sometimes? And creating a dedicated space...? Let's say your tiny project is called foo and is organized as follows:

foo/
β”œβ”€β”€ api
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”œβ”€β”€ authors.py
β”‚Β Β  └── books.py
β”œβ”€β”€ models
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”œβ”€β”€ author.py
β”‚Β Β  └── book.py
└── settings.py

Then, you can add another subtree:

foo/
β”œβ”€β”€ api
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  └── ...
β”œβ”€β”€ models
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  └── ...
β”œβ”€β”€ settings.py
└── utils <---- Aw, there we go
    β”œβ”€β”€ datetimes.py
    β”œβ”€β”€ iteration.py
    └── isbn.py

Using this method, you can put your utility code in a dedicated namespace. For example, let's say we need a utility to validate ISBN. We could put it there so from foo.utils.isbn import validate_isbn.

In fact, that's how utility code is organized in a Django web framework, see these modules:

just to name a few.

This is the first approach that actually acknowledges the nature of utility code.

Avoid utils - create a shared core / foundation

Better yet, we could try to distil something like a shared core as understood by Tactical Domain-Driven Design. Structurally, it will look similar to the previous approach. First, we create a separate subpackage and call it shared_core or foundation. It will be home to a shared understanding of some concepts, like ISBN:

foo/
β”œβ”€β”€ api
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  └── ...
β”œβ”€β”€ models
β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  └── ...
β”œβ”€β”€ settings.py
└── shared_kernel
    └── isbn.py

In fact, we could have a separate class (type!) for this concept:

class ISBN:
    _value: str

    def __init__(self, value: str) -> None:
        # check if value looks like ISBN - if not, raise ValueError
        self._value = value

    def __repr__(self) -> str:
        return self._value

and get rid of procedural thinking that tempts us to validate str instances if they are valid ISBN. Just create an ISBN instance which will make sure it's valid and pass it around!

Separate packages uploaded to PyPI or a private package repository

Finally, you can notice utils can often be separate packages uploaded to PyPI. This is probably a perspective of JavaScript devs and the reason why npm registry swarms with really tiny libraries. (Second thing contributing to it may be that JS standard lib is very small compared to Python).

If for some reason you do not want or can not share the code with the world, why not set up a private PyPI in your company, so these libraries can be easily reused across projects...?

All in all, this often this code will be:

  • general,
  • independently (and easily!) testable,
  • reusable.

Summary

Sometimes a question "where shall I put that code" can be a burning one. The answer is pretty obvious in an organized codebase but does not come easy in a messier one. Let me know in the comments section what was the last time you wondered where should you put something in a project.

I hope that after reading that article you won't ever mistreat your Python utils again :) Last but not least - don't forget to look for libraries that already cover functionality you need instead of writing your own utility.

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.