Documentation and Comments: Best Practices
Introduction
In this section, we outline best practices to enhance code readability, maintainability, and collaboration among developers.
Good documentation and well-placed comments are cornerstones of effective programming, making a codebase more understandable for others and for those who wrote it when they returned to it after some time.
Documentation and Comments Matter
There are several reasons why documentation and comments matter:
- Code Readability: Comments and documentation make the code easier to understand by providing context and explaining why certain decisions were made in the first place.
- Maintenance: Well-documented code is easier to debug, update, or enhance because the original intentions and functionalities are clearly described.
- Collaboration: In teams, it ensures that everyone has an understanding of the codebase, facilitating smoother collaboration and knowledge transfer.
In terms of best practices, the following should be considered when writing comments and documentation:
- Clarity and Brevity: comments and documentation should be kept clear and to the point. Unnecessary information that can clutter the code should be skipped.
- Relevance: comments should be relevant and provide additional insight that is not immediately obvious from the code itself.
- Maintenance: comments and documentation should follow along with updates as the code evolves to ensure they remain accurate.
- Consistency: usage of a consistent style for comments and documentation throughout one’s codebase is to be preferred to improve readability.
Types of Comments and Documentation in Python
There are two types of comments in Python: single-line comments and multi-line comments. Multi-line comments that appear as first statement in a package, module, function, class, or method definition are considered docstrings.
Single-Line Comments
Single-line comments start with a #
symbol and are used for brief explanations or annotations.
= 5 # Inline comment explaining the use of variable x x
Multi-Line Comments
Python does not have a specific syntax for multi-line comments. However, developers use consecutive single-line comments or multi-line strings with triple quotes for this purpose.
'''
This is a multi-line comment in Python
and can be used to explain more complex logic or operations.
'''
Docstrings
Docstrings are string literals appearing as first statement in a package, module, function, class, or method definition. They provide a convenient way of associating documentation with Python modules, functions, classes, and methods. Contrary to single-line comments, docstrings are intended to be used as a documentation for the code and their content is parsed by tools like Sphinx to generate documentation for the code. A docstring can be retrieved by running help()
on the function or class that’s been documented using a docstring.
Function Docstring
In the following example, a docstring is used to describe the function’s purpose.
def greet(name):
"""
This function greets to the person passed in as a parameter.
"""
print("Hello, " + name + "!")
Now let’s have a look at two examples for documentation showing the best and less ideal practices for documenting a function. Similar best practices apply to documenting classes and methods.
def calculate_area(base, height):
"""
Calculate the area of a triangle.
Parameters:
- base: The base of the triangle.
- height: The height of the triangle.
Returns:
The area of the triangle, which is calculated using the formula (base * height) / 2.
"""
return (base * height) / 2
The above example is a good practice for documenting a function. A docstring it’s been used to describe the function’s purpose, parameters, and return value.
The following example is a bad practice for documenting a function. The parameters are not documented and single line comments (which are not parsed by tools like Sphinx, thus not visible in the generated documentation and not retrievable by help()
) are used to describe the function’s purpose and return value.
def calculate_area(base, height):
# This function calculates the area of a triangle
return (base * height) / 2 # Area formula
Class Docstring
The docstring for a class should summarize its behavior and list the public methods and instance variables. If the class is intended to be subclassed, and has an additional interface for subclasses, this interface should be listed separately (in the docstring). The class constructor should be documented in the docstring for its __init__
method. Individual methods should be documented by their own docstring. Having a specific section addressing classes, we’ll skip the example here for sake of brevity.
Module Docstring
The docstring for a module should generally list the classes, exceptions and functions (and any other objects) that are exported by the module, with a one-line summary of each. If we had for example a module named data_processing
including the function clean_text
, extract_dates
, normalize_values
and the class DataCleaner
, the docstring for the module would appear at the top of the file as follows:
"""
Data Processing Utilities
========================
This module provides functions for processing and transforming data
commonly used in data analysis workflows.
Functions:
---------
clean_text(text: str) -> str
Removes special characters and normalizes whitespace in text data.
extract_dates(text: str) -> list[datetime]
Identifies and extracts dates from text in various formats.
normalize_values(data: list[float], method: str = 'minmax') -> list[float]
Normalizes numerical values using specified scaling method.
Classes:
-------
DataCleaner
A class providing methods for data cleaning and transformation.
Exceptions:
----------
ValidationError
Raised when input data fails validation checks.
Notes:
-----
All functions handle None values gracefully and raise appropriate exceptions
for invalid inputs. Unicode normalization is applied where relevant.
"""
Package Docstring
The docstring for a package (i.e., the docstring of the package’s __init__.py
module) should also list the modules and subpackages exported by the package. For example, if we had a package named data_processing
including the module data_processing.py
and the subpackage data_processing.preprocessing
, the docstring for the package would be:
"""
Data Processing Utilities
========================
This package provides modules for data processing and transformation.
Modules:
--------
data_processing
Main module for data processing utilities.
Subpackages:
-----------
preprocessing
Contains modules for data preprocessing and cleaning.
"""