Python Application Notes: packages, modules and classes

For better structuring a Python project, I decide to learn of packages, modules and classes. This article also shows how to import all modules and classes in a directory.

1. Packages, modules and classes

A package includes a collection of modules (a module is simply a Python source file *.py) that can expose classes, functions and global variables.

1.1 Packages

To create a package, put (can just be an empty file) into a subdirectory (say lib/) to make Python treat lib as a package. For instance, a collection of frequently used modules are grouped into a package lib, as listed below in a hierarchical tree structure.

$ tree -P '*.py' .
├── lib
│   ├──             # treat this directory as a package
│   ├──         # a Python module
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   ├──
│   └──

from package import item, the item can be either a submodule/subpackage of the package, or some other name defined in the package, like a function, class or variable. import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.

from item.subitem import subsubitem

# is equivalent to 
import item.subitem.subsubitem
subsubitem = item.subitem.subsubitem

1.2 Modules

A module is simply a Python source file containing Python definitions and statements.

The built-in function dir([object]) returns a sorted list of

  • dir() names that the module defines, but does not list the names of built-in functions and variables.
  • dir(a module object) names of the module’s attributes
  • dir(a class object) names of the class’ attributes and recursively of the attributes of its bases.
  • dir(otherwise) names of the object’s attributes, its class’s attributes, and recursively the attributes of its class’s base classes.
>>> from lib import *

>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'analyzegraph', 'autovivification', 'connectioneventsgenerator', 'debugtraces', 'distancepoints', 'dominatingsets', 'messageeventsgenerator', 'output', 'plotgraph', 'processdatasets', 'readgtfs', 'routetable', 'typeconverter']

# a module object
>>> dir(dominatingsets)
['DominatingSets', 'LABEL_CANDICATE_SETS', 'LABEL_CDS', 'LABEL_NAME', 'LABEL_NOT_CDS', 'OrderedDict', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'copy', 'nx', 'nxaa', 'plt']

# a class object
>>> dir(dominatingsets.DominatingSets)
['__doc__', '__module__', 'get_connected_dominating_sets_greedily',  'get_dominating_sets']

The contents of

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import matplotlib.pyplot as plt
import networkx as nx
import networkx.algorithms.approximation as nxaa
from collections import OrderedDict
import copy

LABEL_NAME = 'node_label_cds'

class DominatingSets:
    def get_dominating_sets(cls, G, weight=None):

    def get_connected_dominating_sets_greedily(cls, G, weight='degree'):

1.3 Classes

Python’s class mechanism is a mixture of the class mechanisms found in C++ and Modula-3.

class MyClass:
    class_variable = 'class variable'       # class variable shared by all instances

    def __init__(self, name): = name                    # instance variable unique to each instance

1.4 Load all modules

I expect to load all modules under lib into by from lib import *.

#!/usr/bin/env python 

from lib import *       # load all modules

dominating = dominatingsets.DominatingSets()

When running the code from lib import * (call lib/ implicitly), a list of module names __all__ defined in lib/ are imported to the current namespace. Therefore, __all__ should be specified to load all modules. The contents of lib/ are as follows.

from os.path import dirname, basename, isfile
import glob

modules = glob.glob(dirname(__file__) + "/*.py")
__all__ = [basename(f)[:-3] for f in modules if isfile(f) and not basename(f).startswith('__')] # exclude

1.5 Load all classes

In section 1.4, import lib from * imports the modules, but not the classes. Therefore, to access a class, the class should be prefixed with the module name separated by a dot, e.g., dominatingsets.DominatingSets() instead of DominatingSets(). Refer to [4] to import all classes, (PS: it doesn’t work for me)

import os, sys

path = os.path.dirname(os.path.abspath(__file__))
for py in [f[:-3] for f in os.listdir(path) if f.endswith('.py') and f != '']:
    mod = __import__('.'.join([__name__, py]), fromlist=[py])
    classes = [getattr(mod, x) for x in dir(mod) if isinstance(getattr(mod, x), type)]
    for cls in classes:
        setattr(sys.modules[__name__], cls.__name__, cls)

2. Naming conventions

As described in PEP 0008 — Style Guide for Python Code,

Package and Module Names

  • Packages should have short, all-lowercase names, although the use of underscores is discouraged.
  • Modules should also have short, all-lowercase names. Underscores can be used in the module name if it improves readability.
  • Classes should normally use the CapWords convention.

PS: When an extension module written in C or C++ has an accompanying Python module that provides a higher level (e.g. more object oriented) interface, the C/C++ module has a leading underscore (e.g. _socket).

[1]StackOverflow: How do I load all modules under a subdirectly in Python?
[2]PEP 0008 — Style Guide for Python Code: Package and Module Names
[3]StackOverflow: Loading all modules in a folder in Python
[4]StackOverflow: Import all classes in directory?
[5]Python Guide: Structuring Your Project


电子邮件地址不会被公开。 必填项已用*标注