python

发表于 2021-09-06 更新于 2025-08-01 分类于 develop 本文字数： 60k 阅读时长 ≈ 54 分钟

资源

资源	组织	类型
scientific python development	python	文档
Python Enhancement Proposals Index	python	文档
geeksforgeeks	python	docs
可乐 python	docs	文档
pythonguide	pythonguide	docs
superfastpython	superfastpython	blog

beginner

debug in vscode

前置条件

vscode
vscode plugins
- ms-vscode-remote.vscode-remote-extensionpack
运行中的服务开放 attach 条件 debugpy

调试采用 python 插件提供的模板, 常用模板基于文件 (file) 或基于模块 (module), 模板中字段说明参考插件及 vscode 预定义变量引用文档.

基于文件模板样例:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}", // the current opened file
            "console": "integratedTerminal",
            "env": {},
            "args": [],
            "cwd": "${fileWorkspaceFolder}", // the current opened file's workspace folder
            "justMyCode": false
        }
    ]
}

方法

vscode->Remote Explorer->SSH Targets 通过配置连接进入要调试服务所在主机。如下示例：

Host myhost
    HostName 192.168.1.156
    User root
    Port 22

vscode->Remote Explorer->Containers 可以看到主机中运行的容器，右键 Attach to Container，vscode 自动进入容器内部：

20220804202137.

安装调试语言插件，ms-python.python
进入调试配置左侧栏，创建基于 Python 调试配置模板，选择 Attach 模板 Attach using Process Id
生成文件 .vscode/launch.json

20220804202048.

20220804202016.

增加匹配调试服务的端口 port 到 launch.json 文件，port 可从服务路径 configs\default_settings.yaml 找到 attach_port: 12345 # 服务默认 attach 调试端口，或从环境变量 CONF_ATTACH_PORT 中获取最新 port ：

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Attach using Process Id",
            "type": "python",
            "request": "attach",
            "processId": "${command:pickProcess}",
            "justMyCode": true,
            "port": 12345
        }
    ]
}

选择创建的调试模板，启动 attach 模板，选择要调试的服务进程 pid，注意每个进程调试都需要启动一次会话。

20220804201927.

links:

debug in GDB

前置条件

使用附带自动加载对应 python 版本脚本的 gdb 环境。

常用命令

Gdb 调试 Python 进程以 C 底层实现调试，看不到直接的 Python 源码，需要辅以一下命令：

[!TIP]
Debugging commands with Gdb

py-bt: 输出 Python 调用栈
py-bt-full: 输出 Python 调用栈
py-down: 在调用栈向下一级
py-list: 显示代码
py-locals: 输出 locals 变量
py-print: 输出
py-up: 在调用栈向上一级

links:

debug in vs

Windows 上用 VS 进行混合调试 Python / C++。

links:

Tricks

数据格式化

保留两位小数

# 使用字符串格式化
def float_to_str(f: float) -> str:
    return f"{f:.2f}"


# 使用round函数
num = 3.1415926
result = round(num, 2)

from decimal import Decimal

# 使用Decimal类
num = Decimal("3.1415926")
result = str(num.quantize(Decimal("0.00")))

prefer dataclass

创建数据类通常使用 python 内置的 dataclass.

它在 Python 3.7 版本中引入。它旨在简化创建和操作数据类的过程。数据类是一种用于存储数据的类，通常只包含属性而不包含方法。

使用 dataclass 装饰器可以自动为类生成一些常见的方法，例如 init 、 repr 、 eq 等。这样，您无需手动编写这些方法，可以更快速地创建和使用数据类。

links:

下面是一个使用 dataclass 的示例：

from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int
    profession: str

# 创建一个Person对象
person = Person("John Doe", 30, "Engineer")

# 打印对象的字符串表示
print(person)  # 输出: Person(name='John Doe', age=30, profession='Engineer')

# 比较两个对象是否相等
person2 = Person("John Doe", 30, "Engineer")
print(person == person2)  # 输出: True

在上述示例中，使用 dataclass 装饰器来创建一个名为 Person 的数据类。它有三个属性： name 、 age 和 profession 。 dataclass 装饰器自动为该类生成了 init 、 repr 和 eq 方法，使能够更方便地创建和比较对象。

除了自动生成的方法， dataclass 还提供了其他参数来自定义类的行为。例如，您可以使用 frozen = True 参数使生成的类成为不可变类，即对象创建后不能修改。

总之， dataclass 是 Python 中一个非常有用的工具，它简化了创建和操作数据类的过程。通过使用 dataclass 装饰器，您可以更高效地编写代码，并减少样板代码的数量。

WorkingDirContext

实现保留工作目录上下文

from __future__ import annotations

import os
from types import TracebackType


class WorkingDirContext:
    """Working directory context definition"""

    def __init__(self, path: str | os.PathLike[str]) -> None:
        self._prev_working_dir = os.getcwd()
        os.chdir(path)

    def __enter__(self) -> WorkingDirContext:
        return self

    def __exit__(
        self,
        exc_type: type[BaseException] | None,
        exc_value: BaseException | None,
        traceback: TracebackType | None,
    ) -> None:
        """Switch the previous directory"""
        os.chdir(self._prev_working_dir)

cmd 直接执行

脚本可以直接通过 python 执行, 只需要指定 -c参数即可. 需要注意地是:

命令将使用; 隔开

1	python -c "import os;print(os.environ)"

动态添加属性或方法

作为动态语言，python 能动态运行时添加动态属性和方法。在不修改源码的情况下，可以通过在运行期间，动态添加属性。参考:

获取目录

获取当前文件的绝对路径

from pathlib import Path

root = Path(__file__).resolve().parent

import os
root = os.path.dirname(os.path.abspath(__file__))

判断是否在调试模式

import sys

def is_debugging():
    return sys.gettrace() is not None

超时退出

基于 signal 实现

python signal 负责处理内存信号处理，超时可以通过 SIGALRM 信号触发超时操作。

import signal


class TimeoutError(Exception):
    """自定义超时异常"""

    def __init__(self, msg, frame):
        super(TimeoutError, self).__init__()
        self.msg = msg

    def __str__(self) -> str:
        return self.msg


def time_out(interval: int, callback=None):
    """interval秒后向进程发送SIGALRM信号，函数在规定时间执行完后关闭alarm闹钟

    Args:
        interval (int): 超时秒数
        callback (function, optional): 超时回调. Defaults to None.
    """

    def decorator(func):
        def handler(signum, frame):
            print(frame)
            raise TimeoutError(f"run func {func.__name__} timeout {interval}s", frame)

        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            try:
                signal.signal(signal.SIGALRM, handler)
                signal.alarm(interval)
                result = func(*args, **kwargs)
                signal.alarm(0)
                return result
            except TimeoutError as e:
                if callback:
                    callback(e)

        return wrapper

    return decorator


def timeout_callback(e):
    print(e.msg)


import time


@time_out(2, timeout_callback)
def task1():
    print("task1 start")
    time.sleep(3)  #! 超时
    print("task1 end")  # tak 1 end 不会输出


@time_out(2, timeout_callback)
def task2():
    print("task2 start")
    time.sleep(1)
    print("task2 end")


if __name__ == "__main__":
    task1()
    task2()

基于 threading 或协程实现

import threading


def threading_time_out(interval, callback=None):
    def decorator(func):
        def wrapper(*args, **kwargs):
            t = threading.Thread(target=func, args=args, kwargs=kwargs)
            t.setDaemon(True)  # 设置主线程结束后子线程立刻结束
            t.start()
            t.join(interval)  # 主线程阻塞等待interval秒
            if t.is_alive() and callback:
                return threading.Timer(0, callback).start()  # 立即执行回调函数
            else:
                return

        return wrapper

    return decorator


def gevent_time_out(interval, callback=None):
    def decorator(func):
        def wrapper(*args, **kwargs):
            # 该部分必选在requests之前导入
            import gevent

            from gevent import monkey

            monkey.patch_all()

            try:
                gevent.with_timeout(interval, func, *args, **kwargs)
            except gevent.timeout.Timeout as e:
                callback() if callback else None

        return wrapper

    return decorator

常量

python 中常量定义是比较困难的，这里给出几种常用的方法。

覆盖 cls.setattr

自定义 setattr, 检测加入的属性重复性, 通过类修饰方式实现。

from functools import wraps


def as_const(cls):
    """成为常量

    Example::

        from constant import as_const


        @as_const
        class _Platform:
            linux = "linux"
            windows = "windows"
            macosx = "macosx"


        @as_const
        class _Const:
            plat = _Platform()


        ct = _Const()
        pt = _Platform()

        print(ct.plat.linux)
        print(pt.linux)
    """

    class ConstError(TypeError):
        """常量不可改错误定义"""

        pass

    class ConstCaseError(ConstError):
        """常量大小写错误定义"""

        pass

    @wraps(cls)
    def const_setter(self, key, value):

        if key in self.__dict__.keys():
            raise ConstError(f"Can't change a const variable with name {key}")
        if not key.isupper():
            raise ConstCaseError(
                f"Const variable must be combined with upper letters:{key}"
            )

        self.__dict__[key] = value

    cls.__setattr__ = const_setter

    return cls

通过 dataclass

dataclass 修饰实现，通过 frozen 参数固定。

@dataclass(frozen=True)
class ErrorName:
    service = "ServiceError"
    drawing = "DrawingError"
    subblock = "SubBlockError"
    file = "FileError"


print(ErrorName.service)

从环境变量读取配置

links:

https://github.com/milvus-io/pymilvus/blob/master/pymilvus/settings.py

扩展 log 配置

links:

https://github.com/milvus-io/pymilvus/blob/master/pymilvus/settings.py

设置 input 默认参数

def input_with_default(prompt, default=None):
    """
    如果default不为None，会在提示中包含默认值。然后，使用`input()`函数接收用户的输入。如果用户输入为空（即直接按下回车），返回默认值。否则，返回用户的输入。
    """
    if default is not None:
        prompt = f"{prompt} [{default}] "
    return input(prompt) or default

name = input_with_default("Please enter your name", "John")
print(f"Hello, {name}!")

static and class method

links:

https://www.geeksforgeeks.org/class-method-vs-static-method-python

metaclass

links:

https://www.geeksforgeeks.org/metaprogramming-metaclasses-python

duck typing

是 Python 中的一种编程风格和理念。它强调在判断一个对象的类型时，关注的是对象具有的方法和属性，而不是对象的具体类型。

根据鸭子类型的理念，如果一个对象走起路来像鸭子、叫起来像鸭子，那么它就可以被看作是一只鸭子。换句话说，只要对象具备了特定的方法和属性，就可以在代码中使用它，而不需要关心它的具体类型。

在 Python 中，通常不会显式地检查对象的类型，而是直接调用对象的方法或访问属性。如果对象具备所需的方法和属性，那么代码就能正常运行。这种灵活性使得 Python 成为一种支持动态类型和鸭子类型的语言。

以下是一个简单的示例，展示了 Duck Typing 的概念：

class Duck:
    def walk(self):
        print("Duck is walking")

    def quack(self):
        print("Duck is quacking")

class Robot:
    def walk(self):
        print("Robot is walking")

    def quack(self):
        print("Robot cannot quack")

def make_object_walk(obj):
    obj.walk()

duck = Duck()
robot = Robot()

make_object_walk(duck)  # 输出：Duck is walking
make_object_walk(robot)  # 输出：Robot is walking

在上述示例中，定义了一个 Duck 类和一个 Robot 类，它们都具备了 walk 方法。然后，定义了一个 make_object_walk 函数，该函数接受一个对象作为参数，并调用对象的 walk 方法。无论传入的是 Duck 对象还是 Robot 对象，只要它们具备 walk 方法，函数就能正常运行。

这就是 Duck Typing 的概念，它允许在编写代码时更加关注对象的行为和能力，而不是对象的具体类型。

cpython 开发

cpython 是 Python 发行最流行的版本，以 C / C++ 为基础的开发分布版本。

参考

Python Developer’s Guide

调试

cpython 调试，需要 GDB 这个工具。而 GDB 需要对应 Python 的脚本的命令加载，通常在 Python3.7 后，在源码目录的 Tools/gdb/* 包含用 GDB 需要的脚本。

按照上面的目录规则 cp 到 gdb 的 auto-load，保证调试 python3.7 进程的时候能找到。

1	ls /usr/share/gdb/auto-load/usr/local/bin/

reference

https://meteorix.github.io/2019/02/13/gdbpython/

pip

pip tool

pip 工具单独安装。

1
2
3

wget https://bootstrap.pypa.io/get-pip.py
python get-pip.py
rm -rf get-pip.py ~/.cache/pip

pip 常用命令

这节列出了常用的 pip 命令, 参考:

# 安装包
pip install yolov5-utils[==version]
# 卸载包
pip uninstall yolov5-utils
# 查看包
pip show yolov5-utils
# 导出环境包至 requirements.txt
pip freeze > requirements.txt
# 更新 pip
pip install --upgrade pip

`pip install -e`

使用 pip install -e 命令可以将本地的一个 Python 包安装到系统中，同时在安装后，该包的源代码仍然可以在本地进行修改，修改后的代码会立即生效，无需重新安装。这种安装方式被称为 “开发模式”。

在开发模式下，可以在本地修改源代码，而不需要每次修改后都重新安装包。这对于开发过程中的调试和测试非常有用。同时，也可以在本地使用版本控制工具来管理源代码，方便代码的维护和协作。

使用命令安装包时，需要在包的根目录下包含一个 setup.py 文件，该文件包含了包的元数据和依赖关系等信息。在安装时，pip 会读取该文件并根据其中的信息进行安装。

install pkg with pip

cmake skbuild install

cmake project integration with skbuild

1	pip install git+https://**.git --build-type Debug

install from git

# Hash
❯ pip install git+git://github.com/aladagemre/django-notification.git@2927346f4c513a217ac8ad076e494dd1adbf70e1

# Branch
❯ pip install git+git://github.com/aladagemre/django-notification.git@cool-feature-branch

# Tag
❯ pip install git+git://github.com/aladagemre/django-notification.git@v2.1.0

reference

pip 换源

cmd pip 全局配置

源生效优先级，sjtu > ngc.nvidia > tsinghua

pip3 config set global.index-url "https://pypi.tuna.tsinghua.edu.cn/simple/ https://pypi.ngc.nvidia.com"
pip3 config set global.extra-index-url "https://pypi.tuna.tsinghua.edu.cn/simple/ https://pypi.ngc.nvidia.com https://mirror.sjtu.edu.cn/pypi/web/simple"
pip3 config set global.timeout 20
pip3 config set global.trusted-host "pypi.tuna.tsinghua.edu.cn pypi.ngc.nvidia.com mirror.sjtu.edu.cn/pypi/web/simple"

执行后会生成配置文件, windows: C:\Users\user\AppData\Roaming\pip, linux: /home/user/.pip/

创建文件使用软件包的安装用户，执行如下命令：

cd ~/.pip

如果提示目录不存在，则执行如下命令创建：

1 2	mkdir ~/.pip cd ~/.pip

编辑 pip.conf 文件。使用 vi pip.conf 命令打开 pip.conf 文件，写入如下内容：

[global]
#以华为源为例，请根据实际情况进行替换。
index-url = https://mirrors.huaweicloud.com/repository/pypi/simple
trusted-host = mirrors.huaweicloud.com
timeout = 120

reference

个人博客参考 1

pip 环境配置

pip 包管理器在安装 python 包时，会在本地目录创建缓存，当下次再下载相同的包时，自动读取缓存，不再从三方源下载以节省时间。

通过命令行配置

1	pip config set global.cache-dir <your-cache-dir>

通过环境变量配置

1	export PIP_CACHE_DIR=<your-cache-dir>

requirements.txt

python 安装包可通过 requirements.txt 配置安装依赖，只需通过 pip 命令 -r 即可指定安装。

1	pip install -r ./requirements.txt -r /requirements.gpu.txt

可通过直接在文件种配置源，灵活拆分配置依赖包。如下，配置了 gpu 的依赖包及相应的搜索源。

--find-links https://download.pytorch.org/whl/torch_stable.html
--extra-index-url https://pypi.ngc.nvidia.com
nvidia-pyindex==1.0.9
nvidia-tensorflow==1.15.5+nv22.7
nvidia-tensorboard==1.15.0+nv21.4
torch==1.8.0+cu111
torchvision==0.9.0+cu111
torchaudio==0.8.0

reference

pip offline package

pip 实现容器启动时加载离线包安装

# requirements.gpu.txt
--extra-index-url https://pypi.ngc.nvidia.com
--find-links https://download.pytorch.org/whl/torch_stable.html
nvidia-pyindex==1.0.9
nvidia-tensorboard==1.15
nvidia-tensorflow[horovod]==1.15.5+nv22.11
torch==1.8.0+cu111
torchaudio==0.8.0
torchvision==0.9.0+cu111
# requirements.txt
numpy==1.23.5
requests==2.28.1
# ...

# syntax=docker/dockerfile:1.2

FROM python:3.8-bullseye as python_base

# config env remote source
RUN set -ex \
    # add multiple pip sources
    && pip3 config set global.index-url "https://pypi.tuna.tsinghua.edu.cn/simple/" \
    && pip3 config set global.extra-index-url "https://pypi.tuna.tsinghua.edu.cn/simple/ https://mirrors.cloud.tencent.com/pypi/simple" \
    && pip3 config set global.timeout 10 \
    && pip3 config set global.trusted-host "pypi.tuna.tsinghua.edu.cn mirrors.cloud.tencent.com" \
    # add new apt source
    && cp /etc/apt/sources.list /etc/apt/sources.list.bak \
    && sed -i '/^# deb / d;s|http://deb.debian.org/debian|http://mirrors.tencent.com/debian|g;/^deb http/s/$/ contrib non-free/;/-updates/h;s/-updates/-backports/;$G' /etc/apt/sources.list \
    && apt-get update && apt-get install -y apt-transport-https ca-certificates  \
    && ln -s /lib/x86_64-linux-gnu/libtinfo.so.6 /lib/x86_64-linux-gnu/libtinfo.so.5 \
    # dumb-init is used to optimize start the service, see https://github.com/Yelp/dumb-init
    && pip3 install dumb-init \
    && rm -rf /var/lib/apt/lists/* /var/cache/* /var/log/* /tmp/* ~/.cache -rf

ENV LC_ALL C.UTF-8

ENV LANG C.UTF-8


FROM python_base as python_pack_pkg

ENV PIP_CACHE_DIR /opt/deploy/.pip_cache

COPY ./requirements.* /opt/deploy/requirements/

# docker buildx cache
RUN --mount=type=cache,target=/opt/deploy/.pip_cache,id=pip_cache,sharing=locked \
    pip3 wheel --wheel-dir /opt/deploy/packages -r /opt/deploy/requirements/requirements.txt -r /opt/deploy/requirements/requirements.gpu.txt

RUN set -ex \
    # prevent download pkgs online again
    && sed -i '/extra-index-url/,+d' /opt/deploy/requirements/requirements.txt \
    && sed -i '/extra-index-url/,+1d' /opt/deploy/requirements/requirements.gpu.txt


FROM python_base as python_env

COPY --from=python_pack_pkg /opt/deploy/requirements/* /opt/deploy/requirements/
COPY --from=python_pack_pkg /opt/deploy/packages/* /opt/deploy/packages/

CMD ["pip3", "install", "--no-index", "--find-links=/opt/deploy/packages" ,"-r" ,"/opt/deploy/requirements/requirements.txt" ,"-r", "/opt/deploy/requirements/requirements.gpu.txt"]


FROM python_env as production

WORKDIR /opt/deploy

COPY scripts/docker/pre_start /usr/local/bin/
COPY --from=python_pack_pkg /opt/deploy/requirements/* /opt/deploy/requirements/
COPY --from=python_pack_pkg /opt/deploy/packages/* /opt/deploy/packages/

ADD . /opt/deploy/

ENTRYPOINT ["/usr/local/bin/dumb-init", "--"]

CMD ["bash", "-c","pre_start && exec python3 app.py"]

#!/usr/bin/env bash
#
# pre_start
set -ex

echo "---------------- Env echo ----------------"
echo "current path: ""$PWD"
echo "who is this: ""$(whoami)"
echo "pip3 version: ""$(pip3 --version)"

echo "----------- packages info ----------------"
ls -la /opt/deploy/packages
echo "---------- requirements info -------------"
cat /opt/deploy/requirements/requirements.txt
cat /opt/deploy/requirements/requirements.gpu.txt
echo "-------- install pre-cached pkgs ---------"
# install pkgs from cached path
pip3 install --no-index --find-links=/opt/deploy/packages -r /opt/deploy/requirements/requirements.txt -r /opt/deploy/requirements/requirements.gpu.txt

echo "---------- output active envs ------------"
dynaconf -i config.settings list

virtualenv

虚拟环境用于环境隔离，进行多环境开发或配置。

安装虚拟环境配置工具 virtualenv, 参考使用教程:

https://techinscribed.com/python-virtual-environment-in-vscode/

1 2	pip install virtualenv virtualenv --version

pipx

常用的管理的 python 命令行工具

mamba install pipx

pipx list
pipx install poetry
pipx install copier
pipx install nox

python project

python 工程表述提供了完整的标准文档, 参考官方打包标准 PEP 621

符合官方标准 pyproject.tmol 示例如下:

links:

developing complete python project

project scaffold

使用项目模板可以快速生成一个符合 python 规范的项目.

links:

pyproject.tmol

项目 project 配置参考 declaring-project-meta.

[project]
name = "xxx"
# version, 配置版本信息,可通过setuptools-scm等工具动态管理版本
version = "0.0.1"
description = "The description of the project xxx"
readme = "README.md"
classifiers = [
    # https://pypi.org/classifiers/
    # Development Status
    "Development Status :: 3 - Alpha",
    "Development Status :: 4 - Beta",
    "Development Status :: 5 - Production/Stable",
    # Framework
    "Framework :: Django",
    "Framework :: Flask",
    # License
    "License :: OSI Approved :: GNU General Public License (GPL)",
    "License :: OSI Approved :: MIT License",
    "License :: OSI Approved :: Apache Software License",
    "License :: OSI Approved :: BSD License",
    # Operating System
    "Operating System :: OS Independent",
    "Operating System :: Microsoft :: Windows",
    "Operating System :: POSIX",
    "Operating System :: Unix",
    "Operating System :: MacOS",
    # Intended Audience
    "Intended Audience :: Developers",
    "Intended Audience :: Education",
    "Intended Audience :: Science/Research",
    # Programming Language
    "Programming Language :: Python :: 3 :: Only",
    "Programming Language :: Python :: 3.7",
    "Programming Language :: Python :: 3.8",
    "Programming Language :: Python :: 3.9",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    # Topic
    "Topic :: Software Development",
    "Topic :: Software Development :: Libraries",
    "Topic :: Software Development :: Libraries :: Python Modules",
    "Topic :: Education",
    "Topic :: Scientific/Engineering",
    "Topic :: Scientific/Engineering :: Artificial Intelligence",
    "Topic :: Scientific/Engineering :: Image Recognition",
]
authors = []
maintainers = []
keywords = [
    "machine-learning",
    "deep-learning",
]
# license: A SPDX license identifier
license = { text = "GPL" }
license = { text = "Apache 2.0" }
license = { text = "BSD" }
requires-python = ">=3.7"
# 项目必要的依赖
dependencies = [
    "fire",
    "gitpython>=3.1.30",
    # ...
]

# 项目可生成的脚本 https://setuptools.pypa.io/en/latest/userguide/entry_point.html
[project.scripts]
# xxx.python-file:entry-function 指定了脚本在xxx包python-file文件中entry-function入口函数
cli_name = "xxx.python-file:entry-function"

# 项目相关的链接
[project.urls]
"Documentation" = "https://github.com/msclock/xxx"
"Source" = "https://github.com/msclock/xxx"
"Tracker" = "https://github.com/msclock/xxx/issues"

# 项目可选依赖, 可通过 pip install .[test]安装对应集合依赖
[project.optional-dependencies]
# 常用的测试依赖如下
test = [
    "GitPython >= 3.1.30",
    "pytest >= 5.2",
    "pytest-mock >= 3.8.2",
    "pytype!=2021.11.18,!=2022.2.17",
    "pre-commit >= 2.20.0",
    "pytest-unordered ~= 0.5",
]
# 常用到的开发依赖如下
dev = [
    # 可以使用分类链接的方式传递依赖组
    "xxx[test]", # 加入上面的 test 依赖
    "black >= 22.8",
    "build >= 0.8",
    "ipython >= 7.16",
    "isort >= 5.10",
    "pdbpp >= 0.10",
    "pip >= 21.1",
    "pre-commit >= 2.20.0",
    "psutil ~= 5.1",
    "twine >= 4.0",
]

build-system

使用 setuptools 后端构建系统

1
2
3

[build-system]
requires = ["setuptools", "setuptools-scm", "wheel"]
build-backend = "setuptools.build_meta"

[!NOTE]
一般使用 setuptools 构建工具, 不需要指定版本

distutils

1
2
3

# 配置生成 wheel 打包文件命名遵循 `pkgname-version-python_tag-xxx-xxx.whl`
[tool.distutils.bdist_wheel]
python_tag = "py37.py38.py39.py310"

setuptools

# 需要包含的数据文件
[tool.setuptools]
include-package-data = true
license-files = ["LICENSE"]

# 打包使用 find 接口 https://setuptools.pypa.io/en/latest/userguide/package_discovery.html#
[tool.setuptools.packages.find]
namespaces = false
include = []
# 排除pkg包及自包和模块
exclude = ["pkg*"]

# 数据文件
[tool.setuptools.package-data]
# xxx 为指定的包名
xxx = ["**/*"]

# 使用文件中变量动态生成版本信息
[project]
dynamic = ["version"]

[tool.setuptools.dynamic]
# attr 指向xxx包中__version__文件中 __version__变量
version = { attr = "xxx.__version__.__version__" }

links:

https://github.com/serious-scaffold/serious-scaffold-python/blob/main/pyproject.toml

setuptools-scm

setuptools-scm 从 git 或 hg 元数据中提取 Python 包版本，而不是将它们声明为版本参数或在 SCM 托管文件中。

# in pyproject.toml
[project]
name = "xxx"
# version = "0.0.1"  # Remove any existing version parameter.
dynamic = ["version"]
[tool.setuptools_scm]
write_to = "xxx/_version.py"

[tool.check-sdist]
sdist-only = ["xxx/_version.py"]

links:

https://github.com/msclock/pylicense

ruff

links:

# in pyproject.toml
[tool.ruff]
select = [
  "E", "F", "W", # flake8
  "B",           # flake8-bugbear
  "I",           # isort
  "N",           # pep8-naming
  "ARG",         # flake8-unused-arguments
  "C4",          # flake8-comprehensions
  "EM",          # flake8-errmsg
  "ICN",         # flake8-import-conventions
  "ISC",         # flake8-implicit-str-concat
  "PGH",         # pygrep-hooks
  "PIE",         # flake8-pie
  "PL",          # pylint
  "PT",          # flake8-pytest-style
  "RET",         # flake8-return
  "RUF100",      # Ruff-specific
  "SIM",         # flake8-simplify
  "UP",          # pyupgrade
  "YTT",         # flake8-2020
]
ignore = [
  "PLR",     # Design related pylint
  "E501",    # Line too long (Black is enough)
  "PT011",   # Too broad with raises in pytest
  "PT004",   # Fixture that doesn't return needs underscore (no, it is fine)
  "SIM118",  # iter(x) is not always the same as iter(x.keys())
  "ARG001",  # Ignore unused arguments
  "PLW0603", # We're fine with global vars
]
target-version = "py37"
src = ["src"]
unfixable = ["T20"]
exclude = []
# Allow lines to be as long as 120 characters.
line-length = 120
isort.known-first-party = ["env", "pybind11_cross_module_tests", "pybind11_tests"]

[tool.ruff.per-file-ignores] # rules on the per file basis include common ignore rules
"tests/**" = ["EM", "N"]
"tests/test_call_policies.py" = ["PLC1901"]

black

black: 标准化项目格式化

# 每行不超过 120
[tool.black]
line-length = 120
target-version = ['py38']
include = '\.pyi?$'
exclude = '''

(
  /(
      \.eggs         # exclude a few common directories in the
    | \.git          # root of the project
    | \.mypy_cache
    | \.venv
    | _build
    | build
    | dist
  )/
)
'''

使用说明

# 只检查错误
docker run --rm  -v $PWD:/src -w /src \
    pyfound/black:latest_release \
    bash -c "black --check ."
# 自动更正
docker run --rm  -v $PWD:/src -w /src \
    pyfound/black:latest_release \
    bash -c "block --quiet ."

isort

isort: 标注化项目 import 语句

# import导入遵循 black 工具
[tool.isort]
profile = "black"
line_length = 120

使用说明

# 只检查错误
docker run --rm  -v $PWD:/src -w /src \
    xcgd/isort \
    sh -c "isort --check ."
# 自动更正
docker run --rm  -v $PWD:/src -w /src \
    xcgd/isort \
    sh -c "isort ."

vulture

Vulture 是一个 Python 代码分析工具，用于查找未使用的代码。

[tool.vulture]
# 排除不需要分析的文件或目录。
exclude = []
ignore_decorators = []
ignore_names = []
# 生成一个白名单，其中包含已经被使用的标识符。
make_whitelist = true
# 设置最小置信度，即 Vulture 认为代码未使用的最低置信度。
min_confidence = 80
# 指定要分析的路径。
paths = ["xxx"]
# 按文件大小排序。
sort_by_size = true
# 是否显示详细的输出信息。
verbose = false

pylint

pylint: 代码 lint

[tool.pylint]
py-version = "3.7" # 基于 python3.7 lint
reports = true
persistent = true
output-format = [ # lint 输出格式
    # "json:.report.pylint.json",
    "colorized",
]
messages_control.disable = [ # lint时静默错误选项
    "design",
    "fixme",
    "imports",
    "line-too-long",
    "imports",
    "invalid-name",
    "protected-access",
    "missing-module-docstring",
    "consider-using-f-string",
    "unspecified-encoding",
]
ignored-modules = ["xxx"] # 忽视某些模块错误

mypy

集成 mypy 代码检查, 配置示例如下:

[tool.mypy]
explicit_package_bases = true
mypy_path = "."
namespace_packages = true
ignore_missing_imports = true
ignore_missing_imports_per_module = true
# disallow_untyped_defs = true
# no_implicit_optional = true
# strict_optional = true
# warn_redundant_casts = true
check_untyped_defs = true
exclude = '''(?x)(
    directory-to-exclude
)'''

[[tool.mypy.overrides]]
module = ["skbuild.*", "setuptools.*", "another-module-name.*"]
ignore_missing_imports = true

pytest

使用集成 pytest 单元测试, 配置示例如下:

[tool.pytest.ini_options]
testpaths = ["test"]
python_files = ["test_*.py", "*_test.py", "testing/python/*.py"]
python_classes = ["Test", "Acceptance"]
python_functions = ["test"]
filterwarnings = ["error"]
xfail_strict = true
addopts = [
    "-ra",
    "--showlocals",
    "--strict-markers",
    "--strict-config",
    # 导入模块时使用importlib模块
    "--import-mode=importlib"
]
minversion = "6.0"

codespell

使用 codespell 检测拼写错误, 配置示例如下:

# pyproject.toml
[tool.codespell]
# note: pre-commit passes explicit lists of files here, which this skip file list doesn't override -
# this is only to allow you to run codespell interactively
skip = "./.git,./.github"
# ignore short words, and typename parameters like offset
ignore-regex = "\\b(.{1,4}|[A-Z]\\w*T)\\b"
# ignore allowed words
ignore-words-list = "passing"
# use the 'clear' dictionary for unambiguous spelling mistakes
builtin = "clear"
# disable warnings about binary files and wrong encoding
quiet-level = 3

以上需要在配合 .pre-commit-config.yaml 使用

- repo: https://github.com/codespell-project/codespell
  rev: v2.2.4
  hooks:
  - id: codespell
    additional_dependencies: [tomli]
    args: ["--toml", "pyproject.toml"]

py.typed

py.typed 文件是用来标识 Python 包是类型注解的支持者。当一个包中存在py.typed文件时，它表示该包是类型注解的友好包，即使用了类型注解来增强代码可读性和可维护性。

py.typed文件的存在告诉类型检查工具（如 mypy）和 IDE（如 PyCharm）等，该包中包含了类型注解，并且可以进行静态类型检查和自动补全等操作。

[!NOTE]
需要注意的是，py.typed文件本身不包含任何代码或内容，它只是一个标识文件。

links:

Packing

配置完 pyproject.toml 后, 打包可以使用以下命令:

使用 build

1 2	pip install build python -m build

使用 pip

1	pip wheel .

upload to pypi

打包完后, 如果需要上传到 pypi 上, 参考 Github-Action-Python-Example.

twine

上传工具一般选择 twine, 使用方式如下:

pip install twine
# Packing locally
python -m build
# Upload package in the generated folder dist
export TWINE_USERNAME=__token__
# api key generated by pypi
export TWINE_PASSWORD=[key]
twine upload --verbose --skip-existing dist/*

集成到 GitHub

使用 GitHub Python pypi upload 模板, 使用示例参考:

https://github.com/msclock/yolov5-utils/blob/main/.github/workflows/python-publish.yml

集成到 GitLab

参考twine 配置 GitLab CI 脚本即可.

docs generator

python 流行的文档生成器, 包括 sphinx.

sphinx

sphinx 是一个非常流行的 python 文档生成工具.

1	pip install sphinx

links:

Quickstart

使用 sphinx-quickstart 工具快速配置

1 2	sphinx-quickstart # 后续根据提示设置项目信息

api generation

[!NOTE]
生成完整的 api 依赖于包包含__init__.py

1	aphinx-apidoc -o docs path-to-packages

接下来将生成的 package 文档指定显示在 index 中.

conf.py

Add jupter support

在 sphinx 文档中集成 jupyter

配置集成 jupyter 插件 myst-bn.

# conf.py
extensions = [
    "myst_nb",
]

在 index 中添加 jupyter 链接

---
maxdepth: 2
caption: Contents:
---
notebooks/Example 1

readthedocs

在 Read the Docs 网站 https://readthedocs.org/ 注册，并绑定 GitHub 账户。点击 “Import a Project” 导入项目，输入项目名称和仓库地址, 可免费在线托管项目文档。

scikit-build-core

scikit-build-core 用于适配基于 cmake 的 python 构建工具.

参考文档:

官方文档

参考 demo 示例:

multiprocessing

主要总结multiprocessing的常用方法及问题解决方法。

Queue

links:

参考文档 runebook

Manager

管理器提供了一种创建可以在不同进程之间共享的数据的方法，包括通过网络在不同机器上运行的进程之间共享。管理器对象控制管理共享对象的服务器进程。其他进程可以使用代理访问共享对象。

multiprocessing.Manager()

返回一个已启动的 SyncManager 对象，该对象可用于在进程之间共享对象。返回的管理器对象对应于一个衍生的子进程，并具有创建共享对象和返回相应代理的方法。

SyncManager

BaseManager 的子类，可用于进程同步。这种类型的对象由 multiprocessing.Manager() 返回。

links:

参考文档 runebook

QA

RuntimeError: Queue objects should only be shared between processes through inheritance

使用 multiprocessing.Queue 通常会报以上错误。因为 multiprocessing.Queue 只能在父子进程间进行队列同步通信。对于使用进程池，需要使用 multiprocessing.Manager.Queue 创建共享的通信队列。示例:

from multiprocessing import Process, Queue, Pool
import os, time, random


# 写数据进程执行的代码:
def write(q, lock):
    lock.acquire()  # 加上锁
    for value in ["A", "B", "C"]:
        print(f"Put {value} to queue...")
        q.put(value)
    lock.release()  # 释放锁


# 读数据进程执行的代码:
def read(q):
    while True:
        if not q.empty():
            value = q.get(False)
            print(f"Get {value} from queue.")
            time.sleep(random.random())
        else:
            break


if __name__ == "__main__":
    manager = multiprocessing.Manager()
    # 父进程创建Queue，并传给各个子进程：
    q = manager.Queue()
    lock = manager.Lock()  # 初始化一把锁
    p = Pool()
    pw = p.apply_async(write, args=(q, lock))
    pr = p.apply_async(read, args=(q,))
    p.close()
    p.join()

concurrency

https://superfastpython.com/learning-paths/

paths

threading in python: https://superfastpython.com/threading-in-python/
threadpoolexecutor: https://superfastpython.com/threadpoolexecutor-in-python/

加密和解密

links:

cryptography

cryptography 是一个设计用于向 Python 开发人员公开加密原语和配方的包。

links:

nuitka

出于安全原因，有时需要将源代码转换为可执行文件，以防止它们被篡改或泄露。有几种工具可以做到这一点, 最受欢迎的是 pyinstaller 和 nuitka。这里用的是 nuitka。

links:

plugins

在打包某些包时做一些 patch 或者 hacking 操作的插件, 方便适配社区发布的包及自定义加密操作.

links:

configuration

若转换后缺少某些依赖 (dll / data)，可以在配置文件中添加。

links:

https://nuitka.net/doc/nuitka-package-config.html

在确定包的依赖时，可以使用以下命令查看是否是缺少依赖关系:

# 列出numpy依赖
python -m nuitka --list-package-dlls=numpy
Nuitka-Tools:INFO: Checking package directory 'C:\tools\mambaforge\envs\3.8\lib\site-packages\numpy' ..
C:\tools\mambaforge\envs\3.8\lib\site-packages\numpy\.libs
  libopenblas.EL2C6PLE4ZYW3ECEVIV3OXXGRN2NRFM2.gfortran-win_amd64.dll
Nuitka-Tools:INFO: Found 1 DLLs.

# !!! shapely 1.8.0以下打包库链接存在bug,尽量用1.8.0以上的版本
python -m nuitka  --list-package-dlls=shapely
Nuitka-Tools:INFO: Checking package directory 'C:\tools\mambaforge\envs\3.8\lib\site-packages\shapely' ..
C:\tools\mambaforge\envs\3.8\lib\site-packages\shapely\DLLs
  geos.dll
  geos_c.dll
Nuitka-Tools:INFO: Found 2 DLLs.

更详细的使用自定义配置文件参考示例

pyarmor

Pyarmor 是一个用于加密和保护 Python 脚本的工具。它能够在运行时刻保护 Python 脚本代码不被泄露，设置加密后脚本的使用期限，绑定加密脚本到硬盘、网卡等硬件设备。

links:

plugin mechanisms

python 中的插件机制主要依赖于包分发或模块加载机制。

Example of entrypoints

基于 entrypoints 的插件机制的使用示例

marshmallow

适用于各个场景的序列化包 marshmallow

验证: 输入数据。
反序列化: 输入数据为应用程序级对象。
序列化: 应用程序级对象为原始 Python 类型。然后，可以将序列化对象呈现为标准格式，例如 JSON ，以便在 HTTP API 中使用。

序列化 & 反序列化

@dataclass
class Item:
    name: str
    price: Decimal
    date: datetime.date


# 对于其他的所有类型 fields, 可以参考文档
# https://marshmallow.readthedocs.io/en/stable/marshmallow.fields.html#api-fields
from marshmallow import Schema, fields


class ItemSchema(Schema):
    name = fields.String()
    price = fields.Decimal()
    date = fields.Date()


item = Item("abc", Decimal("1.23456"), datetime.date(2021, 6, 26))
schema = ItemSchema()
result_obj = schema.dump(item)
print(result_obj)
# 会输出 {'price': Decimal('1.23456'), 'name': 'abc', 'date': '2021-06-26'}

input_data = {"name": "abc", "price": Decimal("1.23456"), "date": "2021-06-26"}
load_result = schema.load(input_data)
print(load_result)  # dict
# 输出 {'name': 'abc', 'date': datetime.date(2021, 6, 26), 'price': Decimal('1.23456')}

post_load

控制反序列化，构造对象

from dataclasses import dataclass
import datetime
from decimal import Decimal
import decimal
from marshmallow import Schema, fields, post_load


class ItemSchema(Schema):
    name = fields.String()
    price = fields.Decimal()
    date = fields.Date()

    @post_load
    def make_item(self, data, **kwargs):
        return Item(**data)


result_obj = schema.dump(item)
print(result_obj)
# 返回 Item(name='abc', price=Decimal('1.23456'), date=datetime.date(2021, 6, 26))

fields.Method

自定义字段序列化需求

from marshmallow import Schema, fields, post_load


class ItemSchema(Schema):
    """Define serialization protocol for class Item

    Args:
        name: str to be registered by the schema
        price: custom way to be registered
        date: date to be registered
    """

    name = fields.String()
    price = fields.Method("price_decimal_2_float", deserialize="float_2_decimal")
    date = fields.Date()

    @post_load  # post_load to make the schema deserialized as object-like
    def make_item(self, data, **kwargs):
        return Item(**data)

    def price_decimal_2_float(self, item: Item):
        """price attr to save"""
        return float(item.price)

    def float_2_decimal(self, float):
        """price attr to load"""
        return decimal.Decimal(str(float))


item = Item("abc", Decimal("1.23456"), datetime.date(2021, 6, 26))
schema = ItemSchema()
result_str = schema.dumps(item)
print(result_str)
# {"date": "2021-06-26", "name": "abc", "price": 1.23456}
input_str = '{"date": "2021-06-26", "name": "abc", "price": 1.23456}'
load_result = schema.loads(input_str)
print(load_result)
# Item(name='abc', price=Decimal('1.23456'), date=datetime.date(2021, 6, 26))

links:

pytest

pytest 是一个非常成熟的全功能的 Python 测试框架，主要特点有以下几点：

简单灵活，容易上手，文档丰富；
支持参数化，可以细粒度地控制要测试的测试用例；
能够支持简单的单元测试和复杂的功能测试，还可以用来做selenium/appnium等自动化测试、接口自动化测试（pytest+requests）;
pytest具有很多第三方插件，并且可以自定义扩展，比较好用的如pytest-selenium（集成selenium）、pytest-html（完美 html 测试报告生成）、pytest-rerunfailures（失败 case 重复执行）、pytest-xdist（多 CPU 分发）等；
测试用例的skip和xfail处理；

退出码

查看退出码

Python 3.7.8 (tags/v3.7.8:4b47a5b6ba, Jun 28 2020, 08:53:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pytest import ExitCode
>>> help(ExitCode)
Help on class ExitCode in module _pytest.config:

class ExitCode(enum.IntEnum)
 |  ExitCode(value, names=None, *, module=None, qualname=None, type=None, start=1)
 |
 |  Encodes the valid exit codes by pytest.
 |
 |  Currently users and plugins may supply other exit codes as well.
 |
 |  .. versionadded:: 5.0
 |
 |  Method resolution order:
 |      ExitCode
 |      enum.IntEnum
 |      builtins.int
 |      enum.Enum
 |      builtins.object
 |
 |  Data and other attributes defined here:
 |
 |  INTERNAL_ERROR = <ExitCode.INTERNAL_ERROR: 3> # 内部发生错误
 |
 |  INTERRUPTED = <ExitCode.INTERRUPTED: 2> # 测试过程被用户中断
 |
 |  NO_TESTS_COLLECTED = <ExitCode.NO_TESTS_COLLECTED: 5> # 没有实际的运行case
 |
 |  OK = <ExitCode.OK: 0>  # 全部测试case运行成功
 |
 |  TESTS_FAILED = <ExitCode.TESTS_FAILED: 1> # 存在测试case失败
 |
 |  USAGE_ERROR = <ExitCode.USAGE_ERROR: 4> # pytest命令行使用错误
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from enum.Enum:
 |

和 shell 第二次封装调用检测退出码使用 echo $?
扩展退出码，通过插件pytest-custom_exit_code扩展。

links:

Flask

Explore Flask

Flask Practice 是 github 上社区总结的实践参考.

Flasky

flasky 是 O’Reilly 的书 Flask Web Development 的第二版的源代码示例

wsgi

Flask 的默认运行模式是单进程单线程，这意味着它在处理请求时是阻塞的。也就是说，当一个请求正在被处理时，其他的请求必须等待。这对于开发和测试来说通常是可以接受的，但在生产环境中，可能需要一个能够同时处理多个请求的服务器。

要使 Flask 非阻塞，可以使用一个能够处理并发请求的 WSGI 服务器，如 Gunicorn 或 uWSGI。这些服务器可以配置为多进程或多线程模式，从而能够同时处理多个请求。

例如，可以使用 Gunicorn 来运行 Flask 应用，并设置工作进程的数量：

1	gunicorn -w 4 -b 0.0.0.0:5000 'module:app'

在这个命令中，-w 4 设置了 4 个工作进程，这意味着 Gunicorn 可以同时处理 4 个请求。

请注意，虽然这可以提高并发处理能力，但并不能真正实现非阻塞。如果需要非阻塞或异步处理，可能需要考虑使用像 gevent 这样的库，或者使用基于 ASGI 的框架，如 FastAPI 或 Starlette1。

gunicorn

匹配 flask 的 wsgi 为 gunicorn

1	gunicorn -w 4 -b 0.0.0.0:5000 'module:app'

在这个命令中：

-w 4 表示启动 4 个工作进程
-b 0.0.0.0:5000 表示绑定到所有 IP 地址的 5000 端口
‘module:app’ 是 flask 应用的导入路径和应用变量。例如，如果应用在 app.py 文件中，应该使用’app:app’

gevent

gevent 作为 flask 的 wsgi.

1	pip install gevent

参考:

from gevent.pywsgi import WSGIServer
from gevent import monkey
monkey.patch_all()

import flask

http_server = WSGIServer(("0.0.0.0", 8000), flask.Flask(__file__))
http_server.serve_forever()

FastAPI

参考链接:

uvicorn

fastapi 通常配合 uvicorn 在生成环境中使用.

1	uvicorn main:app --reload

subprocess

subprocess 运行子进程

import subprocess

args = ["python", "-c", '"print(hello)"']
subprocess.run(args=args, capture_output=True).stdout.decode().strip("\n")

argparse

# tt.py
import os


def parse_args():
    import argparse

    root_parser = argparse.ArgumentParser()
    subs = root_parser.add_subparsers(dest="action")
    subs.required = True

    sub1_parser = subs.add_parser("sub1")

    # positional argument
    sub1_parser.add_argument("p1")
    # optional
    sub1_parser.add_argument("--batch_size", "-b", default=8, type=int)
    return root_parser.parse_args()


if __name__ == "__main__":
    FLAGS = parse_args()
    print(**vars(FLAGS))  # print as dict

# python tt.py
# usage: tt.py [-h] {sub1} ...

# positional arguments:
#   {sub1}

# optional arguments:
#   -h, --help  show this help message and exit

gradio

gradio 是一个快速构建 AI demo app 的 web 应用框架. 安装 pip install gradio.

教程参考:

Quickstart

官方示例, 完整文档参考.

import gradio as gr


def greet(name):
    return "Hello " + name + "!"


iface = gr.Interface(fn=greet, inputs="text", outputs="text")
iface.test_launch()

共享 app 参考

以下是一个 fast api

from fastapi import FastAPI
import gradio as gr

CUSTOM_PATH = "/gradio"

app = FastAPI()


@app.get("/")
def read_main():
    return {"message": "This is your main app"}


io = gr.Interface(lambda x: "Hello, " + x + "!", "textbox", "textbox")
app = gr.mount_gradio_app(app, io, path=CUSTOM_PATH)

# Run this from the terminal as you would normally start a FastAPI app: `uvicorn run:app`
# and navigate to http://localhost:8000/gradio in your browser.

streamlit

streamlit 是一个相比于 gradio 更基础的 web app 构建框架.

docs: https://docs.streamlit.io/

Fire

Fire 将转换任何 Python 模块、类、对象、函数等 (任何 Python 组件都可以工作!) 变成了 CLI。它被称为 Fire 是因为当调用 Fire () 时，它会激活命令。

Using function

# hello.py
import fire


def hello(name="World"):
    return "Hello {name}!".format(name=name)


if __name__ == "__main__":
    fire.Fire(hello)  # 当不显示指定时,默认全部导出

$ python hello.py --help
INFO: Showing help with the command 'hello.py -- --help'.

NAME
    hello.py

SYNOPSIS
    hello.py <flags>

FLAGS
    --name=NAME

Using class

import fire


class Calculator(object):
    """A simple calculator class."""

    def double(self, number):
        """subcommand 1"""
        return 2 * number

    def triple(self, number):
        """subcommand 2"""
        return 3 * number


if __name__ == "__main__":
    fire.Fire(Calculator)

Using object

import fire


class Calculator(object):
    def add(self, x, y):
        return x + y

    def multiply(self, x, y):
        return x * y


if __name__ == "__main__":
    calculator = Calculator()
    fire.Fire(calculator)

Define subcommand

使用函数

import fire


def add(x, y):
    return x + y


def multiply(x, y):
    return x * y


if __name__ == "__main__":
    fire.Fire()
# 显示指定暴露方法
# fire.Fire({
#   'add': add,
#   'mul': multiply,
# })

使用类

# example.py
import fire


class BrokenCalculator(object):
    def __init__(self, offset=1):
        """构造函数中定义了参数，那么这些参数(offset)都会作为整个命令行程序的选项参数。"""
        self._offset = offset

    def add(self, x, y):
        return x + y + self._offset

    def multiply(self, x, y):
        return x * y + self._offset


if __name__ == "__main__":
    fire.Fire(BrokenCalculator)

$ python example.py --help
INFO: Showing help with the command 'example.py -- --help'.

NAME
    example.py

SYNOPSIS
    example.py <flags>

FLAGS
    --offset=OFFSET

Embedded command

class IngestionStage(object):
    def run(self):
        return "Ingesting! Nom nom nom..."


class DigestionStage(object):
    def run(self, volume=1):
        return " ".join(["Burp!"] * volume)

    def status(self):
        return "Satiated."


class Pipeline(object):
    def __init__(self):
        self.ingestion = IngestionStage()
        self.digestion = DigestionStage()

    def run(self):
        self.ingestion.run()  #
        self.digestion.run()


if __name__ == "__main__":
    fire.Fire(Pipeline)

Using attr access

import fire

cities = {
    "hz": (310000, "杭州"),
    "bj": (100000, "北京"),
}


class City(object):
    def __init__(self, code):
        info = cities.get(code)
        self.zipcode = info[0] if info else None
        self.city = info[1] if info else None


if __name__ == "__main__":
    fire.Fire(City)

这里直接指定属性访问即可

$ python example.py --code bj zipcode
100000
$ python example.py --code hz city
杭州

Using command chain

实现命令链式调用, 需要在实例方法中返回 self.

import fire


class Calculator:
    def __init__(self):
        self.result = 0
        self.express = "0"

    def __str__(self):
        return f"{self.express} = {self.result}"

    def add(self, x):
        self.result += x
        self.express = f"{self.express}+{x}"
        return self

    def sub(self, x):
        self.result -= x
        self.express = f"{self.express}-{x}"
        return self

    def mul(self, x):
        self.result *= x
        self.express = f"({self.express})*{x}"
        return self

    def div(self, x):
        self.result /= x
        self.express = f"({self.express})/{x}"
        return self


if __name__ == "__main__":
    fire.Fire(Calculator)

$ python calculator.py add 1 sub 2 mul 3 div 4
((+1-2)*3)/4 = -0.75

$ python calculator.py add 1 sub 2 mul 3 div 4 add 4 sub 3 mul 2 div 1
((((0+1-2)*3)/4+4-3)*2)/1 = 0.5

Using *args and **kwargs

# example.py
import fire


def fargs(*args):
    """位置参数调用:python example.py fargs a b c"""
    return str(args)


def fkwargs(**kwargs):
    """选项参数调用:python example.py fargs --a a1 --b b1 --c c1"""
    return str(kwargs)


if __name__ == "__main__":
    fire.Fire()

# 没有使用分隔符，upper 被作为位置参数
$ python example.py fargs a b c upper
('a', 'b', 'c', 'upper')

# 使用了分隔符，upper(内置方法) 被作为子命令
$ python example.py fargs a b c - upper
('A', 'B', 'C')

Parameter type in fire

# example.py
import fire

fire.Fire(lambda obj: type(obj).__name__)

$ python example.py 10
int
$ python example.py 10.0
float
$ python example.py hello
str
$ python example.py '(1,2)'
tuple
$ python example.py [1,2]
list
$ python example.py True
bool
$ python example.py {name: David}
dict
$ python example.py '{"name": "David Bieber"}'
dict

click

Define subcommand

click 创建子命令, 需要 click.group 创建一个命令组

@click.group()
def cli():
    pass

@cli.command('publish') # 或 @cli.command()
@click.option(
    "--model_name",
    help="Model name",
)
@click.option(
    "--model_directory",
    type=click.Path(exists=True, readable=True),
    required=True,
    help="Model filepath",
)
def publish(model_name, model_directory, flavor):
    # ... existing code ...

if __name__ == "__main__":
    cli()

pre-commit

pre-commit 用于管理和维护多语言预提交挂钩的框架的使用经验。

Install

# 该工具可通过 pip 直接安装。
pip3 install pre-commit
# 集成到当前的项目当中
pre-commit install

.pre-commit-config.yaml

pre-commit 工具通过配置文件 .pre-commit-config.yaml 控制代码提交等操作。

创建文件该配置文件，放于根目录，或者使用 pre-commit sample-config 生成模板，相关模板字段含义参考官方文档。
根据需要自定义配置插件 plugin，这里以配置格式化工具 black 工具为例。

[!TIP]
配置好后，可使用 black 的徽章，表示该项目通过该工具检查。

同样, 也可附上，pre-commit 的徽章。

repos:
  # Black, the code formatter, natively supports pre-commit
  - repo: https://github.com/psf/black
    rev: "22.6.0" # Keep in sync with blacken-docs
    hooks:
      - id: black

  # Also code format the docs
  - repo: https://github.com/asottile/blacken-docs
    rev: "v1.12.1"
    hooks:
      - id: blacken-docs
        additional_dependencies:
          - black==22.6.0 # keep in sync with black hook

pre-commit cli

# 所有仓库代码都运行
pre-commit run -a
# 注册提交阶段执行钩子
pre-commit install --hook-type commit-msg
# 卸载相应钩子
pre-commit uninstall --hook-type commit-msg --hook-type pre-commit
# 卸载 pre-commit 钩子
pre-commit uninstall
# 执行对应配置钩子 black 格式化代码
pre-commit run black

常用配置插件

这里给出实际项目中使用示例 .pre-commit-config.yaml ，并对某些字段附上了相应的说明。

# .pre-commit-config.yaml
# To use:
#
#     pre-commit run -a
#
# Or:
#
#     pre-commit install  # (runs every time you commit in git)
#
# To update this file:
#
#     pre-commit autoupdate
#
# See https://github.com/pre-commit/pre-commit

# (optional: default false) set to true to have pre-commit stop running hooks after the first failure
fail_fast: false

# (optional: default ^$) global file exclude pattern.The below ignores sub/dir1, sub/dir2, *.drawio.
exclude: |
  (?x)(
    ^(sub/dir1) |
    ^(sub/dir2) |
    ^.*\.drawio
  )

# (optional: default '') global file include pattern.
files: '' # set '' as default

# A list of repository mappings.
repos:
  # Standard hooks
  - repo: https://github.com/pre-commit/pre-commit-hooks # the repository url to git clone from
    # the revision or tag to clone at
    rev: v4.3.0
    hooks:
      # which hook from the repository to use.
      - id: check-added-large-files
      - id: check-case-conflict
      - id: check-docstring-first
      - id: check-merge-conflict
      - id: check-toml
      - id: debug-statements
      - id: end-of-file-fixer
        types_or: [c, c++, cuda, proto, textproto, java, python]
      - id: mixed-line-ending
      - id: requirements-txt-fixer
      - id: trailing-whitespace
      - id: check-yaml
        exclude: ^deploy(\/[^\/]+)*\/templates\/.*$
      - id: check-shebang-scripts-are-executable

  # Check yaml
  # - support gitlab reference syntax
  - repo: https://github.com/macisamuele/language-formatters-pre-commit-hooks
    rev: v2.10.0
    hooks:
    - id: pretty-format-yaml
      args:
        - --autofix
        - --offset=2
        # - --preserve-quotes

  # Check yaml
  - repo: https://github.com/lyz-code/yamlfix
    rev: 1.13.0
    hooks:
      - id: yamlfix
        args:
          - -c .yamlfix.toml

  # Upgrade old Python syntax
  - repo: https://github.com/asottile/pyupgrade
    rev: v2.37.3
    hooks:
      - id: pyupgrade
        # (optional) list of additional parameters to pass to the hook.
        args: [--py37-plus]

  # Ruff, the Python auto-correcting linter written in Rust, Ruff can be used to
  # replace Flake8 (plus dozens of plugins), isort, pydocstyle, yesqa, eradicate,
  # pyupgrade, and autoflake
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.0.281
    hooks:
      - id: ruff
        args:
          # this should be placed after black, isort, and similar
          # tools when opoen fixing
          - --fix
          - --show-fixes

  # Nicely sort includes
  - repo: https://github.com/pycqa/isort
    rev: 5.12.0
    hooks:
      - id: isort
        name: isort (python)
        args:
          - --profile=black
          # - --line-length=88

  # Another nicely sort includes
  - repo: https://github.com/timothycrosley/isort
    rev: 5.12.0
    hooks:
      - id: isort
        additional_dependencies: [toml]

  # Black, the code formatter, natively supports pre-commit
  - repo: https://github.com/psf/black
    rev: 22.6.0 # Keep in sync with blacken-docs
    hooks:
      - id: black
        args:
          - --line-length=88

  # Also code format the docs
  - repo: https://github.com/asottile/blacken-docs
    rev: v1.12.1
    hooks:
      - id: blacken-docs
        # (optional) a list of dependencies that will be installed in the environment where this
        # hook gets run. One useful application is to install plugins for hooks such as eslint.
        additional_dependencies:
          - black==22.6.0 # keep in sync with black hook

  # Black mirror, 2x faster black mirror
  - repo: https://github.com/psf/black-pre-commit-mirror
    rev: 23.7.0
    hooks:
      - id: black
        args:
          - --line-length=88

  # Changes tabs to spaces
  - repo: https://github.com/Lucas-C/pre-commit-hooks
    rev: v1.3.1
    hooks:
      - id: remove-tabs

  # Autoremoves unused imports
  - repo: https://github.com/hadialqattan/pycln
    rev: v2.1.1
    hooks:
      - id: pycln
        # (optional) confines the hook to the commit, merge-commit, push, prepare-commit-msg,
        # commit-msg, post-checkout, post-commit, post-merge, post-rewrite, or manual stage.
        # See https://pre-commit.com/#confining-hooks-to-run-at-certain-stages
        stages: [manual]

  # Checking for common mistakes
  - repo: https://github.com/pre-commit/pygrep-hooks
    rev: v1.9.0
    hooks:
      - id: python-check-blanket-noqa
      - id: python-check-blanket-type-ignore
      - id: python-no-log-warn
      - id: python-use-type-annotations

  # Automatically remove noqa that are not used
  - repo: https://github.com/asottile/yesqa
    rev: v1.4.0
    hooks:
      - id: yesqa
        additional_dependencies: &flake8_dependencies
          - flake8-bugbear
          - pep8-naming

  # Flake8 also supports pre-commit natively (same author)
  - repo: https://github.com/PyCQA/flake8
    rev: 5.0.4
    hooks:
      - id: flake8
        # (optional) file exclude pattern.
        exclude: ^(docs/.*|tools/.*)$
        additional_dependencies: *flake8_dependencies
        args:
          - --max-line-length=120
          - --show-source
          - --exclude=.git, __pycache__, build, dist, docs, tools, venv, .venv
          - --extend-ignore=E203, E722, B950
          - --extend-select=B9

  # PyLint has native support - very slow
  - repo: https://github.com/PyCQA/pylint
    rev: v2.14.5
    hooks:
      - id: pylint

  # CMake formatting
  - repo: https://github.com/cheshirekow/cmake-format-precommit
    rev: v0.6.13
    hooks:
      - id: cmake-format
        additional_dependencies: [pyyaml]
        # (optional) override the default file types to run on (AND).
        # See https://pre-commit.com/#filtering-files-with-types
        types: [file]
        files: (\.cmake|CMakeLists.txt)(.in)?$

  # Check static types with mypy
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v0.971
    hooks:
      - id: mypy
        args: []
        exclude: ^(tests|docs|setup.py)/
        additional_dependencies: [nox, rich]

  # Checks the manifest for missing files (native support)
  - repo: https://github.com/mgedmin/check-manifest
    rev: 0.48
    hooks:
      - id: check-manifest
        # This is a slow hook, so only run this if --hook-stage manual is passed
        stages: [manual]
        additional_dependencies: [cmake, ninja]

  # Check json with comments
  - repo: https://gitlab.com/bmares/check-json5
    rev: v1.0.0
    hooks:
    - id: check-json5

  # Check for spelling
  - repo: https://github.com/codespell-project/codespell
    rev: v2.2.1
    hooks:
      - id: codespell
        exclude: |
          (?x)(
            ^(package-lock.json)
          )
        args:
          - --skip=".vscode/"
          - --ignore-words-list="Transer,transer"
          - --check-filenames
          - --write-changes # auto fix in place
        # using the pyproject.toml as the config file
        additional_dependencies:
          - tomli
    # stages: [manual]

  # Check for spelling
  - repo: https://github.com/crate-ci/typos
    rev: v1.16.2
    hooks:
      - id: typos

  # Check for common shell mistakes
  - repo: https://github.com/shellcheck-py/shellcheck-py
    rev: v0.8.0.4
    hooks:
      - id: shellcheck

  # Clang format the codebase automatically
  - repo: https://github.com/pre-commit/mirrors-clang-format
    rev: v14.0.6
    hooks:
      - id: clang-format
        # (optional: default []) list of file types to run on (OR). See Filtering files with types.
        types_or: [c++, c, cuda]

  # Commitizen is a tool designed that obeys a standard way of committing rules.
  - repo: https://github.com/commitizen-tools/commitizen
    rev: v2.32.1
    hooks:
      - id: commitizen
        # (optional: default (all stages)) confines the hook to the commit, merge-commit,
        # push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge,
        # post-rewrite, or manual stage. See Confining hooks to run at certain stages.
        stages: [commit-msg]
      - id: commitizen-branch
        stages: [push]


  # Check for markdown
  - repo: https://github.com/igorshubovych/markdownlint-cli
    rev: v0.35.0
    hooks:
      - id: markdownlint-fix
  # need rc config, such as https://github.com/msclock/blog_hexo/blob/master
  # related options for rc config, see https://github.com/DavidAnson/markdownlint#optionsconfig

Integrate with CI

links:

https://lyz-code.github.io/blue-book/devops/ci/#configuring-pre-commit

gitlab ci

赛选出需要使用的插件, 并结合 gitlab ci templates 进行配置.

github action

在 github 上配置 pre-commit 直接使用对应的 action 即可.

# Make commits from pre-commit
name: pre-commit
on:
  workflow_dispatch:
  push:
    branches:
      - master
  pull_request:
    branches:
      - master

jobs:
  pre-commit:
    runs-on: ubuntu-latest
    permissions:
      contents: write

    steps:
      - name: Checkout repository
        uses: actions/checkout@v3
        with:
          token: ${{ secrets.PERSONAL_TOKEN }}

      - name: Set up python
        uses: actions/setup-python@v3
        with:
          # Avoid checking out the repository in a detached state.
          ref: ${{ github.head_ref }}

      - name: Run pre-commit on codebase
        uses: pre-commit/action@v3.0.0

      - name: Auto fixes from pre-commit on failure
        uses: stefanzweifel/git-auto-commit-action@v4
        if: failure()
        with:
          commit_message: "ci: auto fixes from pre-commit"

# Make pr from pre-commit
name: pre-commit update
on:
  # every day at midnight
  schedule:
    - cron: "0 0 * * *"
  # on demand
  workflow_dispatch:

jobs:

  pre-commit-auto-update:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Set up Python
        uses: actions/setup-python@v3

      - name: Update pre-commit hooks
        uses: browniebroke/pre-commit-autoupdate-action@main

      - name: Create pull request on pre-commit updates
        uses: peter-evans/create-pull-request@v3
        if: ${{ always() }}
        with:
          branch: update/pre-commit-hooks
          title: Update pre-commit hooks
          commit-message: "ci: update pre-commit hooks"
          body: Update available versions of pre-commit hooks to latest version.
          delete-branch: true

dynaconf

dynaconf 专用于工程配置管理包, 符合 12factor config

配置实例

# 配置布局
tree server/configs/
server/configs/
├── __init__.py
├── config.py
├── default_settings.yaml
└── settings.yaml

0 directories, 4 files

# default_settings.yaml
# 以下配置第一次加载的默认配置
# 然后读取 "configs/settings.yaml" 配置

# 默认配置的环境配置
default:
  base: "@jinja {{this.current_env | lower}}"
  logging: "debug" # 默认debug级，python日志等级 debug < info < warning < error < critical;若设置的level>调用的函数，则不会输出
  open_gpu_mode: true # 是否以 gpu 模式调度AI模型。
  debug: true # 是否以 hot reload 模式启动服务
  port: 5000 # http 暴露服务端口
  protocol: "application/x-protobuf-json" # http 协议

  # 默认全局数据缓存目录环境配置
  global_dir:
    # 全局数据的文件夹名称
    cache_data: "cache_data"
    # 全局数据路径下的 dwg 文件缓存文件夹名称，默认为 cache_data/dwg_file_cache
    dwg_file_cache: "dwg_file_cache"
    # 全局数据路径下中间结果数据缓存文件夹名称，默认为 cache_data/mid_data_cache
    mid_data_cache: "mid_data_cache"
    # 全局数据路径下结果数据缓存文件夹名称，默认为 cache_data/result_data_cache
    result_data_cache: "result_data_cache"
    # 全局数据路径下目标检测数据缓存文件夹名称，默认为 cache_data/detection_data_cache
    detection_data_cache: "detection_data_cache"
    # 全局数据路径下消息数据缓存文件夹名称，默认为 cache_data/message_data_cache
    message_data_cache: "message_data_cache"
    # 全局数据路径下消息数据缓存文件夹名称，默认为 cache_data/origin_data_cache
    origin_data_cache: "origin_data_cache"
    # 全局数据路径下日志所在路径，默认为 .log
    log_dir: ".log"
    # 全局日志名称，默认为 .log/service.log
    log_name: "service"
    # 工程日志路径，默认：cache_data/log_data_cache
    log_data_cache: "log_data_cache"
    # 工程日志路径，默认：cache_data/cpp_parameters_cache
    cpp_parameters_cache: "cpp_parameters_cache"
    # 工程日志路径，默认：cache_data/backup_dara_cache
    backup_dara_cache: "backup_dara_cache"

# settings.yaml
# 以下配置会默认覆盖从 "configs/default_settings.yaml" 中加载的值

config:
  dynaconf_merge: true
  base: "production"

# config.py
import json
import pathlib
import sys
from dynaconf import Dynaconf, Validator

CONFIG_PATH = pathlib.Path(__file__).resolve().parent

# Create a Dynaconf instance to manage settings
settings = Dynaconf(
    # Switch between environments using the "LAUNCH_ENV" environment variable
    env_switcher="LAUNCH_ENV",
    # List of configuration files to load
    settings_files=[
        CONFIG_PATH / "default_settings.yaml",  # Default configuration
        CONFIG_PATH / "settings.yaml",  # Main configuration
    ],
    # Enable layered environments (development, production, default)
    environments=True,
    # Prefix for environment variables (e.g., "CONF_FOO=bar" becomes "settings.foo == "bar"")
    envvar_prefix="CONF",
    lowercase_read=True,
)

# Set the environment to "config"
settings.setenv("config")

# Register validators to ensure required settings exist
settings.validators.register(
    Validator("base", must_exist=True),
    Validator("logging", must_exist=True),
    Validator("open_gpu_mode", must_exist=True),
    Validator("port", must_exist=True),
    Validator("protocol", must_exist=True),
    Validator("global_dir", must_exist=True),
)

# Convert settings to a JSON string with custom formatting options
final_settings = json.dumps(
    settings.to_dict(),
    indent=4,
    ensure_ascii=False,
    sort_keys=False,
    separators=(",", ":"),
)

# Write the final settings to a JSON file named ".settings.json"
with open(".settings.json", "w") as f:
    f.write(final_settings)

# Disable debug mode when running in a debugger
if sys.gettrace() is not None:
    settings.debug = False

配置好后, 可使用命令查看配置环境生效的环境结果

# 查看生效 settings 环境
dynaconf -i server.configs.config.settings list
Working in config environment
BASE<str> 'production'
LOGGING<str> 'debug'
OPEN_GPU_MODE<bool> True
DEBUG<bool> True
PORT<int> 5000
PROTOCOL<str> 'application/x-protobuf-json'
GLOBAL_DIR<dict> {'backup_dara_cache': 'backup_dara_cache',
 'cache_data': 'cache_data',
 'cpp_parameters_cache': 'cpp_parameters_cache',
 'detection_data_cache': 'detection_data_cache',
 'dwg_file_cache': 'dwg_file_cache',
 'log_data_cache': 'log_data_cache',
 'log_dir': '.log',
 'log_name': 'service',
 'message_data_cache': 'message_data_cache',
 'mid_data_cache': 'mid_data_cache',
 'origin_data_cache': 'origin_data_cache',
 'result_data_cache': 'result_data_cache'}

loguru

loguru 是一个使 python 日志使用更加方便的包

links:

loguru

Install

1	pip install loguru

Usage

import loguru

loguru.logger.add("file.log", format="{time} {level} {message}", level="DEBUG")

loguru.logger.debug("This is a debug message")
loguru.logger.info("This is an info message")
loguru.logger.warning("This is a warning message")
loguru.logger.error("This is an error message")

pydantic

links:

https://www.youtube.com/watch?v=XIdQ6gO3Anc

pyscript

links:

gitpython

links:

pyenv

轻松管理 python 版本。

pip install pyenv
curl https://pyenv.run | bash
export PATH="$HOME/.pyenv/bin:$PATH"
eval "$(pyenv init - bash)"
pyenv install 3.10.6
pyenv shell 3.10.6
python --version

links:

https://github.com/pyenv/pyenv