扩展 PyTorch

原文： https://pytorch.org/docs/stable/notes/extending.html

在本说明中，我们将介绍扩展 torch.nn ， torch.autograd 以及使用我们的 C 库编写自定义 C 扩展的方法。

扩展 `torch.autograd`

向 autograd 添加操作需要为每个操作实现一个新的 Function 子类。回想一下 Function 是 autograd 用于计算结果和梯度并编码操作历史的工具。每个新功能都需要您实现 2 种方法：

forward() -执行该操作的代码。如果指定默认值，它可以根据需要选择任意数量的参数，其中一些参数是可选的。此处接受各种 Python 对象。跟踪历史记录的Tensor自变量(即使用requires_grad=True的自变量）将被转换为在调用之前不跟踪历史的自变量，并且它们的使用将被记录在图形中。请注意，此逻辑不会遍历列表/字典/任何其他数据结构，而只会考虑直接作为调用参数的Tensor。您可以返回单个Tensor输出，或者如果有多个输出，则返回Tensor的tuple。另外，请参考 Function 的文档，以找到仅可从 forward() 调用的有用方法的描述。
backward() -梯度公式。将给与Tensor参数一样多的参数，每个参数代表梯度 w.r.t。该输出。它应该返回与输入一样多的Tensor，每个输入都包含梯度 w.r.t。其相应的输入。如果您的输入不需要梯度(needs_input_grad是布尔值的元组，指示每个输入是否需要梯度计算），或者是非Tensor对象，则可以返回python:None。另外，如果您为 forward() 设置了可选参数，则返回的梯度可能比输入的梯度多，只要它们都是python:None。

注意

用户有责任正确使用前向 <cite>ctx</cite> 中的特殊功能，以确保新的 Function 与Autograd Engine一起正常工作。

保存正向输入或输出以供稍后在向后使用时，必须使用 save_for_backward() 。
mark_dirty() 必须用于标记任何由正向功能修改的输入。
mark_non_differentiable() 必须用于告知发动机输出是否不可微。

您可以在下面找到 torch.nn 中Linear功能的代码，并附带以下注释：

# Inherit from Function
class LinearFunction(Function):

    # Note that both forward and backward are @staticmethods
    @staticmethod
    # bias is an optional argument
    def forward(ctx, input, weight, bias=None):
        ctx.save_for_backward(input, weight, bias)
        output = input.mm(weight.t())
        if bias is not None:
            output += bias.unsqueeze(0).expand_as(output)
        return output

    # This function has only a single output, so it gets only one gradient
    @staticmethod
    def backward(ctx, grad_output):
        # This is a pattern that is very convenient - at the top of backward
        # unpack saved_tensors and initialize all gradients w.r.t. inputs to
        # None. Thanks to the fact that additional trailing Nones are
        # ignored, the return statement is simple even when the function has
        # optional inputs.
        input, weight, bias = ctx.saved_tensors
        grad_input = grad_weight = grad_bias = None

        # These needs_input_grad checks are optional and there only to
        # improve efficiency. If you want to make your code simpler, you can
        # skip them. Returning gradients for inputs that don't require it is
        # not an error.
        if ctx.needs_input_grad[0]:
            grad_input = grad_output.mm(weight)
        if ctx.needs_input_grad[1]:
            grad_weight = grad_output.t().mm(input)
        if bias is not None and ctx.needs_input_grad[2]:
            grad_bias = grad_output.sum(0)

        return grad_input, grad_weight, grad_bias

现在，为了使使用这些自定义操作更容易，我们建议为它们的apply方法加上别名：

linear = LinearFunction.apply

在这里，我们给出了一个由非 Tensor 参数参数化的函数的附加示例：

class MulConstant(Function):
    @staticmethod
    def forward(ctx, tensor, constant):
        # ctx is a context object that can be used to stash information
        # for backward computation
        ctx.constant = constant
        return tensor * constant

    @staticmethod
    def backward(ctx, grad_output):
        # We return as many input gradients as there were arguments.
        # Gradients of non-Tensor arguments to forward must be None.
        return grad_output * ctx.constant, None

Note

backward的输入，即grad_output，也可以是跟踪历史的张量。因此，如果backward通过可区分的操作实现(例如，调用另一个自定义function），则高阶导数将起作用。

您可能想检查实现的向后方法是否实际计算了函数的派生类。通过与使用较小有限差分的数值近似进行比较，可以实现：

from torch.autograd import gradcheck

# gradcheck takes a tuple of tensors as input, check if your gradient
# evaluated with these tensors are close enough to numerical
# approximations and returns True if they all verify this condition.
input = (torch.randn(20,20,dtype=torch.double,requires_grad=True), torch.randn(30,20,dtype=torch.double,requires_grad=True))
test = gradcheck(linear, input, eps=1e-6, atol=1e-4)
print(test)

有关有限差分梯度比较的更多详细信息，请参见数字梯度检查。

扩展 `torch.nn`

nn 导出两种接口-模块及其功能版本。您可以通过两种方式对其进行扩展，但是我们建议对所有层使用模块，这些模块可容纳任何参数或缓冲区，并建议使用功能形式的无参数操作，例如激活函数，缓冲池等。

上面的部分已经完全介绍了添加操作的功能版本。

添加 `Module`

由于 nn 大量利用了 autograd ，因此添加新的 Module 需要实现执行该操作的 Function 并可以计算梯度。从现在开始，假设我们要实现Linear模块，并且已经实现了如上清单所示的功能。只需很少的代码即可添加。现在，需要实现两个功能：

__init__(可选）-接受参数，例如内核大小，功能数量等，并初始化参数和缓冲区。
forward() -实例化 Function 并使用它执行操作。它与上面显示的功能包装非常相似。

这是可以实现Linear模块的方式：

class Linear(nn.Module):
    def __init__(self, input_features, output_features, bias=True):
        super(Linear, self).__init__()
        self.input_features = input_features
        self.output_features = output_features

        # nn.Parameter is a special kind of Tensor, that will get
        # automatically registered as Module's parameter once it's assigned
        # as an attribute. Parameters and buffers need to be registered, or
        # they won't appear in .parameters() (doesn't apply to buffers), and
        # won't be converted when e.g. .cuda() is called. You can use
        # .register_buffer() to register buffers.
        # nn.Parameters require gradients by default.
        self.weight = nn.Parameter(torch.Tensor(output_features, input_features))
        if bias:
            self.bias = nn.Parameter(torch.Tensor(output_features))
        else:
            # You should always register all possible parameters, but the
            # optional ones can be None if you want.
            self.register_parameter('bias', None)

        # Not a very smart way to initialize weights
        self.weight.data.uniform_(-0.1, 0.1)
        if bias is not None:
            self.bias.data.uniform_(-0.1, 0.1)

    def forward(self, input):
        # See the autograd section for explanation of what happens here.
        return LinearFunction.apply(input, self.weight, self.bias)

    def extra_repr(self):
        # (Optional)Set the extra information about this module. You can test
        # it by printing an object of this class.
        return 'in_features={}, out_features={}, bias={}'.format(
            self.in_features, self.out_features, self.bias is not None
        )

编写自定义 C ++扩展

有关详细说明和示例，请参见 PyTorch 教程。

可在 torch.utils.cpp_extension 上找到文档。

编写自定义 C 扩展

可以在此 GitHub 存储库上找到示例。

我们一直在努力

apachecn/AiLearning

扩展 PyTorch

扩展 torch.autograd

扩展 torch.nn

添加 Module

编写自定义 C ++扩展

编写自定义 C 扩展

扩展 `torch.autograd`

扩展 `torch.nn`

添加 `Module`