Pytorch: 自定義網(wǎng)絡層實例

更新時間：2020年01月07日 13:36:13 作者：xholes

今天小編就為大家分享一篇Pytorch: 自定義網(wǎng)絡層實例，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧

自定義Autograd函數(shù)

對于淺層的網(wǎng)絡，我們可以手動的書寫前向傳播和反向傳播過程。但是當網(wǎng)絡變得很大時，特別是在做深度學習時，網(wǎng)絡結構變得復雜。前向傳播和反向傳播也隨之變得復雜，手動書寫這兩個過程就會存在很大的困難。幸運地是在pytorch中存在了自動微分的包，可以用來解決該問題。在使用自動求導的時候，網(wǎng)絡的前向傳播會定義一個計算圖（computational graph），圖中的節(jié)點是張量（tensor），兩個節(jié)點之間的邊對應了兩個張量之間變換關系的函數(shù)。有了計算圖的存在，張量的梯度計算也變得容易了些。例如， x是一個張量，其屬性 x.requires_grad = True，那么 x.grad就是一個保存這個張量x的梯度的一些標量值。

最基礎的自動求導操作在底層就是作用在兩個張量上。前向傳播函數(shù)是從輸入張量到輸出張量的計算過程；反向傳播是輸入輸出張量的梯度（一些標量）并輸出輸入張量的梯度（一些標量）。在pytorch中我們可以很容易地定義自己的自動求導操作，通過繼承torch.autograd.Function并定義forward和backward函數(shù)。

forward(): 前向傳播操作?？梢暂斎肴我舛嗟膮?shù)，任意的python對象都可以。

backward():反向傳播（梯度公式）。輸出的梯度個數(shù)需要與所使用的張量個數(shù)保持一致，且返回的順序也要對應起來。

# Inherit from Function
class LinearFunction(Function):

  # Note that both forward and backward are @staticmethods
  @staticmethod
  # bias is an optional argument
  def forward(ctx, input, weight, bias=None):
    # ctx在這里類似self，ctx的屬性可以在backward中調用
    ctx.save_for_backward(input, weight, bias)
    output = input.mm(weight.t())
    if bias is not None:
      output += bias.unsqueeze(0).expand_as(output)
    return output

  # This function has only a single output, so it gets only one gradient
  @staticmethod
  def backward(ctx, grad_output):
    # This is a pattern that is very convenient - at the top of backward
    # unpack saved_tensors and initialize all gradients w.r.t. inputs to
    # None. Thanks to the fact that additional trailing Nones are
    # ignored, the return statement is simple even when the function has
    # optional inputs.
    input, weight, bias = ctx.saved_tensors
    grad_input = grad_weight = grad_bias = None

    # These needs_input_grad checks are optional and there only to
    # improve efficiency. If you want to make your code simpler, you can
    # skip them. Returning gradients for inputs that don't require it is
    # not an error.
    if ctx.needs_input_grad[0]:
      grad_input = grad_output.mm(weight)
    if ctx.needs_input_grad[1]:
      grad_weight = grad_output.t().mm(input)
    if bias is not None and ctx.needs_input_grad[2]:
      grad_bias = grad_output.sum(0).squeeze(0)

    return grad_input, grad_weight, grad_bias

#調用自定義的自動求導函數(shù)
linear = LinearFunction.apply(*args) #前向傳播
linear.backward()#反向傳播
linear.grad_fn.apply(*args)#反向傳播

對于非參數(shù)化的張量（權重是常量，不需要更新），此時可以定義為：

class MulConstant(Function):
  @staticmethod
  def forward(ctx, tensor, constant):
    # ctx is a context object that can be used to stash information
    # for backward computation
    ctx.constant = constant
    return tensor * constant

  @staticmethod
  def backward(ctx, grad_output):
    # We return as many input gradients as there were arguments.
    # Gradients of non-Tensor arguments to forward must be None.
    return grad_output * ctx.constant, None

高階導數(shù)

grad_x =t.autograd.grad(y, x, create_graph=True)

grad_grad_x = t.autograd.grad(grad_x[0],x)

自定義Module

計算圖和自動求導在定義復雜網(wǎng)絡和求梯度的時候非常好用，但對于大型的網(wǎng)絡，這個還是有點偏底層。在我們構建網(wǎng)絡的時候，經(jīng)常希望將計算限制在每個層之內(nèi)（參數(shù)更新分層更新）。而且在TensorFlow等其他深度學習框架中都提供了高級抽象結構。因此，在pytorch中也提供了類似的包nn，它定義了一組等價于層（layer）的模塊（Modules）。一個Module接受輸入張量并得到輸出張量，同時也會包含可學習的參數(shù)。

有時候，我們希望運用一些新的且nn包中不存在的Module。此時就需要定義自己的Module了。自定義的Module需要繼承nn.Module且自定義forward函數(shù)。其中forward函數(shù)可以接受輸入張量并利用其它模型或者其他自動求導操作來產(chǎn)生輸出張量。但并不需要重寫backward函數(shù)，因此nn使用了autograd。這也就意味著，需要自定義Module, 都必須有對應的autograd函數(shù)以調用其中的backward。

class Linear(nn.Module):
  def __init__(self, input_features, output_features, bias=True):
    super(Linear, self).__init__()
    self.input_features = input_features
    self.output_features = output_features

    # nn.Parameter is a special kind of Tensor, that will get
    # automatically registered as Module's parameter once it's assigned
    # as an attribute. Parameters and buffers need to be registered, or
    # they won't appear in .parameters() (doesn't apply to buffers), and
    # won't be converted when e.g. .cuda() is called. You can use
    # .register_buffer() to register buffers.
    # (很重要?。?！參數(shù)一定需要梯度！)nn.Parameters require gradients by default.
    self.weight = nn.Parameter(torch.Tensor(output_features, input_features))
    if bias:
      self.bias = nn.Parameter(torch.Tensor(output_features))
    else:
      # You should always register all possible parameters, but the
      # optional ones can be None if you want.
      self.register_parameter('bias', None)

    # Not a very smart way to initialize weights
    self.weight.data.uniform_(-0.1, 0.1)
    if bias is not None:
      self.bias.data.uniform_(-0.1, 0.1)

  def forward(self, input):
    # See the autograd section for explanation of what happens here.
    return LinearFunction.apply(input, self.weight, self.bias)

  def extra_repr(self):
    # (Optional)Set the extra information about this module. You can test
    # it by printing an object of this class.
    return 'in_features={}, out_features={}, bias={}'.format(
      self.in_features, self.out_features, self.bias is not None

Function與Module的異同

Function與Module都可以對pytorch進行自定義拓展，使其滿足網(wǎng)絡的需求，但這兩者還是有十分重要的不同：

Function一般只定義一個操作，因為其無法保存參數(shù)，因此適用于激活函數(shù)、pooling等操作；Module是保存了參數(shù)，因此適合于定義一層，如線性層，卷積層，也適用于定義一個網(wǎng)絡

Function需要定義三個方法：init, forward, backward（需要自己寫求導公式）；Module：只需定義init和forward，而backward的計算由自動求導機制構成

可以不嚴謹?shù)恼J為，Module是由一系列Function組成，因此其在forward的過程中，F(xiàn)unction和Variable組成了計算圖，在backward時，只需調用Function的backward就得到結果，因此Module不需要再定義backward。

Module不僅包括了Function，還包括了對應的參數(shù)，以及其他函數(shù)與變量，這是Function所不具備的。

module 是 pytorch 組織神經(jīng)網(wǎng)絡的基本方式。Module 包含了模型的參數(shù)以及計算邏輯。Function 承載了實際的功能，定義了前向和后向的計算邏輯。

Module 是任何神經(jīng)網(wǎng)絡的基類，pytorch 中所有模型都必需是 Module 的子類。 Module 可以套嵌，構成樹狀結構。一個 Module 可以通過將其他 Module 做為屬性的方式，完成套嵌。

Function 是 pytorch 自動求導機制的核心類。Function 是無參數(shù)或者說無狀態(tài)的，它只負責接收輸入，返回相應的輸出；對于反向，它接收輸出相應的梯度，返回輸入相應的梯度。

在調用loss.backward()時，使用的是Function子類中定義的backward()函數(shù)。

以上這篇Pytorch: 自定義網(wǎng)絡層實例就是小編分享給大家的全部內(nèi)容了，希望能給大家一個參考，也希望大家多多支持腳本之家。

您可能感興趣的文章: