Packages

p

lamp

autograd

package autograd

Implements reverse mode automatic differentiaton

The main types in this package are lamp.autograd.Variable and lamp.autograd.Op. The computational graph built by this package consists of vertices representing values (as lamp.autograd.Variable) and vertices representing operations (as lamp.autograd.Op).

Variables contain the value of a Rn => Rm function. Variables may also contain the partial derivative of their argument with respect to a single scalar. A Variable whose value is a scalar (m=1) can trigger the computation of partial derivatives of all the intermediate upstream Variables. Computing partial derivatives with respect to non-scalar variables is not supported.

A constant Variable may be created with the const or param factory method in this package. const may be used for constants which do not need their partial derivatives to be computed. param on the other hand create Variables which will fill in their partial derivatives. Further variables may be created by the methods in this class, eventually expressing more complex Rn => Rm functions.

Example
lamp.Scope.root{ implicit scope =>
  // x is constant (depends on no other variables) and won't compute a partial derivative
  val x = lamp.autograd.const(STen.eye(3, STenOptions.d))
  // y is constant but will compute a partial derivative
  val y = lamp.autograd.param(STen.ones(List(3,3), STenOptions.d))

  // z is a Variable with x and y dependencies
  val z = x+y

  // w is a Variable with z as a direct and x, y as transient dependencies
  val w = z.sum
  // w is a scalar (number of elements is 1), thus we can call backprop() on it.
  // calling backprop will fill out the partial derivatives of the upstream variables
  w.backprop()

  // partialDerivative is empty since we created `x` with `const`
  assert(x.partialDerivative.isEmpty)

  // `y`'s partial derivatie is defined and is computed
  // it holds `y`'s partial derivative with respect to `w`, the scalar which we called backprop() on
  assert(y.partialDerivative.isDefined)

}

This package may be used to compute the derivative of any function, provided the function can be composed out of the provided methods. A particular use case is gradient based optimization.

See also

https://arxiv.org/pdf/1811.05031.pdf for a review of the algorithm

lamp.autograd.Op for how to implement a new operation

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. autograd
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Type Members

  1. case class Add(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  2. case class ArcTan(scope: Scope, a: Variable) extends Op with Product with Serializable
  3. case class ArgMax(scope: Scope, a: Variable, dim: Long, keepDim: Boolean) extends Op with Product with Serializable
  4. case class Assign(scope: Scope, abandon: Variable, keep: Variable) extends Op with Product with Serializable
  5. case class AvgPool2D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long) extends Op with Product with Serializable

    2D avg pooling

    2D avg pooling

    input

    batch x in_channels x h x w

  6. case class BatchNorm(scope: Scope, input: Variable, weight: Variable, bias: Variable, runningMean: STen, runningVar: STen, training: Boolean, momentum: Double, eps: Double) extends Op with Product with Serializable
  7. case class BatchNorm2D(scope: Scope, input: Variable, weight: Variable, bias: Variable, runningMean: STen, runningVar: STen, training: Boolean, momentum: Double, eps: Double) extends Op with Product with Serializable

    Batch Norm 2D 0-th dimension are samples.

    Batch Norm 2D 0-th dimension are samples. 1-th are features, everything else is averaged out.

  8. case class BatchedMatMul(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  9. case class BinaryCrossEntropyWithLogitsLoss(scope: Scope, input: Variable, target: STen, posWeights: Option[STen], reduction: Reduction) extends Op with Product with Serializable

    input: (N,T) where T>=1 are multiple independent tasks target: same shape as input, float with in [0,1] posWeight: is (T)

  10. case class CappedShiftedNegativeExponential(scope: Scope, a: Variable, shift: Double) extends Op with Product with Serializable
  11. case class CastToPrecision(scope: Scope, a: Variable, precision: FloatingPointPrecision) extends Op with Product with Serializable
  12. case class Cholesky(scope: Scope, input: Variable) extends Op with Product with Serializable
  13. case class CholeskySolve(scope: Scope, b: Variable, factor: Variable, upper: Boolean) extends Op with Product with Serializable
  14. case class Concatenate(scope: Scope, a: Seq[Variable], dim: Long) extends Op with Product with Serializable
  15. case class ConstAdd(scope: Scope, a: Variable, b: Double) extends Op with Product with Serializable
  16. case class ConstMult(scope: Scope, a: Variable, b: Double) extends Op with Product with Serializable
  17. sealed trait Constant extends Variable

    A variable whose parent is empty

  18. case class ConstantWithGrad(value: STen, pd: STen) extends Constant with Product with Serializable

    A variable whose parent is empty but whose partial derivative is defined

  19. case class ConstantWithoutGrad(value: STen) extends Constant with Product with Serializable

    A variable whose parent and partial derivatives are empty

  20. case class Convolution(scope: Scope, input: Variable, weight: Variable, bias: Variable, stride: Array[Long], padding: Array[Long], dilation: Array[Long], transposed: Boolean, outputPadding: Array[Long], groups: Long) extends Op with Product with Serializable

    1D/2D/3D convolution

    1D/2D/3D convolution

    input

    batch x in_channels x height x width

    weight

    out_channels x in_channels x kernel_size x kernel_size

    bias

    out_channels

    returns

    Variable with Tensor of size batch x out_channels x L' (length depends on stride/padding/dilation)

  21. case class Cos(scope: Scope, a: Variable) extends Op with Product with Serializable
  22. case class Cross(scope: Scope, a: Variable, b: Variable, dim: Int) extends Op with Product with Serializable
  23. case class Debug(scope: Scope, a: Variable, callback: (STen, Boolean, Boolean) => Unit) extends Op with Product with Serializable
  24. case class Diag(scope: Scope, a: Variable, diagonal: Long) extends Op with Product with Serializable
  25. case class Div(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  26. case class Dropout(scope: Scope, a: Variable, prob: Double, train: Boolean) extends Op with Product with Serializable
  27. case class ElementWiseMaximum(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  28. case class ElementWiseMinimum(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  29. case class Embedding(scope: Scope, input: Variable, weight: Variable) extends Op with Product with Serializable
  30. case class EqWhere(scope: Scope, a: Variable, b: Long) extends Op with Product with Serializable
  31. case class EuclideanDistance(scope: Scope, a: Variable, b: Variable, dim: Int) extends Op with Product with Serializable
  32. case class Exp(scope: Scope, a: Variable) extends Op with Product with Serializable
  33. case class Expand(scope: Scope, a: Variable, shape: List[Long]) extends Op with Product with Serializable
  34. case class ExpandAs(scope: Scope, a: Variable, as: STen) extends Op with Product with Serializable
  35. case class Flatten(scope: Scope, input: Variable, startDim: Int, endDim: Int) extends Op with Product with Serializable
  36. case class Gelu(scope: Scope, a: Variable) extends Op with Product with Serializable
  37. case class GraphMemoryAllocationReport(parameterTensorCount: Long, parameterTensorStorage: Long, constantTensorCount: Long, constantTensorStorage: Long, intermediateTensorCount: Long, intermediateTensorStorage: Long) extends Product with Serializable
  38. case class HardSwish(scope: Scope, a: Variable) extends Op with Product with Serializable
  39. case class IndexAdd(scope: Scope, src: Variable, index: Variable, dim: Int, maxIndex: Long) extends Op with Product with Serializable
  40. case class IndexAddToTarget(scope: Scope, target: Variable, src: Variable, index: Variable, dim: Int) extends Op with Product with Serializable
  41. case class IndexFill(scope: Scope, input: Variable, dim: Long, index: Variable, fill: Double) extends Op with Product with Serializable
  42. case class IndexSelect(scope: Scope, input: Variable, dim: Long, index: Variable) extends Op with Product with Serializable
  43. case class Inv(scope: Scope, a: Variable) extends Op with Product with Serializable
  44. case class LayerNormOp(scope: Scope, input: Variable, weight: Variable, bias: Variable, normalizedShape: List[Long], eps: Double) extends Op with Product with Serializable
  45. case class LeakyRelu(scope: Scope, a: Variable, slope: Double) extends Op with Product with Serializable
  46. case class Log(scope: Scope, a: Variable) extends Op with Product with Serializable
  47. case class Log1p(scope: Scope, a: Variable) extends Op with Product with Serializable
  48. case class LogDet(scope: Scope, a: Variable) extends Op with Product with Serializable
  49. case class LogSoftMax(scope: Scope, a: Variable, dim: Int) extends Op with Product with Serializable
  50. case class MaskFill(scope: Scope, input: Variable, mask: Variable, fill: Double) extends Op with Product with Serializable
  51. case class MaskSelect(scope: Scope, input: Variable, mask: Variable) extends Op with Product with Serializable
  52. case class MatMul(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  53. case class MaxPool1D(scope: Scope, input: Variable, kernelSize: Long, stride: Long = 1, padding: Long = 0, dilation: Long = 1) extends Op with Product with Serializable

    1D max pooling

    1D max pooling

    input

    batch x in_channels x L

  54. case class MaxPool2D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long, dilation: Long) extends Op with Product with Serializable

    2D max pooling

    2D max pooling

    input

    batch x in_channels x h x w

  55. case class Mean(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op with Product with Serializable
  56. case class Minus(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  57. case class MseLoss(scope: Scope, input: Variable, target: STen, reduction: Reduction) extends Op with Product with Serializable
  58. case class Mult(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
  59. case class NllLoss(scope: Scope, input: Variable, target: STen, weights: STen, reduction: Reduction, ignore: Long) extends Op with Product with Serializable
  60. case class Norm2(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op with Product with Serializable
  61. case class OneHot(scope: Scope, a: Variable, numClasses: Int) extends Op with Product with Serializable
  62. trait Op extends AnyRef

    Represents an operation in the computational graph

    Represents an operation in the computational graph

    Short outline of reverse autograd from scalar values

    y = f1 o f2 o .. o fn

    One of these subexpression (f_i) has value w2 and arguments w1. We can write dy/dw1 = dy/dw2 * dw2/dw1. dw2/dw1 is the Jacobian of f_i at the current value of w1. dy/dw2 is the Jacobian of y wrt to w2 at the current value of w2.

    The current value of w1 and w2 are computed in a forward pass. The value dy/dy is 1 and from this dy/dw2 is recursed in the backward pass. The Jacobian function of dw2/dw1 is computed symbolically and hard coded.

    The anonymous function which Ops must implement is dy/dw2 => dy/dw2 * dw2/dw1. The argument of that function (dy/dw2) is coming down from the backward pass. The Op must implement dy/dw2 * dw2/dw1.

    The shape of dy/dw2 is the shape of the value of the operation (dy/dw2). The shape of dy/dw2 * dw2/dw1 is the shape of the parameter variable with respect which the derivative is taken, i.e. w1 since we are computing dy/dw1.

    How to implement an operation
    // Each concrete realization of the operation corresponds to an instance of an Op
    // The Op instance holds handles to the input variables (here a, b), to be used in the backward pass
    // The forward pass is effectively done in the constructor of the Op
    // The backward pass is triggerd and orchestrated by [[lamp.autograd.Variable.backward]]
    case class Mult(scope: Scope, a: Variable, b: Variable) extends Op {
    
    // List all parameters which support partial derivatives, here both a and b
    val params = List(
      // partial derivative of the first argument
      a.zipBackward { (p, out) =>
      // p is the incoming partial derivative, out is where the result is accumated into
      // Intermediate tensors are released due to the enclosing Scope.root
      Scope.root { implicit scope => out += (p * b.value).unbroadcast(a.sizes) }
      },
      // partial derivative of the second argument ..
      b.zipBackward { (p, out) =>
      Scope.root { implicit scope => out += (p * a.value).unbroadcast(b.sizes) }
    
      }
    )
    //The value of this operation, i.e. the forward pass
    val value = Variable(this, a.value.*(b.value)(scope))(scope)
    
    }
    See also

    https://en.wikipedia.org/wiki/Automatic_differentiation#Reverse_accumulation

    http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf

  63. case class PInv(scope: Scope, a: Variable, rcond: Double) extends Op with Product with Serializable
  64. case class Pow(scope: Scope, a: Variable, exponent: Variable) extends Op with Product with Serializable
  65. case class PowConst(scope: Scope, a: Variable, exponent: Double) extends Op with Product with Serializable
  66. sealed trait Reduction extends AnyRef
  67. case class Relu(scope: Scope, a: Variable) extends Op with Product with Serializable
  68. case class RepeatInterleave(scope: Scope, self: Variable, repeats: Variable, dim: Int) extends Op with Product with Serializable
  69. case class Reshape(scope: Scope, a: Variable, shape: Array[Long]) extends Op with Product with Serializable
  70. case class ScaledDotProductAttention(scope: Scope, query: Variable, key: Variable, valueIn: Variable, isCausal: Boolean) extends Op with Product with Serializable
  71. case class ScatterAdd(scope: Scope, src: Variable, index: Variable, dim: Int, maxIndex: Long) extends Op with Product with Serializable
  72. case class Select(scope: Scope, a: Variable, dim: Long, index: Long) extends Op with Product with Serializable
  73. case class Sigmoid(scope: Scope, a: Variable) extends Op with Product with Serializable
  74. case class Sin(scope: Scope, a: Variable) extends Op with Product with Serializable
  75. case class Slice(scope: Scope, a: Variable, dim: Long, start: Long, end: Long, step: Long) extends Op with Product with Serializable
  76. case class SmoothL1Loss(scope: Scope, input: Variable, target: STen, reduction: Reduction, beta: Double) extends Op with Product with Serializable
  77. case class Softplus(scope: Scope, a: Variable, beta: Double, threshold: Double) extends Op with Product with Serializable
  78. case class SparseFromValueAndIndex(scope: Scope, values: Variable, indices: STen, dim: Seq[Long]) extends Op with Product with Serializable
  79. case class SquaredFrobeniusMatrixNorm(scope: Scope, a: Variable) extends Op with Product with Serializable
  80. case class Stack(scope: Scope, a: Seq[Variable], dim: Long) extends Op with Product with Serializable
  81. case class Sum(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op with Product with Serializable
  82. case class Tan(scope: Scope, a: Variable) extends Op with Product with Serializable
  83. case class Tanh(scope: Scope, a: Variable) extends Op with Product with Serializable
  84. case class ToDense(scope: Scope, sparse: Variable) extends Op with Product with Serializable
  85. case class Transpose(scope: Scope, a: Variable, dim1: Int = 0, dim2: Int = 1) extends Op with Product with Serializable
  86. sealed trait Variable extends AnyRef

    A value of a tensor valued function, a vertex in the computational graph.

    A value of a tensor valued function, a vertex in the computational graph.

    A Variable may be constant, i.e. depends on no other Variables. Constant variables may or may not need their partial derivatives computed.

  87. case class VariableNonConstant(op1: Op, value: STen, pd: STen) extends Variable with Product with Serializable

    A variable whose parent is not empty, neither its partial derivative

  88. case class Variance(scope: Scope, a: Variable, dim: List[Int]) extends Op with Product with Serializable
  89. case class View(scope: Scope, a: Variable, shape: Array[Long]) extends Op with Product with Serializable
  90. case class WeightNorm(scope: Scope, v: Variable, g: Variable, dim: Long) extends Op with Product with Serializable
  91. case class Where(scope: Scope, condition: STen, trueBranch: Variable, falseBranch: Variable) extends Op with Product with Serializable

Value Members

  1. def const(m: Double, tOpt: STenOptions = STenOptions.d)(implicit scope: Scope): Constant
  2. def const(m: STen): Constant
  3. def param(m: Double, tOpt: STenOptions = STenOptions.d)(implicit scope: Scope): ConstantWithGrad
  4. def param(m: STen)(implicit scope: Scope): ConstantWithGrad
  5. object Autograd
  6. object Constant
  7. case object Mean extends Reduction with Product with Serializable
  8. case object NoReduction extends Reduction with Product with Serializable
  9. case object Sum extends Reduction with Product with Serializable
  10. object Variable
  11. object VariableNonConstant extends Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped