package autograd
Implements reverse mode automatic differentiaton
The main types in this package are lamp.autograd.Variable and lamp.autograd.Op. The computational graph built by this package consists of vertices representing values (as lamp.autograd.Variable) and vertices representing operations (as lamp.autograd.Op).
Variables contain the value of a Rn => Rm
function. Variables may also
contain the partial derivative of their argument with respect to a single
scalar. A Variable whose value is a scalar (m=1) can trigger the computation
of partial derivatives of all the intermediate upstream Variables. Computing
partial derivatives with respect to non-scalar variables is not supported.
A constant Variable may be created with the const
or param
factory
method in this package. const
may be used for constants which do not need
their partial derivatives to be computed. param
on the other hand create
Variables which will fill in their partial derivatives. Further variables
may be created by the methods in this class, eventually expressing more
complex Rn => Rm
functions.
Example
lamp.Scope.root{ implicit scope => // x is constant (depends on no other variables) and won't compute a partial derivative val x = lamp.autograd.const(STen.eye(3, STenOptions.d)) // y is constant but will compute a partial derivative val y = lamp.autograd.param(STen.ones(List(3,3), STenOptions.d)) // z is a Variable with x and y dependencies val z = x+y // w is a Variable with z as a direct and x, y as transient dependencies val w = z.sum // w is a scalar (number of elements is 1), thus we can call backprop() on it. // calling backprop will fill out the partial derivatives of the upstream variables w.backprop() // partialDerivative is empty since we created `x` with `const` assert(x.partialDerivative.isEmpty) // `y`'s partial derivatie is defined and is computed // it holds `y`'s partial derivative with respect to `w`, the scalar which we called backprop() on assert(y.partialDerivative.isDefined) }
This package may be used to compute the derivative of any function, provided the function can be composed out of the provided methods. A particular use case is gradient based optimization.
- See also
https://arxiv.org/pdf/1811.05031.pdf for a review of the algorithm
lamp.autograd.Op for how to implement a new operation
- Alphabetic
- By Inheritance
- autograd
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Type Members
- case class Add(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class ArcTan(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class ArgMax(scope: Scope, a: Variable, dim: Long, keepDim: Boolean) extends Op with Product with Serializable
- case class Assign(scope: Scope, abandon: Variable, keep: Variable) extends Op with Product with Serializable
- case class AvgPool2D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long) extends Op with Product with Serializable
2D avg pooling
2D avg pooling
- input
batch x in_channels x h x w
- case class BatchNorm(scope: Scope, input: Variable, weight: Variable, bias: Variable, runningMean: STen, runningVar: STen, training: Boolean, momentum: Double, eps: Double) extends Op with Product with Serializable
- case class BatchNorm2D(scope: Scope, input: Variable, weight: Variable, bias: Variable, runningMean: STen, runningVar: STen, training: Boolean, momentum: Double, eps: Double) extends Op with Product with Serializable
Batch Norm 2D 0-th dimension are samples.
Batch Norm 2D 0-th dimension are samples. 1-th are features, everything else is averaged out.
- case class BatchedMatMul(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class BinaryCrossEntropyWithLogitsLoss(scope: Scope, input: Variable, target: STen, posWeights: Option[STen], reduction: Reduction) extends Op with Product with Serializable
input: (N,T) where T>=1 are multiple independent tasks target: same shape as input, float with in [0,1] posWeight: is (T)
- case class CappedShiftedNegativeExponential(scope: Scope, a: Variable, shift: Double) extends Op with Product with Serializable
- case class CastToPrecision(scope: Scope, a: Variable, precision: FloatingPointPrecision) extends Op with Product with Serializable
- case class Cholesky(scope: Scope, input: Variable) extends Op with Product with Serializable
- case class CholeskySolve(scope: Scope, b: Variable, factor: Variable, upper: Boolean) extends Op with Product with Serializable
- case class Concatenate(scope: Scope, a: Seq[Variable], dim: Long) extends Op with Product with Serializable
- case class ConstAdd(scope: Scope, a: Variable, b: Double) extends Op with Product with Serializable
- case class ConstMult(scope: Scope, a: Variable, b: Double) extends Op with Product with Serializable
- sealed trait Constant extends Variable
A variable whose parent is empty
- case class ConstantWithGrad(value: STen, pd: STen) extends Constant with Product with Serializable
A variable whose parent is empty but whose partial derivative is defined
- case class ConstantWithoutGrad(value: STen) extends Constant with Product with Serializable
A variable whose parent and partial derivatives are empty
- case class Convolution(scope: Scope, input: Variable, weight: Variable, bias: Variable, stride: Array[Long], padding: Array[Long], dilation: Array[Long], transposed: Boolean, outputPadding: Array[Long], groups: Long) extends Op with Product with Serializable
1D/2D/3D convolution
1D/2D/3D convolution
- input
batch x in_channels x height x width
- weight
out_channels x in_channels x kernel_size x kernel_size
- bias
out_channels
- returns
Variable with Tensor of size batch x out_channels x L' (length depends on stride/padding/dilation)
- case class Cos(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Cross(scope: Scope, a: Variable, b: Variable, dim: Int) extends Op with Product with Serializable
- case class Debug(scope: Scope, a: Variable, callback: (STen, Boolean, Boolean) => Unit) extends Op with Product with Serializable
- case class Diag(scope: Scope, a: Variable, diagonal: Long) extends Op with Product with Serializable
- case class Div(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class Dropout(scope: Scope, a: Variable, prob: Double, train: Boolean) extends Op with Product with Serializable
- case class ElementWiseMaximum(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class ElementWiseMinimum(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class Embedding(scope: Scope, input: Variable, weight: Variable) extends Op with Product with Serializable
- case class EqWhere(scope: Scope, a: Variable, b: Long) extends Op with Product with Serializable
- case class EuclideanDistance(scope: Scope, a: Variable, b: Variable, dim: Int) extends Op with Product with Serializable
- case class Exp(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Expand(scope: Scope, a: Variable, shape: List[Long]) extends Op with Product with Serializable
- case class ExpandAs(scope: Scope, a: Variable, as: STen) extends Op with Product with Serializable
- case class Flatten(scope: Scope, input: Variable, startDim: Int, endDim: Int) extends Op with Product with Serializable
- case class Gelu(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class GraphMemoryAllocationReport(parameterTensorCount: Long, parameterTensorStorage: Long, constantTensorCount: Long, constantTensorStorage: Long, intermediateTensorCount: Long, intermediateTensorStorage: Long) extends Product with Serializable
- case class HardSwish(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class IndexAdd(scope: Scope, src: Variable, index: Variable, dim: Int, maxIndex: Long) extends Op with Product with Serializable
- case class IndexAddToTarget(scope: Scope, target: Variable, src: Variable, index: Variable, dim: Int) extends Op with Product with Serializable
- case class IndexFill(scope: Scope, input: Variable, dim: Long, index: Variable, fill: Double) extends Op with Product with Serializable
- case class IndexSelect(scope: Scope, input: Variable, dim: Long, index: Variable) extends Op with Product with Serializable
- case class Inv(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class LayerNormOp(scope: Scope, input: Variable, weight: Option[Variable], bias: Option[Variable], normalizedShape: List[Long], eps: Double) extends Op with Product with Serializable
- case class LeakyRelu(scope: Scope, a: Variable, slope: Double) extends Op with Product with Serializable
- case class Log(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Log1p(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class LogDet(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class LogSoftMax(scope: Scope, a: Variable, dim: Int) extends Op with Product with Serializable
- case class MaskFill(scope: Scope, input: Variable, mask: Variable, fill: Double) extends Op with Product with Serializable
- case class MaskSelect(scope: Scope, input: Variable, mask: Variable) extends Op with Product with Serializable
- case class MatMul(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class MaxPool1D(scope: Scope, input: Variable, kernelSize: Long, stride: Long = 1, padding: Long = 0, dilation: Long = 1) extends Op with Product with Serializable
1D max pooling
1D max pooling
- input
batch x in_channels x L
- case class MaxPool2D(scope: Scope, input: Variable, kernelSize: Long, stride: Long, padding: Long, dilation: Long) extends Op with Product with Serializable
2D max pooling
2D max pooling
- input
batch x in_channels x h x w
- case class Mean(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op with Product with Serializable
- case class Minus(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class MseLoss(scope: Scope, input: Variable, target: STen, reduction: Reduction) extends Op with Product with Serializable
- case class Mult(scope: Scope, a: Variable, b: Variable) extends Op with Product with Serializable
- case class NllLoss(scope: Scope, input: Variable, target: STen, weights: STen, reduction: Reduction, ignore: Long) extends Op with Product with Serializable
- case class Norm2(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op with Product with Serializable
- case class OneHot(scope: Scope, a: Variable, numClasses: Int) extends Op with Product with Serializable
- trait Op extends AnyRef
Represents an operation in the computational graph
Represents an operation in the computational graph
Short outline of reverse autograd from scalar values
y = f1 o f2 o .. o fn
One of these subexpression (f_i) has value w2 and arguments
w1
. We can writedy/dw1 = dy/dw2 * dw2/dw1
.dw2/dw1
is the Jacobian off_i
at the current value ofw1
.dy/dw2
is the Jacobian ofy
wrt tow2
at the current value ofw2
.The current value of
w1
andw2
are computed in a forward pass. The valuedy/dy
is 1 and from thisdy/dw2
is recursed in the backward pass. The Jacobian function ofdw2/dw1
is computed symbolically and hard coded.The anonymous function which
Op
s must implement isdy/dw2 => dy/dw2 * dw2/dw1
. The argument of that function (dy/dw2
) is coming down from the backward pass. TheOp
must implementdy/dw2 * dw2/dw1
.The shape of
dy/dw2
is the shape of the value of the operation (dy/dw2
). The shape ofdy/dw2 * dw2/dw1
is the shape of the parameter variable with respect which the derivative is taken, i.e.w1
since we are computingdy/dw1
.How to implement an operation
// Each concrete realization of the operation corresponds to an instance of an Op // The Op instance holds handles to the input variables (here a, b), to be used in the backward pass // The forward pass is effectively done in the constructor of the Op // The backward pass is triggerd and orchestrated by [[lamp.autograd.Variable.backward]] case class Mult(scope: Scope, a: Variable, b: Variable) extends Op { // List all parameters which support partial derivatives, here both a and b val params = List( // partial derivative of the first argument a.zipBackward { (p, out) => // p is the incoming partial derivative, out is where the result is accumated into // Intermediate tensors are released due to the enclosing Scope.root Scope.root { implicit scope => out += (p * b.value).unbroadcast(a.sizes) } }, // partial derivative of the second argument .. b.zipBackward { (p, out) => Scope.root { implicit scope => out += (p * a.value).unbroadcast(b.sizes) } } ) //The value of this operation, i.e. the forward pass val value = Variable(this, a.value.*(b.value)(scope))(scope) }
- case class PInv(scope: Scope, a: Variable, rcond: Double) extends Op with Product with Serializable
- case class Pow(scope: Scope, a: Variable, exponent: Variable) extends Op with Product with Serializable
- case class PowConst(scope: Scope, a: Variable, exponent: Double) extends Op with Product with Serializable
- sealed trait Reduction extends AnyRef
- case class Relu(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class RepeatInterleave(scope: Scope, self: Variable, repeats: Variable, dim: Int) extends Op with Product with Serializable
- case class Reshape(scope: Scope, a: Variable, shape: Array[Long]) extends Op with Product with Serializable
- case class ScaledDotProductAttention(scope: Scope, query: Variable, key: Variable, valueIn: Variable, isCausal: Boolean) extends Op with Product with Serializable
- case class ScatterAdd(scope: Scope, src: Variable, index: Variable, dim: Int, maxIndex: Long) extends Op with Product with Serializable
- case class Select(scope: Scope, a: Variable, dim: Long, index: Long) extends Op with Product with Serializable
- case class Sigmoid(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Sin(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Slice(scope: Scope, a: Variable, dim: Long, start: Long, end: Long, step: Long) extends Op with Product with Serializable
- case class SmoothL1Loss(scope: Scope, input: Variable, target: STen, reduction: Reduction, beta: Double) extends Op with Product with Serializable
- case class Softplus(scope: Scope, a: Variable, beta: Double, threshold: Double) extends Op with Product with Serializable
- case class SparseFromValueAndIndex(scope: Scope, values: Variable, indices: STen, dim: Seq[Long]) extends Op with Product with Serializable
- case class SquaredFrobeniusMatrixNorm(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Stack(scope: Scope, a: Seq[Variable], dim: Long) extends Op with Product with Serializable
- case class Sum(scope: Scope, a: Variable, dim: List[Int], keepDim: Boolean) extends Op with Product with Serializable
- case class Tan(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class Tanh(scope: Scope, a: Variable) extends Op with Product with Serializable
- case class ToDense(scope: Scope, sparse: Variable) extends Op with Product with Serializable
- case class Transpose(scope: Scope, a: Variable, dim1: Int = 0, dim2: Int = 1) extends Op with Product with Serializable
- sealed trait Variable extends AnyRef
A value of a tensor valued function, a vertex in the computational graph.
A value of a tensor valued function, a vertex in the computational graph.
A Variable may be constant, i.e. depends on no other Variables. Constant variables may or may not need their partial derivatives computed.
- case class VariableNonConstant(op1: Op, value: STen, pd: STen) extends Variable with Product with Serializable
A variable whose parent is not empty, neither its partial derivative
- case class Variance(scope: Scope, a: Variable, dim: List[Int]) extends Op with Product with Serializable
- case class View(scope: Scope, a: Variable, shape: Array[Long]) extends Op with Product with Serializable
- case class WeightNorm(scope: Scope, v: Variable, g: Variable, dim: Long) extends Op with Product with Serializable
- case class Where(scope: Scope, condition: STen, trueBranch: Variable, falseBranch: Variable) extends Op with Product with Serializable
Value Members
- def const(m: Double, tOpt: STenOptions = STenOptions.d)(implicit scope: Scope): Constant
- def const(m: STen): Constant
- def param(m: Double, tOpt: STenOptions = STenOptions.d)(implicit scope: Scope): ConstantWithGrad
- def param(m: STen)(implicit scope: Scope): ConstantWithGrad
- object Autograd
- object Constant
- case object Mean extends Reduction with Product with Serializable
- case object NoReduction extends Reduction with Product with Serializable
- case object Sum extends Reduction with Product with Serializable
- object Variable
- object VariableNonConstant extends Serializable