Packages

package nn

Provides building blocks for neural networks

Notable types:

Optimizers:

Modules facilitating composing other modules:

  • nn.Sequential composes a homogenous list of modules (analogous to List)
  • nn.sequence composes a heterogeneous list of modules (analogous to tuples)
  • nn.EitherModule composes two modules in a scala.Either

Examples of neural network building blocks, layers etc:

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. nn
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Package Members

  1. package bert
  2. package graph
  3. package languagemodel

Type Members

  1. case class AdamW(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter = simple(0.001), beta1: OptimizerHyperparameter = simple(0.9), beta2: OptimizerHyperparameter = simple(0.999), eps: Double = 1e-8, clip0: Option[Double] = None, debias: Boolean = true, mixedPrecision: Boolean = false) extends Optimizer with Product with Serializable

    See also

    https://arxiv.org/pdf/1711.05101.pdf Algorithm 2

  2. case class AdversarialTraining(eps: Double) extends LossCalculation[Variable] with Product with Serializable
  3. case class BatchNorm(weight: Constant, bias: Constant, runningMean: Constant, runningVar: Constant, training: Boolean, momentum: Double, eps: Double, forceTrain: Boolean, forceEval: Boolean, evalIfBatchSizeIsOne: Boolean) extends Module with Product with Serializable
  4. case class BatchNorm2D(weight: Constant, bias: Constant, runningMean: Constant, runningVar: Constant, training: Boolean, momentum: Double, eps: Double) extends Module with Product with Serializable
  5. case class Conv1D(weights: Constant, bias: Constant, stride: Long, padding: Long, dilation: Long, groups: Long) extends Module with Product with Serializable
  6. case class Conv2D(weights: Constant, bias: Constant, stride: Long, padding: Long, dilation: Long, groups: Long) extends Module with Product with Serializable
  7. case class Conv2DTransposed(weights: Constant, bias: Constant, stride: Long, padding: Long, dilation: Long) extends Module with Product with Serializable
  8. case class Debug(fun: (STen, Boolean, Boolean) => Unit) extends Module with Product with Serializable
  9. case class DependentHyperparameter(default: Double)(pf: PartialFunction[PTag, Double]) extends OptimizerHyperparameter with Product with Serializable
  10. case class Dropout(prob: Double, training: Boolean) extends Module with Product with Serializable
  11. case class EitherModule[A, B, M1 <: GenericModule[A, B], M2 <: GenericModule[A, B]](members: Either[M1 with GenericModule[A, B], M2 with GenericModule[A, B]]) extends GenericModule[A, B] with Product with Serializable
  12. case class Embedding(weights: Constant) extends Module with Product with Serializable

    Learnable mapping from classes to dense vectors.

    Learnable mapping from classes to dense vectors. Equivalent to L * W where L is the n x C one-hot encoded matrix of the classes * is matrix multiplication W is the C x dim dense matrix. W is learnable. L is never computed directly. C is the number of classes. n is the size of the batch.

    Input is a long tensor with values in [0,C-1]. Input shape is arbitrary, (*). Output shape is (* x D) where D is the embedding dimension.

  13. case class FreeRunningRNN[T, M <: StatefulModule[Variable, Variable, T]](module: M with StatefulModule[Variable, Variable, T], timeSteps: Int) extends StatefulModule[Variable, Variable, T] with Product with Serializable

    Wraps a (sequence x batch) long -> (sequence x batch x dim) double stateful module and runs in it greedy (argmax) generation mode over timeSteps steps.

  14. case class Fun(fun: (Scope) => (Variable) => Variable) extends Module with Product with Serializable
  15. case class GRU(weightXh: Constant, weightHh: Constant, weightXr: Constant, weightXz: Constant, weightHr: Constant, weightHz: Constant, biasR: Constant, biasZ: Constant, biasH: Constant) extends StatefulModule[Variable, Variable, Option[Variable]] with Product with Serializable

    Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * hidden dim)

  16. case class GenericFun[A, B](fun: (Scope) => (A) => B) extends GenericModule[A, B] with Product with Serializable
  17. trait GenericModule[A, B] extends AnyRef

    Base type of modules

    Base type of modules

    Modules are functions of type (Seq[lamp.autograd.Constant],A) => B, where the Seq[lamp.autograd.Constant] arguments are optimizable parameters and A is a non-optimizable input.

    Modules provide a way to build composite functions while also keep track of the parameter list of the composite function.

    Example
    case object Weights extends LeafTag
    case object Bias extends LeafTag
    case class Linear(weights: Constant, bias: Option[Constant]) extends Module {
    
      override val state = List(
        weights -> Weights
      ) ++ bias.toList.map(b => (b, Bias))
    
      def forward[S: Sc](x: Variable): Variable = {
        val v = x.mm(weights)
        bias.map(_ + v).getOrElse(v)
    
      }
    }

    Some other attributes of modules are attached by type classes e.g. with the nn.TrainingMode, nn.Load type classes.

    A

    the argument type of the module

    B

    the value type of the module

    See also

    nn.Module is an alias for simple Variable => Variable modules

  18. trait InitState[M, C] extends AnyRef

    Type class about how to initialize recurrent neural networks

  19. implicit class InitStateSyntax[M, C] extends AnyRef
  20. case class LSTM(weightXi: Constant, weightXf: Constant, weightXo: Constant, weightHi: Constant, weightHf: Constant, weightHo: Constant, weightXc: Constant, weightHc: Constant, biasI: Constant, biasF: Constant, biasO: Constant, biasC: Constant) extends StatefulModule[Variable, Variable, Option[(Variable, Variable)]] with Product with Serializable

    Inputs of size (sequence length * batch * vocab) Outputs of size (sequence length * batch * output dim)

  21. case class LayerNorm(scale: Constant, bias: Constant, eps: Double, normalizedShape: List[Long]) extends Module with Product with Serializable
  22. trait LeafTag extends PTag
  23. trait LearningRateSchedule[State] extends AnyRef
  24. case class LiftedModule[M <: Module](mod: M with Module) extends StatefulModule[Variable, Variable, Unit] with Product with Serializable
  25. case class Linear(weights: Constant, bias: Option[Constant]) extends Module with Product with Serializable
  26. trait Load[M] extends AnyRef

    Type class about how to load the contents of the state of modules from external tensors

  27. implicit class LoadSyntax[M] extends AnyRef
  28. trait LossCalculation[I] extends AnyRef

    Loss and Gradient calculation

    Loss and Gradient calculation

    Takes samples, target, module, loss function and computes the loss and the gradients

  29. trait LossFunction extends AnyRef
  30. case class MappedState[A, B, C, D, M <: StatefulModule[A, B, C]](statefulModule: M with StatefulModule[A, B, C], map: (C) => D) extends StatefulModule2[A, B, C, D] with Product with Serializable
  31. case class ModelWithOptimizer[I, M <: GenericModule[I, Variable]](model: SupervisedModel[I, M], optimizer: Optimizer) extends Product with Serializable
  32. type Module = GenericModule[Variable, Variable]
  33. case class MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, linearized: Boolean, causalMask: Boolean) extends GenericModule[(Variable, Variable, Variable, Option[STen]), Variable] with Product with Serializable

    Multi-head scaled dot product attention module

    Multi-head scaled dot product attention module

    Input: (query,key,value,maxLength) where

    • query: batch x num queries x query dim
    • key: batch x num k-v x key dim
    • value: batch x num k-v x key value
    • maxLength: 1D or 2D long tensor for attention masking
  34. trait Optimizer extends AnyRef
  35. trait OptimizerHyperparameter extends AnyRef
  36. trait PTag extends AnyRef

    A small trait to mark paramters for unique identification

  37. class PerturbedLossCalculation[I] extends LossCalculation[I]

    Evaluates the gradient at current point + eps where eps is I * N(0,noiseLevel)

  38. case class RAdam(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter = simple(0.001), beta1: OptimizerHyperparameter = simple(0.9), beta2: OptimizerHyperparameter = simple(0.999), eps: Double = 1e-8, clip0: Option[Double] = None) extends Optimizer with Product with Serializable

    Rectified Adam optimizer algorithm

  39. case class RNN(weightXh: Constant, weightHh: Constant, biasH: Constant) extends StatefulModule[Variable, Variable, Option[Variable]] with Product with Serializable

    Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * hidden dim)

  40. case class Recursive[A, M <: GenericModule[A, A]](member: M with GenericModule[A, A], n: Int) extends GenericModule[A, A] with Product with Serializable
  41. case class ResidualModule[M <: Module](transform: M with Module) extends Module with Product with Serializable
  42. case class SGDW(parameters: Seq[(STen, PTag)], learningRate: OptimizerHyperparameter, weightDecay: OptimizerHyperparameter, momentum: Option[OptimizerHyperparameter] = None, clip0: Option[Double] = None) extends Optimizer with Product with Serializable
  43. case class Seq2[T1, T2, T3, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3]) extends GenericModule[T1, T3] with Product with Serializable
  44. case class Seq2Seq[S0, S1, M1 <: StatefulModule2[Variable, Variable, S0, S1], M2 <: StatefulModule[Variable, Variable, S1]](encoder: M1 with StatefulModule2[Variable, Variable, S0, S1], decoder: M2 with StatefulModule[Variable, Variable, S1]) extends StatefulModule2[(Variable, Variable), Variable, S0, S1] with Product with Serializable
  45. case class Seq3[T1, T2, T3, T4, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4]) extends GenericModule[T1, T4] with Product with Serializable
  46. case class Seq4[T1, T2, T3, T4, T5, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4], M4 <: GenericModule[T4, T5]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4], m4: M4 with GenericModule[T4, T5]) extends GenericModule[T1, T5] with Product with Serializable
  47. case class Seq5[T1, T2, T3, T4, T5, T6, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4], M4 <: GenericModule[T4, T5], M5 <: GenericModule[T5, T6]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4], m4: M4 with GenericModule[T4, T5], m5: M5 with GenericModule[T5, T6]) extends GenericModule[T1, T6] with Product with Serializable
  48. case class Seq6[T1, T2, T3, T4, T5, T6, T7, M1 <: GenericModule[T1, T2], M2 <: GenericModule[T2, T3], M3 <: GenericModule[T3, T4], M4 <: GenericModule[T4, T5], M5 <: GenericModule[T5, T6], M6 <: GenericModule[T6, T7]](m1: M1 with GenericModule[T1, T2], m2: M2 with GenericModule[T2, T3], m3: M3 with GenericModule[T3, T4], m4: M4 with GenericModule[T4, T5], m5: M5 with GenericModule[T5, T6], m6: M6 with GenericModule[T6, T7]) extends GenericModule[T1, T7] with Product with Serializable
  49. case class SeqLinear(weight: Constant, bias: Constant) extends Module with Product with Serializable

    Inputs of size (sequence length * batch * in dim) Outputs of size (sequence length * batch * output dim) Applies a linear function to each time step

  50. case class Sequential[A, M <: GenericModule[A, A]](members: M with GenericModule[A, A]*) extends GenericModule[A, A] with Product with Serializable
  51. case class Shampoo(parameters: Seq[(STen, PTag)], learningRate: OptimizerHyperparameter = simple(0.001), clip0: Option[Double] = None, eps: Double = 1e-4, diagonalThreshold: Int = 256, updatePreconditionerEveryNIterations: Int = 100, momentum: OptimizerHyperparameter = simple(0d)) extends Optimizer with Product with Serializable

    See also

    https://arxiv.org/pdf/1802.09568.pdf Algorithm 1

  52. class SimpleLossCalculation[I] extends LossCalculation[I]
  53. type StatefulModule[A, B, C] = GenericModule[(A, C), (B, C)]
  54. type StatefulModule2[A, B, C, D] = GenericModule[(A, C), (B, D)]
  55. case class StatefulSeq2[T1, T2, T3, S1, S2, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2]) extends StatefulModule[T1, T3, (S1, S2)] with Product with Serializable
  56. case class StatefulSeq3[T1, T2, T3, T4, S1, S2, S3, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2], m3: M3 with StatefulModule[T3, T4, S3]) extends StatefulModule[T1, T4, (S1, S2, S3)] with Product with Serializable
  57. case class StatefulSeq4[T1, T2, T3, T4, T5, S1, S2, S3, S4, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3], M4 <: StatefulModule[T4, T5, S4]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2], m3: M3 with StatefulModule[T3, T4, S3], m4: M4 with StatefulModule[T4, T5, S4]) extends StatefulModule[T1, T5, (S1, S2, S3, S4)] with Product with Serializable
  58. case class StatefulSeq5[T1, T2, T3, T4, T5, T6, S1, S2, S3, S4, S5, M1 <: StatefulModule[T1, T2, S1], M2 <: StatefulModule[T2, T3, S2], M3 <: StatefulModule[T3, T4, S3], M4 <: StatefulModule[T4, T5, S4], M5 <: StatefulModule[T5, T6, S5]](m1: M1 with StatefulModule[T1, T2, S1], m2: M2 with StatefulModule[T2, T3, S2], m3: M3 with StatefulModule[T3, T4, S3], m4: M4 with StatefulModule[T4, T5, S4], m5: M5 with StatefulModule[T5, T6, S5]) extends StatefulModule[T1, T6, (S1, S2, S3, S4, S5)] with Product with Serializable
  59. case class SupervisedModel[I, M <: GenericModule[I, Variable]](module: M with GenericModule[I, Variable], lossFunction: LossFunction, lossCalculation: LossCalculation[I] = new SimpleLossCalculation[I], printMemoryAllocations: Boolean = false)(implicit tm: TrainingMode[M]) extends Product with Serializable
  60. implicit class ToLift[M <: Module] extends AnyRef
  61. implicit class ToMappedState[A, B, C, M <: StatefulModule[A, B, C]] extends AnyRef
  62. implicit class ToUnlift[A, B, C, D, M <: StatefulModule2[A, B, C, D]] extends AnyRef
  63. implicit class ToWithInit[A, B, C, M <: StatefulModule[A, B, C]] extends AnyRef
  64. trait TrainingMode[M] extends AnyRef

    Type class about how to switch a module into training or evaluation mode

  65. implicit class TrainingModeSyntax[M] extends AnyRef
  66. case class Transformer(encoder: TransformerEncoder, decoder: TransformerDecoder) extends GenericModule[(Variable, Variable, Option[STen], Option[STen]), Variable] with Product with Serializable
  67. case class TransformerDecoder(blocks: Seq[TransformerDecoderBlock]) extends GenericModule[(Variable, Variable, Option[STen]), Variable] with Product with Serializable
  68. case class TransformerDecoderBlock(attentionDecoderDecoder: MultiheadAttention, attentionEncoderDecoder: MultiheadAttention, layerNorm1: LayerNorm, layerNorm2: LayerNorm, layerNorm3: LayerNorm, layerNorm4: LayerNorm, w1: Constant, b1: Constant, w2: Constant, b2: Constant, dropout: Double, train: Boolean) extends GenericModule[(Variable, Variable, Option[STen]), Variable] with Product with Serializable
  69. case class TransformerEmbedding(embedding: Embedding, addPositionalEmbedding: Boolean, positionalEmbedding: Constant) extends GenericModule[Variable, Variable] with Product with Serializable

    A module with positional and token embeddings

    A module with positional and token embeddings

    Token embeddings are lookup embeddings. Positional embeddings are supplied as a constant. They are supposed to come from a fixed unlearned derivation of the positions.

    Token and positional embeddings are summed.

    Gradients are not computed for positionalEmbedding

  70. case class TransformerEncoder(blocks: Seq[TransformerEncoderBlock]) extends GenericModule[(Variable, Option[STen]), Variable] with Product with Serializable

    TransformerEncoder module

    TransformerEncoder module

    Does *not* include initial embedding or position encoding.

    Input is (data, maxLength) where data is (batch, sequence, input dimension), double tensor maxLength is a 1D or 2D long tensor used for attention masking.

    Attention masking is implemented similarly to chapter 11.3.2.1 in d2l.ai v1.0.0-beta0. It supports unmasked attention, attention on variable length input, and left-to-right attention.

    Output is (bach, sequence, output dimension)

  71. case class TransformerEncoderBlock(attention: MultiheadAttention, layerNorm1: LayerNorm, layerNorm2: LayerNorm, w1: Constant, b1: Constant, w2: Constant, b2: Constant, scale1: Constant, scale2: Constant, dropout: Double, train: Boolean, gptOrder: Boolean) extends GenericModule[(Variable, Option[STen]), Variable] with Product with Serializable

    A single block of the transformer self attention encoder using GELU

    A single block of the transformer self attention encoder using GELU

    Input is (data, maxLength) where data is (batch, sequence, input dimension), double tensor maxLength is a 1D or 2D long tensor used for attention masking.

    The order of operations depends on gptOrder param. If gptOrder is true then:

    • y = attention(norm(input))+input
    • result = mlp(norm(y))+y
    • Note that in this case there is no normalization at the end of the transformer. One may wants to add one separately. This is how GPT2 is defined in hugging face or nanoGPT.
    • Note that the residual connection has a path which does not flow through the normalization.
    • + dimension wise learnable scale parameter in each residual path

    If gptOrder is false then:

    • y = norm(attention(input)+input )
    • result = norm(mlp(y)+y)
    • This follows chapter 11.7 in d2l.ai v1.0.0-beta0. (Same as in https://arxiv.org/pdf/1706.03762.pdf)
    • Note that the residual connection has a path which flows through the normalization.

    Output is (bach, sequence, output dimension)

  72. case class UnliftedModule[A, B, C, D, M <: StatefulModule2[A, B, C, D]](statefulModule: M with StatefulModule2[A, B, C, D])(implicit init: InitState[M, C]) extends GenericModule[A, B] with Product with Serializable
  73. case class WeightNormLinear(weightsV: Constant, weightsG: Constant, bias: Option[Constant]) extends Module with Product with Serializable
  74. case class WithInit[A, B, C, M <: StatefulModule[A, B, C]](module: M with StatefulModule[A, B, C], init: C) extends StatefulModule[A, B, C] with Product with Serializable
  75. case class WrapFun[A, B, M <: GenericModule[A, B], O](module: M, fun: (A, B) => O) extends GenericModule[A, (B, O)] with Product with Serializable
  76. case class Yogi(parameters: Seq[(STen, PTag)], weightDecay: OptimizerHyperparameter, learningRate: OptimizerHyperparameter = simple(0.01), beta1: OptimizerHyperparameter = simple(0.9), beta2: OptimizerHyperparameter = simple(0.999), eps: Double = 1e-3, clip0: Option[Double] = None, debias: Boolean = true) extends Optimizer with Product with Serializable

    The Yogi optimizer algorithm I added the decoupled weight decay term following https://arxiv.org/pdf/1711.05101.pdf

    The Yogi optimizer algorithm I added the decoupled weight decay term following https://arxiv.org/pdf/1711.05101.pdf

    See also

    https://papers.nips.cc/paper/2018/file/90365351ccc7437a1309dc64e4db32a3-Paper.pdf Algorithm 2

  77. case class simple(v: Double) extends OptimizerHyperparameter with Product with Serializable

Value Members

  1. def gradientClippingInPlace(gradients: Seq[Option[STen]], theta: STen): Unit
  2. def initLinear[S](in: Int, out: Int, tOpt: STenOptions)(implicit arg0: Sc[S]): Constant
  3. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _], T7 <: GenericModule[_, _], T8 <: GenericModule[_, _], T9 <: GenericModule[_, _], T10 <: GenericModule[_, _], T11 <: GenericModule[_, _], T12 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, t7: T7, t8: T8, t9: T9, t10: T10, t11: T11, t12: T12, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6], arg6: Load[T7], arg7: Load[T8], arg8: Load[T9], arg9: Load[T10], arg10: Load[T11], arg11: Load[T12]): Unit
  4. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _], T7 <: GenericModule[_, _], T8 <: GenericModule[_, _], T9 <: GenericModule[_, _], T10 <: GenericModule[_, _], T11 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, t7: T7, t8: T8, t9: T9, t10: T10, t11: T11, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6], arg6: Load[T7], arg7: Load[T8], arg8: Load[T9], arg9: Load[T10], arg10: Load[T11]): Unit
  5. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _], T7 <: GenericModule[_, _], T8 <: GenericModule[_, _], T9 <: GenericModule[_, _], T10 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, t7: T7, t8: T8, t9: T9, t10: T10, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6], arg6: Load[T7], arg7: Load[T8], arg8: Load[T9], arg9: Load[T10]): Unit
  6. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _], T7 <: GenericModule[_, _], T8 <: GenericModule[_, _], T9 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, t7: T7, t8: T8, t9: T9, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6], arg6: Load[T7], arg7: Load[T8], arg8: Load[T9]): Unit
  7. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _], T7 <: GenericModule[_, _], T8 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, t7: T7, t8: T8, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6], arg6: Load[T7], arg7: Load[T8]): Unit
  8. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _], T7 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, t7: T7, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6], arg6: Load[T7]): Unit
  9. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _], T6 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, t6: T6, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5], arg5: Load[T6]): Unit
  10. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _], T5 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, t5: T5, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4], arg4: Load[T5]): Unit
  11. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _], T4 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, t4: T4, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3], arg3: Load[T4]): Unit
  12. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _], T3 <: GenericModule[_, _]](t1: T1, t2: T2, t3: T3, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2], arg2: Load[T3]): Unit
  13. def loadMultiple[T1 <: GenericModule[_, _], T2 <: GenericModule[_, _]](t1: T1, t2: T2, tensors: Seq[STen])(implicit arg0: Load[T1], arg1: Load[T2]): Unit
  14. object AdamW extends Serializable
  15. object BatchNorm extends Serializable
  16. object BatchNorm2D extends Serializable
  17. object Conv1D extends Serializable
  18. object Conv2D extends Serializable
  19. object Conv2DTransposed extends Serializable
  20. object Debug extends Serializable
  21. object Dropout extends Serializable
  22. object EitherModule extends Serializable
  23. object Embedding extends Serializable
  24. object FreeRunningRNN extends Serializable
  25. object Fun extends Serializable
  26. object GRU extends Serializable
  27. object GenericFun extends Serializable
  28. object GenericModule
  29. object InitState
  30. object LSTM extends Serializable
  31. object LayerNorm extends Serializable
  32. object LearningRateSchedule
  33. object LiftedModule extends Serializable
  34. object Linear extends Serializable
  35. object Load
  36. object LossFunctions
  37. object MLP

    Factory for multilayer fully connected feed forward networks

    Factory for multilayer fully connected feed forward networks

    Returned network has the following repeated structure: [linear -> batchnorm -> nonlinearity -> dropout]*

    The last block does not include the nonlinearity and the dropout.

  38. object MappedState extends Serializable
  39. object MultiheadAttention extends Serializable
  40. case object NoTag extends LeafTag with Product with Serializable
  41. object PTag
  42. object PositionalEmbedding
  43. object RAdam extends Serializable
  44. object RNN extends Serializable
  45. object Recursive extends Serializable
  46. object ResidualModule extends Serializable
  47. object SGDW extends Serializable
  48. object Seq2 extends Serializable
  49. object Seq2Seq extends Serializable
  50. object Seq3 extends Serializable
  51. object Seq4 extends Serializable
  52. object Seq5 extends Serializable
  53. object Seq6 extends Serializable
  54. object SeqLinear extends Serializable
  55. object Sequential extends Serializable
  56. object Shampoo extends Serializable
  57. object StatefulSeq2 extends Serializable
  58. object StatefulSeq3 extends Serializable
  59. object StatefulSeq4 extends Serializable
  60. object StatefulSeq5 extends Serializable
  61. object TrainingMode
  62. object Transformer extends Serializable
  63. object TransformerDecoder extends Serializable
  64. object TransformerDecoderBlock extends Serializable
  65. object TransformerEmbedding extends Serializable
  66. object TransformerEncoder extends Serializable
  67. object TransformerEncoderBlock extends Serializable
  68. object UnliftedModule extends Serializable
  69. object WeightNormLinear extends Serializable
  70. object WithInit extends Serializable
  71. object WrapFun extends Serializable
  72. object Yogi extends Serializable
  73. object sequence
  74. object statefulSequence

Inherited from AnyRef

Inherited from Any

Ungrouped