Lamp is organized into the following components:
aten
package and links to the native libtorch shared library via JNI.lamp.STen
.lamp.autograd
.lamp.nn
.lamp.data
.Lamp has native CPU and GPU backed n-dimension arrays with similar operations to numpy or pytorch.
In fact, lamp.STen
is a Scala binding to torch’s ATen
tensor class.
import lamp._
// get a scope for zoned memory management
Scope.root{ implicit scope =>
// a 3D tensor, e.g. a color image
val img : STen = STen.rand(List(768, 1024, 3))
// get its shape
assert(img.shape == List(768, 1024, 3))
// select a channel
assert(img.select(dim=2,index=0).shape == List(768, 1024))
// manipulate with a broadcasting operation
val img2 = img / 2d
// take max, returns a tensor with 0 dimensions i.e. a scalar
assert(img2.max.shape == Nil)
// get a handle to metadata about data type, data layout and device
assert(img.options.isCPU)
val vec = STen.fromDoubleArray(Array(2d,1d,3d),List(3),CPU,DoublePrecision)
// broadcasting matrix multiplication
val singleChannel = (img matmul vec)
assert(singleChannel.shape == List(768L,1024L))
// svd
val (u,s,vt) = singleChannel.svd(false)
assert(u.shape == List(768,768))
assert(s.shape == List(768))
assert(vt.shape == List(768,1024))
val errorOfSVD = (singleChannel - ((u * s) matmul vt)).norm2(dim=List(0,1), keepDim=false)
assert(errorOfSVD.toDoubleArray.head < 1E-6)
}
We will use tabular data to build a multiclass classifier.
First get some tabular data into the JVM memory:
val testData = org.saddle.csv.CsvParser
.parseInputStreamWithHeader[Double](
new java.util.zip.GZIPInputStream(
getClass.getResourceAsStream("/mnist_test.csv.gz")
),
maxLines = 100L
)
.toOption
.get
// testData: org.saddle.Frame[Int, String, Double] = [99 x 785]
// label 1x1 1x2 1x3 1x4 28x24 28x25 28x26 28x27 28x28
// ------ ------ ------ ------ ------ ... ------ ------ ------ ------ ------
// 0 -> 7.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 1 -> 2.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 2 -> 1.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 3 -> 0.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 4 -> 4.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// ...
// 94 -> 1.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 95 -> 4.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 96 -> 1.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 97 -> 7.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 98 -> 6.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
//
val trainData = org.saddle.csv.CsvParser
.parseInputStreamWithHeader[Double](
new java.util.zip.GZIPInputStream(
getClass.getResourceAsStream("/mnist_train.csv.gz")
),
maxLines = 100L
)
.toOption
.get
// trainData: org.saddle.Frame[Int, String, Double] = [99 x 785]
// label 1x1 1x2 1x3 1x4 28x24 28x25 28x26 28x27 28x28
// ------ ------ ------ ------ ------ ... ------ ------ ------ ------ ------
// 0 -> 5.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 1 -> 0.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 2 -> 4.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 3 -> 1.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 4 -> 9.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// ...
// 94 -> 8.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 95 -> 0.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 96 -> 7.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 97 -> 8.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
// 98 -> 3.0000 0.0000 0.0000 0.0000 0.0000 ... 0.0000 0.0000 0.0000 0.0000 0.0000
//
Next we copy those JVM objects into native tensors. Note that we specify device so it will end up on the desired device. The feature matrix is copied into a 2D floating point tensor and the target vector is holding the class labels is copied into a 1D integer (long) tensor.
import lamp._
import lamp.autograd._
import org.saddle.Mat
implicit val scope : Scope = Scope.free // Use Scope.root, Scope.apply in non-doc code
// scope: Scope = lamp.Scope@65d4645b // Use Scope.root, Scope.apply in non-doc code
val device = CPU
// device: CPU.type = CPU
val precision = SinglePrecision
// precision: SinglePrecision.type = SinglePrecision
val testDataTensor =
lamp.saddle.fromMat(testData.filterIx(_ != "label").toMat, device, precision)
// testDataTensor: STen = STen(value = Tensor at 139981095781056; CPUFloatType)
val testTarget =
lamp.saddle.fromLongMat(
Mat(testData.firstCol("label").toVec.map(_.toLong)),
device
).squeeze
// testTarget: STen = STen(value = Tensor at 139981096643696; CPULongType)
val trainDataTensor =
lamp.saddle.fromMat(trainData.filterIx(_ != "label").toMat, device,precision)
// trainDataTensor: STen = STen(
// value = Tensor at 139981096640960; CPUFloatType
// )
val trainTarget =
lamp.saddle.fromLongMat(
Mat(trainData.firstCol("label").toVec.map(_.toLong)),
device
).squeeze
// trainTarget: STen = STen(value = Tensor at 139981096502144; CPULongType)
Here we use the predefined modules in lamp.nn
.
In particular lamp.nn.MLP
will create a multilayer
fully connected feed forward neural network. We have to specify the input dimension - in our case 784,
the output dimension - in our case 10, and the sizes of the hidden layers.
We also have to give it an aten.TensorOptions
which holds information of what kind of tensor it should allocate for the parameters - what type (float/double/long) and on which device (cpu/cuda).
Then we compose that function with a softmax, and provide a suitable loss function.
Loss functions in lamp are of the type (lamp.autograd.Variable, lamp.STen) => lamp.autograd.Variable
. The first Variable
argument is the output of the model, the second STen
argument is the target. It returns a new Variable
which is the loss. An autograd Variable
holds a tensor and has the ability to compute partial derivatives. The training loop will take the loss and compute the partial derivative of all learnable parameters.
import lamp.nn._
val tensorOptions = device.options(SinglePrecision)
// tensorOptions: STenOptions = STenOptions(
// value = TensorOptions at 139981096502320; TensorOptions(dtype=float, device=cpu (default), layout=Strided (default), requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt))
// )
val classWeights = STen.ones(List(10), tensorOptions)
// classWeights: STen = STen(value = Tensor at 139981096503184; CPUFloatType)
val model = SupervisedModel(
sequence(
MLP(in = 784, out = 10, List(64, 32), tensorOptions, dropout = 0.2),
Fun(implicit scope => _.logSoftMax(dim = 1))
),
LossFunctions.NLL(10, classWeights)
)
// model: SupervisedModel[Variable, Seq2[Variable, Variable, Variable, Seq2[Variable, Variable, Variable, Sequential[Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable], EitherModule[Variable, Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]], Seq2[Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable]]] = SupervisedModel(
// module = Seq2(
// m1 = Seq2(
// m1 = Sequential(
// members = List(
// Seq4(
// m1 = Linear(
// weights = ConstantWithGrad(
// value = STen(value = Tensor at 139981096993984; CPUFloatType),
// pd = STen(value = Tensor at 139981096999040; CPUFloatType)
// ),
// bias = None
// ),
// m2 = Sequential(
// members = ArraySeq(
// EitherModule(
// members = Left(
// value = BatchNorm(
// weight = ConstantWithGrad(
// value = STen(
// value = Tensor at 139981097015808; CPUFloatType
// ),
// pd = STen(
// value = Tensor at 139981097009952; CPUFloatType
// )
// ),
// bias = ConstantWithGrad(
// value = STen(
// value = Tensor at 139981097010224; CPUFloatType
// ),
// pd = STen(
// value = Tensor at 139981097011584; CPUFloatType
// )
// ),
// runningMean = ConstantWithoutGrad(
// value = STen(
// value = Tensor at 139981097011616; CPUFloatType
// )
// ),
// runningVar = ConstantWithoutGrad(
// value = STen(
// value = Tensor at 139981097056384; CPUFloatType
// )
// ),
// training = true,
// momentum = 0.1,
// eps = 1.0E-5,
// forceTrain = false,
// forceEval = false,
// ...
The predefined training loop in lamp.data.IOLoops
needs a stream of batches, where batch holds a subset of the training data.
In particular the trainig loop expects a type of Unit => lamp.data.BatchStream
where the BatchStream represent a stream of batches over the full set of training data.
This factory function will then be called in each epoch.
Lamp provides a helper which chops up the full batch of data into mini-batches.
import lamp.data._
val makeTrainingBatch = (_:IOLoops.TrainingLoopContext) =>
BatchStream.minibatchesFromFull(
minibatchSize = 200,
dropLast = true,
features = trainDataTensor,
target =trainTarget,
rng = new scala.util.Random()
)
// makeTrainingBatch: IOLoops.TrainingLoopContext => AnyRef with BatchStream[(Variable, STen), Int, BufferPair] = <function1>
With this we have everything to assemble the training loop:
import cats.effect.IO
val trainedModelIO = IOLoops.epochs(
model = model,
optimizerFactory = SGDW
.factory(
learningRate = simple(0.0001),
weightDecay = simple(0.001d)
),
trainBatchesOverEpoch = makeTrainingBatch,
validationBatchesOverEpoch = None,
epochs = 1
)
// trainedModelIO: IO[(Int, SupervisedModel[Variable, Seq2[Variable, Variable, Variable, Seq2[Variable, Variable, Variable, Sequential[Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable], EitherModule[Variable, Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]], Seq2[Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable]]], List[(Int, Double, Option[(Double, Double)])], Unit, SimpleLoopState)] = FlatMap(
// ioe = FlatMap(
// ioe = Delay(
// thunk = lamp.data.IOLoops$$$Lambda$2157/0x00000008014224e0@6062ad94,
// event = cats.effect.tracing.TracingEvent$StackTrace
// ),
// f = lamp.data.IOLoops$$$Lambda$2158/0x00000008014227b8@449f319e,
// event = cats.effect.tracing.TracingEvent$StackTrace
// ),
// f = lamp.data.IOLoops$$$Lambda$2159/0x0000000801422b90@2af73ea,
// event = cats.effect.tracing.TracingEvent$StackTrace
// )
Lamp provides many optimizers: SgdW, AdamW, RAdam, Shampoo etc. They can take a custom learning rate schedule.
For other capabilities of this training loop see the scaladoc of IOLoops.epochs
.
The IOLoop.epochs
method returns an IO
which will run into the trained model once executed:
import cats.effect.unsafe.implicits.global
val (epochOfModel, trainedModel, learningCurve, _, _) = trainedModelIO.unsafeRunSync()
// epochOfModel: Int = 0
// trainedModel: SupervisedModel[Variable, Seq2[Variable, Variable, Variable, Seq2[Variable, Variable, Variable, Sequential[Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable], EitherModule[Variable, Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]], Seq2[Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable]]] = SupervisedModel(
// module = Seq2(
// m1 = Seq2(
// m1 = Sequential(
// members = List(
// Seq4(
// m1 = Linear(
// weights = ConstantWithGrad(
// value = STen(value = Tensor at 139981096993984; CPUFloatType),
// pd = STen(value = Tensor at 139981096999040; CPUFloatType)
// ),
// bias = None
// ),
// m2 = Sequential(
// members = ArraySeq(
// EitherModule(
// members = Left(
// value = BatchNorm(
// weight = ConstantWithGrad(
// value = STen(
// value = Tensor at 139981097015808; CPUFloatType
// ),
// pd = STen(
// value = Tensor at 139981097009952; CPUFloatType
// )
// ),
// bias = ConstantWithGrad(
// value = STen(
// value = Tensor at 139981097010224; CPUFloatType
// ),
// pd = STen(
// value = Tensor at 139981097011584; CPUFloatType
// )
// ),
// runningMean = ConstantWithoutGrad(
// value = STen(
// value = Tensor at 139981097011616; CPUFloatType
// )
// ),
// runningVar = ConstantWithoutGrad(
// value = STen(
// value = Tensor at 139981097056384; CPUFloatType
// )
// ),
// training = true,
// momentum = 0.1,
// eps = 1.0E-5,
// forceTrain = false,
// forceEval = false,
// ...
// learningCurve: List[(Int, Double, Option[(Double, Double)])] = List(
// (0, NaN, None)
// )
val module = trainedModel.module
// module: Seq2[Variable, Variable, Variable, Seq2[Variable, Variable, Variable, Sequential[Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable], EitherModule[Variable, Variable, Seq4[Variable, Variable, Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable], nn.Dropout with GenericModule[Variable, Variable]], Seq2[Variable, Variable, Variable, Linear with GenericModule[Variable, Variable], Sequential[Variable, EitherModule[Variable, Variable, nn.BatchNorm, LayerNorm]] with GenericModule[Variable, Variable]]] with GenericModule[Variable, Variable]] with GenericModule[Variable, Variable], Fun with GenericModule[Variable, Variable]] with GenericModule[Variable, Variable] = Seq2(
// m1 = Seq2(
// m1 = Sequential(
// members = List(
// Seq4(
// m1 = Linear(
// weights = ConstantWithGrad(
// value = STen(value = Tensor at 139981096993984; CPUFloatType),
// pd = STen(value = Tensor at 139981096999040; CPUFloatType)
// ),
// bias = None
// ),
// m2 = Sequential(
// members = ArraySeq(
// EitherModule(
// members = Left(
// value = BatchNorm(
// weight = ConstantWithGrad(
// value = STen(
// value = Tensor at 139981097015808; CPUFloatType
// ),
// pd = STen(value = Tensor at 139981097009952; CPUFloatType)
// ),
// bias = ConstantWithGrad(
// value = STen(
// value = Tensor at 139981097010224; CPUFloatType
// ),
// pd = STen(value = Tensor at 139981097011584; CPUFloatType)
// ),
// runningMean = ConstantWithoutGrad(
// value = STen(
// value = Tensor at 139981097011616; CPUFloatType
// )
// ),
// runningVar = ConstantWithoutGrad(
// value = STen(
// value = Tensor at 139981097056384; CPUFloatType
// )
// ),
// training = true,
// momentum = 0.1,
// eps = 1.0E-5,
// forceTrain = false,
// forceEval = false,
// evalIfBatchSizeIsOne = false
// )
// )
// )
// )
// ...
The trained model we can use for prediction:
val bogusData = STen.ones(List(1,784),tensorOptions)
// bogusData: STen = STen(value = Tensor at 139981099905056; CPUFloatType)
val classProbabilities = module.forward(const(bogusData)).toDoubleArray.map(math.exp).toVector
// classProbabilities: Vector[Double] = Vector(
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637,
// 0.09999999680245637
// )
println(classProbabilities)
// Vector(0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637, 0.09999999680245637)