Packages

  • package root
    Definition Classes
    root
  • package org
    Definition Classes
    root
  • package saddle

    Saddle is a Scala Data Library.

    Saddle

    Saddle is a Scala Data Library.

    Saddle provides array-backed, indexed one- and two-dimensional data structures.

    These data structures are specialized on JVM primitives. With them one can often avoid the overhead of boxing and unboxing.

    Basic operations also aim to be robust to missing values (NA's)

    The building blocks are intended to be easily composed.

    The foundational building blocks are:

    Inspiration for Saddle comes from many sources, including the R programming language, the pandas data analysis library for Python, and the Scala collections library.

    Definition Classes
    org
  • package array

    This package contains utilities for working with arrays that are specialized for numeric types.

  • package binary

    Binary serialization for Frame[String,String,T] or Mat[T] with primitive T

    Binary serialization for Frame[String,String,T] or Mat[T] with primitive T

    The layout of binary format is as follows:

    • The first 6 bytes are "SADDLE"
    • The next unsigned byte is the major version
    • The next unsigned byte is the minor version
    • The next 4 bytes form a little endian integer as HEADER_LENGTH
    • The next HEADER_LENGTH bytes form an UTF-8 string as the header.
    • The header is a valid JSON object with the following fields:
      • v: numeric positive integer is the version of the header structure
      • colix : a JSON array of strings, it is the column index of the frame
      • rowix : a JSON array of strings, it is the row index of the frame
      • numrows: numeric positive integer, number of rows
      • numcols: numeric positive integer, number of cols
      • Either one of rowix or numrows may be missing
      • Either one of colix or numcols may be missing
      • rowmajor : a boolean, indicating whether the data is stored in row-major or col-major order
      • datatype : string, either "double", "long", "int", "float", "byte"
    • The header is padded with spaces (0x20) such that HEADER_LENGTH+12 is divisible by 16. The count of spaces are included in HEADER_LENGTH.
    • The next width * numRows * numCols bytes form a little endian primitive array in row-major or col-major order. numRows and numCols are determined from the rowix/numrows and colix/numcols header fields. width is determined from the datatype field (8 for double and long, 4 for int and float, 1 for byte)
  • package csv
  • package groupby
  • package index
  • package io
  • package linalg
  • package macros
  • package mat
  • package npy
  • package ops

    Provides type aliases for a few basic operations

  • package scalar
  • package spire
  • package util

    Additional utilities that need a home

  • package vec

    Factory methods to generate Vec instances

  • ArrToVec
  • Buffer
  • FillBackward
  • FillForward
  • FillMethod
  • Frame
  • Index
  • Mat
  • Numeric
  • OptionToScalar
  • PctMethod
  • PrimitiveToScalar
  • RankTie
  • SeqToFrame
  • SeqToFrame2
  • SeqToIndex
  • SeqToMat
  • SeqToSeries
  • SeqToVec
  • Series
  • Vec
  • VecDoubleOps
  • doubleIsNumeric
  • floatIsNumeric
  • intIsNumeric
  • longIsNumeric
  • order
p

org

saddle

package saddle

Saddle

Saddle is a Scala Data Library.

Saddle provides array-backed, indexed one- and two-dimensional data structures.

These data structures are specialized on JVM primitives. With them one can often avoid the overhead of boxing and unboxing.

Basic operations also aim to be robust to missing values (NA's)

The building blocks are intended to be easily composed.

The foundational building blocks are:

Inspiration for Saddle comes from many sources, including the R programming language, the pandas data analysis library for Python, and the Scala collections library.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. saddle
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Package Members

  1. package array

    This package contains utilities for working with arrays that are specialized for numeric types.

  2. package binary

    Binary serialization for Frame[String,String,T] or Mat[T] with primitive T

    Binary serialization for Frame[String,String,T] or Mat[T] with primitive T

    The layout of binary format is as follows:

    • The first 6 bytes are "SADDLE"
    • The next unsigned byte is the major version
    • The next unsigned byte is the minor version
    • The next 4 bytes form a little endian integer as HEADER_LENGTH
    • The next HEADER_LENGTH bytes form an UTF-8 string as the header.
    • The header is a valid JSON object with the following fields:
      • v: numeric positive integer is the version of the header structure
      • colix : a JSON array of strings, it is the column index of the frame
      • rowix : a JSON array of strings, it is the row index of the frame
      • numrows: numeric positive integer, number of rows
      • numcols: numeric positive integer, number of cols
      • Either one of rowix or numrows may be missing
      • Either one of colix or numcols may be missing
      • rowmajor : a boolean, indicating whether the data is stored in row-major or col-major order
      • datatype : string, either "double", "long", "int", "float", "byte"
    • The header is padded with spaces (0x20) such that HEADER_LENGTH+12 is divisible by 16. The count of spaces are included in HEADER_LENGTH.
    • The next width * numRows * numCols bytes form a little endian primitive array in row-major or col-major order. numRows and numCols are determined from the rowix/numrows and colix/numcols header fields. width is determined from the datatype field (8 for double and long, 4 for int and float, 1 for byte)
  3. package csv
  4. package groupby
  5. package index
  6. package io
  7. package linalg
  8. package macros
  9. package mat
  10. package npy
  11. package ops

    Provides type aliases for a few basic operations

  12. package scalar
  13. package spire
  14. package util

    Additional utilities that need a home

  15. package vec

    Factory methods to generate Vec instances

Type Members

  1. implicit class ArrToVec[T] extends AnyRef
  2. final class Buffer[V] extends AnyRef
  3. type CLM[C] = ClassTag[C]

    Shorthand for class manifest typeclass

  4. abstract class FillMethod extends AnyRef

    Filling method for NA values.

    Filling method for NA values. Non-sealed because could add more variants in the future.

  5. class Frame[RX, CX, T] extends NumericOps[Frame[RX, CX, T]]

    Frame is an immutable container for 2D data which is indexed along both axes (rows, columns) by associated keys (i.e., indexes).

    Frame is an immutable container for 2D data which is indexed along both axes (rows, columns) by associated keys (i.e., indexes).

    The primary use case is homogeneous data, but a secondary concern is to support heterogeneous data that is homogeneous ony within any given column.

    The row index, column index, and constituent value data are all backed ultimately by arrays.

    Frame is effectively a doubly-indexed associative map whose row keys and col keys each have an ordering provided by the natural (provided) order of their backing arrays.

    Several factory and access methods are provided. In the following examples, assume that:

    val f = Frame('a'->Vec(1,2,3), 'b'->Vec(4,5,6))

    The apply method takes a row and col key returns a slice of the original Frame:

    f(0,'a') == Frame('a'->Vec(1))

    apply also accepts a org.saddle.index.Slice:

    f(0->1, 'b') == Frame('b'->Vec(4,5))
    f(0, *) == Frame('a'->Vec(1), 'b'->Vec(4))

    You may slice using the col and row methods respectively, as follows:

    f.col('a') == Frame('a'->Vec(1,2,3))
    f.row(0) == Frame('a'->Vec(1), 'b'->Vec(4))
    f.row(0->1) == Frame('a'->Vec(1,2), 'b'->Vec(4,5))

    You can achieve a similar effect with rowSliceBy and colSliceBy

    The colAt and rowAt methods take an integer offset i into the Frame, and return a Series indexed by the opposing axis:

    f.rowAt(0) == Series('a'->1, 'b'->4)

    If there is a one-to-one relationship between offset i and key (ie, no duplicate keys in the index), you may achieve the same effect via key as follows:

    f.first(0) == Series('a'->1, 'b'->4)
    f.firstCol('a') == Series(1,2,3)

    The at method returns an instance of a org.saddle.scalar.Scalar, which behaves much like an Option; it can be either an instance of org.saddle.scalar.NA or a org.saddle.scalar.Value case class:

    f.at(0, 0) == scalar.Scalar(1)

    The rowSlice and colSlice methods allows slicing the Frame for locations in [i, j) irrespective of the value of the keys at those locations.

    f.rowSlice(0,1) == Frame('a'->Vec(1), 'b'->Vec(4))

    Finally, the method raw accesses a value directly, which may reveal the underlying representation of a missing value (so be careful).

    f.raw(0,0) == 1

    Frame may be used in arithmetic expressions which operate on two Frames or on a Frame and a scalar value. In the former case, the two Frames will automatically align along their indexes:

    f + f.shift(1) == Frame('a'->Vec(NA,3,5), 'b'->Vec(NA,9,11))
    RX

    The type of row keys

    CX

    The type of column keys

    T

    The type of entries in the frame

  6. trait Index[T] extends AnyRef

    Index provides a constant-time look-up of a value within array-backed storage, as well as operations to support joining and slicing.

  7. final class Mat[T] extends NumericOps[Mat[T]]

    Mat is an immutable container for 2D homogeneous data (a "matrix").

    Mat is an immutable container for 2D homogeneous data (a "matrix"). It is backed by a single array. Data is stored in row-major order.

    Several element access methods are provided.

    The at method returns an instance of a org.saddle.scalar.Scalar, which behaves much like an Option in that it can be either an instance of org.saddle.scalar.NA or a org.saddle.scalar.Value case class:

    val m = Mat(2,2,Array(1,2,3,4))
    m.at(0,0) == Value(1)

    The method raw accesses the underlying value directly.

    val m = Mat(2,2,Array(1,2,3,4))
    m.raw(0,0) == 1d

    Mat may be used in arithmetic expressions which operate on two Mats or on a Mat and a primitive value. A fe examples:

    val m = Mat(2,2,Array(1,2,3,4))
    m * m == Mat(2,2,Array(1,4,9,16))
    m dot m == Mat(2,2,Array(7d,10,15,22))
    m * 3 == Mat(2, 2, Array(3,6,9,12))

    Note, Mat is generally compatible with EJML's DenseMatrix. It may be convenient to induce this conversion to do more complex linear algebra, or to work with a mutable data structure.

  8. type NUM[C] = Numeric[C]

    Shorthand for numeric typeclass

  9. trait Numeric[T] extends ORD[T]
  10. type ORD[C] = Order[C]

    Shorthand for ordering typeclass

  11. implicit class OptionToScalar[T] extends AnyRef
  12. sealed trait PctMethod extends AnyRef

    Trait which specifies what percentile method to use

  13. implicit class PrimitiveToScalar[T] extends AnyRef
  14. sealed trait RankTie extends AnyRef

    Trait which specifies how to break a rank tie

  15. type ST[C] = ScalarTag[C]

    Shorthand for scalar tag typeclass

  16. implicit class SeqToFrame[RX, CX, T] extends AnyRef

    Augments Seq with a toFrame method that returns a new Frame instance.

    Augments Seq with a toFrame method that returns a new Frame instance.

    For example,

    val t = IndexedSeq(("a", "x", 3), ("b", "y", 4))
    val f = t.toFrame
    
    res0: org.saddle.Frame[java.lang.String,java.lang.String,Int] =
    [2 x 2]
          x  y
          -- --
    a ->  3 NA
    b -> NA  4
    RX

    Type of row index elements of Frame

    CX

    Type of col index elements of Frame

    T

    Type of data elements of Frame

  17. implicit class SeqToFrame2[RX, CX, T] extends AnyRef
  18. implicit class SeqToIndex[X] extends AnyRef

    Augments Seq with a toIndex method that returns a new Index instance.

    Augments Seq with a toIndex method that returns a new Index instance.

    For example,

    val i = IndexedSeq(1,2,3)
    val s = i.toIndex
    X

    Type of index elements

  19. implicit class SeqToMat[T] extends AnyRef
  20. implicit class SeqToSeries[T, X] extends AnyRef

    Augments Seq with a toSeries method that returns a new Series instance.

    Augments Seq with a toSeries method that returns a new Series instance.

    For example,

    val p = IndexedSeq(1,2,3) zip IndexedSeq(4,5,6)
    val s = p.toSeries
    T

    Type of data elements of Series

    X

    Type of index elements of Series

  21. implicit class SeqToVec[T] extends AnyRef

    Augments Seq with a toVec method that returns a new Vec instance.

    Augments Seq with a toVec method that returns a new Vec instance.

    For example,

    val s = IndexedSeq(1,2,3)
    val v = s.toVec
    T

    Type of elements of Vec

  22. class Series[X, T] extends NumericOps[Series[X, T]]

    Series is an immutable container for 1D homogeneous data which is indexed by a an associated sequence of keys.

    Series is an immutable container for 1D homogeneous data which is indexed by a an associated sequence of keys.

    Both the index and value data are backed by arrays.

    Series is effectively an associative map whose keys have an ordering provided by the natural (provided) order of the backing array.

    Several element access methods are provided.

    The apply method returns a slice of the original Series:

    val s = Series(Vec(1,2,3,4), Index('a','b','b','c'))
    s('a') == Series('a'->1)
    s('b') == Series('b'->2, 'b'->3)

    Other ways to slice a series involve implicitly constructing an org.saddle.index.Slice object and passing it to the Series apply method:

    s('a'->'b') == Series('a'->1, 'b'->2, 'b'->3)
    s(* -> 'b') == Series('a'->1, 'b'->2, 'b'->3)
    s('b' -> *) == Series('b'->2, 'b'->3, 'c'->4)
    s(*) == s

    The at method returns an instance of a org.saddle.scalar.Scalar, which behaves much like an Option in that it can be either an instance of org.saddle.scalar.NA or a org.saddle.scalar.Value case class:

    s.at(0) == Scalar(1)

    The slice method allows slicing the Series for locations in [i, j) irrespective of the value of the keys at those locations.

    s.slice(2,4) == Series('b'->3, 'c'->4)

    To slice explicitly by labels, use the sliceBy method, which is inclusive of the key boundaries:

    s.sliceBy('b','c') == Series('b'->3, 'c'->4)

    The method raw accesses the value directly, which may reveal the underlying representation of a missing value (so be careful).

    s.raw(0) == 1

    Series may be used in arithmetic expressions which operate on two Series or on a Series and a scalar value. In the former case, the two Series will automatically align along their indexes. A few examples:

    s * 2 == Series('a'->2, 'b'->4, ... )
    s + s.shift(1) == Series('a'->NA, 'b'->3, 'b'->5, ...)
    X

    Type of elements in the index, for which there must be an implicit Ordering and ST

    T

    Type of elements in the values array, for which there must be an implicit ST

  23. trait Vec[T] extends NumericOps[Vec[T]]

    Vec is an immutable container for 1D homogeneous data (a "vector").

    Vec is an immutable container for 1D homogeneous data (a "vector"). It is backed by an array and indexed from 0 to length - 1.

    Several element access methods are provided.

    The apply() method returns a slice of the original vector:

    val v = Vec(1,2,3,4)
    v(0) == Vec(1)
    v(1, 2) == Vec(2,3)

    The at method returns an instance of a org.saddle.scalar.Scalar, which behaves much like an Option in that it can be either an instance of org.saddle.scalar.NA or a org.saddle.scalar.Value case class:

    Vec[Int](1,2,3,na).at(0) == Scalar(1)
    Vec[Int](1,2,3,na).at(3) == NA

    The method raw accesses the underlying value directly.

    Vec(1d,2,3).raw(0) == 1d

    Vec may be used in arithmetic expressions which operate on two Vecs or on a Vec and a scalar value. A few examples:

    Vec(1,2,3,4) + Vec(2,3,4,5) == Vec(3,5,7,9)
    Vec(1,2,3,4) * 2 == Vec(2,4,6,8)

    Note, Vec is implicitly convertible to an array for convenience; this could be abused to mutate the contents of the Vec. Try to avoid this!

    T

    Type of elements within the Vec

  24. implicit class VecDoubleOps extends AnyRef

    Specialized methods for Vec[Double]

    Specialized methods for Vec[Double]

    Methods in this class do not filter out NAs, e.g. Vec(NA,1d).max2 == NA rather than 1d

Value Members

  1. def *: SliceAll

    Syntactic sugar, placeholder for 'slice-all'

    Syntactic sugar, placeholder for 'slice-all'

    val v = Vec(1,2,3, 4)
    val u = v(*)
  2. implicit def any2Slice[T](p: T): SliceDefault[T]
  3. def clock[T](op: => T): (Double, T)

    Allow timing of an operation

    Allow timing of an operation

    clock { bigMat.T dot bigMat }
  4. def concat[T](vecs: IndexedSeq[Vec[T]])(implicit arg0: ST[T]): Vec[T]
  5. implicit val doubleOrd: doubleIsNumeric.type
  6. implicit val floatOrd: floatIsNumeric.type
  7. implicit val intOrd: intIsNumeric.type
  8. implicit val longOrd: longIsNumeric.type
  9. def na[T](implicit st: ST[T]): T

    na provides syntactic sugar for constructing primitives recognized as NA.

    na provides syntactic sugar for constructing primitives recognized as NA. A use case is be:

    Vec[Int](1,2,na,4)

    The NA bit pattern for integral types is MinValue because it induces a symmetry on the remaining bound of values; e.g. the remaining Byte bound is (-127, +127).

  10. implicit def pair2Slice[T](p: (T, T)): SliceDefault[T]

    Syntactic sugar, allow '->' to generate an (inclusive) index slice

    Syntactic sugar, allow '->' to generate an (inclusive) index slice

    val v = Vec(1,2,3,4)
    val u = v(0 -> 2)
  11. implicit def pair2SliceFrom[T](p: (T, SliceAll)): SliceFrom[T]

    Syntactic sugar, allow ' -> *' to generate an (inclusive) index slice, open on right

    Syntactic sugar, allow ' -> *' to generate an (inclusive) index slice, open on right

    val v = Vec(1,2,3,4)
    val u = v(1 -> *)
  12. implicit def pair2SliceTo[T](p: (SliceAll, T)): SliceTo[T]

    Syntactic sugar, allow '* -> ' to generate an (inclusive) index slice, open on left

    Syntactic sugar, allow '* -> ' to generate an (inclusive) index slice, open on left

    val v = Vec(1,2,3,4)
    val u = v(* -> 2)
  13. object Buffer
  14. case object FillBackward extends FillMethod with Product with Serializable
  15. case object FillForward extends FillMethod with Product with Serializable
  16. object Frame extends BinOpFrame
  17. object Index
  18. object Mat
  19. object PctMethod
  20. object RankTie
  21. object Series extends BinOpSeries
  22. object Vec
  23. object doubleIsNumeric extends Numeric[Double] with DoubleTotalOrderTrait
  24. object floatIsNumeric extends Numeric[Float] with FloatTotalOrderTrait
  25. object intIsNumeric extends Numeric[Int]
  26. object longIsNumeric extends Numeric[Long]
  27. object order extends OrderInstances

Inherited from AnyRef

Inherited from Any

Ungrouped