object Umap
- Alphabetic
- By Inheritance
- Umap
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @IntrinsicCandidate() @native()
- def edgeWeights(knnDistances: Mat[Double], knn: Mat[Int]): Mat[Double]
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @IntrinsicCandidate() @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @IntrinsicCandidate() @native()
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- def umap(data: Mat[Double], device: Device = CPU, precision: FloatingPointPrecision = DoublePrecision, k: Int = 10, numDim: Int = 2, knnMinibatchSize: Int = 1000, lr: Double = 0.1, iterations: Int = 500, minDist: Double = 0.0d, negativeSampleSize: Int = 5, randomSeed: Long = 42L, balanceAttractionsAndRepulsions: Boolean = true, repulsionStrength: Double = 1d, logger: Option[Logger] = None, positiveSamples: Option[Int] = None): (Mat[Double], Mat[Double], Double)
Dimension reduction similar to UMAP For reference see https://arxiv.org/abs/1802.03426 This method does not follow the above paper exactly.
Dimension reduction similar to UMAP For reference see https://arxiv.org/abs/1802.03426 This method does not follow the above paper exactly.
Minimizes the objective function: L(x) = L_attraction(x) + L_repulsion(x)
L_attraction(x) = sum over (i,j) edges : b_ij * ln(f(x_i,x_j)) b_ij is the value of the 'UMAP graph' as in the above paper x_i is the low dimensional coordinate of the i-th sample f(x,y) = 1 if ||x-y||_2 < minDist , or exp(-(||x-y||_2 - minDist)) otherwise
L_repulsion(x) = sum over (i,j) edges: (1-b_ij) * ln(1 - f(x_i,x_j)) , evaluated with sampling L_repulsion is evaluated by randomly sampling in each iteration from all (i,j) edges having b_ij=0
Nearest neighbor search is evaluated by brute force. It may be batched, and may be evaluated on the GPU.
L(x) is maximized by gradient descent, in particular Adam. Derivatives of L(x) are computed using reverse mode automatic differentiation (autograd). Gradient descent may be evaluated on the GPU.
Distance metric is alway Euclidean.
Differences to the algorithm described in the UMAP paper:
- The paper desribes a smooth approximation of the function 'f' (Definition 11.). That approximation is not used in this code.
- The paper describes an optimization procedure different from the approach taken here. They sample each edge according to b_ij and update the vertices one after the other. The current code updates each locations all together according to the derivative of L(x).
- data
each row is a sample
- device
device to run the optimization and KNN search (GPU or CPU)
- precision
precision to run the KNN search, optimization is always in double precision
- k
number of nearest neighbors to retrieve. Self is counted as nearest neighbor
- numDim
number of dimensions to project to
- knnMinibatchSize
KNN search may be batched if the device can't fit the whole distance matrix
- lr
learning rate
- iterations
number of epochs of optimization
- minDist
see above equations for the definition, see the UMAP paper for its effect
- negativeSampleSize
number of negative edges to select for each positive
- balanceAttractionsAndRepulsions
if true the number of negative samples will not affect the relative strength of attractions and repulsions (see @param repulsionStrength)
- repulsionStrength
strength of repulsions compared to attractions
- returns
a triple of the layout, the umap graph (b) and the final optimization loss
- def umapCustomKnn(knn: Mat[Int], knnDistances: Mat[Double], device: Device = CPU, numDim: Int = 2, lr: Double = 0.1, iterations: Int = 500, minDist: Double = 0.0d, negativeSampleSize: Int = 5, randomSeed: Long = 42L, balanceAttractionsAndRepulsions: Boolean = true, repulsionStrength: Double = 1d, logger: Option[Logger] = None, positiveSamples: Option[Int] = None): (Mat[Double], Mat[Double], Double)
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
Deprecated Value Members
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable]) @Deprecated
- Deprecated
(Since version 9)