Packages

case class MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean) extends GenericModule[(Variable, Variable, Variable, STen), Variable] with Product with Serializable

Multi-head scaled dot product attention module

Input: (query,key,value,tokens) where query: batch x num queries x query dim key: batch x num k-v x key dim value: batch x num k-v x key value tokens: batch x num queries, long type

Tokens is used to carry over padding information and ignore the padding

Linear Supertypes
Serializable, Product, Equals, GenericModule[(Variable, Variable, Variable, STen), Variable], AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. MultiheadAttention
  2. Serializable
  3. Product
  4. Equals
  5. GenericModule
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new MultiheadAttention(wQ: Constant, wK: Constant, wV: Constant, wO: Constant, dropout: Double, train: Boolean, numHeads: Int, padToken: Long, linearized: Boolean)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def apply[S](a: (Variable, Variable, Variable, STen))(implicit arg0: Sc[S]): Variable

    Alias of forward

    Alias of forward

    Definition Classes
    GenericModule
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native() @IntrinsicCandidate()
  7. val dropout: Double
  8. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  9. def forward[S](x: (Variable, Variable, Variable, STen))(implicit arg0: Sc[S]): Variable

    The implementation of the function.

    The implementation of the function.

    In addition of x it can also use all the state to compute its value.

    Definition Classes
    MultiheadAttentionGenericModule
  10. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @IntrinsicCandidate()
  11. final def gradients(loss: Variable, zeroGrad: Boolean = true): Seq[Option[STen]]

    Computes the gradient of loss with respect to the parameters.

    Computes the gradient of loss with respect to the parameters.

    Definition Classes
    GenericModule
  12. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  13. final def learnableParameters: Long

    Returns the total number of optimizable parameters.

    Returns the total number of optimizable parameters.

    Definition Classes
    GenericModule
  14. val linearized: Boolean
  15. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  16. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @IntrinsicCandidate()
  17. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @IntrinsicCandidate()
  18. val numHeads: Int
  19. val padToken: Long
  20. final def parameters: Seq[(Constant, PTag)]

    Returns the state variables which need gradient computation.

    Returns the state variables which need gradient computation.

    Definition Classes
    GenericModule
  21. def productElementNames: Iterator[String]
    Definition Classes
    Product
  22. val state: List[(Constant, LeafTag with Product with Serializable)]

    List of optimizable, or non-optimizable, but stateful parameters

    List of optimizable, or non-optimizable, but stateful parameters

    Stateful means that the state is carried over the repeated forward calls.

    Definition Classes
    MultiheadAttentionGenericModule
  23. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  24. val train: Boolean
  25. val wK: Constant
  26. val wO: Constant
  27. val wQ: Constant
  28. val wV: Constant
  29. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  30. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  31. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  32. final def zeroGrad(): Unit
    Definition Classes
    GenericModule

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable]) @Deprecated
    Deprecated

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped