ttnn.snake_beta

ttnn.snake_beta(input_tensor: ttnn.Tensor, alpha: ttnn.Tensor, beta: ttnn.Tensor, *, memory_config: ttnn.MemoryConfig = ``None``, output_tensor: ttnn.Tensor = ``None``) → None

Computes the SnakeBeta activation element-wise on input_tensor:

\[\text{output}_i = \text{input}_i + \frac{\sin^2(\text{alpha}_i \cdot \text{input}_i)}{\text{beta}_i}\]

This is the BigVGAN-style Snake activation with separate learnable alpha and beta parameters.

Parameters:

input_tensor (ttnn.Tensor) – the input tensor. Must be rank >= 2 and in TILE layout.
alpha (ttnn.Tensor) – the frequency parameter tensor. Broadcastable on the last dimension.
beta (ttnn.Tensor) – the denominator parameter tensor. Same shape as alpha. Caller is responsible for ensuring beta != 0 (no internal epsilon).

Keyword Arguments:

memory_config (ttnn.MemoryConfig, optional) – memory configuration for the output. Defaults to None.
output_tensor (ttnn.Tensor, optional) – preallocated output tensor. Defaults to None.

Note

alpha, beta, and input_tensor must all be TILE layout and share the same dtype (BFLOAT16 or FLOAT32).
alpha.shape == beta.shape; both may only have non-1 size on the last dimension, which must equal input_tensor.shape[-1].

Example

>>> alpha = ttnn.from_torch(torch.ones(48, dtype=torch.bfloat16), layout=ttnn.TILE_LAYOUT, device=device)
>>> beta = ttnn.from_torch(torch.ones(48, dtype=torch.bfloat16), layout=ttnn.TILE_LAYOUT, device=device)
>>> output = ttnn.snake_beta(input_tensor, alpha, beta)

Computes the SnakeBeta activation y = x + sin^2(alpha * x) / beta element-wise on input_tensor with broadcastable per-channel alpha and beta, and returns a tensor with the same layout as input_tensor.