# Table of Contents - [Operators](#operators) - [Activation](#activation) - [AddN](#addn) - [ArgMin/ArgMax](#argminargmax) - [Batch2Space](#batch2space) - [BatchNorm](#batchnorm) - [Broadcast](#broadcast) - [Clip](#clip) - [Concat](#concat) - [Conv2d](#conv2d) - [Conv3d](#conv3d) - [DeConv2d](#deconv2d) - [DeConv1d](#deconv1d) - [DepthToSpace](#depthtospace) - [Dropout](#dropout) - [Add](#add) - [Sub](#sub) - [Multiply](#multiply) - [Div](#div) - [Pow](#pow) - [Minimum](#minimum) - [Maximum](#maximum) - [FloorDiv](#floordiv) - [Erf](#erf) - [FullyConnected](#fullyconnected) - [Gather](#gather) - [GatherElements](#gatherelements) - [GatherNd](#gathernd) - [GroupedConv1d](#groupedconv1d) - [GroupedConv2d](#groupedconv2d) - [L2Normalization](#l2normalization) - [LocalResponseNormalization](#localresponsenormalization) - [And](#and) - [Or](#or) - [LogSoftmax](#logsoftmax) - [Matmul](#matmul) - [MaxpooGrad](#maxpoograd) - [MaxpoolWithArgmax](#maxpoolwithargmax) - [MaxpoolWithArgmax2](#maxpoolwithargmax2) - [MaxUnpool2d](#maxunpool2d) - [Moments](#moments) - [NBG](#nbg) - [OneHot](#onehot) - [Pad](#pad) - [Pool2d](#pool2d) - [Classic Pool2d](#classic-pool2d) - [Global Pool2d](#global-pool2d) - [Adaptive Pool2d](#adaptive-pool2d) - [ReduceMin](#reducemin) - [ReduceMax](#reducemax) - [ReduceAny](#reduceany) - [ReduceAll](#reduceall) - [ReduceProd](#reduceprod) - [ReduceMean](#reducemean) - [ReduceSum](#reducesum) - [Greater](#greater) - [GreaterOrEqual](#greaterorequal) - [Less](#less) - [LessOrEqual](#lessorequal) - [NotEqual](#notequal) - [Equal](#equal) - [Reorg](#reorg) - [Reshape](#reshape) - [Resize](#resize) - [Resize1d](#resize1d) - [Reverse](#reverse) - [RoiAlign](#roialign) - [RoiPool](#roipool) - [ScatterND](#scatternd) - [Select](#select) - [DataConvert](#dataconvert) - [Neg](#neg) - [Abs](#abs) - [Sin](#sin) - [Exp](#exp) - [Log](#log) - [Sqrt](#sqrt) - [Rsqrt](#rsqrt) - [Square](#square) - [LogicalNot](#logicalnot) - [Floor](#floor) - [Ceil](#ceil) - [Cast](#cast) - [Slice](#slice) - [Softmax](#softmax) - [Space2Batch](#space2batch) - [SpaceToDepth](#spacetodepth) - [Split](#split) - [Squeeze](#squeeze) - [Stack](#stack) - [StridedSlice](#stridedslice) - [Svdf](#svdf) - [Tile](#tile) - [Topk](#topk) - [Transpose](#transpose) - [Unidirectional sequence lstm](#unidirectional-sequence-lstm) - [Unstack](#unstack) # Operators ## Activation Activation functions: ``` Relu(x) : max(0, x) Relu1(x) : -1 if x <= -1; x if -1 < x < 1; 1 if x >= 1 Relu6(x) : 0 if x <= 0; x if 0 < x < 6; 6 if x >= 6 Elu(x) : x if x >= 0 else alpha*(e^x - 1) Tanh(x) : (1 - e^{-2x})/(1 + e^{-2x}) Sigmoid(x) : 1/(1 + e^{-x}) Swish(x) : x * sigmoid(x) HardSwish(x) : 0 if x <= -3; x(x + 3)/6 if -3 < x < 3; x if x >= 3 HardSigmoid(x) : min(max(alpha*x + beta, 0), 1) SoftRelu(x) : log(1 + e^x). Also known as SoftPlus. Mish(x) : x * tanh(softrelu(x)) LeakyRelu(x) : alpha * x if x <= 0; x if x > 0. alpha is a scalar. Prelu(x) : alpha * x if x <= 0; x if x > 0. alpha is a tensor. - axis : describes the axis of the inputs when coerced to 2D. Linear(x, a, b) : a*x + b. Gelu(x) : x * P(X <= x), where P(x) ~ N(0, 1). https://tensorflow.google.cn/api_docs/python/tf/nn/gelu Selu(x, alpha, gamma) : gamma * x if(x>=0), gamma * alpha * (exp(x)-1) x<0 Celu(x, alpha) : x if x >= 0; alpha * (exp(x/alpha) - 1) ``` ## AddN ``` AddN(x) : Input0 + Input1 + ... + InputN ``` ## ArgMin/ArgMax Computes the indices of the **min/max** elements of the input tensor's element along the provided **axis**. The type of the output tensor is integer. ## Batch2Space This operation reshapes the batch dimension (dimension 0) into M + 1 dimensions of shape **block_size** + [batch], interleaves these blocks back into the grid defined by the spatial dimensions [1, ..., M], to obtain a result with the same rank as the input. This is the reverse transformation of Space2Batch. - crop : corp the output tensor for ROI usage. ## BatchNorm Carries out batch normalization as described in the paper https://arxiv.org/abs/1502.03167. $$\hat x_i\leftarrow \frac{x_i-\mu_\mathcal{B}}{\sqrt{\sigma_\mathcal{B}^2+\epsilon}}$$ $$y_i=\gamma\hat x_i+\beta\equiv BN_{\gamma,\beta}(x_i)$$ ## Broadcast Broadcast an array for a compatible shape. See also numpy.broadcast_to(). Input: - input. Attribute: - shape: the shape which broadcast to. - dimensions(optional): Which dimension in the target shape each dimension of the operand shape corresponds to. For BroadcastInDim. ## Clip Clip(x) : min if x <= min; x if min < x < max; max if x >= max ## Concat Concatenate a list of tensors into a single tensor. - axis : Which axis to concat on. ## Conv2d Performs a 2-D convolution operation, include classic Conv2D / Depthwise Conv2D / Group Conv2D / Dilation Conv2D. Input: - input [WHCN or CWHN]. - kernel [ WHIcOc ] (Ic: Input Channels. Oc: Output Channels). - bias [ O ]. Optional. Attribute: - weights : the output channel number for weight tensor. - ksize : the height and width for weight tensor. - padding : AUTO, VALID or SAME. - pad : pad value for each spatial axis. - stride : stride along each spatial axis. - dilation : dilation value along each spatial axis of the filter. - multiplier: function similar to group attribute on other framework, but the value is different. multiplier = weights / group. - layout : WHCN or CWHN. ## Conv3d Performs a 3-D convolution operation Input: - input [WHDCN]. - kernel [ WHDIcOc ] (Ic: Input Channels. Oc: Output Channels). - bias [ O ]. Optional. Attribute: - weights : the output channel number for weight tensor. - ksize : the height and width for weight tensor. - padding : AUTO, VALID or SAME. - pad : pad value for each spatial axis. (left, right, top, bottom, front, rear). - stride : stride along each spatial axis. - dilation : dilation value along each spatial axis of the filter. - multiplier: function similar to group attribute on other framework, but the value is different. multiplier = weights / group. - input_layout : WHDCN or WHCDN. - kernel_layout : WHDIcOc ## DeConv2d Performs the transpose of 2-D convolution operation. This operation is sometimes called "deconvolution" after Deconvolutional Networks, but is actually the transpose (gradient) of Conv2D rather than an actual deconvolution. - oc_count_ : the out channel count for weight tensor. - pad_type : SAME, VALID or AUTO. - ksize : the height and width for weight tensor. - padding : AUTO, VALID or SAME. - pad : pad value for each spatial axis. - stride : stride along each spatial axis. - output_padding : specifying the amount of padding along the height and width of the output tensor. - group : the feature count of each group. - input_layout : Layout for input, WHCN by default. - kernel_layout: Layout for kernel, WHIO by default. ## DeConv1d Performs the transpose of 1-D convolution operation. This operation is sometimes called "deconvolution1d" after Deconvolutional Networks, but is actually the transpose (gradient) of Conv2D rather than an actual deconvolution. - weights : the channel number for weight tensor. - ksize : the length for weight tensor. - padding : AUTO, VALID or SAME. - pad : pad value for each spatial axis. - stride : stride along each spatial axis. - output_padding : specifying the amount of padding along the height and width of the output tensor. ## DepthToSpace DepthToSpace rearranges (permutes) data from depth into blocks of spatial data. This is the reverse transformation of SpaceToDepth. Chunks of data of size block_size * block_size from depth are rearranged into non-overlapping blocks of size block_size x block_size. The width of the output tensor is input_depth * block_size, whereas the height is input_height * block_size. The depth of the input tensor must be divisible by block_size * block_size - crop : corp the output tensor for ROI usage. ## Dropout The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. TIM-VX only focus on inference time, and just scaling input tensor by **ratio** for Dropout operator. ## Add Add(x, y) : x + y. This operation supports broadcasting. ## Sub Sub(x, y) : x - y. This operation supports broadcasting. ## Multiply Multiply(x, y) : Multiplies two tensors, element-wise, also known as Hadamard product. This operation supports broadcasting. - scale: scaling the product. ## Div Div(x, y) : x / y. This operation supports broadcasting. ## Pow Pow(x, y) : x ^ y. This operation supports broadcasting. ## Minimum Minimum(x, y) : min(x, y). This operation supports broadcasting. ## Maximum Maximum(x, y) : max(x, y). This operation supports broadcasting. ## FloorDiv FloorDiv(x, y): floor( x / y ). This operation supports broadcasting. ## Erf Computes the Gauss error function of x element-wise. - no parameters ## FullyConnected Denotes a fully (densely) connected layer, which connects all elements in the input tensor with each element in the output tensor. - axis: Describes the axis of the inputs when coerced to 2D. - weights: the output channel number for weight tensor. ## Gather Gather slices from input, **axis** according to **indices**. ## GatherElements GatherElements slices from input, **axis** according to **indices**. out[i][j][k] = input[index[i][j][k]][j][k] if axis = 0, out[i][j][k] = input[i][index[i][j][k]][k] if axis = 1, out[i][j][k] = input[i][j][index[i][j][k]] if axis = 2, https://github.com/onnx/onnx/blob/main/docs/Operators.md#GatherElements ## GatherNd An operation similar to Gather but gathers across multiple axis at once. ## GroupedConv1d Performs a grouped 1-D convolution operation. Input: - input [WCN]. - kernel [ WIcOc ] (Ic: Input Channels. Oc: Output Channels).Ic*group=C. - bias [ O ]. Optional. Attribute: - weights : the output channel number for weight tensor. - ksize : the height and width for weight tensor. - padding : AUTO, VALID or SAME. - pad : pad value for each spatial axis. - stride : stride along each spatial axis. - dilation : dilation value along each spatial axis of the filter. - group: Split conv to n group. - layout : WCN or CWN. ## GroupedConv2d Performs a grouped 2-D convolution operation. Input: - input [WHCN or CWHN]. - kernel [ WHIcOc ] (Ic: Input Channels. Oc: Output Channels). - bias [ O ]. Optional. Attribute: - weights : the output channel number for weight tensor. - ksize : the height and width for weight tensor. - padding : AUTO, VALID or SAME. - pad : pad value for each spatial axis. - stride : stride along each spatial axis. - dilation : dilation value along each spatial axis of the filter. - group_number: Split conv to n group. - layout : WHCN or CWHN. ## L2Normalization Applies L2 normalization along the axis dimension: ``` output[batch, row, col, channel] = input[batch, row, col, channel] / sqrt(sum_{c} pow(input[batch, row, col, c], 2)) ``` ## LocalResponseNormalization Applies Local Response Normalization along the depth dimension: ``` sqr_sum[a, b, c, d] = sum( pow(input[a, b, c, d - depth_radius : d + depth_radius + 1], 2)) output = input / pow((bias + alpha * sqr_sum), beta) ``` ## And Returns the truth value of x AND y element-wise. This operation supports broadcasting. ## Or Returns the truth value of x OR y element-wise. This operation supports broadcasting. ## LogSoftmax Computes the log softmax activation on the input tensor element-wise, per batch. ``` logsoftmax = logits - log(reduce_sum(exp(logits), axis)) ``` ## Matmul Multiplies matrix a by matrix b, producing a * b. - transpose_a: If True, a is transposed before multiplication. - transpose_b: If True, b is transposed before multiplication. - adjoint_a: If True, a is conjugated and transposed before multiplication. - adjoint_b: If True, b is conjugated and transposed before multiplication. ## MaxpooGrad Acquire the gradient of 2-D Max pooling operation's input tensor. \ Like the tensorflow_XLA op SelectAndScatter, see https://tensorflow.google.cn/xla/operation_semantics?hl=en#selectandscatter. - padding : AUTO, VALID or SAME. - ksize : filter size. - stride : stride along each spatial axis. - round_type : CEILING or FLOOR. * Inputs: - 0 : input tensor of 2-D Max pooling. - 1 : gradient of 2-D Max pooling output tensor. ## MaxpoolWithArgmax Performs an 2-D Max pooling operation and return indices - padding : AUTO, VALID or SAME. - ksize : filter size. - stride : stride along each spatial axis. - round_type : CEILING or FLOOR. ## MaxpoolWithArgmax2 Performs an 2-D Max pooling operation and return indices(which start at the beginning of the input tensor). - padding : AUTO, VALID or SAME. - ksize : filter size. - stride : stride along each spatial axis. - round_type : CEILING or FLOOR. ## MaxUnpool2d Performs an 2-D Max pooling operation upsample - stride : stride along each spatial axis. - ksize : filter size. ## Moments The mean and variance are calculated by aggregating the contents of x across axes. If x is 1-D and axes = [0] this is just the mean and variance of a vector. - axes : Axes along which to compute mean and variance. - keep_dims : Produce moments with the same dimensionality as input. ## NBG Network Binary Graph is a precompile technology, which can compile a fuse graph into a bianry file. ## OneHot Create a one-hot tensor. - depth : A scalar defining the depth of the one hot dimension. - on_value : A scalar defining the value to fill in output. - off_value : A scalar defining the value to fill in output. - axis : The axis to fill. ## Pad Pads a tensor. - const_val : the value to pad. - pad_mode : the mode of pad. - front_size : Add pad values to the left and top. - back_size : Add pad values to the right and bottom. ## Pool2d ### Classic Pool2d Performs an 2-D pooling operation. - type : MAX, AVG, L2 or AVG_ANDROID. - padding : AUTO, VALID or SAME. - pad : Specify the number of pad values for left, right, top, and bottom. - ksize : filter size. - stride : stride along each spatial axis. - round_type : CEILING or FLOOR. ### Global Pool2d - type : MAX, AVG, L2 or AVG_ANDROID. - input_size : input size(only [W, H]) - round_type : CEILING or FLOOR. ### Adaptive Pool2d Same as torch.nn.AdaptiveXXXPool2d. - type : MAX, AVG, L2 or AVG_ANDROID. - input_size : input size(only [W, H]) - output_size : output size(only [W, H]) - round_type : CEILING or FLOOR. ## ReduceMin Reduces a tensor by computing the minimum of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## ReduceMax Reduces a tensor by computing the maximum of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## ReduceAny Reduces a tensor by computing the "logical or" of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## ReduceAll Reduces a tensor by computing the "logical and" of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## ReduceProd Reduces a tensor by computing the multiplying of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## ReduceMean Reduces a tensor by computing the mean of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## ReduceSum Reduces a tensor by computing the summing of elements along given dimensions. - axis : the dimensions to reduce. - keep_dims : If keep_dims is true, the reduced dimensions are retained with length 1. Otherwise, the rank of the tensor is reduced by 1 for each entry in dimensions ## Greater For input tensors x and y, computes x > y elementwise. ## GreaterOrEqual For input tensors x and y, computes x >= y elementwise. ## Less For input tensors x and y, computes x < y elementwise. ## LessOrEqual For input tensors x and y, computes x <= y elementwise. ## NotEqual For input tensors x and y, computes x != y elementwise. ## Equal For input tensors x and y, computes x == y elementwise. ## Reorg The layer used in YOLOv2. See also https://github.com/pjreddie/darknet/blob/master/src/reorg_layer.c ## Reshape Given tensor, this operation returns a tensor that has the same values as tensor, but with a newly specified shape. - size : defining the shape of the output tensor. ## Resize Resizes images to given size. - type : NEAREST_NEIGHBOR, BILINEAR or AREA. - factor : scale the input size. DO NOT use it with target_height / target_width together. - align_corners : If True, the centers of the 4 corner pixels of the input and output tensors are aligned, preserving the values at the corner pixels. - half_pixel_centers : If True, the pixel centers are assumed to be at (0.5, 0.5). This is the default behavior of image.resize in TF 2.0. If this parameter is True, then align_corners parameter must be False. - target_height / target_width : output height / width. DO NOT use it with factor together. ## Resize1d Resize1ds 1D tensors to given size. - type : NEAREST_NEIGHBOR, BILINEAR or AREA. - factor : scale the input size. DO NOT use it with target_height / target_width together. - align_corners : If True, the centers of the 4 corner pixels of the input and output tensors are aligned, preserving the values at the corner pixels. - half_pixel_centers : If True, the pixel centers are assumed to be at (0.5, 0.5). This is the default behavior of image.resize in TF 2.0. If this parameter is True, then align_corners parameter must be False. - target_height / target_width : output height / width. DO NOT use it with factor together. ## Reverse Reverses specific dimensions of a tensor. - axis : The indices of the dimensions to reverse. ## RoiAlign Select and scale the feature map of each region of interest to a unified output size by average pooling sampling points from bilinear interpolation. - output_height : specifying the output height of the output tensor. - output_width : specifying the output width of the output tensor. - height_ratio : specifying the ratio from the height of original image to the height of feature map. - width_ratio : specifying the ratio from the width of original image to the width of feature map. - height_sample_num : specifying the number of sampling points in height dimension used to compute the output. - width_sample_num :specifying the number of sampling points in width dimension used to compute the output. ## RoiPool Select and scale the feature map of each region of interest to a unified output size by max-pooling. pool_type : only support max-pooling (MAX) scale : The ratio of image to feature map (Range: 0 < scale <= 1) size : The size of roi pooling (height/width) ## ScatterND Scatter updates into a new tensor according to indices. - shape : The shape of the resulting tensor. ## Select Using a tensor of booleans c and input tensors x and y select values elementwise from both input tensors: O[i] = C[i] ? x[i] : y[i]. ## DataConvert Change the format from input tensor to output tensor. ## Neg Neg(x) : -x ## Abs Abs(x) : x if x >= 0; -x if x < 0. ## Sin Sin(x) : sin(x) ## Exp Exp(x) : e^x ## Log Log(x) : ln(x) ## Sqrt Sqrt(x) : $$\sqrt{x}$$ ## Rsqrt Rsqrt(x) : $$\frac{1}{\sqrt{x}}$$ ## Square Square : x^2 ## LogicalNot LogicalNot(x) : NOT x ## Floor returns the largest integer less than or equal to a given number. ## Ceil returns the largest integer more than or equal to a given number. ## Cast Change the format from input tensor to output tensor. This operation ignores the scale and zeroPoint of quanized tensors. ## Slice Extracts a slice of specified size from the input tensor starting at a specified location. - start : the beginning indices of the slice in each dimension. - length : the size of the slice in each dimension. ## Softmax Computes the softmax activation on the input tensor element-wise, per batch, by normalizing the input vector so the maximum coefficient is zero: ``` output[batch, i] = exp((input[batch, i] - max(input[batch, :])) * beta) / sum_{k}{exp((input[batch, k] - max(input[batch, :])) * beta)} ``` ## Space2Batch This operation divides "spatial" dimensions [1, ..., M] of the input into a grid of blocks of shape **block_size**, and interleaves these blocks with the "batch" dimension (0) such that in the output, the spatial dimensions [1, ..., M] correspond to the position within the grid, and the batch dimension combines both the position within a spatial block and the original batch position. Prior to division into blocks, the spatial dimensions of the input are optionally zero padded according to paddings. This is the reverse transformation of Batch2Space. - pad : the paddings for each spatial dimension of the input tensor. ## SpaceToDepth SpaceToDepth rearranges blocks of spatial data into depth. More specifically, this op outputs a copy of the input tensor where values from the height and width dimensions are moved to the depth dimension. This is the reverse transformation of DepthToSpace. ## Split Splits a tensor along a given axis into num_splits subtensors. - axis : the axis along which to split. - slices : indicating the number of splits along given axis. ## Squeeze Removes dimensions of size 1 from the shape of a tensor. - axis : the dimensions to squeeze. ## Stack Packs the list of tensors in inputs into a tensor with rank one higher than each tensor in values, by packing them along the **axis** dimension. Dimensions below the dimension specified by axis will be packed together with other inputs. ## StridedSlice Extracts a strided slice of a tensor.Same as tensorflow. Roughly speaking, this op extracts a slice of size (end - begin) / stride from the given input tensor. Starting at the location specified by begin the slice continues by adding stride to the index until all dimensions are not less than end. Note that a stride can be negative, which causes a reverse slice. - begin_dims : the starts of the dimensions of the input tensor to be sliced. - end_dims : the ends of the dimensions of the input tensor to be sliced. - stride_dims : the strides of the dimensions of the input tensor to be sliced. - begin_mask : if the ith bit of begin_mask is set, begin[i] is ignored and the fullest possible range in that dimension is used instead. - end_mask : if the ith bit of end_mask is set, end[i] is ignored and the fullest possible range in that dimension is used instead. - shrink_axis_mask : if the ith bit of shrink_axis_mask is set, the ith dimension specification shrinks the dimensionality by 1, taking on the value at index begin[i]. In this case, the ith specification must define a slice of size 1, e.g. begin[i] = x, end[i] = x + 1. ## Svdf Performs an 2-D pooling operation. - rank : The rank of the SVD approximation. - num_units : corresponds to the number of units. - spectrogram_length : corresponds to the fixed-size of the memory. ## Tile Constructs a tensor by tiling a given tensor. - multiples : Must be one of the following types: int32, int64. Length must be the same as the number of dimensions in input. ## Topk Finds values and indices of the k largest entries for the last dimension. - k : Number of top elements to look for along the last dimension. ## Transpose Transposes the input tensor, permuting the dimensions according to the **perm** tensor. The returned tensor's dimension i corresponds to the input dimension perm[i]. If perm is not given, it is set to (n-1...0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. ## Unidirectional sequence lstm how to bind input/output: take unidirectional_sequence_lstm_test.cc ## Unstack Unpacks the given dimension of a rank-R tensor into rank-(R-1) tensors. - axis : An int. The axis to unstack along. Defaults to the first dimension. Negative values wrap around, so the valid range is [-R, R).