As **application developer**, I want to **be able to create new opeartor with built-in ops**, so that I can **simplify the lowing from high-level framework(tensorflow,pytorch) to tim-vx**, since I don't want to rewrite same pattern in different frameworks.
As **application developer**, I want to **be able to create my own opeartor with standard opencl kernel**, so that I can **support novel operators not presented in tim-vx**.
* Green components implemented as a public API of tim-vx.
* Red components could be implemented outside of tim-vx.
* Gray components implemented as a private code inside tim-vx.
## **Composed operator**
If some operator can be composed by built-in operators, such as RNNCell which actually built from FullyConnected, Tanh, and DataConvert Layers,
developer can add their own operator implementation before VSI introduce high-performance built-in ops.
[Implementation reference of RNNCell](https://github.com/VeriSilicon/TIM-VX/blob/main/src/tim/vx/ops/rnn_cell.cc)
**Keynotes for RNNCell**:
In the constructor of RNNCellImpl, internal operators - fc/tanh/dataconvert - will be created without inner connection.
The inner connection build up inside bindInput() and bindOutput();
### Layout Inference {todo}
Inside of composed operator, it actually is a subgraph of tim-vx's built-in operatos, it should be easy to extend the original layout inference for build-in operators to composed operator - just do layout inference inside the subgraph.
TIM-VX provide two different approach to integrate user's operator:
1. Build from source : build tim-vx source and user operators' implementation as single library;
2. Build from sdk: tim-vx prebuilt as a standalone library and a set of standard headers; user build operator implementation and link with tim-vx;
From tim-vx api view, the customized operator registed at graph-level, the registration automatically effected at the first time to create instance of the customized operator. With this approcah, user can override built-in operator or support new operator in a new model easily.
```c++
void CreateGraphWithCustomizedOperator() {
// create context/graph/tensor as before.
auto conv = graph->CreateOperation<tim::vx::Conv2d>(...);
auto post_detect = graph->CreateOperation<3rd_party::DetectionPostProcess>(...);
Usually, kernel take two different kinds of paramter: "tensor-like" and scalar; The tensor-like parameters usually is the output-tensor from other operators or input for other operator.
In the operator's paramter list, only scalar parameters should be defined. "tensor-like" operand should provied by bindInput/bindOutput.
: CustomOpBase(graph, input_num, output_num, CustomOpClass::kernel_id_, CustomOpClass::kernel_name_,.../*any other parameter required by c++ code, not relevant to cl kernel**/){
1.ParamTuple tuple_list_: scalar parameters tuple list in CL kernel signature, we provide param_transform() function to transform tuple_list_ to param_list_.
2.uint32_t input_num/output_num: the number of kernel operation inputs/outputs.
3.static const char* kernel_name_: OpenCL kernel name defined by users, which is unique.
4.static int32_t kernel_id_:OpenCL kernel id is defined as
5.const char* kernel_resource_: OpenCL kernel registration should be defined in custom op class initialization function. It can contain multi functions adaptd to servel situations. For example:
2.SetupParams: the function for kernel select and build option. The func_name_ is the selected function name provided by kernel_resource_, is used to determine which kernel function to be applied. build_option is the compiler options when compile custom op resource.
```c++
void SetupParams(
std::vector<tim::vx::DataType> input_types,
std::string& build_option) override {
if(...){
func_name_ = "..."/*it MUST provided in kernel_source_ */;
build_option = "..."/*compile paramters*/;
}else{
...
}
}
```
3.SetupEnqueue: the function for kernel local size and gobal size.