* Add support for different input dtype of MaxPoolGrad.
Type: Code improvement
* Integrate api trace into tim-vx source code, as part of experimeantal.
Type: New Feature
* Refine api trace code and document
Add missing traced apis of tim::vx::Quantization
Type: Code improvement
* Split Api relayer code out of tracer.
To enable compile replayer code in machine which can't access high version boost libs.
Type: Code improvement
Correct erros of deconv1d unittest
Added hint in the header indicating that padtype is not supported yet
Added 2 cases for deconv1d
Type: Code Improvement
Issue: github issue #585
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
* Add support for different input dtype of MaxPoolGrad.
Type: Code improvement
* Integrate api trace into tim-vx source code, as part of experimeantal.
Type: New Feature
* Refine api trace code and document
Add missing traced apis of tim::vx::Quantization
Type: Code improvement
1.Added copyright && Added reference or const reference for functions
2.Rewrite function of determing whether there is a common input
3.Use std::remove_if instead of std::find before doing erase
4.Added security check to prevent access to deleted ops
Type: Code Improvement
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Added Float16 type definition from third-party
Refine float16 bias handlling in conv2d
Refine float16 case in conv2d
Caution: Headers of float16 only be included when build unit_test
Type: New Feature
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Added op fusion for mean_stddev_normalization ops such as layernorm and
instance norm
Type: New Feature
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
If Executable object doesn't bind with a concrete DeviceID,
it will go first device by default.
When run multi executable with multi device, the behavior is not
expected. Fixed by attach device id with CompileOption.
Signed-off-by: xiang.zhang <xiang.zhang@verisilicon.com>
* Add support for different input dtype of MaxPoolGrad.
Type: Code improvement
* Integrate api trace into tim-vx source code, as part of experimeantal.
Type: New Feature
Remove wrong layout comment for depthwise conv unit test
Add comment of layout condition in basic class for depthwise conv
Type: Code Improvement
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Fixed bug that when deconv1d ouput is set to be transient, actual output shape will be zero at dim 1.
Reason :internal typing error
Type: Bug Fix
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Convert float16 bias tensor to float32 to meet condition of NN
convolution in driver
Caution: Clang version requires minimum 15.0
Type: Code Improvement
Issue: bugzilla id:32785 | jira id VIVD-744
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Add another constructor for stridedslice when new_axis_mask is set
The layout inference need to reconstruct the axis mapping when
new_axis_mask is set(TODO)
Type: New Feature
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Remove unused value to make sure project build successfully with higher
version compiler such as clang15
Type: Code Improvement
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Layernormolization can handle non zero axis now
Added case to verify layernorm with axis 2
Modify layernorm opjson
Type: Code Improvement
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Record constructor form of each operation as a json file to support acuity to call
timvx op
Type: Documentation
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Refine unidirectional_gru and gru_cell code to avoid including ovxlib files
in header of some op
Introduce TranslateToVsibool function to support above code refinement
Type: Code Improvement
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
In the past we reverse all inputs to default order pv and caused
unnecessary transpose operation.
In this commit only const slope will be handled and do transpose if necessary.
Type: Code Improvement
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Added missing ops which have already supported; Changed status of some
ops.
Type: Documentation
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
Support tensor cache while create tensor
Tensor can be shared between different operations, if tensor have
identical data and quantization parameter, they should share same
low-level tensor object to save memory.
In tim-vx, introduce a tensor cache which key is md5sum and value is
low-level tensor object. If up-coming tensor have same md5sum, the
cached tensor object reused for tensor creation.
Type: New feature
Signed-off-by: Chen Xin <jack.chen@verisilicon.com>
Co-authored-by: Chen Xin <jack.chen@verisilicon.com>
Update internal to 0e9393dbb4f653b9dfceaeaaa920d4deb8b27077
Update prebuilt-sdk to 6.4.14 release
Update cmakefiles to support above updates
Type: New Feature
Signed-off-by: Feiyue Chen <Feiyue.Chen@verisilicon.com>
If graph has free INPUT or OUTPUT, modified error to
warning when check in graph compile
Type: Code refine
Signed-off-by: Chen Xin <jack.chen@verisilicon.com>
Co-authored-by: Chen Xin <jack.chen@verisilicon.com>