# Tengine **Repository Path**: dqedda_admin/Tengine ## Basic Information - **Project Name**: Tengine - **Description**: Tengine is a lite, high performance, modular inference engine for embedded device - **Primary Language**: C++ - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2019-08-06 - **Last Updated**: 2023-02-10 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Tengine Overview [![GitHub license](http://OAID.github.io/pics/apache_2.0.svg)](./LICENSE) **Tengine**, developed by **OPEN** AI LAB, is a lite, high-performance, and modular inference engine for embedded device. Tengine is composed of six modules: **core/operator/serializer/executor/driver/wrapper**. - [**core**](core) provides the basic components and functionalities of the system. - [**operator**](operator) defines the schema of operators, such as convolution, relu, pooling, etc. al. Here is the current support [**operator list**](doc/operator_ir.md). - [**serializer**](serializer) is to load the saved model. The serializer framework is extensible to support different format, including the customized one. Caffe/ONNX/Tensorflow/MXNet and Tengine models can be loaded directly by Tengine. - [**executor**](executor) implements the code to run graph and operators. Current version provides a highly optimized implementation for multi A72 cores. - [**driver**](driver) is the adapter of real H/W and provides service to device executor by HAL API. It is possible for single driver to create multiple devices. - [**wrapper**](wrapper) provides the wrapper of APIs for different frameworks. Both Caffe API wrapper and Tensorflow API wrapper work now. This version can load and run Caffe model of **mobilenet** and **squeezenet** directly. For more details, please goto [**install**](doc/install.md). `NOTE`: Old Caffe model has to be upgraded using **upgrade_net_proto_binary/upgrade_net_proto_binary** from Caffe's package. ## Performance The data is collected on **1.8G A72** and on chip RK3399, by repeating calling the forward interface to get the average time cost (ms) per run. - Single A72 core (1xA72) |NN |Caffe(Openblas)|Tengine| |----|---------------|-------| |squeezenet|147|91| |mobilenet|306|122| - Two A72 cores (2xA72) |NN |Caffe(Openblas)|Tengine| |----|---------------|-------| |squeezenet|102|51| |mobilenet|232|65| For details to run benchmark, please visit [**benchmark**](doc/benchmark.md) page. ## Build and Install please refer to the [**Linux build**](doc/install.md) and [**Android build**](https://github.com/OAID/Tengine/blob/master/doc/build_android.md) ## Tengine examples and model zoo please visit [examples](examples/readme.md) for demos on classification/detection and download models from [**Tengine model zoo**](https://pan.baidu.com/s/1Ar9334MPeIV1eq4pM1eI-Q) (psw: hhgc) [**tengine applications**](https://github.com/OAID/Tengine-app) is a project for sharing android/linux applications powered by Tengine ## Develop New Operator It is easy to add new operator to Tengine. Here is the guide on [**new operator**](doc/operator_dev.md). ## Support New Model Format Tengine can be extended to support new serialization format, by building new serializer module. [How to build new serializer module](doc/serializer_dev.md) ## Communication && Tech Support * Github issues * QQ group: 829565581 (Question:Tengine Answer:openailab) * Tengine Community: http://www.tengine.org.cn/ ## Release History ## version 1.3.2 - 2019/04/19 **tengine model 2.0** **New apis** get_graph_node_number() get_graph_node_by_idx() **New features** Separate CPU operator as a independent so: hclcpu.so Add Reference Operator Update Testcase & Update permute for mxnet Update lstm grun mxnet serializer Support MXNET serializer in CMakelist.txt Support TFLITE serializer in CMakelist.txt Support eltwise in TFLITE serializer **More operator support** RNN operator definition and blas implementation LSTM operator definition and blas implementation GRU operator definition and blas implementation ## version 1.0.0 - 2018/12/31 **tengine API 2.0** New API set for NN inference Simplify graph create process: just create_graph() instead of load_model() and create_runtime_graph() Support perf stat and tensor dump Support log redirect Support to build Android NN Driver with new Tengine API By setting CONFIG_LEGACY_API=y in makefile.config, tengine API 1.0 still works **more tensorflow models support** Support inceptionv3/v4, resnet_v2_101, mobilenet v1/v2 models from [tensorflow](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/models.md) ## version 0.8.0 - 2018/11/15 **Support GPU/CPU Heterogeneous Computing** By calling set_graph_device(graph,"acl_opencl"), operators that GPU supports will be scheduled to GPU, while left operators will be scheduled on CPU automatically. Here is the guide to run [a MSSD example](https://github.com/OAID/Tengine/blob/master/doc/gpu_cpu_mssd.md) with GPU FP16 **Using c++_shared for Android build** As NDK toolchains will drop gun_stl finally, this version switches to c++_shared Please download the pre-built libraries with c++_shared from [Tengine Android Build Libraries](https://pan.baidu.com/s/1-zsqxXXcZEXmCip-nQzcIw) (password: *wtcz*). **Support ACL in Android** Update the cmake system to support ACL in Android build. please refer to [Android build guide ](https://github.com/OAID/Tengine/blob/master/doc/build_android.md) **Bugfix** The issue to load tengine model converted from MXNet ## version 0.7.2 - 2018/10/15 Serializer: update ONNX module with new onnx proto version ## version 0.7.0 - 2018/9/15 **New features** Serializer: support saving model as c files ACL GPU: add FP16 support NN: mobilenet v2 support in examples Accuracy tools: yolov2 accuracy test Build: support cross-building arm32 library support building on raspberry pi 3b automatically clean the build directory when makfile.config changed **Bug fix** A few memory leakage issues in library and examples A race condition issue between front thread and the background working thread Tensorflow serializer issue: fail to load inception_v3 model ### version 0.6.0 - 2018/7/02 Support Tengine model file. protobuf is optional now. Please refer to [tengine_model exmaples](examples/tengine_model) ### version 0.5.0 - 2018/6/15 **New features** Support GPU: using ACL (Arm computing library) as a backend graph device Support blas operator implementation: Tengine can run on x86 without caffe now Support new NN: Inception-v3/vgg16/faster-rcnn/ssd/yolo-v2 Support Android build: includes 32bit and 64bit Support cross-compile on x86 (experimental): debian example contributed by **mcharleb** and **Mani-Sadhasivam** @ Linaro Support Tensorflow serializer: load inception-v3 and mobilenet TF model directly Support Tensorflow wrapper: label_image.cpp from tensorflow repo **Others** Single so file now and remove the etc/config according to feedback from field. Tengine will automatically probe the CPU arch/part settings, and there is just one CPU driver now. To assign cpu manually when necessary: export TENGINE_CPU_LIST=1,2 Besides probing CPU, a few CPUs are defined in cpu_predefined.cpp, including rk3399/a63/kirin960/apq8096. To use the predefined CPU, refers to below: const struct cpu_info * p_info=get_predefined_cpu("rk3399"); create_cpu_device("rk3399",p_info); ### version 0.3.0 - 2018/2/6 Introduce the driver/device model to support MT(Multi-Thread) Support new NN: Inception-v4 Caffe Wrapper examples: squeezenet/mobilenet/mtcnn MXNet model load examples: squeezenet/mobilenet ### version 0.2.0 - 2018/1/24 Support new operator: Eltwise, PReLU, Slice Support new NN: mtcnn, resnet and lighten_cnn Experimental caffe API wrapper: caffe based application just needs to recompile to use Tengine ### version 0.1.2 - 2017/12/30 Update documents, as well a few fixes. ### version 0.1.0 - 2017/12/29 Initial release of single A72 support