README.md
1## RKNPU2
2 RKNPU2 provides an advanced interface to access Rockchip NPU.
3
4## Support Platform
5 - RK3566/RK3568
6 - RK3588/RK3588S
7 - RV1103/RV1106
8 - RK3562
9
10Note:
11 The rknn model must be generated using RKNN Toolkit 2: https://github.com/rockchip-linux/rknn-toolkit2
12
13 **For RK1808/RV1109/RV1126/RK3399Pro, please use:**
14
15 https://github.com/rockchip-linux/rknn-toolkit
16
17 https://github.com/rockchip-linux/rknpu
18
19 https://github.com/airockchip/RK3399Pro_npu
20
21## ReleaseLog
22
23# 1.5.0
24
25- Support RK3562
26- Support more NPU operator fuse, such as Conv-Silu/Conv-Swish/Conv-Hardswish/Conv-sigmoid/Conv-HardSwish/Conv-Gelu ..
27- Improve support for NHWC output layout
28- RK3568/RK3588:The maximum input resolution up to 8192
29- Improve support for Swish/DataConvert/Softmax/Lstm/LayerNorm/Gather/Transpose/Mul/Maxpool/Sigmoid/Pad
30- Improve support for CPU operators (Cast, Sin, Cos, RMSNorm, ScalerND, GRU)
31- Limited support for dynamic resolution
32- Provide MATMUL API
33- Add RV1103/RV1106 rknn_server application as proxy between PC and board
34- Add more examples such as rknn_dynamic_shape_input_demo and video demo for yolov5
35- Bug fix
36
37
38
39### 1.4.0
40
41- Support more NPU operators, such as Reshape、Transpose、MatMul、 Max、Min、exGelu、exSoftmax13、Resize etc.
42- Add **Weight Share** function, reduce memory usage.
43- Add **Weight Compression** function, reduce memory and bandwidth usage.(RK3588/RV1103/RV1106)
44- RK3588 supports storing weights or feature maps on SRAM, reducing system bandwidth consumption.
45- RK3588 adds the function of running a single model on multiple cores at the same time.
46- Add new output layout NHWC (C has alignment restrictions) .
47- Improve support for non-4D input.
48- Add more examples such as rknn_yolov5_android_apk_demo and rknn_internal_mem_reuse_demo.
49- Bug fix.
50
51### 1.3.0
52
53- Support RV1103/RV1106(Beta SDK)
54- rknn_tensor_attr support w_stride(rename from stride) and h_stride
55- Rename rknn_destroy_mem()
56- Support more NPU operators, such as Where, Resize, Pad, Reshape, Transpose etc.
57- RK3588 support multi-batch multi-core mode
58- When RKNN_LOG_LEVEL=4, it supports to display the MACs utilization and bandwidth occupation of each layer.
59- Bug fix
60
61### 1.2.0
62
63- Support RK3588
64- Support more operators, such as GRU、Swish、LayerNorm etc.
65- Reduce memory usage
66- Improve zero-copy interface implementation
67- Bug fix
68
69### 1.1.0
70
71 - Support INT8+FP16 mixed quantization to improve model accuracy
72 - Support specifying input and output dtype, which can be solidified into the model
73 - Support multiple inputs of the model with different channel mean/std
74 - Improve the stability of multi-thread + multi-process runtime
75 - Support flashing cache for fd pointed to internal tensor memory which are allocated by users
76 - Improve dumping internal layer results of the model
77 - Add rknn_server application as proxy between PC and board
78 - Support more operators, such as HardSigmoid、HardSwish、Gather、ReduceMax、Elu
79 - Add LSTM support (structure cifg and peephole are not supported, function: layernormal, clip is not supported)
80 - Bug fix
81
82
83### 1.0
84 - Optimize the performance of rknn_inputs_set()
85 - Add more functions for zero-copy
86 - Add new OP support, see OP support list document for details.
87 - Add multi-process support
88 - Support per-channel quantitative model
89 - Bug fix
90
91
92### 0.7
93 - Optimize the performance of rknn_inputs_set(), especially for models whose input width is 8-byte aligned.
94 - Add new OP support, see OP support list document for details.
95 - Bug fix
96
97### 0.6
98 - Initial version
99
100