Compression Optimization Strategy for End-to-End ASR Model Based on Conformer
-
Graphical Abstract
-
Abstract
With the rise of deep learning, the end-to-end speech recognition model has received increasing attention. Currently, the performance of the end-to-end speech recognition model has been further updated on basis of on the proposal of the Conformer Framework, which has been widely used in the field of speech recognition. However, these models perform poorly on edge hardware due to large memory and computation requirements. Under the premise of ensuring that the loss of accuracy of the model is as small as possible, in order to reduce the size and calculation amount of the model as much as possible, three compression and optimization strategies are adopted, namely Model Quantization, Structured Pruning based on Weight Channels and Singular Value Decomposition. The model quantization has been improved simultaneously. Influence in varying degrees of compression on the loss of model accuracy is explored. Tests were carried out on different devices by combining these strategies. Comparing with the status quo of the baseline in which the Word Error Rate is less than 3%, the speed of model inference recognition is approximately 3~4 times faster.
-
-