SoundAI Voice Kit is a one-stop intelligent voice interaction development kit. It integrates many algorithms, such as acoustic distribution network, beamforming, sound source direction , directional pickup, noise suppression, reverberation, echo cancellation, speech wake-up, speech recognition, semantic understanding, speech synthesis, duplex communication and so on. It is compatible with all mainstream intelligent voice hardware architectures, and supports mainstream AI platforms such as DuerOS, Xiao Ai, AliGenies, Tencent Dingdang, and Amazon Alexa. Also, it can support global customers to quickly customize service skills, and help them achieve rapid development and mass production of intelligent voice interaction hardware products.
Full RangeOmnidirectional: 360° natural interaction within 5 meters
High RecognitionRecognition rate of vertical scenes > 95%
Low LatencyResponse speed under extreme environment < 1.3s
Less False AlarmFrequency of false alarm within average 72h < 1
Being compatible with different hardware features, it can match different application scenarios and support ring-shaped, linear, square, L-shaped arrays which has different numbers of elements. With beamforming, sound source localization, noise suppression, echo cancellation, speech enhancement, SSA, SSP, VAN , OpenAEC, etc. it has realized precise intelligent voice interaction in local field, far field, distribution field and ultra far field.
Supporting Duel-wake, Free-cut, OpenAEC, AKS and other functions，its awakening rate is more than 95%. In the family's daily environment its frequency of false alarm is less than 0.5 times a day.
Deeply customized for audio and video content ,it matches interaction needs of vertical scenes such as work, relax, and travel well, and supports Free-ask, One-shot, VAN and other functions.
It has precise semantic analysis of vertical domains. Supporting docking DuerOS let it has built-in rich content resources; supporting docking Xiao Ai platform let it realize easy control of smart home; supporting docking more voice platform let it achieve integrated voice and touch work experience.
It provides rich information analysis and mining services, supports voiceprint recognition, age recognition, emotion recognition, gender recognition, humming recognition, abnormal sound detection and other functions; provides high-quality speech synthesis which can turn text into smooth and pleasant voice content that said by different celebrities, boys or girls.
Far-field voice interaction
Local-field voice interaction
Distribution-field voice interaction
Ultra far-field speech interaction
full chain of voice technology
customization and flexible configuration
rich experience and verification for mass production
Advanced Acoustic Technology
support mainstream intelligent hardware architecture
SLA can up to 99.99%
MI AI Speaker
MI AI Mini Speaker
Xiaodu AI Speaker