원문정보
초록
영어
A Recently AI systems have increasingly focused on integration with various systems for classification and recognition, including IoT applications. This paper introduced to integrate to speech recognition and object detection for user recognition system. The speech recognition model incorporates preprocessing techniques based on voice signal processing, utilizing features such as Mel spectrogram, Mel-frequency Cepstral Coefficients (MFCC), and chroma. These signal processing was important in recently speech recognition research field. also, it can be able to makes elaborate to word classification. so ours model was consist of Convolutional Neural Network(CNN) based model. according to CNN model was simple architecture, it was used to low memory and high inference time. The chroma analysis was consist of voice Pitch data. So, we can classifier to user gender using this analysis. The Your Only Look Once(YOLO) object-based detection research has been actively conducted recently. this model has low memory, high inference speed and great performance accuracy. ours system has integrate to word classification, gender classification and YOLO object detection system. this system worked in user authentication in the administrator system. the user vocalize a word to issue a simple command, and the user’s voice pattern and characteristics are classified, and the gender classification system classifies the gender after determining the voice pitch for further user recognition. Finally we used the QT framework to construct applications and fuse systems to make them easily accessible to users.
목차
1. Introduction
2. Background knowledge
2-1. AI System
2-2 Speech recognition.
2-3 Object detection.
3. Suggestion
3-1. Setup environment
3-2. System Ui
4. Real-time object detection result
5. Speech recognition result
6. Conclusion
References
