D7601-AI Voice Transcription Conference System

I.Solution Description
With the increasing demand for meeting records and security needs of customers, it is necessary to record, convey and ensure the security of meeting contents in important meeting scenes such as meetings training, interviews and speeches, mainly facing the following major problems:
Difficulty in publishing meeting records
1.Untimely publication: high requirements for the timeliness of meeting records; the conclusions of the meeting cannot be communicated in time.
2.Incomplete records: heavy workload, easy to fatigue, easy to make mistakes, it takes more than 4 hours to organize 1 hour of recording;
3.Sensitive data leakage: meeting records documents cannot be stored, modified and managed in a unified manner.
Low efficiency of information reception
1.No real-time caption display: listeners are easily affected by accent and sound, and cannot obtain accurate information for a short time, so real-time captioning is needed to assist.
2.No voice-text comparison retrospective: a large-scale meeting takes a long time, and the audience obtains scattered information, so it is necessary to compare the voice and text retrospectively after the meeting;
As one of the main channels for people to exchange information, voice carries a large amount of meaningful data content. With the research and development of intelligent voice technology, machines can replace human beings for related work, solving the problems of high-level meeting records in important meeting scenarios and inability of manpower record efficiency.

The AI voice transcription conference system developed by DSPPA is a pure offline intelligent system product specially designed to solve the problems of difficulty in recording important conference records, low efficiency, and no traceability. Its core functions include the following aspects.
● Greatly release human resources and reduce the cost of enterprise and organization.
● The content of the conference can be edited and revised in a portable way, and the speed of drafting is fast.
● The audio recording of the meeting has text comparison, which is convenient for proofreading.
● Real-time display of conference subtitles to build multi-directional information transmission.
● The meeting data is easy to manage, and the meeting minutes can be traced by audio and text comparison.
● Offline deployment ensures data security, and data is isolated from the Internet.

II.Solution Highlight Functions

III.System Connection Diagram

The meeting room is XX square in plan, X meters long and X meters wide, mainly used for various forms of meetings, academic and technical exchanges, internal training. In order to improve work efficiency and ensure unified control and accurate recall of meeting contents, a set of speech transcription system with advanced equipment and perfect technology is established, which can be unified and used in cooperation with conference sound system of the meeting room to realize real -time audio or recording acquisition in different scenes and real-time conversion into text through speech recognition technology. For meeting scenarios that need to display speech content, the system provides real-time display of speech content on the screen through software technology. (Can be edited according to the actual situation of the project.)

IV.System Functions
● High accuracy: Relying on core voice technology, the accuracy rate of standard Mandarin is up to 98%+.
● Efficient meeting: the entire recording is automatically transcribed by machine and the 1-hour audio is released in 10 minutes at the fastest time.
● Conference security: The use of an independent professional server without Network can effectively avoid the leakage of conference content and information.
● Personalized Identification: Support customized colloquial names of people and places, lead the training of special accents, and customize local languages.
● Role separation: in the process of meeting, the initiator, participant, chairman, host, secretary, etc. of the meeting can be separated from each other's roles and automatically recognized as text in real time, which can increase meeting efficiency doubly. It will greatly reduce the work intensity and pressure of the meeting recorder.
● Division of paragraph and sentences：By extracting context-related semantic features and combining speech features such as pauses, fundamental frequency information, the division of clauses and paragraphs is carried out; Comprehensive use of context-related semantic features and phonetic features to solve sentence and segmentation problems.
● Smooth text: By using generalized features and combining contextual related semantic features and phonological features, the system eliminates stop words, tone words and repetitive words from the transcription results, making the smoothed text easier to read.
● Intelligent Retrieval: Automatic association of recording and text, playback of word and sound comparison. Support full-text research and easily trace historical data.
● Conference Information Management: Support local meeting creation, meeting management, meeting record export, knowledge base building.
● Applicable to a variety of scenarios: applicable to office meetings, work reports, academic lectures, training, interviews and other scenarios.

V.Main Devices Introduction
D7601ES Voice Transcription Server Software Interface
The intelligent voice transcription system provides real-time speech recognition and recording file recognition capabilities to meet a series of needs for real-time text transcription, such as daily meetings, reports and speeches, command and dispatch, and recording arrangements, which realize real-time audio or recording collection in conference scenarios, and real-time conversion into text through voice recognition technology.

D7601CS Voice Transcription Terminal Software
The system has basic function such as conference audio management, real-time proofreading and editing, and real-time transcription of text. Besides, it provides key marks, keyword optimization and other functions to facilitate and quickly organize conference materials; provides automatic segmentation, pause words, tone words, repeat words deletion and other functions to automatically optimize text results; and provides full-text search function for easy retrieval of historical audio.

Voice Transcription Software for Large Screen
According to the needs of different conferences, the system can not only edit the first draft of the real-time transcription results, but also display the real-time voice transcription results through the display board screen, and the font color and the background color of the display board can be adjusted according to the actual situation.

D7601DS AI voice transcription Server

As the core equipment of the intelligent conference voice transcription system, the AI voice transcription server mainly provides recording service processing, data transfer and other capabilities; the main control laptop is mainly used to deploy client software, control the start and end of the conference, and provide various function operation.
The voice transcription server directly picks up the audio from the conference host through the audio encoder, completes the collection of real-time speech speech in the conference, automatically transcribes the voice for role separation, and sends the transcription result to the client software in real time. With simple manual interaction, the machine transcribed content can be edited, modified and typeset. After the meeting, a meeting information record that is faithful to the original text is formed, and the key minutes of the meeting can also be extracted according to the meeting record.
For large-scale conference scenes, in addition to the regular meeting, it can also be displayed on the screen in real time, which can be displayed on the spot.

VI.Applications