Do you need to break the audio file?
When we do any processing on audio files, it takes a long time. Here, processing can mean anything. For example, we may want to increase or decrease the frequency of an audio, or, as in this article, recognize the content in an audio file. By breaking it down into small audio files called chunks, we can ensure fast processing.
pip3 install pydub pip3 install audioread pip3 install SpeechRecognition pre>
There are two main steps in the program.
Step # 1: It deals with slicing audio files into small chunks at regular intervals. Slicing can be done with or without overlap. Overlapping means that the next fragment created will start from a constant time back, so that during the cut, if any audio / word is cut out, it can be covered by this overlap. For example, if the audio file is 22 seconds and the overlap is 1.5 seconds, the timing of these chunks will be:chunk1: 0 - 5 seconds chunk2: 3.5 - 8.5 seconds chunk3: 7 - 12 seconds chunk4: 10.5 - 15.5 seconds chunk5: 14 - 19.5 seconds chunk6: 18 - 22 seconds
We can ignore this overlap by setting the overlap to 0.
Step # 2: It deals with working with the sliced audio file to do whatever the user wants. Here, for demonstration purposes, the snippets were streamed through Google`s speech recognition engine, and the text was written to a separate file. To understand how to use the Google Speech Recognition engine to recognize sound from a microphone, check out Geek.wav Output: Screenshot of cmd running the code:
Text File: recognized
Here`s the implementation:
As we can see in the screenshot above, all these chunks are stored on the local system. We have now successfully sliced the overdubbed audio file and recognized the content of the fragments.
Advantages of this method:
Disadvantages of this method: