Manipulate Audio File in Python With 6 Powerful Tips

Introduction

Dealing with audio files may not be that common to a Python enthusiast, but sometimes you may wonder if you are able to manipulate audio files in Python for your personal interest. For instance, if you really like some music, and you want to edit some parts of it and save into your phone, so that you can listen it during your study or outdoor exercise without skipping those annoying ads.

In this post, I would be introducing you a simple yet useful library for you to manipulate audio file in Python code.

Prerequisites:

You need to install Pydub in your working environment, below is the installation command via pip (click for more tips):

pip install pydub

The library has the dependency to ffmpeg in order to support most of the audio file formats, so you may use the below command to install if you do not have it yet:

pip install ffmpeg

Download Video from YouTube

As I am going to use a funny video from YouTube for the later demonstration, I would need install another library – youtube_dl to download the video into my local folder:

pip install youtube_dl

Below is the command to download the video from YouTube with the given URL and the output file name. You can also use -f to specify the file format if the original video has multiple format:

youtube-dl "https://www.youtube.com/watch?v=Zo6F_qtQCCc" -o "hongshaorou.mp4"

You may see the below output messages from your terminal, and the final output file will be saved to your current directory:

Manipulate Audio File in Python, pydub, download youtube,cut video python

Now let’s import pydub and use this video to explore what we can do with this library.

from pydub import AudioSegment
import os

Extract Sound From A Video File

To load a video file, we can use the from_file function from the AudioSegment module:

base_dir = r"c:\sounds"
sound = AudioSegment.from_file(os.path.join(base_dir, "hongshaorou.mp4"))

There are also other functions such as from_mp3, from_wav or from_ogg etc., depending on what type of audio files you want to read. With the export function, you can easily convert the video file into another format:

sound.export(os.path.join(base_dir, "hsr.mp3"), format="mp3")

There are some more parameters you can use to specify the metadata when you save the file, e.g.:

sound.export(os.path.join(base_dir, "hsr.mp3"),
                           format="mp3",
                           bitrate="192k",
                           tags={"album": "chinese cuisine", "artist": "not sure"},
                           cover= os.path.join(base_dir,"hongshaorou.jpeg"))

And you can also retrieve the meta info as per below:

from pydub.utils import mediainfo
mediainfo('hsr3.mp3')

Split/Cut Audio Clips

With the AudioSegment object, you can cut the audio file like slicing a list by specifying the starting point and ending point in milliseconds. For instance, to cut our audio file from 1:18 to 1:33 and save it to mp3:

first_cut_point = (1*60 + 18) * 1000
last_cut_point = (1*60 + 33) * 1000

sound_clip = sound[first_cut_point:last_cut_point]

sound_clip.export(os.path.join(base_dir, "hsr.mp3"), format="mp3")

Increase/Reduce Sound Volume

You can make the sound louder or quieter by adding/subtracting the decibels as per below:

#increase volume by 10dB for the first 2 seconds
sound_clip_1 = sound_clip[:2000] + 10

#reduce volume by 5dB for the last 3 seconds
sound_clip_2 = sound_clip[-3000:] - 5

#combine multiple sound clips
final_clip = sound_clip_1 + sound_clip[2000:-3000] + sound_clip_2

Play Sound In Python

If you are running the code in Jupyter Lab, you can simply execute the final_clip and see how the result sounds like:

Otherwise you use the playback module to play the sound as per below:

from pydub.playback import play
play(final_clip)

Adding Silence In The Sound

Silence can be added to your sound clip as per below:

#Adding 1 second silence before the sound clip
AudioSegment.silent(duration=1000) + sound_clip[:5000]

Overlay Audio Onto Another Audio

The overlay function allows you to overlay one AudioSegment to another AudioSegment object. For instance:

sound_clip[5000:10000].overlay(final_clip[:5000])

There are some more useful functions for editing audio files, you can see full API document from here.

Conclusion

In this article, we have reviewed through a few very useful functions in the pydub library which allows you to manipulate audio file such as converting audio formats, combining, splitting or editing sound clips. With these tips, you shall be able to create your own sound clips in a few lines of Python code. In this post, we have also used the youtube-dl library which allows you to download the video from YouTube and some other video streaming website. You may refer to this reddit discussion if you are wondering whether this is legal. But I believe it should be alright if you just use for your personal exploration on the Python programming.