Resources

pandas convert columns to rows, convert wide to long, pandas melt

Pandas Tips – Convert Columns To Rows

  Introduction

In one of my previous posts – Pandas tricks to split one row of data into multiple rows, we have discussed a solution to split the summary data from one row into multiple rows in order to standardize the data for further analysis. Similarly, there are many scenarios that we have the aggregated data like a Excel pivot table, and we need to unpivot it from wide to long format for better analysis. In this article, I will be sharing with you a few tips to convert columns to rows with pandas DataFrame.

Prerequisites

To run the later code examples, you shall get pandas installed in your working environment. Below is the pip command to install pandas:

pip install pandas

And we will be using the data from this file for the later demonstration, so you may download and examine how the data looks like with below code:

import pandas as pd
import os
data_dir = "c:\\your_download_dir"
df = pd.read_excel(os.path.join(data_dir, "Sample-Data.xlsx"))

You shall see the sample sales data as per below:

pandas convert columns to rows, wide to long format, pandas melt

The sales amount has been summarized by each product in the last 4 columns. With this wide data format, it would be difficult for us to do some analysis, for instance, the top salesman by month by products or the best seller products by month etc.

A better data format should be transforming the product columns into rows so that each single row only represents 1 product and its sales amount. Now let’s start to explore what are the different ways to convert columns to rows with pandas.

Using Pandas Stack Method

The most immediate solution you may think of would be using the stack method as it allows you to stack the columns vertically onto each other and make it into multiple rows.  For our case, we will need to specify the DataFrame index as “Salesman” and “Order Date“, so that the product columns will stack based on this index. For instance:

df.set_index(["Salesman", "Order Date"]).stack()

If you check the result now, you shall see the below output:

pandas convert columns to rows, wide to long format, pandas melt

This is an MultiIndex Series with index name – [‘Salesman’, ‘Order Date’, None], so you can reset the index and  rename the Series name as “Amount”, meanwhile give the name of the “None” index as “Product Desc” to make it more meaningful. E.g.:

df.set_index(["Salesman", "Order Date"])\
    .stack()\
    .reset_index(name='Amount')\
    .rename(columns={'level_2':'Product Desc'})

With the above code, you can see the output similar to below:

pandas convert columns to rows, wide to long format, pandas melt

 

If you do not want to have the 0 sales amount records, you can easily apply a filter to the DataFrame to have cleaner data.

Using Pandas Melt method

The melt method is a very powerful function to unpivot data from wide to long format. It is like the opposite operation to the pivot_table function, so if you are familiar with pivot_table function or the Excel pivot table, you shall be able to understand the parameters easily.

To achieve the same result as per the stack function, we can use the below code with melt method:

df.melt(id_vars=['Salesman', 'Order Date'], 
        value_vars=['Beer', 'Red Wine', 'Whisky', 'White Wine'],
        var_name="Product Desc",
        value_name='Amount')

The id_vars specifies the columns for grouping rows. The value_vars and var_name specify the columns to unpivot and the new column name, and the value_name indicates the name of the value column. To help you better understand this parameters, you can imagine how the data is generated via pivot table in Excel, now it’s the reversing process.

pandas convert columns to rows, wide to long format, pandas melt

 

Using Pandas wide_to_long Method

The wide_to_long method is quite self-explanatory by its name. The method uses pandas.melt under the hood, and it is designed to solve some particular problems. For instance, if your columns names follows certain patterns such as including a year or number or date, you can specify the pattern and extract the info when converting those columns to rows.

Below is the code that generates the same output as our previous examples:

pd.wide_to_long(
    df, 
    stubnames="Amount", 
    i=["Salesman", "Order Date"], 
    j="Product Desc", 
    suffix=r"|Red Wine|White Wine|Whisky|Beer").reset_index()

The stubnames parameter specifies the columns for the values converted from the wide format. And i specifies the columns for grouping the rows, and j is the new column name those stacked columns. Since our product column names does not follow any pattern, in the suffix parameter, we just list out all the product names.

As the wide_to_long returns a MultiIndex DataFrame, we need to reset index to make it flat data structure.

You may not see the power of this function from the above example, but if you look at the below example from its official document, you would understand how wonderful this function is when solving this type of problems.

pandas convert columns to rows, wide to long format, pandas melt, pandas wide_to_long

Performance Consideration

When testing the code performance for the above 3 methods, the wide_to_long method would take significant longer time than the other two methods, and melt seems to be the fastest. But the result may vary for large set of data, so you will need to evaluate again based on your data set.

#timeit for stack method
4.52 ms ± 329 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#timeit for melt method
3.5 ms ± 238 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#timeit for wide_to_long method
17.8 ms ± 709 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Conclusion

In this article, we have reviewed through 3 pandas methods to convert columns to rows when you need to unpivot your data or transform it from wide to long format for further analysis. A simple testing shows that melt method performs the best and the wide_to_long takes the longest time, but bear in mind that wide_to_long method has its specific use cases which the other functions may not be able to achieve.

 

Manipulate Audio File in Python, pydub, download youtube,cut video python

Manipulate Audio File in Python With 6 Powerful Tips

Introduction

Dealing with audio files may not be that common to a Python enthusiast, but sometimes you may wonder if you are able to manipulate audio files in Python for your personal interest. For instance, if you really like some music, and you want to edit some parts of it and save into your phone, so that you can listen it during your study or outdoor exercise without skipping those annoying ads.

In this post, I would be introducing you a simple yet useful library for you to manipulate audio file in Python code.

Prerequisites:

You need to install Pydub in your working environment, below is the installation command via pip (click for more tips):

pip install pydub

The library has the dependency to ffmpeg in order to support most of the audio file formats, so you may use the below command to install if you do not have it yet:

pip install ffmpeg

Download Video from YouTube

As I am going to use a funny video from YouTube for the later demonstration, I would need install another library – youtube_dl to download the video into my local folder:

pip install youtube_dl

Below is the command to download the video from YouTube with the given URL and the output file name. You can also use -f to specify the file format if the original video has multiple format:

youtube-dl "https://www.youtube.com/watch?v=Zo6F_qtQCCc" -o "hongshaorou.mp4"

You may see the below output messages from your terminal, and the final output file will be saved to your current directory:

Manipulate Audio File in Python, pydub, download youtube,cut video python

Now let’s import pydub and use this video to explore what we can do with this library.

from pydub import AudioSegment
import os

Extract Sound From A Video File

To load a video file, we can use the from_file function from the AudioSegment module:

base_dir = r"c:\sounds"
sound = AudioSegment.from_file(os.path.join(base_dir, "hongshaorou.mp4"))

There are also other functions such as from_mp3, from_wav or from_ogg etc., depending on what type of audio files you want to read. With the export function, you can easily convert the video file into another format:

sound.export(os.path.join(base_dir, "hsr.mp3"), format="mp3")

There are some more parameters you can use to specify the metadata when you save the file, e.g.:

sound.export(os.path.join(base_dir, "hsr.mp3"),
                           format="mp3",
                           bitrate="192k",
                           tags={"album": "chinese cuisine", "artist": "not sure"},
                           cover= os.path.join(base_dir,"hongshaorou.jpeg"))

And you can also retrieve the meta info as per below:

from pydub.utils import mediainfo
mediainfo('hsr3.mp3')

Split/Cut Audio Clips

With the AudioSegment object, you can cut the audio file like slicing a list by specifying the starting point and ending point in milliseconds. For instance, to cut our audio file from 1:18 to 1:33 and save it to mp3:

first_cut_point = (1*60 + 18) * 1000
last_cut_point = (1*60 + 33) * 1000

sound_clip = sound[first_cut_point:last_cut_point]

sound_clip.export(os.path.join(base_dir, "hsr.mp3"), format="mp3")

Increase/Reduce Sound Volume

You can make the sound louder or quieter by adding/subtracting the decibels as per below:

#increase volume by 10dB for the first 2 seconds
sound_clip_1 = sound_clip[:2000] + 10

#reduce volume by 5dB for the last 3 seconds
sound_clip_2 = sound_clip[-3000:] - 5

#combine multiple sound clips
final_clip = sound_clip_1 + sound_clip[2000:-3000] + sound_clip_2

Play Sound In Python

If you are running the code in Jupyter Lab, you can simply execute the final_clip and see how the result sounds like:

Otherwise you use the playback module to play the sound as per below:

from pydub.playback import play
play(final_clip)

Adding Silence In The Sound

Silence can be added to your sound clip as per below:

#Adding 1 second silence before the sound clip
AudioSegment.silent(duration=1000) + sound_clip[:5000]

Overlay Audio Onto Another Audio

The overlay function allows you to overlay one AudioSegment to another AudioSegment object. For instance:

sound_clip[5000:10000].overlay(final_clip[:5000])

There are some more useful functions for editing audio files, you can see full API document from here.

Conclusion

In this article, we have reviewed through a few very useful functions in the pydub library which allows you to manipulate audio file such as converting audio formats, combining, splitting or editing sound clips. With these tips, you shall be able to create your own sound clips in a few lines of Python code. In this post, we have also used the youtube-dl library which allows you to download the video from YouTube and some other video streaming website. You may refer to this reddit discussion if you are wondering whether this is legal. But I believe it should be alright if you just use for your personal exploration on the Python programming.

Python generate QR code, Python read QR code, Photo by Lukas on Unsplash

Read and Generate QR Code With 5 Lines of Python Code

 Introduction

QR Code is the most popular 2 dimensional barcodes that widely used for document management, track and trace in supply chain and logistics industry, mobile payment,  and even the “touchless” health declaration and contact tracing during the COVID-19 pandemic. Comparing to 1D barcode, QR code can be very small in size but hold more information, and also easier for scanning as you can scan it from any direction.

In this article, I would be sharing with you how to use some pure Python packages to generate QR code and read QR code from images.

Generate QR code with Python

To generate QR code, we will use a Python package called qrcode. Below is the pip command to install this package:

#install qrcode together with pillow
pip install qrcode[pil]

#or install qrcode if you already have pillow installed
pip install qrcode

As it has dependency to Pillow package, you will need to have this package installed as well. Once you have these packages ready, let’s import the modules at the beginning of our code:

import qrcode
from PIL import Image

Generating a QR code with this qrcode library can be easily done with 1 line of code:

img = qrcode.make('QR Code')

If you check the “img” object from Jupyter Notebook, you can see the below image:

Generate QR code Python, Read QR code Python

This make function provides a quick way to generate QR code with all the default parameters. To specify the parameters like the size, style or border of the boxes, you can use the QRCode class. For instance:

qr = qrcode.QRCode(
    version=1,
    error_correction=qrcode.constants.ERROR_CORRECT_L,
    box_size=10,
    border=4,
)

Here is the explanations for these parameters:

version – QR code has 40 different sizes which indicated as the version parameter in above, version 1 represents a 21×21 matrix.  You can use (v-1)*4 + 21 to calculate the size of the matrix for each of the version number.

error_correctionspecifies error correction level which controls how many error correction code blocks to be inserted in order to achieve the error correction capability. In another words, if you want your barcode to be readable even when it’s damaged (or have a logo/image onto it) , you may increase the error correction level, but this would also make your barcode more compact.

box_size the number of pixels of the square box

border – the thickness of the square box border

Once you have a QRCode instance, you can use the below code to specify the barcode data, color and generate a image:

#barcode content
qr.add_data('codeforests.com')

#auto adjust the size
qr.make(fit=True)

#specifying barcode color
img = qr.make_image(fill_color="#040359", back_color="#f7f7fa")

If you check the “img” object from Jupyter Notebook again, you shall see something similar to below:

 

Generate QR code Python, Read QR code Python

 

To use the same barcode style to generate new barcode, you can just clear the data and then re-generate a new image object:

 

qr.clear()
qr.add_data('Python Tutorials')

img2 = qr.make_image(fill_color="#015b82", back_color="TransParent")

When inspecting the “img2” in Jupyter Notebook, you shall see below:

Generate QR code Python, Read QR code Python, Python QR code with different color

 

You can simply use the “save” method to save it into an image file since it is a Pillow Image object:

img2.save("qr_code.png")

The qrcode package cannot directly generate multiple QR codes into one image, if you need that, you may use the Pillow package to combine the images. For instance:

#create a blank image
new_img = Image.new("RGBA", (600, 350), "#fcfcfc")

new_img.paste(img, (0, 0))

new_img.paste(img2, (300, 0))

new_img.save("multi-QR-code.png")

The above will create a new image and combine the two barcode images into one. If you check the saved image file, you shall see:

 

Generate QR code Python, Read QR code Python, generate multiple QR codes on one page

With this package, you can also generate styled QR code e.g.: rounded corners, radial gradient, embedded image or different color masks. You can take a look at the samples from it’s office site.

Read QR Code in Python

To read QR code, we will use another Python package called pyzbar. You can use below pip command to install it:

pip install pyzbar

This library is also a very easy to use, you can directly pass in a Pillow Image object, numpy.ndarray or raw bytes to the decode method to detect the barcode. For instance:

import pyzbar.pyzbar as pyzbar
from pyzbar.pyzbar import ZBarSymbol

input_image = Image.open("multi-QR-code.png")

decoded_objects = pyzbar.decode(input_image, symbols=[ZBarSymbol.QRCODE])

The decode method returns a list of barcode objects detected from the image with their position info. You can use the symbols parameter to restrict what type of barcodes you want to detect. When this parameter is not specified, all its supported barcode types will be checked.

From the above, you can further loop through the list to get the actual content data of the barcodes:

for obj in decoded_objects:
    zbarData = obj.data.decode("utf-8")
    print(zbarData)

You shall see the below result:

Generate QR code Python, Read QR code Python

In your real-world project, if you need to read one barcode among the multiple barcodes from a document, you may try to use the symbols to restrict the barcode types, or use regular expression to validate the detected barcode data in order to find the correct one you need.

If you need to do a lot of image pre-processing or even read barcode from video or webcam, you may install OpenCV and use the detectAndDecodeMulti method to read the QR code.

Conclusion

In this article, we have reviewed through two simple but useful packages – qrcode for generating QR code, and pyzbar for reading the content from a QR code. There are quite many other Python packages for generating all sorts of one or two dimensional barcodes, some are in pure Python packages and some are Python wrappers, you may take a look at the summary table from this blog if any specific barcode type you need is not supported by these two packages.

 

20 Useful Tips for Using Python Pip

20 Tips for Using Python Pip

Introduction

Python has become one of the most popular programming languages due to the easy to use syntax as well as the thousands of open-source libraries developed by the Python community. Almost every problem you want to solve, you can find a solution with these third-party libraries, so that you do not need to reinvent the wheels. Majority of these libraries are hosted in the repository called Pypi and you can install these libraries with the Python pip command.

Python pip module helps you to manage the downloading, installation of the packages, and solving the dependency requirements. Although you probably have used pip for some time, you may not spend much time to read through it’s user guide for some of the useful operations. In this article, we have summarize the 20 useful tips for managing Python third party packages with Python pip.

Check the current pip version

Since Python version 3.4, the pip module has been included by default within the Python binary installer, so you do not need to install it separately once you have Python program installed. To check the version of the pip package, you can use the below:

pip --version

Sample output:

Python Pip version

Install package from Pypi

Installing package is very simple with pip command, you can use “install” option followed by one or multiple package names:

pip install requests

By default, pip looks for the latest release and install the latest version for you together with the dependency packages. Sample output as per below:

Python Pip install package

You can also specify the version number of the package to be installed:

py -m pip install pip==21.1.1

Sample output:

Python Pip install package with version number

Pip also supports a list of version specifier such as >=1.2, <2.0, ~=2.0, !=2.0 or ==1.9.* for matching the correct version of the package to be installed.

When you are not in a virtual environment, the package will be installed into the global folder (system-site) by default, you can use the “–user” option to specify the installation folder in user-site in case of any permission issue. E.g.:

pip install --user requests

Output as per below:

Python Pip install package to user-site

Although you can specify your own customized installation path for your different projects, using virtual environment is still the best way to manage dependencies and conflicts.

Show package version and installation location

To check the basic information such as version number or installation location for an existing package, you can use the “show” option:

pip show colorama

You can see the below information about the package:

Python Pip show package version and installation location

And you can also use the “–verbose” mode to display the additional meta info.

List all the packages installed

To list out all the packages installed, you can use the “list” option:

py -m pip list

You shall see the output format similar to below:

Python Pip list packages

You can add a “–user” option to list all packages installed in your user-site, e.g.:

py -m pip list --user

When you are using virtual environment with “–system-site-packages” (allowing virtual environment to access system-site packages), you can use the “list –local” option to show only the packages installed in your virtual environment:

py -m pip list --local

List all the outdated packages

To check if any installed packages are outdated, you can use the “–outdated” option:

py -m pip list -o
# or
py -m pip list --outdated

Below is the sample output:

Python Pip list outdated packages

Upgrade package to the latest version

Once identified the outdated packages, you can manually use the “–upgrade” option to upgrade the package to the latest version. Multiple package name can be specified with whitespaces:

py -m pip install --upgrade pip
#or 
py -m pip install --U pip setuptools

Sample output as per below:

Python Pip upgrade package

Auto upgrade packages to the latest version

Pip does not have an option to auto upgrade the outdated packages, but you can make use of the result from “list -o” and create a simple script to achieve it, e.g.:

#in Windows command line
for /F "skip=2 delims= " %i in ('pip list --o --local') do pip install -U %i

#in linux
pip list --o --local | grep -v '^\-e' | cut -d = -f 1  | xargs -n1 pip install -U

Export installed packages

You can use “freeze” option to export all your installed package names into a text file, so that you can re-create exactly the same project environment in another PC. For instance:

py -m pip freeze -l > requirements_demo.txt

Result in the output text file:

Python Pip install requirement file

Install multiple packages from requirement file

For the packages you’ve exported with “freeze” option, you can re-install all the packages in another environment with the below “-r” option:

py -m pip install -r requirements.txt

You may see the below output when you have package name “numpy” in your requirements.txt file:

Python Pip install package with requirement file

The requirements.txt also allows to include other requirement files. This may be useful when you have a sub module requires extra packages and can be run independently as a separate application. So you may put the common packages in the requirements.txt and the additional packages in the requirements_module1.txt file,  the include the requirements.txt file in your module file.

E.g. the content in the requirements_module1.txt:

#opencv-python
#comment out some packages
python-dateutil

-r requirements.txt

When you run the “install” command:

py -m pip install -r requirements_module1.txt

You shall the sample output as per below:

Python Pip install package with multiple requirement files

Uninstall packages

Uninstalling an existing package can be done with below command:

pip uninstall numpy

Output as per below:

Python Pip uninstall package

Install package from wheel file

When you have a binary wheel file downloaded in your local folder, you can also use the “install” option to install the wheel file directly:

py -m pip install --force-reinstall C:\Users\codef\Downloads\python_dateutil-2.8.2-py2.py3-none-any.whl

Output as per below:

pip install whl

Install package from non-Pypi index

If the package is not hosted in Pypi index, you can manually specify the index url with “–index-url” or simply “-i” :

py -m pip install -i https://mirrors.huaweicloud.com/repository/pypi/simple/ colorama

Above command would download and install the package from huawei cloud repository (a PyPi mirror):

Python Pip install package from Pypi mirrors

This would be also helpful when you are not able to access the Pypi directly due to the firewall or proxy issue in your network, you can find a Pypi mirror repository and download the packages from there. Usually these mirrors synchronize with Pypi in a few minutes interval which should not cause any issue for your development work.

Configure global index url

To permanently save the index url so that you do not have to key in the url for every package installation, you can use the “config” option to set the url globally. e.g:

pip config set global.index-url https://mirrors.aliyun.com/pypi/simple

With the above setting, you can install package from the mirror repository as per normal without specifying the url option.

Check package compatibility

When you manually install the packages, sometimes you may encounter issues that some dependency packages

having incompatible version installed. To check if you have any such issue, you can use the “check” option :

python -m pip check

You may see something similar to below when there is any conflict:

Python Pip check package compatibility

Download package into local folder

You can download the package wheel files into your local folder when you need:

pip download requests -d .\requests

The “-d” option allows you specify the target folder where you want to save the wheel files. You may get multiple wheel files if the package has any dependency packages. (you can use “–no-deps” when you do not want to download the dependency files)

Below is the sample result:

 

Python Pip download wheels file

Install package into local folder

To install the package from a folder, you can use the “-f” with the file path:

pip install requests -f .\requests

This is the same as installing the package from Pypi:

Python Pip install package offline

Conclusion

In this article we have summarized some useful tips for using Python pip to manage the installation and upgrading of the third party packages for your Python projects. For more advanced usage of this module, you may refer to it’s official document.

reading email from outlook with python pywin32

5 Useful Tips for Reading Email From Outlook In Python

Introduction

Pywin32 is one of the most popular packages for automating your daily work for Microsoft outlook/excel etc. In my previous post, we discussed about how to use this package to read emails and save attachments from outlook. As there were quite many questions raised in the comments which were not covered in the original post, this article is intended to review through some of the advanced topic for reading emails from outlook via Python Pywin32 package.

If you have not yet read through the previous post, you may check it out from here.

Prerequisites:

Assuming you have already installed the latest Pywin32 package and imported below necessary packages in your script, and you shall not encounter any error after executing the GetNamespace method to establish the outlook connection:

import win32com.client

#other libraries to be used in this script 
import os 
from datetime import datetime, timedelta

outlook = win32com.client.Dispatch('outlook.application') 
mapi = outlook.GetNamespace('MAPI')

When using below code to iterate the Accounts property, you shall see whichever accounts you have configured in your outlook:

for account in mapi.Accounts: 
    print(account.DeliveryStore.DisplayName)

#Assuming below accounts have been configured:
#abc@hotmail.com
#def@gmail.com

Now let’s move on to the topics we are going to discuss in this article.

Reading Email from Multiple Outlook Accounts

If you have multiple accounts configured in your outlook application, to access one of the accounts, you can use the Folders method and specify the account name or index of the account, e.g.:

for idx, folder in enumerate(mapi.Folders):
    #index starts from 1
    print(idx+1, folder)

#Assuming below output:
# 1  abc@hotmail.com
# 2  def@gmail.com

And to access the sub folders under a particular email account, you can continue to use the folders method to specify the sub folder name or index of the folder. Before that, you may want to check what are the available sub folders and it’s index value as per below:

for idx, folder in enumerate(mapi.Folders("abc@hotmail.com").Folders):
    print(idx+1, folder)
# or using index to access the folder
for idx, folder in enumerate(mapi.Folders(1).Folders): 
    print(idx+1, folder)

You shall see something similar to the below:

reading email from outlook with Python pywin32

With the above folder index and name, you shall be able to access the email messages as per below:

messages = mapi.Folders("abc@company.com").Folders("Inbox").Items
# or
messages = mapi.Folders(1).Folders(2).Items
for msg in list(messages):
    print(msg.Subject)

Although the index would not get changed when you move up/down of your folders in outlook, obviously using folder name still is much better than index in terms of readability of the code.

Filter Email Based on Receiving Time Window

When reading emails from outlook inbox, you may want to zoom into the emails within a specific receiving time window rather than scanning through thousands of emails you have received in the inbox. To filter emails based on certain conditions, you can use restrict method together with the logical operators.

For instance, to filter the emails received from 1st day of the current month until today 12am:

today = datetime.today()

# first day of the month
start_time = today.replace(month=1, hour=0, minute=0, second=0).strftime('%Y-%m-%d %H:%M %p')

#today 12am
end_time = today.replace(hour=0, minute=0, second=0).strftime('%Y-%m-%d %H:%M %p')

messages = messages.Restrict("[ReceivedTime] >= '" + start_time
+ "' And [ReceivedTime] <= '" + end_time + "'")

With logical operators like AND, OR and NOT, you are able to combine multiple criteria together. For instance, to check the email with certain subject but not from a particular sender email:

messages = messages.Restrict("[Subject] = 'Sample Report'" 
                             + " And Not ([SenderEmailAddress] = 'abc@company.com')")

And you can also use the Restrict method as many times as you wish if it makes your code more readable than combining all conditions in one filter, e.g.:

messages = messages.Restrict("[Subject] = 'Sample Report'")
messages = messages.Restrict("Not ([SenderEmailAddress] = 'abc@company.com')")

Getting First N emails

When using Restrict method for filtering email messages, you would not be able to specify max number of emails you want to read. If you wish to get the first/last N emails based on the receiving time, you can use the Sort method to sort the messages based on certain email properties before you slice the list. Below is the sample code to get the latest 10 email messages based on the receiving time:

messages.Sort("[ReceivedTime]", Descending=True)

#read only the first 10 messages
for message in list(messages)[:10]:
    print(message.Subject, message.ReceivedTime, message.SenderEmailAddress)

Wildcard Matching for Filtering

With the Restrict method, you cannot do wildcard matching such as searching whether the email subject or body contains certain keywords. To be able to achieve that, you will need to use the DASL query.

For instance, with the below DASL query syntax, you can filter email subject which contains “Sample Report” keyword:

messages = messages.Restrict("@SQL=(urn:schemas:httpmail:subject LIKE '%Sample Report%')")

You may want to check here to see what are the fields supported in ADSL query and the correct namespace to be used.

Include/Exclude Multiple Email Domains

To filter the emails only from a particular domain, you can use the ADSL query similar to the previous example:

messages = messages.Restrict("@SQL=(urn:schemas:httpmail:SenderEmailAddress LIKE '%company.com')")

And to exclude the emails from a few domains, you can use multiple conditions with logical operators:

messages = messages.Restrict("@SQL=(Not(urn:schemas:httpmail:senderemail LIKE '%@abc%') \
And Not(urn:schemas:httpmail:senderemail LIKE '%@123%') \
And Not(urn:schemas:httpmail:senderemail LIKE '%@xyz%'))")

Conclusion

In this article, we have reviewed through some advanced usage of the Pywin32 package for filtering emails. You may not find many Python tutorials for this package from online directly, but you shall be able to see the equivalent VBA code from its official website for most of the code you have seen in this article. In the event that you cannot find a solution for you problem, you may check and see whether there is something implemented in VBA code that you can convert it into Python syntax.

Link for the previous post Reading Email From Outlook In PythonFollow me on twitter for more updates.