
5 Useful Tips for Reading Email From Outlook In Python
Introduction
Pywin32 is one of the most popular packages for automating your daily work for Microsoft outlook/excel etc. In my previous post, we discussed about how to use this package to read emails and save attachments from outlook. As there were quite many questions raised in the comments which were not covered in the original post, this article is intended to review through some of the advanced topic for reading emails from outlook via Python Pywin32 package.
If you have not yet read through the previous post, you may check it out from here.
Prerequisites:
Assuming you have already installed the latest Pywin32 package and imported below necessary packages in your script, and you shall not encounter any error after executing the GetNamespace method to establish the outlook connection:
import win32com.client #other libraries to be used in this script import os from datetime import datetime, timedelta outlook = win32com.client.Dispatch('outlook.application') mapi = outlook.GetNamespace('MAPI')
When using below code to iterate the Accounts property, you shall see whichever accounts you have configured in your outlook:
for account in mapi.Accounts: print(account.DeliveryStore.DisplayName) #Assuming below accounts have been configured: #abc@hotmail.com #def@gmail.com
Now let’s move on to the topics we are going to discuss in this article.
Reading Email from Multiple Outlook Accounts
If you have multiple accounts configured in your outlook application, to access one of the accounts, you can use the Folders method and specify the account name or index of the account, e.g.:
for idx, folder in enumerate(mapi.Folders): #index starts from 1 print(idx+1, folder) #Assuming below output: # 1 abc@hotmail.com # 2 def@gmail.com
And to access the sub folders under a particular email account, you can continue to use the folders method to specify the sub folder name or index of the folder. Before that, you may want to check what are the available sub folders and it’s index value as per below:
for idx, folder in enumerate(mapi.Folders("abc@hotmail.com").Folders): print(idx+1, folder) # or using index to access the folder for idx, folder in enumerate(mapi.Folders(1).Folders): print(idx+1, folder)
You shall see something similar to the below:
With the above folder index and name, you shall be able to access the email messages as per below:
messages = mapi.Folders("abc@company.com").Folders("Inbox").Items # or messages = mapi.Folders(1).Folders(2).Items for msg in list(messages): print(msg.Subject)
Although the index would not get changed when you move up/down of your folders in outlook, obviously using folder name still is much better than index in terms of readability of the code.
Filter Email Based on Receiving Time Window
When reading emails from outlook inbox, you may want to zoom into the emails within a specific receiving time window rather than scanning through thousands of emails you have received in the inbox. To filter emails based on certain conditions, you can use restrict method together with the logical operators.
For instance, to filter the emails received from 1st day of the current month until today 12am:
today = datetime.today() # first day of the month start_time = today.replace(month=1, hour=0, minute=0, second=0).strftime('%Y-%m-%d %H:%M %p') #today 12am end_time = today.replace(hour=0, minute=0, second=0).strftime('%Y-%m-%d %H:%M %p') messages = messages.Restrict("[ReceivedTime] >= '" + start_time + "' And [ReceivedTime] <= '" + end_time + "'")
With logical operators like AND, OR and NOT, you are able to combine multiple criteria together. For instance, to check the email with certain subject but not from a particular sender email:
messages = messages.Restrict("[Subject] = 'Sample Report'" + " And Not ([SenderEmailAddress] = 'abc@company.com')")
And you can also use the Restrict method as many times as you wish if it makes your code more readable than combining all conditions in one filter, e.g.:
messages = messages.Restrict("[Subject] = 'Sample Report'") messages = messages.Restrict("Not ([SenderEmailAddress] = 'abc@company.com')")
Getting First N emails
When using Restrict method for filtering email messages, you would not be able to specify max number of emails you want to read. If you wish to get the first/last N emails based on the receiving time, you can use the Sort method to sort the messages based on certain email properties before you slice the list. Below is the sample code to get the latest 10 email messages based on the receiving time:
messages.Sort("[ReceivedTime]", Descending=True) #read only the first 10 messages for message in list(messages)[:10]: print(message.Subject, message.ReceivedTime, message.SenderEmailAddress)
Wildcard Matching for Filtering
With the Restrict method, you cannot do wildcard matching such as searching whether the email subject or body contains certain keywords. To be able to achieve that, you will need to use the DASL query.
For instance, with the below DASL query syntax, you can filter email subject which contains “Sample Report” keyword:
messages = messages.Restrict("@SQL=(urn:schemas:httpmail:subject LIKE '%Sample Report%')")
You may want to check here to see what are the fields supported in ADSL query and the correct namespace to be used.
Include/Exclude Multiple Email Domains
To filter the emails only from a particular domain, you can use the ADSL query similar to the previous example:
messages = messages.Restrict("@SQL=(urn:schemas:httpmail:SenderEmailAddress LIKE '%company.com')")
And to exclude the emails from a few domains, you can use multiple conditions with logical operators:
messages = messages.Restrict("@SQL=(Not(urn:schemas:httpmail:senderemail LIKE '%@abc%') \ And Not(urn:schemas:httpmail:senderemail LIKE '%@123%') \ And Not(urn:schemas:httpmail:senderemail LIKE '%@xyz%'))")
Conclusion
In this article, we have reviewed through some advanced usage of the Pywin32 package for filtering emails. You may not find many Python tutorials for this package from online directly, but you shall be able to see the equivalent VBA code from its official website for most of the code you have seen in this article. In the event that you cannot find a solution for you problem, you may check and see whether there is something implemented in VBA code that you can convert it into Python syntax.
Link for the previous post Reading Email From Outlook In Python. Follow me on twitter for more updates.
This is SO CLOSE to the complete solution I am looking for! This and your prior post on saving attachments has taken me far. I need to save attachments from a shared inbox in moved to subfolder under the inbox. Two things, how do I navigate the subfolders and how do i preserve the folder structure of the attachment when I save it to the local drive? My example
messages = mapi.Folders(“eXXX@oXXa.com”).Folders(“Inbox\Test1”).Items
Then to save the attachment:
attachment.SaveASFile(os.path.join(outputDir, attachment.FileName))
I want to change outputDir based on the folder the email attachment is in. Do I need to check if the folder already exists?
Thanks in advance
Nick R
Hi Nick,
Let me just share some code to explain.
Assuming you want to iterate through all the sub folders (1 level only) under the Inbox folder, you can use the subfolder name to form a output folder structure. And then create the folder accordingly before you save the attachment. The folder has to be created before you save the attachment files, otherwise you will get error.
for subfolder in mapi.Folders(‘eXXX@oXXa.com’).Folders(‘Inbox’).Folders:
output_dir = os.path.join(os.getcwd(), “Inbox”, subfolder.Name)
print(output_dir)
messages = subfolder.Items
# you may do some filtering here
for m in messages:
for att in m.Attachments:
file_name = att.FileName
createDir(outputDir)
att.SaveASFile(os.path.join(outputDir, file_name))
Below is a separate method to create the folder if it does not exists:
def createDir(dirPath):
if not os.path.exists(dirPath):
try:
os.makedirs(dirPath)
except OSError as exc: # Guard against race condition
if exc.errno != errno.EEXIST:
print(“could not create folder “, dirPath)
raise
Hope this solves your problem
i want to filter out the attachments using filename like only files with certain filenames has to downloaded
Hi Venus,
The outlook does not expose any API to filter the attachments by filename. You will have to first check if email has attachments (count > 0) and then match the file name as per what you want before you save it.
how to get attachment name file
Thanks for your kind sharing.
I run into error for the dates of receiving email, with the below message:
File “<COMObject <unknown>>”, line 2, in Restrict
pywintypes.com_error: (-2147352567, ‘Exception occurred.’, (4096, ‘Microsoft Outlook’, ‘The property “ReceiveTime” is unknown.’, None, 0, -2147352567), None)
Hi Will,
Thanks for reading.
It should be “ReceivedTime”, I guess you have a typo in the code.
Hi,
I want to extract the email subject and body to excel, how can i do that?
Hi Rayan,
The easiest way would be using the csv module as per below:
import csv
with open(“msg.csv”, “w”, newline=”, encoding=”utf-8″) as csv_file:
writer = csv.DictWriter(csv_file, [“Subject”, “Body”])
writer.writeheader()
for msg in messages:
writer.writerow({‘Subject’: msg.Subject, ‘Body’: msg.Body})
You can also take a look at my this post for writing to an excel.
How to get only first mail from inbox
Hi Kadir,
You can refer to this post, and get the first element from messages
hello..great post! I’m trying to get the sender’s email address when I use print(message.SenderEmailAddress) returns this code: /o=ExchangeLabs/ou=Exchange Administrative Group (FYDIBOHF23SPDLT)/cn=Recipients/cn=950159a8f618495d9a1d152095e7d740 do you know how to solve this?
I am trying to create filter but I can’t understand how to achieve what I need. I need a filter to filter out emails that CONTAINS a certain string on the subject, instead of being equal like this sentence: