Resources

Best Tips for Python, Data Science and Automation

Resources

reading email from outlook with python pywin32

5 Useful Tips for Reading Email From Outlook In Python

Introduction Pywin32 is one of the most popular packages for automating your daily work for Microsoft outlook/excel etc. In my previous post, we discussed about how to use this package to read emails and save attachments from outlook. As there were quite many questions raised in the comments which were not covered in the original […]

Read More
common python mistakes for beginners

8 Common Python Mistakes You Shall Avoid

Introduction Python is a very powerful programming language with easily understandable syntax which allows you to learn by yourself even you are not coming from a computer science background. Through out the learning journey, you may still make lots mistakes due to the lack of understanding on certain concepts. Learning how to fix these mistakes […]

Read More
Python one-liners with list comprehension and ternary operation

15 Most Powerful Python One-liners You Can't Skip

Introduction One-liner in Python refers to a short code snippet that achieves some powerful operations. It’s popular and widely used in Python community as it makes the code more concise and easier to understand. In this article, I will be sharing some most commonly used Python one-liners that would definitely speed up your coding without […]

Read More
web scraping with python requests and lxml

Web Scraping From Scratch With 3 Simple Steps

Introduction Web scraping or crawling refers to the technique to extract the information from a website and transform into structured data for later analysis. There are generally a few reasons that you may need to implement a web scraping scripts to automate the data collection process: There isn’t any public API available for you to […]

Read More
gspread read and write google sheet

Read and write Google Sheet with 5 lines of Python code

Introduction Google Sheet is a very powerful tool in terms of collaboration, it allows multiple users to work on the same rows of data simultaneously. It also provides fine-grained APIs in various programming languages for your application to connect and interact with Google Sheet. Sometimes when you just need some simple operations like reading/writing data […]

Read More
python suppress stdout and stderr Photo by Yeshi Kangrang on Unsplash

Python recipes- suppress stdout and stderr messages

Introduction If you have worked on some projects that requires API calls to the external parties or uses 3rd party libraries, you may sometimes run into the problem that you are able to get the correct return results but it also comes back with a lot of noises in the stdout and stderr. For instance, […]

Read More

Get file names by extension from a directory

Whenever you access the directories and files, you probably will need to implement some function to get file names by file extension from a particular directory. For instance, you may want to check and process all the excel files in a folder, or do a house keeping to remove all the old log files. In this article, I will be explaining to you a few ways of implementing such function.

Let’s get started!

There are actually plenty of libraries/modules you can use to achieve it, but let’s start with the most commonly used libraries/modules.

Option 1

Since you will need to import the os module anyway if you need to handle the file operations, you can make use of the functions from this module.

For instance, you can list out all the files/sub-directories under the current directory,  and check if file name ending with certain file extension as per below:

import os

pyfiles = []
for file in os.listdir("."):
    if file.lower().endswith(".ipynb"):
        pyfiles.append(file)

You can further sort the files by last modified time from latest to the earliest.

pyfiles.sort(key=os.path.getmtime, reverse=True)

What if you want to check multiple file extensions ? Don’t worries, you can still achieve it by some minor change on the if condition:

if file.lower().endswith((".ipynb", ".xlsx")):

Option 2

The os module also has another method scandir which is able to achieve the same, and also returns the file types and file attribute info.

files = []
for file in os.scandir("."):
    if file.name.lower().endswith((".ipynb", ".xlsx")):
        files.append(file.name)

 

Option 3

If you don’t like the way to match the file names in the above code, you can use fnmatch to do this job. for example: 

import fnmatch
files = []
for file in os.listdir("."):
    if fnmatch.fnmatch(file, "*.ipynb") or fnmatch.fnmatch(file, "*.xlsx"):
        files.append(file)

 

Option 4

Python has a glob module you can use the Unix style of pattern to match the files. To match the files with certain extension, you can simply do the below:

import glob
files = glob.glob("*.ipynb")

And then sort by the file creation from the latest to the earliest:

files.sort(key=os.path.getctime, reverse=True)

if you want match for multiple file extensions, you can do something as below:

files = []
file_types = ("*.ipynb", "*.xlsx")
for file_type in file_types:
    files.extend(glob.glob(file_type))

files.sort(key=os.path.getctime, reverse=True)

As I mentioned earlier, there are far more ways of doing it and it would not be possible to list of all them, so I will just stop here, and please leave your comments if you have better ideas.

 

How to swap key and value in a python dictionary

There are cases that you may want to swap key and value pair in a python dictionary, so that you can do some operation by using the unique values in the original dictionary.

For instance, if you have the below dictionary:

contact = {"joe" : "contact@company.com", "john": "john@company.com"}

you can swap key and value of the dictionary by:

contact = {val : key for key, val in contact.items()}
print(contact)

You will see the below output:

{'contact@company.com': 'joe', 'john@company.com': 'john'}

But for the above dictionary, if multiple names sharing the same email address, then only one name will be retained. e.g. :

contact = {"joe" : "contact@company.com", "jane" : "contact@company.com", "john": "john@company.com"}
contact = {val : key for key, val in contact.items()}

Output of the contact dictionary will be :

{'contact@company.com': 'jane', 'john@company.com': 'john'}

So how to keep all the keys that have the same value after reversing it ?

You will need to use a list or set to collect all the keys if the value is the same, e.g.:

email_contact = {}
for key, val in contact.items():
    email_contact.setdefault(val, []).append(key)

(please refer to this article about the setdefault method)

And you will see the below output for the new dictionary email_contact:

{'contact@company.com': ['joe', 'jane'], 'john@company.com': ['john']}

That’s exactly what we want ! Now we shall be able to say “hi” to both Joe and Jane when sending email to contact@company.com without missing any names.

 

As per always, welcome any comments or questions.

python dictionary keyerror

Handling the KeyError for python dictionary

python dictionary KeyError

The KeyError is quite commonly seen when dealing with the dictionary objects. when trying to access the dictionary while the key does not exists, then this error will be showing up. Usually to avoid this error, we will need to check if the key exists before accessing the value.

For instance, you can check if the key “country” exists in my_dict and then check if the values is “SGP” like the below. But the code does not look elegant.

my_dict = {"name" : "National University of Singapore", "address" : "21 Lower Kent Ridge Rd Singapore", "contact": "68741616"}
if my_dict.get("country") and my_dict["country"] == "SGP":
    print(f"country code is {my_dict['country']}")

You may also see someone uses the below way to make the code more concise. To pass in a default value if the key does not exists:

if my_dict.get("country", "") == "SGP":
    print(f"country code is {my_dict['country']}")

The Zen of Python tells us

Explicit is better than implicit.

So the above code actually does not follow this principal. If you go through the python documentation for dictionary, there is indeed a way to get the value of the key and meanwhile setting a default value if the key is new to the dictionary. Below code shows how it works:

if my_dict.setdefault("country", "") == "SGP":
    print(f"country code is {my_dict['country']}")

By doing the above, the key “country” will be added into the my_dict with a default value if the key does not exists previously, and then return the value of this key.

To extend the above setdefault method, if the value is a list of objects, you can also use this method to initialize it and then set the value.

my_dict.setdefault("faculty", []) # use list or set()
my_dict["faculty"].append("Arts")
my_dict["faculty"].append("Computer Science")

 

As per always, welcome for any comments or questions.

 

python send email from outlook

How to send email from outlook in python

In the previous article, I have explained how to read and save attachments from the outlook by using pywin32 library. In this article, I will walk through with you how to send email from outlook with the same library.

Prerequisite:

You need to install the pywin32 library in your working environment.

pip install pywin32

and import this library in your script.

import win32com.client

Let’s get started!

You will first need to initiate the outlook application by calling the below:

outlook = win32com.client.Dispatch('outlook.application')

In outlook, email, meeting invite, calendar, appointment etc. are all considered as Item object. Hence we can use the below to create an email object:

mail = outlook.CreateItem(0)

for this mail item, there are various attributes we can set, such as the below To, CC, BCC, Subject, Body, HTMLBody etc. as well as the Attachments:

mail.To = 'contact@company.com'
mail.Subject = 'Sample Email'
mail.HTMLBody = '<h3>This is HTML Body</h3>'
mail.Body = "This is the normal body"
mail.Attachments.Add('c:\\sample.xlsx')
mail.Attachments.Add('c:\\sample2.xlsx')
mail.CC = 'somebody@company.com'

You can add multiple attachments by calling the Attachments.Add multiple times.

Trigger to send out email from outlook

With the above attributes set, you shall be able to send out the email since all the necessary info are provided. Below line of code will trigger to send email from outlook application.

mail.Send()

You may also wonder what if you just want to reply to a particular email instead of writing new email? In this case, you will need to find out the email message first and then use the message.Reply() or message.ReplyAll() to reply to the original message. Do check on my this article.

Conclusion:

This is just a sample demo of how to send emails, and there are plenty of things you can do with pywin32 library, do check my other related articles, such as this.

Last but not the least, welcome to any comments or questions.

Fix the CompDocError when reading excel file with xlrd

CompDocError

You may have seen this CompDocError before if you used python xlrd library to read the older version of the excel file (.xls). When directly opening the same file from Microsoft Excel, it is able to show the data properly without any issue.

This usually happens if the excel file is generated from 3rd party application, the program did not follow strictly on the Microsoft Excel standard format, although the file is readable by Excel but it fails when opening it with xlrd library due to the non-standard format or missing some meta data. As you may have no control on how the 3rd party application generate the file, you will need to find a way to handle this CompDocError in your code.

 

SOLUTIONS FOR COMPDOCERROR

 

Option 1:

If you look at the error message, the error raised from  the line 427 in the compdoc.py in your xlrd package. Since you confirm there is no problem with the data in your excel file except the minor format issue, you can open the compdoc.py and comment out the lines for raising CompDocError exception.

while s >= 0:
    if self.seen[s]:
        pass
        #print("_locate_stream(%s): seen" % qname, file=self.logfile); dump_list(self.seen, 20, self.logfile)
        #raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))

Option 2:

You may notice if you open your file in Microsoft Excel and save it, you will be able to use xlrd to read and no exception will be raised. This is because Excel already fixed the issues for you when saving the file. You can use the same approach in your code to fix this problem.

To do that, you can use the pywin32 library to open the native Excel application and re-save the file.

 

import win32com.client as win32

excel_app = win32.Dispatch('Excel.Application')
wb = excel_app.Workbooks.open("test.xls")
excel_app.DisplayAlerts = False #do not show any alert when closing the excel
wb.Save()
excel_app.quit()

 

Conclusion

 

For option 1, it is good if your program only reads the files generated from the same source. If your program needs to read different excel files from different sources, it may not be a good to always assume the “CompDocError” can be ignored.

 

For option 2, when calling the excel_app.quit(), the entire Excel application will be closed without any alert. If you have other excel files opening at the time, it will be all closed together. So this solution is good if your program will run in a standalone environment or you confirm no other process/people will be using excel when running your code.

 

If you would like to understand more about how to read & write excel file with xlrd, please check this article.