Resources

Best Tips for Python, Data Science and Automation

Resources

Python generate QR code, Python read QR code, Photo by Lukas on Unsplash

Read and Generate QR Code With 5 Lines of Python Code

 Introduction QR Code is the most popular 2 dimensional barcodes that widely used for document management, track and trace in supply chain and logistics industry, mobile payment,  and even the “touchless” health declaration and contact tracing during the COVID-19 pandemic. Comparing to 1D barcode, QR code can be very small in size but hold more […]

Read More
20 Useful Tips for Using Python Pip

20 Tips for Using Python Pip

Introduction Python has become one of the most popular programming languages due to the easy to use syntax as well as the thousands of open-source libraries developed by the Python community. Almost every problem you want to solve, you can find a solution with these third-party libraries, so that you do not need to reinvent […]

Read More
reading email from outlook with python pywin32

5 Useful Tips for Reading Email From Outlook In Python

Introduction Pywin32 is one of the most popular packages for automating your daily work for Microsoft outlook/excel etc. In my previous post, we discussed about how to use this package to read emails and save attachments from outlook. As there were quite many questions raised in the comments which were not covered in the original […]

Read More
common python mistakes for beginners

8 Common Python Mistakes You Shall Avoid

Introduction Python is a very powerful programming language with easily understandable syntax which allows you to learn by yourself even you are not coming from a computer science background. Through out the learning journey, you may still make lots mistakes due to the lack of understanding on certain concepts. Learning how to fix these mistakes […]

Read More
Python one-liners with list comprehension and ternary operation

15 Most Powerful Python One-liners You Can't Skip

Introduction One-liner in Python refers to a short code snippet that achieves some powerful operations. It’s popular and widely used in Python community as it makes the code more concise and easier to understand. In this article, I will be sharing some most commonly used Python one-liners that would definitely speed up your coding without […]

Read More
web scraping with python requests and lxml

Web Scraping From Scratch With 3 Simple Steps

Introduction Web scraping or crawling refers to the technique to extract the information from a website and transform into structured data for later analysis. There are generally a few reasons that you may need to implement a web scraping scripts to automate the data collection process: There isn’t any public API available for you to […]

Read More
python send email with attachment via smtplib

How to send email with attachment via python smtplib

In one of my previous article, I have discussed about how to send email from outlook application. That has assumed you have already installed outlook and configured your email account on the machine where you want to run your script. In this article, I will be sharing with you how to automatically send email with attachments via lower level API, to be more specific, by using python smtplib where you do not need to set up anything in your environment to make it work.

For this article, I will demonstrate to you to send a HTML format email from a gmail account with some attachment. So besides the smtplib module, we will need to use another two modules – ssl and email.

Let’s get started!

First, you will need to find out the SMTP server and port info to send email via google account. You can find this information from this link. For your easy reading, I have captured in the below screenshot.

codeforests - google smtp server configuration info

So we are going to use the server: smtp.gmail.com and port 587 for our case. (you may search online to find out more info about the SSL & TLS, we will not discuss much about it in this article)

Let’s start to import all the modules we need:

import smtplib, ssl
from email.mime.multipart import MIMEMultipart 
from email.mime.text import MIMEText 
from email.mime.application import MIMEApplication

As we are going to send the email in HTML format (which are you able to unlock a lot features such as adding in styles, drawing tables etc.), we will need to use the MIMEText. And also the MIMEMultipart and MIMEApplication for the attachment.

Build up the email message

To build up our email message, we need to create mixed type MIMEMultipart object so that we can send both text and attachment. And next, we shall specify the from, to, cc and subject attributes.

smtp_server = 'smtp.gmail.com'
smtp_port = 587 
#Replace with your own gmail account
gmail = '[email protected]'
password = 'your password'

message = MIMEMultipart('mixed')
message['From'] = 'Contact <{sender}>'.format(sender = gmail)
message['To'] = '[email protected]'
message['CC'] = '[email protected]'
message['Subject'] = 'Hello'

You probably do not want anybody can see your hard coded password here, you may consider to put this email account info into a separate configuration file. Check my another post on the read/write configuration files.

For the HTML message content, we will wrap it into the MIMEText, and then attach it to our MIMEMultipart message:

msg_content = '<h4>Hi There,<br> This is a testing message.</h4>\n'
body = MIMEText(msg_content, 'html')
message.attach(body)

Let’s assume you want to attach a pdf file from your c drive, you can read it in binary mode and pass it into MIMEApplication with MIME type as pdf. Take note on the additional header where you need to specify the name your attachment file.

attachmentPath = "c:\\sample.pdf"
try:
	with open(attachmentPath, "rb") as attachment:
		p = MIMEApplication(attachment.read(),_subtype="pdf")	
		p.add_header('Content-Disposition', "attachment; filename= %s" % attachmentPath.split("\\")[-1]) 
		message.attach(p)
except Exception as e:
	print(str(e))

If you have a list of the attachments, you can loop through the list and attach them one by one with the above code.

Once everything is set properly, we can convert the message object into to a string:

msg_full = message.as_string()

Send email

Here comes to the most important part, we will need to initiate the TLS context and use it to communicate with SMTP server.

context = ssl.create_default_context()

And we will initialize the connection with SMTP server and set the TLS context, then start the handshaking process.

Next it authenticate our gmail account, and in the send mail method, you can specify the sender, to and cc (as a list), as well as the message string. (cc is optional)

with smtplib.SMTP(smtp_server, smtp_port) as server:
	server.ehlo()  
	server.starttls(context=context)
	server.ehlo()
	server.login(gmail, password)
	server.sendmail(gmail, 
				to.split(";") + (cc.split(";") if cc else []),
				msg_full)
	server.quit()

print("email sent out successfully")

Once sendmail completed, you will disconnect with the server by server.quit().

With all above, you shall be able to receive the email triggered from your code. You may want to wrap these codes into a class, so that you can reuse it as service library in your multiple projects.

 

As per always, please share if you have any questions or comments.

python cache

How to print colored message on command line terminal window

When you are developing a python script with some output messages printed on the terminal window, you may find a little bit boring that all the messages are printed in black and white, especially if some messages are meant for warning, and some just for information only. You may wonder how to print colored message to make them look differently, so that your users are able to pay special attention to those warning or error messages.

In this article, I will be sharing with you a library which allows you to print colored message in your terminal.

Let’s get started!

The library I am going to introduce called colorama, which is a small and clean library for styling your messages in both Windows, Linux and Mac os.

Prerequisite :

You will need to install this library, so that you will be able to run the following code in this article.

pip install colorama

To start using this library, you will need to import the modules, and call the init() method at the beginning of your script or your class initialization method.

import colorama
from colorama import Fore, Back, Style
colorama.init()

Print colored message with colorama

The init method also accepts some **kwargs to overwrite it’s default behaviors. E.g. by default, the style will not be reset back after printing out a message,  and the subsequent messages will be following the same styles. You can pass in autoreset = true to the init method, so that the style will be reset after each printing statement.

Below are the options you can use when formatting the font, background and style.

Fore: BLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, WHITE, RESET.
Back: BLACK, RED, GREEN, YELLOW, BLUE, MAGENTA, CYAN, WHITE, RESET.
Style: DIM, NORMAL, BRIGHT, RESET_ALL

To use it in your message, you can do as per below to wrap your messages with the styles:

print(Fore.CYAN + "Cyan messages will be printed out just for info only" + Style.RESET_ALL)
print(Fore.RED + "Red messages are meant to be to warning or error" + Style.RESET_ALL)
print(Fore.YELLOW + Back.GREEN +  "Yellow messages are debugging info" + Style.RESET_ALL)

This is how it would look like in your terminal:

Python printed colored message with colorama

As I mentioned earlier, if you don’t set the autoreset to true, you will need to reset the style at the end of your each message, so that different message applies different styles.

What if you want to apply the styles when asking user’s input ? Let’s see an example:

print(Fore.YELLOW)
choice = input("Enter YES to confrim:")
print(Style.RESET_ALL)
if str.upper(choice) in ["YES",'Y']:
    print(Fore.GREEN + "You have just confirmed to proceed." + Style.RESET_ALL)
else:
    print(Fore.RED + "You did not enter yes, let's stop here" + Style.RESET_ALL)

By wrapping the input inside Fore.YELLOW and Style.RESET_ALL, whatever output messages from your script or user entry, the same style will be applied.

Let’s put all the above into a script and run it in the terminal to check how it looks like.

Python printed colored message with colorama

Yes, that’s exactly what we want to achieve! Now you can wrap your printing statement into a method e.g.: print_colored_message, so that you do not need to repeat the code everywhere.

As per always, please share if you have any comments or questions.

 

python unpack objects

Python how to unpack tuple, list and dictionary

There are various cases that you want to unpack your python objects such as tuple, list or dictionary into individual variables, so that you can easily access the individual items. In this article I will be sharing with you how to unpack these different python objects and how it can be useful when working with the *args and **kwargs in the function.

Let’s get started.

Unpack python tuple objects

Let’s say we have a tuple object called shape which describes the height, width and channel of an image, we shall be able to unpack it to 3 separate variables by doing below:

shape = (500, 300, 3)
height, width, channel = shape
print(height, width, channel)

And you can see each item inside the tuple has been assigned to the individual variables with a meaningful name, which increases the readability of your code. Below is the output:

500 300 3

It’s definitely more elegant than accessing each items by index, e.g. shape[0], shape[1], shape[2].

What if we just need to access a few items in a big tuple which has many items? Here we need to introduce the _ (unnamed variable) and * (unpack arbitrary number of items)

For example,  if we just want to extract the first and the last item from the below tuple, we can let the rest of the items go into a unnamed variable.

toto_result = (4,11,14,23,28,47,24)
first, *_, last = toto_result
print(first, last)

So the above will give the below output:

4 24

If you are curious what is inside the “_”, you can try to print it out. and you would see it’s actually a list of the rest of items between the first and last item.

[11, 14, 23, 28, 47]

The most popular use case of the packing and unpacking is to pass around as parameters to function which accepts arbitrary number of arguments (*args). Let’s look at an example:

def sum(*numbers):
    total = 0
    for n in numbers:
        total += n
    return total

For the above sum function, it accepts any number of arguments and sum up the values. The * here is trying to pack all the arguments passed to this function and put it into a tuple called numbers. If you are going to sum up the values for all the items in toto_result, directly pass in the toto_result would not work.

toto_resut = (4,11,14,23,28,47,24)
#sum(toto_result) would raise TypeError

So what we can do is to unpack the items from the tuple then pass it the sum function:

total = sum(*toto_resut)
print(total)
#output should be 151

Unpack python list objects

Unpacking the list object is similar to the unpacking operations on tuple object. If we replace the tuple to list in the above example, it should be working perfectly.

shape = [500, 300, 3]
height, width, channel = shape
print(height, width, channel)
#output shall be 500 300 3

toto_result = [4,11,14,23,28,47,24]
first, *_, last = toto_result
print(first, last)
#output shall be 4 24

total = sum(*toto_resut) 
print(total) 
#output should be also 151

Unpack python dictionary objects

Unlike the list or tuple, unpacking the dictionary probably only useful when you wants to pass the dictionary as the keyword arguments into a function (**kwargs).

For instance, in the below function, you can pass in all your keyword arguments one by one.

def print_header(**headers):
    for header in headers:
        print(header, headers[header])

print_header(Host="Mozilla/5.0", referer = "https://www.codeforests.com")

Or if you have a dictionary like below, you can just unpack it and pass to the function:

headers = {'Host': 'www.codeforests.com', 'referer' : 'https://www.codeforests.com'}
print_header(**headers)

It will generate the same result as previously, but the code is more concise.

Host www.codeforests.com
referer https://www.codeforests.com

With this unpacking operator, you can also combine multiple dictionaries as per below:

headers = {'Host': 'www.codeforests.com', 'referer' : 'https://www.codeforests.com'}
extra_header = {'user-agent': 'Mozilla/5.0'}

new_header = {**headers, **extra_header}

The output of the new_header will be like below:

{'Host': 'www.codeforests.com',
 'referer': 'https://www.codeforests.com',
 'user-agent': 'Mozilla/5.0'}

Conclusion

The unpacking operation is very usefully especially when dealing with the *args and **kwargs. There is one thing worth noting on the unamed variable (_) which I mentioned in the previous paragraph. Please use it with caution, as if you notice, the python interactive interpreter also uses _ to store the last executed expression. So do take note on this potential conflict. See the below example:

codeforests interactive interpreter conflicts

As per always, welcome any comments or questions.

Get file names by extension from a directory

Whenever you access the directories and files, you probably will need to implement some function to get file names by file extension from a particular directory. For instance, you may want to check and process all the excel files in a folder, or do a house keeping to remove all the old log files. In this article, I will be explaining to you a few ways of implementing such function.

Let’s get started!

There are actually plenty of libraries/modules you can use to achieve it, but let’s start with the most commonly used libraries/modules.

Option 1

Since you will need to import the os module anyway if you need to handle the file operations, you can make use of the functions from this module.

For instance, you can list out all the files/sub-directories under the current directory,  and check if file name ending with certain file extension as per below:

import os

pyfiles = []
for file in os.listdir("."):
    if file.lower().endswith(".ipynb"):
        pyfiles.append(file)

You can further sort the files by last modified time from latest to the earliest.

pyfiles.sort(key=os.path.getmtime, reverse=True)

What if you want to check multiple file extensions ? Don’t worries, you can still achieve it by some minor change on the if condition:

if file.lower().endswith((".ipynb", ".xlsx")):

Option 2

The os module also has another method scandir which is able to achieve the same, and also returns the file types and file attribute info.

files = []
for file in os.scandir("."):
    if file.name.lower().endswith((".ipynb", ".xlsx")):
        files.append(file.name)

 

Option 3

If you don’t like the way to match the file names in the above code, you can use fnmatch to do this job. for example: 

import fnmatch
files = []
for file in os.listdir("."):
    if fnmatch.fnmatch(file, "*.ipynb") or fnmatch.fnmatch(file, "*.xlsx"):
        files.append(file)

 

Option 4

Python has a glob module you can use the Unix style of pattern to match the files. To match the files with certain extension, you can simply do the below:

import glob
files = glob.glob("*.ipynb")

And then sort by the file creation from the latest to the earliest:

files.sort(key=os.path.getctime, reverse=True)

if you want match for multiple file extensions, you can do something as below:

files = []
file_types = ("*.ipynb", "*.xlsx")
for file_type in file_types:
    files.extend(glob.glob(file_type))

files.sort(key=os.path.getctime, reverse=True)

As I mentioned earlier, there are far more ways of doing it and it would not be possible to list of all them, so I will just stop here, and please leave your comments if you have better ideas.

 

How to swap key and value in a python dictionary

There are cases that you may want to swap key and value pair in a python dictionary, so that you can do some operation by using the unique values in the original dictionary.

For instance, if you have the below dictionary:

contact = {"joe" : "[email protected]", "john": "[email protected]"}

you can swap key and value of the dictionary by:

contact = {val : key for key, val in contact.items()}
print(contact)

You will see the below output:

{'[email protected]': 'joe', '[email protected]': 'john'}

But for the above dictionary, if multiple names sharing the same email address, then only one name will be retained. e.g. :

contact = {"joe" : "[email protected]", "jane" : "[email protected]", "john": "[email protected]"}
contact = {val : key for key, val in contact.items()}

Output of the contact dictionary will be :

{'[email protected]': 'jane', '[email protected]': 'john'}

So how to keep all the keys that have the same value after reversing it ?

You will need to use a list or set to collect all the keys if the value is the same, e.g.:

email_contact = {}
for key, val in contact.items():
    email_contact.setdefault(val, []).append(key)

(please refer to this article about the setdefault method)

And you will see the below output for the new dictionary email_contact:

{'[email protected]': ['joe', 'jane'], '[email protected]': ['john']}

That’s exactly what we want ! Now we shall be able to say “hi” to both Joe and Jane when sending email to [email protected] without missing any names.

 

As per always, welcome any comments or questions.

python dictionary keyerror

Handling the KeyError for python dictionary

python dictionary KeyError

The KeyError is quite commonly seen when dealing with the dictionary objects. when trying to access the dictionary while the key does not exists, then this error will be showing up. Usually to avoid this error, we will need to check if the key exists before accessing the value.

For instance, you can check if the key “country” exists in my_dict and then check if the values is “SGP” like the below. But the code does not look elegant.

my_dict = {"name" : "National University of Singapore", "address" : "21 Lower Kent Ridge Rd Singapore", "contact": "68741616"}
if my_dict.get("country") and my_dict["country"] == "SGP":
    print(f"country code is {my_dict['country']}")

You may also see someone uses the below way to make the code more concise. To pass in a default value if the key does not exists:

if my_dict.get("country", "") == "SGP":
    print(f"country code is {my_dict['country']}")

The Zen of Python tells us

Explicit is better than implicit.

So the above code actually does not follow this principal. If you go through the python documentation for dictionary, there is indeed a way to get the value of the key and meanwhile setting a default value if the key is new to the dictionary. Below code shows how it works:

if my_dict.setdefault("country", "") == "SGP":
    print(f"country code is {my_dict['country']}")

By doing the above, the key “country” will be added into the my_dict with a default value if the key does not exists previously, and then return the value of this key.

To extend the above setdefault method, if the value is a list of objects, you can also use this method to initialize it and then set the value.

my_dict.setdefault("faculty", []) # use list or set()
my_dict["faculty"].append("Arts")
my_dict["faculty"].append("Computer Science")

 

As per always, welcome for any comments or questions.

 

python send email from outlook

How to send email from outlook in python

In the previous article, I have explained how to read and save attachments from the outlook by using pywin32 library. In this article, I will walk through with you how to send email from outlook with the same library.

Prerequisite:

You need to install the pywin32 library in your working environment.

pip install pywin32

and import this library in your script.

import win32com.client

Let’s get started!

You will first need to initiate the outlook application by calling the below:

outlook = win32com.client.Dispatch('outlook.application')

In outlook, email, meeting invite, calendar, appointment etc. are all considered as Item object. Hence we can use the below to create an email object:

mail = outlook.CreateItem(0)

for this mail item, there are various attributes we can set, such as the below To, CC, BCC, Subject, Body, HTMLBody etc. as well as the Attachments:

mail.To = '[email protected]'
mail.Subject = 'Sample Email'
mail.HTMLBody = '<h3>This is HTML Body</h3>'
mail.Body = "This is the normal body"
mail.Attachments.Add('c:\\sample.xlsx')
mail.Attachments.Add('c:\\sample2.xlsx')
mail.CC = '[email protected]'

You can add multiple attachments by calling the Attachments.Add multiple times.

Trigger to send out email from outlook

With the above attributes set, you shall be able to send out the email since all the necessary info are provided. Below line of code will trigger to send email from outlook application.

mail.Send()

You may also wonder what if you just want to reply to a particular email instead of writing new email? In this case, you will need to find out the email message first and then use the message.Reply() or message.ReplyAll() to reply to the original message. Do check on my this article.

Conclusion:

This is just a sample demo of how to send emails, and there are plenty of things you can do with pywin32 library, do check my other related articles, such as this.

Last but not the least, welcome to any comments or questions.

Fix the CompDocError when reading excel file with xlrd

CompDocError

You may have seen this CompDocError before if you used python xlrd library to read the older version of the excel file (.xls). When directly opening the same file from Microsoft Excel, it is able to show the data properly without any issue.

This usually happens if the excel file is generated from 3rd party application, the program did not follow strictly on the Microsoft Excel standard format, although the file is readable by Excel but it fails when opening it with xlrd library due to the non-standard format or missing some meta data. As you may have no control on how the 3rd party application generate the file, you will need to find a way to handle this CompDocError in your code.

 

SOLUTIONS FOR COMPDOCERROR

 

Option 1:

If you look at the error message, the error raised from  the line 427 in the compdoc.py in your xlrd package. Since you confirm there is no problem with the data in your excel file except the minor format issue, you can open the compdoc.py and comment out the lines for raising CompDocError exception.

while s >= 0:
    if self.seen[s]:
        pass
        #print("_locate_stream(%s): seen" % qname, file=self.logfile); dump_list(self.seen, 20, self.logfile)
        #raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))

Option 2:

You may notice if you open your file in Microsoft Excel and save it, you will be able to use xlrd to read and no exception will be raised. This is because Excel already fixed the issues for you when saving the file. You can use the same approach in your code to fix this problem.

To do that, you can use the pywin32 library to open the native Excel application and re-save the file.

 

import win32com.client as win32

excel_app = win32.Dispatch('Excel.Application')
wb = excel_app.Workbooks.open("test.xls")
excel_app.DisplayAlerts = False #do not show any alert when closing the excel
wb.Save()
excel_app.quit()

 

Conclusion

 

For option 1, it is good if your program only reads the files generated from the same source. If your program needs to read different excel files from different sources, it may not be a good to always assume the “CompDocError” can be ignored.

 

For option 2, when calling the excel_app.quit(), the entire Excel application will be closed without any alert. If you have other excel files opening at the time, it will be all closed together. So this solution is good if your program will run in a standalone environment or you confirm no other process/people will be using excel when running your code.

 

If you would like to understand more about how to read & write excel file with xlrd, please check this article.