Tutorials

python regular expression match, search and findall

Python regular expression match, search and findall

Python beginners may sometimes get confused by this match and search functions in the regular expression module, since they are accepting the same parameters and return the same result in most of the simple use cases.  In this article, let’s discuss about the difference between these two functions.

match vs search in Python regular expression

Let’s start from an example. Let’s say if we want to get the words which ending with “ese” in the languages, both of the below match and search return the same result in match objects.

import re
languages = "Japanese,English"
m = re.match("\w+(?=ese)",languages)
#m returns : <re.Match object; span=(0, 5), match='Japan'>

m = re.search("\w+(?=ese)",languages)
#m returns : <re.Match object; span=(0, 5), match='Japan'>

But if the sequence of your languages changed, e.g. languages = “English, Japanese”, then you will see some different results:

languages = "English,Japanese" 
m = re.match("\w+(?=ese)",languages) 
#m returns empty
m = re.search("\w+(?=ese)",languages) 
#m returns : <re.Match object; span=(8, 13), match='Japan'>

The reason is that match function only starts the matching from the beginning of your string, while search function will start matching from anywhere in your string. Hence if the pattern you want to match may not start from the beginning, you shall always use search function.

In this case, if you want to restrict the matching only start from the beginning, you can also achieve it with search function by specifying “^” in your pattern:

languages = "English,Japanese,Chinese" 
m = re.search("^\w+(?=ese)",languages) 
#m returns empty
m = re.search("\w+(?=ese)",languages)
#m returns: <re.Match object; span=(8, 13), match='Japan'>

findall in Python regular expression

You may also notice when there are multiple occurrences of the pattern, search function only returns the first matched. This sometimes may not be desired when you actually want to see the full list of matched patterns. To return all the occurrences, you can use the findall function:

languages = "English,Japanese,Chinese,Burmese"
m = re.findall("\w+(?=ese)", languages)
#m returns: ['Japan', 'Chin', 'Burm']

 

 

python read and write json file

Read and write json file in python

Json file format is commonly used in most of the programming languages to store data or exchange the data between back end and front end, or between different applications and systems. In this article, I will be explaining how to read and write json file in python programming language.

Read from a JSON file

Python has a json module which makes the read and write json pretty easy. First, let’s assume we have the below example.json file to be read.

{
"link": "www.codeforests.com",
"name": "ken", 
"member": true, 
"hobbies": ["jogging", "watching movie"]
}

To read the file, we can simply use the load method and pass in the file descriptor.

example = json.load(open("example.json"))

Now you can access the example dictionary for the data, e.g.

print(config["hobbies"])

The output would be :

['jogging', 'watching movie']

Write into JSON file

Let’s continue to use the previous example, and try to add one more hobby into the hobbies. Then save the json object into a file.

This time, you can use the json.dump and pass in the file descriptor to be written to:

example["hobbies"].append("badminton")
with open("example.json", "w") as f:
    json.dump(example, f)

If you look at the json documentation, there are two more methods : json.loads and json.dumps. The main difference of this two methods vs json.load & json.dumps is that the loads and dumps take the str representation of the json object. e.g.:

obj = json.loads('{"json":"obj"}')
print(obj)
print(json.dumps({"json":"obj"}))

 

How to read and write configuration (.ini) file in python

There are several file formats you can use for your configuration file, the most commonly used format are .ini, .json and .yaml. In this article, I will sharing with you how to read/write your configurations in the .ini file formats.

Read .ini file

Below is a example of the ini file, you can define the sections (e.g. [LOGIN]) as much as you want to separate the different configuration info.

[LOGIN]
user = admin
#Please change to your real password
password = admin

[SERVER]
host = 192.168.0.1
port = 8088

In python, there is already a module configparser to read an parse the information from the ini file int dictionary objects. Assume you have saved above as config.ini file into your current folder, you can use the below lines of code to read.

import configparser

config = configparser.ConfigParser()		
config.read("config.ini")
login = config['LOGIN']
server = config['SERVER']

You can assign each of the sections into a separate dictionary for easier accessing the values. The output should be same as below:

codeforests read ini file

Note that the line starting with # symbol (or ; ) will be taken as comment line and omitted when parsing the keys and values.

Also all the values are taken as string, so you will need to do your own data type conversion after you read it.

Write .ini file

Now let’s see how we can write to an ini file.

You will still need this configparser library, and the idea is that you need to set the keys and values into the configparser object and then save it into a file.

config = configparser.ConfigParser()
if not config.has_section("INFO"):
    config.add_section("INFO")
    config.set("INFO", "link", "www.codeforests.com")
    config.set("INFO", "name", "ken")

with open("example.ini", 'w') as configfile:
    config.write(configfile)

And this would create the example.ini file with below content:

[INFO]
link = www.codeforests.com
name = ken

I have created another two separate articles to cover the .json and .yaml format, please have a look if you are interested.

As per always, welcome any comments or questions.

openpyxl write excel with styles

Openpyxl library to write excel file with styles

openpyxl to write excel files with styles

Openpyxl probably the most popular python library for read/write Excel xlsx files. In this article, I will be sharing with you how to write data into excel file with some formatting.

Let’s get started with openpyxl.

If you have not yet installed this openpyxl library in your working environment, you may use the below command to install it.

pip install openpyxl

And we shall import the library and modules at the beginning of the script:

import openpyxl
from openpyxl.styles import Alignment, Border, Side, Font

Now I am going to create a new excel with the sheet name as “Demo”:

workbook = openpyxl.Workbook()
sheet = workbook.active
sheet.title = "Demo"

Assuming if you have the below data that you want to write into the excel file:

raw_data = [["University Name", "No. of Students", "Address", "Contact"],
 ["National University of Singapore", "35908", "21 Lower Kent Ridge Rd, Singapore 119077", "68741616"],
 ["Nanyang Technological University", "31687", "50 Nanyang Ave, 639798", "67911744"],
 ["Singapore Management University", "8182", "81 Victoria St, Singapore 188065", "68280100"]]

You can loop through the list to get the value and assign it to a particular excel cell. Note that the excel row and columns always starts from 1.

for row_idx, rec in enumerate(raw_data):
    for col_idx, val in enumerate(rec):
        sheet.cell(row=row_idx+1, column=col_idx+1).value = val

if you save your data now via the below code, you will see that the saved excel does not come with any formatting (default formatting)

workbook.save("Demo.xlsx")

openpyxl write excel file with styles

As you can see, the format does not look good and some of the column width needs to be adjusted in order to see the full content. Let’s apply some styling to the cells.

Let’s draw the borders for each of the cells, you can specify the color of the border as well as the border style. for more border styles, you can refer to this openpyxl document. you can also use different style for different side of the borders.

thin = Side(border_style="thin", color="303030")
black_border = Border(top=thin, left=thin, right=thin, bottom=thin)

you can also give different width for the different columns as per below :

sheet.column_dimensions["A"].width = 27
sheet.column_dimensions["B"].width = 12
sheet.column_dimensions["C"].width = 33
sheet.column_dimensions["D"].width = 8

Define your own font style:

font = Font(name='Calibri', size=9, bold=True, color='07101c')

Define the alignment style, and you can definitely use different alignment style for different columns. Here I just defined 1 style for all cells.

align = Alignment(horizontal="center", wrap_text= True, vertical="center")

Next, Let’s apply the above styles to each of the cell and save the worksheet:

for label in ["A", "B", "C", "D"]:
    for col_idx in range(row_num):
        idx = label + str(col_idx + 1)
	sheet[idx].alignment = algin
	sheet[idx].font = font
	sheet[idx].border = black_border

workbook.save("Demo.xlsx")

The final output should be similar to the below, which looks much better with the styling.

openpyxl write excel with styles

 

As per always, welcome to any comments or questions.

python read email from outlook and save attachment

How to read email from outlook in python

There are always scenarios that you may wonder how to have a program to automatically read email from outlook and do some processing based on certain criteria. The most common use case is that you want to auto process email attachments when receiving some scheduled reports. In this article, I will be explaining to you how to use python to read email from outlook and save attachment files into the specified folder.

Prerequisites:

In order to be able to access the outlook native application, we will need to make use of the pywin32 library. Make sure you have installed this library and imported into your script.

import win32com.client
#other libraries to be used in this script
import os
from datetime import datetime, timedelta

Let’s get started!

Like communicating with other system or app, you will need to initiate a session in the first place. By calling the GetNamespace function, you can get the outlook session for the subsequent operations.

outlook = win32com.client.Dispatch('outlook.application')
mapi = outlook.GetNamespace("MAPI")

if you have configured multiple accounts in your outlook, you need to pass in the account name when accessing it’s folders, we can cover this topic in another article. For this article, let assume we only have 1 account configured in outlook.

for account in mapi.Accounts:
	print(account.DeliveryStore.DisplayName)

To access the inbox folder, you will need to pass in the folder type – 6 in the below function. You may refer to this doc to understand the full list of folder types, such as the draft, outbox, sent, deleted items folder etc.

inbox = mapi.GetDefaultFolder(6)

What if your email is in a sub folder under your inbox? The GetDefaultFolder has the Folders attribute where you can access to the sub folder by it’s name. For instance, to access the “your_sub_folder” under the inbox folder:

inbox = mapi.GetDefaultFolder(6).Folders["your_sub_folder"]

Read email from outlook

Now you are accessible to the inbox and it’s sub folder. You can view all the messages by getting the items as per below. But you may want filter the messages by certain criteria, such as the receiving date, from, subject etc. To do that, we can apply some filter conditions to the messages.

messages = inbox.Items

Use Restrict function to filter your email message. For instance, we can filter by receiving time in past 24 hours, and email sender as “contact@codeforests.com” with subject as “Sample Report”

received_dt = datetime.now() - timedelta(days=1)
received_dt = received_dt.strftime('%m/%d/%Y %H:%M %p')
messages = messages.Restrict("[ReceivedTime] >= '" + received_dt + "'")
messages = messages.Restrict("[SenderEmailAddress] = 'contact@codeforests.com'")
messages = messages.Restrict("[Subject] = 'Sample Report'")

Save attachment files

With all the above filters, we shall only have the messages that we are interested in. Let’s loop through the message and check for the details.

#Let's assume we want to save the email attachment to the below directory
outputDir = r"C:\attachment"
try:
    for message in list(messages):
	try:
	    s = message.sender
	    for attachment in message.Attachments:
	        attachment.SaveASFile(os.path.join(outputDir, attachment.FileName))
	        print(f"attachment {attachment.FileName} from {s} saved")
	except Exception as e:
		print("error when saving the attachment:" + str(e))
except Exception as e:
		print("error when processing emails messages:" + str(e))

There are other attributes like Body, Size, Subject, LastModificationTime etc., please check this Microsoft documentation for more details.

If the particular problem you are trying to solve is not covered in this article, you may check my another post 5 Tips For Reading Email From Outlook In Python. And you may be also interested to see how to send email from outlook in python, please check this article.

As per always, welcome any comments or questions. Follow me on twitter for more updates.