Python comprehension Photo by Karsten Würth on Unsplash

Python comprehensions for list, set and dictionary

Introduction

Python comprehension is a set of looping and filtering instructions for evaluating expressions and producing sequence output. It is commonly used to construct list, set or dictionary objects which are known as list comprehension, set comprehension and dictionary comprehension. Comparing to use map or filter or nested loops to generate the same result, comprehension has more concise syntax and improved readability. In this article, I will be explaining these three types of comprehensions with some practical examples.

Python comprehension basic syntax

You may have seen people using list/set/dict comprehension in their code, but if you have not yet checked the Python documentation, below is the syntax for Python comprehension.

Assignment Expression for target in iterable [if condition]

It requires a single assignment expression with at least one for statement, and following by zero or more for or if statements.

With this basic understanding, let’s dive into the examples.

List comprehension

List comprehension uses for and if statements to construct a list literal. The new list can be derived from any sequence like string, list, set and dictionary etc. For instance, if we have a list like below:

words = [
    "Serendipity",
    "Petrichor",
    "Supine",
    "Solitude",
    "Aurora",
    "Idyllic",
    "Clinomania",
    "Pluviophile",
    "Euphoria",
    "Sequoia"]

Single loop/if statement

You can use list comprehension to derive a new list which only keeps the elements with length less than 8 from the original list:

short_words = [word for word in words if len(word) < 8 ]

If you examine the short_words, you shall see only the short words (length less than 8) were selected to form a new list:

['Supine', 'Aurora', 'Idyllic', 'Sequoia']

Multiple if statements

As described earlier in the syntax, you can have multiple if conditions to filter the elements:

short_s_words = [word for word in words if len(word) < 8 if word.startswith("S") ]
#short_s_words = [word for word in words if len(word) < 8 and word.startswith("S") ]

The above two would generate the same result as per below:

['Supine', 'Sequoia']

Similarly, you can also use or in the if statement:

short_or_s_words = [word for word in words if len(word) < 8 or word.startswith("S") ]

You shall see the below result for the short_or_s_words variable:

['Serendipity', 'Supine', 'Solitude', 'Aurora', 'Idyllic', 'Sequoia']

Multiple loop/if statements

Sometimes you may have nested data structure and you would like to make it a flatten structure. For instance, to transform a nested list into a flatten list, you can make use of the list comprehension with multiple loops as per below:

lat_long = [[1.291173,103.810535], [1.285387,103.846082], [1.285803,103.845392]]
[x for pos in lat_long for x in pos]

Python will evaluate these expressions from left to right until the innermost block is reached. You shall see the nested list has been transformed into a flatten list:

[1.291173, 103.810535, 1.285387, 103.846082, 1.285803, 103.845392]

And similarly if you have multiple sequences to be iterated through, you can have multiple for statements in your comprehension or use zip function depends on what kind of results you what to achieve:

[(word, num) for word in words if word.startswith("S") for num in range(4) if num%2 == 0]

The above code would generate the output as per below:

[('Serendipity', 0),
 ('Serendipity', 2),
 ('Supine', 0),
 ('Supine', 2),
 ('Solitude', 0),
 ('Solitude', 2),
 ('Sequoia', 0),
 ('Sequoia', 2)]

If you use zip as per below, it will generates some different results.

[(word, num) for word, num in zip(words, range(len(words))) if word.startswith("S") and num%2 == 0]

Another practical example would be using list comprehension to return the particular type of files from the current and its sub folders. For instance, below code would list out all all the ipynb files from current and its sub folder but excluding the checkpoints folder:

import os

[os.path.join(d[0], f) for d in os.walk(".") if not ".ipynb_checkpoints" in d[0]
             for f in d[2] if f.endswith(".ipynb")]

Generate tuples from list comprehension

As you can see from the above examples, the list comprehension supports to generate list of tuples, but do take note that you have to use parenthesis e.g.: (word, len(word)) in the expression to indicate the expected output to be a tuple, otherwise there will be a syntax error:

[(word, len(word)) for word in words]

Set comprehension

Similar to list comprehension, the set comprehension uses the same syntax but constructs a set literal. For instance:

words_set = set(words)
short_words_set = {word for word in words_set if len(word) < 8}

The only difference between list comprehension and set comprehension is the square braces “[]” changed to curly braces “{}”.  And you shall see the same result as previous example except the data type now is a set:

{'Aurora', 'Idyllic', 'Sequoia', 'Supine'}

And same as list comprehension, any iterables can be used in the set comprehension to derive a new set. So using the list directly in below will also produce the same result as the above example.

short_words_set = {word for word in words if len(word) < 8}

Due to the nature of the set data structure, you shall expect the duplicate values to be removed when forming a new set with set comprehension.

Dictionary comprehension

With enough explanation in above, i think we shall directly jump into the examples, since everything is the same as list and set comprehension except the data type.

Below is an example:

dict_words = {word: len(word) for word in words}

It produces a new dictionary as per below:

{'Serendipity': 11,
 'Petrichor': 9,
 'Supine': 6,
 'Solitude': 8,
 'Aurora': 6,
 'Idyllic': 7,
 'Clinomania': 10,
 'Pluviophile': 11,
 'Euphoria': 8,
 'Sequoia': 7}

And similarly, you can do some filtering with if statements:

s_words_dict = {word: length for word, length in dict_words.items() if word.startswith("S")}

You can see only the keys starting with “s” were selected to form a new dictionary:

{'Serendipity': 11, 'Supine': 6, 'Solitude': 8, 'Sequoia': 7}

You can check another usage of dictionary comprehension from this post – How to swap key and value in a python dictionary

Limitations and constraints

With all the above examples, you may find comprehension makes our codes more concise and clearer comparing to use map and filter:

list(map(lambda x: x, filter(lambda word: len(word) < 8, words)))

But do bear in mind not to overuse it, especially if you have more than two loop/if statements, you shall consider to move the logic into a function, rather than put everything into a singe line which causes the readability issue.

The Python comprehension is designed to be simple and only support the for and if statements, so you will not be able to put more complicated logic inside a single comprehension.

Finally, if you have a large set of data, you shall avoid using comprehension as it may exhaust all the memory and causes your program to crash. An alternative way is to use the generator expression, which has the similar syntax but it produces a generator for later use. For instance:

w_generator = ((word, length) for word, length in dict_words.items() if word.startswith("S"))

It returns a generator and you can consume the items one by one:

for x in w_generator:
    print(x)

You can see the same result would be produced:

('Serendipity', 11)
('Supine', 6)
('Solitude', 8)
('Sequoia', 7)

Conclusion

In this article, we have reviewed though the basic syntax of the Python comprehension with some practical examples for list comprehension, set comprehension and dictionary comprehension. Although it is so convenient to use Python comprehension, you will still need to think about the readability/clarity of your code to other people and also the potential memory issues when you are working on a huge data set.

You may also like

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x