Data Structures in Python

Learn via video course
FREE
View all courses
Python Course for Beginners With Certification: Mastering the Essentials
Python Course for Beginners With Certification: Mastering the Essentials
by Rahul Janghu
1000
4.90
Start Learning
Python Course for Beginners With Certification: Mastering the Essentials
Python Course for Beginners With Certification: Mastering the Essentials
by Rahul Janghu
1000
4.90
Start Learning
Topics Covered

In the vast and evolving programming world, data structures in Python are the essential building blocks that help store and organize information efficiently.

This article delves into the core data structures in Python, such as Lists, Dictionaries, Tuples, and Sets, elucidating their distinct features and the ways they can be leveraged to tackle diverse programming problems.

Grasping these fundamental data structures in Python is crucial for boosting your coding efficiency and enhancing your ability to solve complex challenges.

Python Lists

In Python, data structures such as lists act as dynamic arrays capable of containing a varied range of elements. This is in contrast to arrays in many other programming languages that mandate uniformity in element type. The flexibility of these data structures in Python makes lists incredibly valuable for a multitude of programming tasks, paralleling the functionality of vectors in C++ or ArrayLists in Java.

Creating a Python list is straightforward and can be accomplished as illustrated below.

Example:

Output:

Python lists are indexed starting from 0, allowing for easy access to any element based on its position. The flexibility of lists extends to accessing elements, including slicing techniques and even negative indexing.

Example:

Output:

This example demonstrates the creation of a simple list, a nested list, and the flexibility of accessing elements directly or through negative indexing, showcasing the adaptability and power of Python lists in managing collections of data.

Dictionary in Python

One of the key data structures in Python is the dictionary, which maps keys to values and operates akin to hash tables in other programming languages. As an integral part of data structures in Python, dictionaries provide a highly efficient method of data storage with a time complexity of O(1) for retrieving values.

Dictionaries in Python can be created using curly braces {} with key-value pairs separated by colons, or through dictionary comprehension for more dynamic constructions.

Example:

Output:

This example showcases the creation of dictionaries, accessing elements using keys, the utility of the get() method for safe access, and the power of dictionary comprehension for generating dictionaries dynamically.

Tuple in Python

Tuples in Python are akin to lists in that they can hold a collection of items. However, a key distinction lies in their immutability; once a tuple is created, its contents cannot be altered, meaning items cannot be added or removed. This characteristic makes tuples a secure option for storing data that should remain constant throughout the execution of a program.

Creating a tuple in Python can be done simply by separating values with commas. While parentheses are commonly seen enclosing tuple elements, they are not mandatory unless needed for syntactical clarity.

Example:

Output:

This example highlights the process of creating and using tuples, from constructing them with string values or converting a list into a tuple, to accessing their elements via positive and negative indexing.

Sets in Python

Sets in Python are a powerful data structures in python that offer a unique way to store data - they are unordered, mutable, and specifically designed to prevent duplicate elements. Python sets are underpinned by a hashtable, a highly efficient data structures in python that allows for rapid insertion, deletion, and lookup operations, typically in O(1) average time complexity.

In Python, sets are often implemented using a dictionary-like structure internally, with dummy values to represent the elements of the set. This clever use of dictionaries allows Python to optimize the performance of sets significantly.

Example:

Output:

This example illustrates the creation of a set with both numeric and string elements, showcasing the automatic elimination of duplicate entries. It also demonstrates how to iterate over a set and perform membership tests.

Frozen sets in Python

Frozen sets in Python stand out as immutable collections, meaning once they're created, their elements cannot be changed. This characteristic distinguishes them from regular sets, which are mutable and allow modifications such as adding or removing elements. Frozen sets support only those operations that don't alter the set itself, making them ideal for use as dictionary keys or elements of another set, thanks to their hashable property.

To create a frozenset, simply pass an iterable to the frozenset() constructor. When no argument is given, an empty frozenset is generated.

Example:

Output:

This example highlights the key differences between regular and frozen sets. While you can freely modify a regular set, attempting to alter a frozenset, such as adding new elements, will result in an error, underscoring the immutable nature of frozensets. This immutability makes frozensets a reliable choice for data that should remain constant throughout the execution of a program.

Python Strings

In Python, strings are sequences of Unicode characters, essentially serving as immutable arrays. This means that once a string is created, the characters within it cannot be changed. In Python, there is no distinct type for a single character; it is considered a string with a length of one, contrasting with certain other programming languages that distinguish between characters and strings. This immutability implies that any attempt to modify a string will result in the creation of an entirely new string, rather than an alteration of the original.

Example:

Output:

This example illustrates basic string operations in Python, such as creating a string and accessing its first and last characters. The ability to use negative indexing allows for easy access to the end of the string, highlighting the flexibility and power of string handling in Python programming.

Python Bytearray

In Python, the bytearray type offers a mutable sequence of integers, each in the range of 0 to 255. This feature is particularly useful when you need to work with data that requires changes, such as manipulating binary data or files. Unlike strings or tuples, bytearrays can be modified in place, making them a versatile tool for operations that involve byte-level modifications.

Here's how to work with Python bytearray:

Example:

Output:

This example demonstrates creating a bytearray, accessing and modifying its elements, and appending new elements to it. The ability to directly manipulate the sequence makes bytearrays incredibly useful for processing and transforming binary data in various applications.

Collection module in Python

The collections module in Python is a built-in module that provides alternatives to Python's general-purpose built-in containers, such as dict, list, set, and tuple. It includes a set of specialized container datatypes that offer various alternatives to Python's built-in containers. Here's an overview and examples of some of the most commonly used containers within the collections module:

Counters

Counters are a subclass of dictionary designed to count hashable objects. They are an incredibly useful tool for tallying elements and implementing counting algorithms.

Example:

Output:

OrderedDict

An OrderedDict is a subclass of the dictionary that maintains the sequence of items as they are added, facilitating the insertion and removal of elements.

Example:

Output:

DefaultDict

A DefaultDict is a dictionary subclass that calls a factory function to supply missing values, simplifying handling of missing keys.

Example:

Output:

ChainMap

A ChainMap groups multiple dictionaries into a single view, making it convenient to manage multiple scopes (e.g., variable scopes in programming languages).

Example:

Output:

NamedTuple

NamedTuples enable the creation of objects akin to tuples that are indexable, iterable, and whose fields can be accessed via attribute lookup.

Example:

Output:

Deque

A deque (double-ended queue) is a generalization of stacks and queues which supports memory-efficient and fast appends and pops from either side.

Example:

Output:

UserDict

UserDict is a wrapper around dictionary objects for easier dictionary subclassing.

Example:

Output:

UserString

UserString is a wrapper around string objects for easier string subclassing.

Example:

Output:

Linked List in Python

A linked list is a linear collection of data elements where each element points to the next, allowing for a dynamic data structure with efficient insertions and deletions.

Example:

Output:

Stack in Python

A stack is a linear data structure that follows the Last In, First Out (LIFO) principle.

Example: Python Stack Implementation

Output:

Queue

Similar to a stack, a queue in Python is a fundamental data structure that organizes items sequentially, adhering to a First In, First Out (FIFO) principle.

Implementing Queues in Python

Queues can be implemented in Python using several methods:

  • Using a list
  • Using collections.deque
  • Using queue.Queue

Implementation Using List

In a list-based implementation, the append() method is used for enqueue operations, while the pop(0) method is employed for dequeue operations, as shown below:

Output:

Implementation Using collections.deque

The collections.deque is preferred for queue implementations due to its efficient O(1) time complexity for append and pop operations from both ends, compared to the list's O(n) for certain operations.

Output:

Implementation Using queue.Queue

The queue.Queue class provides a FIFO queue implementation suitable for multi-threading environments, with methods to check if the queue is full or empty.

Output:

Each of these implementations showcases the versatility of queues in Python, offering a range of options to fit different scenarios and performance requirements.

Priority Queue in Python

Priority queues in Python are specialized data structures using Python that organize elements such that each is assigned a certain priority. This concept in data structures using Python mirrors real-world situations where tasks of higher importance are addressed first, irrespective of the sequence in which they come up.

Here's a straightforward implementation of a priority queue using Python's list, without the need for any external library.

Output:

Heap queue

The heapq module in Python provides an array of functions for managing heaps, which are a specialized data structure using Python that follows the heap queue or priority queue algorithm. In these data structures using Python, heaps are implemented as binary trees where the value of each parent node is less than or equal to the values of its children, ensuring the smallest element is consistently positioned at the root, heap[0]. The module facilitates efficient operations—such as inserting and extracting the smallest element—within O(log n) time, making it exceptionally suitable for tasks that demand regular retrieval of the smallest item.

Below is an illustration of how to use the heapq module to create and manipulate a heap:

Output:

Binary Tree

A binary tree is a specialized form of a tree utilized in data structures using python, where each node can have no more than two children, commonly identified as the left and right child. In the realm of data structures using Python, this structure is vital for a multitude of computing tasks, ranging from straightforward data storage to the execution of intricate algorithms.

Let's illustrate this with an example of creating and adding nodes to a binary tree in Python.

Example: Defining a Node in a Binary Tree

Consider constructing a simple binary tree as follows:

This structure forms a tree that looks like this:

Tree traversal is a critical operation for accessing or printing all the nodes of a binary tree. There are several strategies for this, including depth-first and breadth-first traversals. Depth-first traversal includes Inorder (Left, Root, Right), Preorder (Root, Left, Right), and Postorder (Left, Right, Root) traversals.

Example: Tree Traversal Methods

Using the previously defined tree, we can perform different traversals:

Output:

These traversal methods allow us to visit all the nodes in a structured way, demonstrating the versatility and efficiency of binary trees in data management and algorithm design. The complexity of traversing a binary tree is O(n), where n is the number of nodes, since each node is visited once.

Graph

Graphs, as one of the more complex data structures in Python, are characterized by their capacity to depict intricate connections through nodes (or vertices) and edges that link these nodes. This type of data structures in Python is defined by a collection of vertices (V) and edges (E) which join vertex pairs, thereby showcasing the interlinked attributes of the data they represent.

Graphs can be represented in Python using two primary methods: the Adjacency Matrix and the Adjacency List.

Adjacency Matrix

An Adjacency Matrix takes the form of a square matrix and serves to depict a finite graph. The matrix entries denote the adjacency between vertex pairs; a non-zero value implies the presence of an edge connecting those vertices. In the matrix representation, if there is an edge between vertex (i) and vertex (j), then the matrix cell ([i][j]) is set to (1), or a weight in the case of a weighted graph. This method is excellent for representing dense graphs, where the number of edges is large.

Output:

Adjacency List

The Adjacency List is a more space-efficient way to represent graphs, especially sparse ones. It consists of an array of lists, where the index of the array represents a vertex and the list at each index contains the vertices adjacent to that vertex. This method allows for faster and more efficient storage of graphs when you have a large number of vertices but few edges.

Output:

Both methods have their advantages and are chosen based on the specific requirements of the application, such as the need for fast edge lookup versus space efficiency.

Conclusion

  1. Python features a broad range of built-in data structures for diverse programming needs.
  2. It supports mutable, immutable, and hashable types like lists, tuples, and dictionaries for efficient data manipulation.
  3. Specialized structures such as sets and the collections module enhance unique data storage abilities.
  4. Advanced constructs like linked lists and heaps cater to complex data arrangements and algorithms.
  5. Mastering these structures' properties and uses is crucial for optimal data organization and code performance.

Learn More