Assigning data to a Structured Array

Learn via video courses
Topics Covered

Overview

While a homogenous array of numbers can frequently adequately represent our data, this is not always the case. Structured arrays are subsets of NumPy arrays. They handle complex and heterogeneous data, as opposed to standard NumPy arrays, which store homogeneous data. For example, the following code will generate a NumPy structured array:

As described in the tuples, this NumPy structured array will have two columns with two separate data types. Now that we've established a NumPy structured array, we'll need to assign data to it before we can operate on it. Let's look at a few different approaches to allocate data to a NumPy structured array.

Introduction

Before we begin, we will assume that you have successfully installed Python on your system and are familiar with the foundations of the NumPy library, particularly the numpy.dtype() function. If you have any questions about the numpy.dtype() method, it is highly recommended that you read the official documentation before proceeding. You might be attempting to assign data to NumPy structured arrays considering you don't know what NumPy structured arrays are. Before we get started with the conversions, let's go over what NumPy structured arrays are in the first place. If you're already familiar with it, go on to "How to assign data to Structured Arrays?"

What are NumPy Structured Arrays?

Consider a scenario in which there are only five individuals on the globe. Their names and ages are all we are aware of about them. Lists are the simplest way to store this information. However, it will not be the ideal approach and will be extremely memory inefficient. This is where NumPy structured arrays come in.

Numpy Structured Array is generated with the help of a structured data type. A Structured data type might have numerous kinds, each with its own name. Numpy Structured Array can effectively store and access the same data. It accomplishes this by storing the entire array in the same memory region as a contiguous array.

Let's construct a Numpy Structured Array employing a Structured data type, then.

Note: Jupyter Notebook has been used over here and is the preferable IDE

Code:

Output:

The above code generated an empty Structured Array, which may be comprehended and depicted as follows:

empty-structured-array2

Now that you have a basic understanding of what NumPy structured arrays are and how to build them let's look at several methods of assigning data to NumPy structured arrays.

How to Assign Data to a Structured Array

We can operate on a structured array once it has been created from a structured data type. There are several methods for assigning values to a structured array, including assigning data from Python Native Types (Tuples) or scalar values, leveraging other structured arrays, or allocating values using subarrays.

Let's proceed with allocating data from Python naïve types, such as tuples, to our NumPy structured arrays.

1. Assignment from Python Native Types (Tuples)

Using Python tuples is the easiest approach to allocate values to a NumPy structured array. Each allocated value ought to be a tuple with a length equal to the total count of fields in the array rather than a list or array, as these will cause NumPy's broadcasting rules to be triggered. If this seems unfamiliar, please read NumPy's broadcasting official documentation. The members of the tuple are allocated to the array's subsequent fields, moving from left to right. Let us try to understand this with the assistance of an example.

Code:

Output:

Note: The abbreviated string format codes used to assign dtypes to structured arrays over here may appear confusing, but they are based on clear and simple logic. The first (optional) letter is < or >, which signifies "little endian" or "big-endian" and sets the significant bit ordering standard. The following letter indicates the data type: characters, bytes, ints, floating points, and more. The final character or characters reflect the object's size in bytes.

Let's try to figure out what's going on behind the hood in the preceding example. Here, the characters "i4" and "f4" stand for "4-byte (or 32-bit) integer" and "4-byte (or 32-bit) float," respectively. When we assigned values to our NumPy structured arrays at the first index, the first value of the tuple was assigned with data type 32-bit integers and the second value with 32-bit floating point integers.

2. Assignment From Scalars

A structured array can also be assigned a value from a scalar. A scalar allocated to a structured array will be allocated to all fields. In simplest terms, it means that the same value should be provided to each field but with the data type provided by the dtype parameter. This occurs when a scalar is allocated to a structured array or when an unstructured array is allocated to a structured array. Let us try to illustrate this with an example.

Code:

Output:

array([(7, 7.), (7, 7.), (7, 7.), (7, 7.), (7, 7.)],
      dtype=[('f0', '<i4'), ('f1', '<f4')])import numpy as np
structured_array = np.zeros(3, dtype=[('a', 'i8'), ('b', 'f4'), ('c', 'S3')])
structured_array_with_broadcasting = np.ones(3, dtype=[('x', 'f4'), ('y', 'S3'), ('z', 'O')])
structured_array_with_broadcasting=structured_array[1:2]
structured_array_with_broadcasting

In the preceding example, we saw how a single scalar value (in our instance, 7) was assigned to each field with its associated data type.

3. Assignment From Other Structured Arrays

As you can see in the preceding example, we can allocate a scalar to a NumPy structured array; but what if we wish to assign a NumPy structured array from another NumPy structured array? When two NumPy structured arrays are assigned to each other, the source elements are transformed to tuples and then allocated to the destination elements. That is, regardless of field names, the very first field of the input array is allocated to the first field of the target array, and so on. Structured arrays with various field counts cannot be allocated to one another. There is no impact on the destination structure's bytes that are not present in any of the fields. Let us try to grasp this with a code example.

Case 1:

Code:

Output:

Case 2:

Code:

Output:

In the previous two situations, we observed that when both NumPy structured arrays had the same amount of fields, the compiler produced no issue, and data was allocated from one structured array to another structured array (as depicted by Case 1). However, when we attempted to allocate data from a structured array with a different number of fields than the one to which we are allocating data, the compiler threw an error, as seen in Case 2.

4. Assignment Involving Subarrays

We assigned a whole NumPy structured array to another NumPy structured array, as seen in the preceding part. But what if we wish to allocate a NumPy structured array's portion or subarray to another NumPy structured array? The rules for allocating a NumPy structured subarray to a NumPy structured array are identical to those discussed in the preceding section. In addition to the rules indicated in the preceding section, when allocating to fields that are subarrays, the allocated value is the first broadcast to the form of the subarray. Let us now attempt to comprehend this using an example.

A quick recap, Broadcasting is a useful technique that enables NumPy to execute arithmetic operations on arrays of varied sizes. We frequently have a smaller dimension array and a bigger dimension array, and we wish to utilize the smaller dimension array to execute some operations on the larger dimension array numerous times. NumPy broadcasting accomplishes this without duplicating data and frequently results in effective algorithm implementations.

Case 1: Code:

Output:

Case 2: Code:

Output:

When the size of the subarray was similar to the size of the NumPy structured array to which we were assigning the data in the prior two cases, no NumPy broadcasting was performed, and data were correctly allocated to the structured array. However, when we attempted to allocate the smaller size subarray to the larger size NumPy structured array to which we were assigning the data, the content of the subarray was broadcasted first to match the shape of our NumPy structured array, and the data was successfully assigned to our structured array.

This brings us to the conclusion of our article. Kudos! You should now understand how to allocate data to NumPy structured arrays.

Conclusion

This blog taught us:

  • There are four ways to allocate data to a NumPy structured array.
  • Instead of a list or array, each allocated value ought to be a tuple with a length equal to the entire number of fields in the array; otherwise, it will be handled as a scalar operation and must broadcast each item before allocating.
  • The compiler issued an error when we attempted to allocate data from a structured array with a dissimilar number of fields than the one to which we were allocating data.
  • A NumPy structured subarray can be allocated to a NumPy structured array.