06_Data_Structure_Alignment_Figure_00
admin

Data structure alignment

Share on facebook
Share on twitter
Share on linkedin

Foreword

Having knowledge about how data is stored and the way we can access them will help us improve our system performance as well as optimize the memory, especially when you are developing on an embedded system with limited resources.

Now, let’s get started.

What is data structure alignment?

Data structure alignment is the way data is arranged and accessed in computer memory. It consists of two separate but related issues: data alignment and data structure padding.

Wikipedia

Data structure alignment: when data is loaded to memory, they will be rearranged to make it more efficient to access by the CPU. There are 2 separate concepts while doing “Data structure alignment”:

  • Data alignment: place all variables at an address while maintaining the offset equal to multiple of the word size.
  • Data structure padding: in order to keep the address equal to multiple of the word size, sometime some meaningless bytes will be inserted between 2 variables, and this is “PADDING”

We will go more details in the next section about how the rearrangement happens and how it can improve our system performance.

About the system

Before getting into details, we will need to go through some fundamental knowledge about our system.

Word size and address size

A processor does not access to memory one byte at a time but in 2, 4, 8, 16, or 32-byte block (base on the system). The reason for this is the performance for accessing an address on mutiple bytes boundary is a lot faster than on a single byte boundary.

Word size: This is the number of bits that a CPU can process at one time. In the modern CPU, word size can be 8, 16, 24, 32 or 64 bits depends on the system. We usually call a system base on it word size. For example:

  • 16-bit system: 1 Word = 16 bits = 2 bytes
  • 32-bit system: 1 Word = 32 bits = 4 bytes
  • 64-bit system: 1 Word = 64 bits = 8 bytes

Figure 01: Word size

Address size: this is the size of the address space. For example, if we use 4 bytes (32 bits) to store the address, we will have 2^32 = 4 294 967 296 different addresses.

Figure 02: Address size

In modern CPU, the word size is usually (not always) used to also describe the size of the address space. This allows one memory address to be efficiently stored in one word. For example:

  • 32-bit system: size of 1 word = size of 1 address space = 32 bits.
  • 64-bit system: size of 1 word = size of 1 address space = 64 bits.  

Figure 03: Address size and Word size

Why do we need it?

Let’s take a look at the example below to see how accessing a misalignment data can slow down the system’s performance.

Figure 04: 4-byte variable on the unalignment system and alignment system

As can be seen in figure 04, we will need 5 steps in the misalignment system in comparison with only 2 steps in the alignment system to get the 4-byte variable.

How it works?

Here are 3 steps I found useful when dealing with the structure padding:

  1. Place all the variables in the struct at the address that can be evenly divisible to the size of that variable (if the system word size if bigger than variable size). If the system word size is smaller than the variable size, the variable will be placed at the address that can be evenly divisible to the word size.


                                              Figure 05: Different data type and their aligned address.

  2. Padding to unused bytes.
  3. Calculate the final size of the struct. 

Let’s take a look at some example below to know how it works

Example 01: Calculate size of struct_01 in 32-bit system and 64-bit system

Step 1 and 2: Put the variables to the appropriate place and add padding to unused bytes.

Figure 06: Example 01

Step 3: Calculate the final size

  • For 32-bit system: sizeof(struct_01) = 12 bytes
  • For 64-bit system: sizeof(struct_01) = 16 bytes

Example 02: Calculate size of struct_02 in 32-bit system and 64-bit system

Step 1 and 2: Put the variables to the appropriate place and add padding to unused bytes.

Figure 07: Example 02

Step 3: Calculate the final size

  • For 32-bit system: sizeof(struct_02) = 20 bytes
  • For 64-bit system: sizeof(struct_02) = 24 bytes

Example 03: Calculate the size of struct_03 in 32-bit system and 64-bit system

Step 1 and 2: Put the variables to the appropriate place and add padding to unused bytes.

Figure 08: Example 3

Step 3: Calculate the final size

  • For 32-bit system: sizeof(struct_03) = 6 bytes
  • For 64-bit system: sizeof(struct_03) = 6 bytes

Align and Padding macro in C language

In some compilers such as IAR or KeilC, you can use some struct attributes to have more control over data alignment.

PACK ATTRIBUTE

With a structure having pack attribute, we are not padding anything between struct elements.

Example 04: Calculate the size of struct_04 in the 32-bit system and 64-bit system

Step 1: Put the variables to the appropriate place, no padding.

Figure 09: Example 04

Step 2: Calculate the final size

  • For 32-bit system: sizeof(struct_04) = 15 bytes
  • For 64-bit system: sizeof(struct_04) = 15 bytes

 

ALIGNED ATTRIBUTE

With the struct having aligned attribute, the value of the aligned attribute will overdrive the word-size. Therefore, 3 steps now will become:

  1. Place all the variables in the struct at the address that can be evenly divisible to the size of that variable (if the aligned value if bigger than variable size). If the aligned value is smaller than the variable size, the variable will be placed at the address that can be evenly divisible to the aligned value.
  2. Padding to unused bytes.
  3. Calculate the final size of the struct. 
Remember that the final size is always evenly divisible to the aligned value

Example 05: Calculate the size of struct_05 in the 32-bit system and 64-bit system

Step 1 and 2: Put the variables to the appropriate place.

Figure 10: Example 05

Step 3: Calculate the final size

  • For 32-bit system: sizeof(struct_05) = 24 bytes
  • For 64-bit system: sizeof(struct_05) = 24 bytes

USE “PACK” AND “ALIGNED” ATTRIBUTE TOGETHER

When using “pack” and “aligned” attribute together, we will:

  1. “Pack” the struct first.
  2. Add padding bytes at the end to make sure the size of our structure is evenly divisible to the aligned value.

Example 06: Calculate the size of struct_06 in the 32-bit system and 64-bit system

Pack the struct first and add padding bytes at the end to make sure it is evenly divisible to the aligned value (8 in this example)

Figure 11: Example 06

More thoughts

Some more interesting article about how data structure alignment affects our system’s performance: https://www.ibm.com/developerworks/library/pa-dalign/

WRITTEN BY

Trung Do

Firmware engineer, blogger and a makerholic 

Related Articles

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>