Skip to content

Struct Field Alignment

When optimizing Go programs for performance, struct layout and memory alignment often go unnoticed—yet they have a measurable impact on memory usage and cache efficiency. Go automatically aligns struct fields based on platform-specific rules, inserting padding to satisfy alignment constraints. Understanding and controlling this alignment can reduce memory footprint, improve cache locality, and improve performance in tight loops or high-throughput data pipelines.

Why Alignment Matters

Modern CPUs are sensitive to memory layout. When data is misaligned or spans multiple cache lines, it incurs additional access cycles and can disrupt performance. In Go, struct fields are aligned according to their type requirements, and the compiler inserts padding bytes to meet these constraints. If fields are arranged without care, unnecessary padding may inflate struct size significantly, affecting memory use and bandwidth.

Consider the following two structs:

type PoorlyAligned struct {
    flag bool
    count int64
    id byte
}

type WellAligned struct {
    count int64
    flag bool
    id byte
}

On a 64-bit system, PoorlyAligned requires 24 bytes due to the padding between fields, whereas WellAligned fits into 16 bytes by ordering fields from largest to smallest alignment requirement.

Benchmarking Impact

We benchmarked both struct layouts by allocating 10 million instances of each and measuring allocation time and memory usage:

func BenchmarkPoorlyAligned(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var items = make([]PoorlyAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
            result += items[j].count
        }
    }
}

func BenchmarkWellAligned(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var items = make([]WellAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
            result += items[j].count
        }
    }
}

Benchmark Results

Benchmark Iterations Time per op (ns) Bytes per op Allocs per op
PoorlyAligned-14 177 20,095,621 240,001,029 1
WellAligned-14 186 19,265,714 160,006,148 1

The WellAligned version reduced memory usage by 80MB for 10 million structs and also ran slightly faster than the poorly aligned version. This highlights that thoughtful field arrangement improves memory efficiency and can yield modest performance gains in allocation-heavy code paths.

Avoiding False Sharing in Concurrent Workloads

In addition to memory layout efficiency, struct alignment also plays a crucial role in concurrent systems. When multiple goroutines access different fields of the same struct that reside on the same CPU cache line, they may suffer from false sharing—where changes to one field cause invalidations in the other, even if logically unrelated.

On modern CPUs, a typical cache line is 64 bytes wide. When a struct is accessed in memory, the CPU loads the entire cache line that contains it, not just the specific field. This means that two unrelated fields within the same 64-byte block will both reside in the same line—even if they are used independently by separate goroutines. If one goroutine writes to its field, the cache line becomes invalidated and must be reloaded on the other core, leading to degraded performance due to false sharing.

To illustrate, we compared two structs—one vulnerable to false sharing, and another with padding to separate fields across cache lines:

type SharedCounterBad struct {
    a int64
    b int64
}

type SharedCounterGood struct {
    a int64
    _ [56]byte // Padding to prevent a and b from sharing a cache line
    b int64
}

Each field is incremented by a separate goroutine 1 million times:

func BenchmarkFalseSharing(b *testing.B) {
    var c SharedCounterBad  // (1)
    var wg sync.WaitGroup

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        wg.Add(2)
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.a++
            }
            wg.Done()
        }()
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.b++
            }
            wg.Done()
        }()
        wg.Wait()
    }
}
  1. FalseSharing and NoFalseSharing benchmarks are identical, except we will use SharedCounterGood for the NoFalseSharing benchmark.

Benchmark Results:

Benchmark Time per op (ns) Bytes per op Allocs per op
FalseSharing 996,234 55 2
NoFalseSharing 958,180 58 2

Placing padding between the two fields prevented false sharing, resulting in a measurable performance improvement. The version with padding completed ~3.8% faster (the value could vary between re-runs from 3% to 6%), which can make a difference in tight concurrent loops or high-frequency counters. It also shows how false sharing may unpredictably affect memory use due to invalidation overhead.

Show the complete benchmark file
package perf

import (
    "sync"
    "testing"
)

// types-simple-start
type PoorlyAligned struct {
    flag bool
    count int64
    id byte
}

type WellAligned struct {
    count int64
    flag bool
    id byte
}
// types-simple-end

var result int64

// simple-start
func BenchmarkPoorlyAligned(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var items = make([]PoorlyAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
            result += items[j].count
        }
    }
}

func BenchmarkWellAligned(b *testing.B) {
    for i := 0; i < b.N; i++ {
        var items = make([]WellAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
            result += items[j].count
        }
    }
}
// simple-end


// types-shared-start
type SharedCounterBad struct {
    a int64
    b int64
}

type SharedCounterGood struct {
    a int64
    _ [56]byte // Padding to prevent a and b from sharing a cache line
    b int64
}
// types-shared-end

// shared-start

func BenchmarkFalseSharing(b *testing.B) {
    var c SharedCounterBad  // (1)
    var wg sync.WaitGroup

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        wg.Add(2)
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.a++
            }
            wg.Done()
        }()
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.b++
            }
            wg.Done()
        }()
        wg.Wait()
    }
}
// shared-end

func BenchmarkNoFalseSharing(b *testing.B) {
    var c SharedCounterGood
    var wg sync.WaitGroup

    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        wg.Add(2)
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.a++
            }
            wg.Done()
        }()
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.b++
            }
            wg.Done()
        }()
        wg.Wait()
    }
}

When To Align Structs

Always align structs. It's free to implement and often leads to better memory efficiency without changing any logic—only field order needs to be adjusted.

Guidelines for struct alignment:

  • Order fields by decreasing size to reduce internal padding. Larger fields first help prevent unnecessary gaps caused by alignment rules.
  • Group same-sized fields together to optimize memory layout. This ensures fields can be packed tightly without additional padding.
  • Use padding deliberately to separate fields accessed by different goroutines. Preventing false sharing can improve performance in concurrent applications.
  • Avoid interleaving small and large fields. Mixing sizes leads to inefficient memory usage due to extra alignment padding between fields.
  • Use the fieldalignment linter to verify. This tool helps catch suboptimal layouts automatically during development.