Struct Field Alignment¶

When optimizing Go programs for performance, struct layout and memory alignment often go unnoticed—yet they have a measurable impact on memory usage and cache efficiency. Go automatically aligns struct fields based on platform-specific rules, inserting padding to satisfy alignment constraints. Understanding and controlling this alignment can reduce memory footprint, improve cache locality, and improve performance in tight loops or high-throughput data pipelines.

Why Alignment Matters¶

Modern CPUs are sensitive to memory layout. When data is misaligned or spans multiple cache lines, it incurs additional access cycles and can disrupt performance. In Go, struct fields are aligned according to their type requirements, and the compiler inserts padding bytes to meet these constraints. If fields are arranged without care, unnecessary padding may inflate struct size significantly, affecting memory use and bandwidth.

Consider the following two structs:

type PoorlyAligned struct {
    flag bool
    count int64
    id byte
}

type WellAligned struct {
    count int64
    flag bool
    id byte
}

On a 64-bit system, PoorlyAligned requires 24 bytes due to the padding between fields, whereas WellAligned fits into 16 bytes by ordering fields from largest to smallest alignment requirement.

Benchmarking Impact¶

We benchmarked both struct layouts by allocating 10 million instances of each and measuring allocation time and memory usage:

func BenchmarkPoorlyAligned(b *testing.B) {
    for b.Loop() {
        var items = make([]PoorlyAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
        }
    }
}

func BenchmarkWellAligned(b *testing.B) {
    for b.Loop() {
        var items = make([]WellAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
        }
    }
}

Benchmark Results

Benchmark	Iterations	Time per op (ns)	Bytes per op	Allocs per op
PoorlyAligned-14	177	20,095,621	240,001,029	1
WellAligned-14	186	19,265,714	160,006,148	1

The WellAligned version reduced memory usage by 80MB for 10 million structs and also ran slightly faster than the poorly aligned version. This highlights that thoughtful field arrangement improves memory efficiency and can yield modest performance gains in allocation-heavy code paths.

In addition to memory layout efficiency, struct alignment also plays a crucial role in concurrent systems. When multiple goroutines access different fields of the same struct that reside on the same CPU cache line, they may suffer from false sharing—where changes to one field cause invalidations in the other, even if logically unrelated.

On modern CPUs, a typical cache line is 64 bytes wide. When a struct is accessed in memory, the CPU loads the entire cache line that contains it, not just the specific field. This means that two unrelated fields within the same 64-byte block will both reside in the same line—even if they are used independently by separate goroutines. If one goroutine writes to its field, the cache line becomes invalidated and must be reloaded on the other core, leading to degraded performance due to false sharing.

To illustrate, we compared two structs—one vulnerable to false sharing, and another with padding to separate fields across cache lines:

type SharedCounterBad struct {
    a int64
    b int64
}

type SharedCounterGood struct {
    a int64
    _ [56]byte // Padding to prevent a and b from sharing a cache line
    b int64
}

Each field is incremented by a separate goroutine 1 million times:

func BenchmarkFalseSharing(b *testing.B) {
    var c SharedCounterBad  // (1)
    var wg sync.WaitGroup

    for b.Loop() {
        wg.Add(2)
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.a++
            }
            wg.Done()
        }()
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.b++
            }
            wg.Done()
        }()
        wg.Wait()
    }
}

FalseSharing and NoFalseSharing benchmarks are identical, except we will use SharedCounterGood for the NoFalseSharing benchmark.

Benchmark Results:

Benchmark	Time per op (ns)	Bytes per op	Allocs per op
FalseSharing	996,234	55	2
NoFalseSharing	958,180	58	2

Placing padding between the two fields prevented false sharing, resulting in a measurable performance improvement. The version with padding completed ~3.8% faster (the value could vary between re-runs from 3% to 6%), which can make a difference in tight concurrent loops or high-frequency counters. It also shows how false sharing may unpredictably affect memory use due to invalidation overhead.

Show the complete benchmark file

package perf

import (
    "sync"
    "testing"
)

// types-simple-start
type PoorlyAligned struct {
    flag bool
    count int64
    id byte
}

type WellAligned struct {
    count int64
    flag bool
    id byte
}
// types-simple-end

// simple-start
func BenchmarkPoorlyAligned(b *testing.B) {
    for b.Loop() {
        var items = make([]PoorlyAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
        }
    }
}

func BenchmarkWellAligned(b *testing.B) {
    for b.Loop() {
        var items = make([]WellAligned, 10_000_000)
        for j := range items {
            items[j].count = int64(j)
        }
    }
}
// simple-end


// types-shared-start
type SharedCounterBad struct {
    a int64
    b int64
}

type SharedCounterGood struct {
    a int64
    _ [56]byte // Padding to prevent a and b from sharing a cache line
    b int64
}
// types-shared-end

// shared-start

func BenchmarkFalseSharing(b *testing.B) {
    var c SharedCounterBad  // (1)
    var wg sync.WaitGroup

    for b.Loop() {
        wg.Add(2)
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.a++
            }
            wg.Done()
        }()
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.b++
            }
            wg.Done()
        }()
        wg.Wait()
    }
}
// shared-end

func BenchmarkNoFalseSharing(b *testing.B) {
    var c SharedCounterGood
    var wg sync.WaitGroup

    for b.Loop() {
        wg.Add(2)
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.a++
            }
            wg.Done()
        }()
        go func() {
            for i := 0; i < 1_000_000; i++ {
                c.b++
            }
            wg.Done()
        }()
        wg.Wait()
    }
}

When To Align Structs¶

Always align structs. It's free to implement and often leads to better memory efficiency without changing any logic—only field order needs to be adjusted.

Guidelines for struct alignment:

Order fields by decreasing size to reduce internal padding. Larger fields first help prevent unnecessary gaps caused by alignment rules.
Group same-sized fields together to optimize memory layout. This ensures fields can be packed tightly without additional padding.
Use padding deliberately to separate fields accessed by different goroutines. Preventing false sharing can improve performance in concurrent applications.
Avoid interleaving small and large fields. Mixing sizes leads to inefficient memory usage due to extra alignment padding between fields.
Use the fieldalignment linter to verify. This tool helps catch suboptimal layouts automatically during development.

Struct Field Alignment¶

Why Alignment Matters¶

Benchmarking Impact¶

Avoiding False Sharing in Concurrent Workloads¶

When To Align Structs¶