Menu

Verilog For Loops: Unrolled at Compile Time

How Verilog for loops differ from their software cousins - they're unrolled by the synthesizer into parallel hardware, not executed iteratively at runtime.

This page includes runnable editors - edit, run, and see output instantly.

The Software Look-Alike

Verilog's for loop is a copy of C's:

for (i = 0; i < 8; i = i + 1) begin
    // body
end

The same three parts: initializer, condition, increment. The body re-runs as long as the condition holds.

In a testbench, this behaves exactly like you'd expect from software. The simulator steps through each iteration in turn:

Four iterations, four lines of output. No surprises.

The surprise comes when you put a for loop inside synthesizable code.

The Unroll

A for loop in a synthesizable always block does not become a runtime loop in hardware. The synthesizer unrolls it at elaboration time - it expands the loop into N copies of the body, where N is the iteration count:

That looks like a loop. In simulation, the simulator does loop through eight iterations. In synthesis, the loop is unrolled into eight parallel checks of data[0] through data[7], all happening at the same time. The synthesizer sees:

count = 0;
if (data[0]) count = count + 1;
if (data[1]) count = count + 1;
if (data[2]) count = count + 1;
...
if (data[7]) count = count + 1;

…and then turns the sequence into an adder tree. The runtime behavior is "look at all 8 bits at once and count how many are 1", in a single combinational sweep.

The implication: a for loop in synthesizable Verilog is not free. A 64-iteration loop becomes 64 copies of the body in hardware. If the body is complex, you've just built a large combinational block. Use loops when N is small (handful to a few dozen). For larger counts, you usually want a clocked counter and a state machine.

Constant Bounds Are Required

The synthesizer can only unroll the loop if it knows N at elaboration time. That means the loop bounds must be constants:

// Works - bound is constant
for (i = 0; i < 8; i = i + 1) ...

// Works - bound is a parameter
for (i = 0; i < WIDTH; i = i + 1) ...

// Doesn't synthesize - bound depends on a runtime signal
for (i = 0; i < dynamic_count; i = i + 1) ...

The last form might still work in simulation, but the synthesizer will reject it. If you genuinely need a runtime-counted loop, you build it with a clocked state machine and a counter register - hardware doesn't have variable-trip-count loops in the way software does.

generate for vs Procedural for

A separate but related construct is generate for, which uses a genvar and lives outside always blocks:

genvar i;
generate
    for (i = 0; i < 8; i = i + 1) begin : g
        bit_inverter inv(.x(in[i]), .y(out[i]));
    end
endgenerate

That stamps out 8 instances of bit_inverter (covered in Module Instantiation). It's structural - you're saying "make 8 copies of this sub-module" - not behavioral.

Quick distinction:

  • Procedural for (inside always) - unrolls statements within a single behavioral block.
  • Generate for (outside always) - replicates entire structural constructs: instances, assign statements, named blocks.

Use the one that matches what you're replicating.

When for Shines: Vector Operations

Loops are at their best when you're doing the same operation to every bit of a vector. Population count, parity, byte reversal, lookup-table generation:

32 iterations, each doing one bit assignment - much more readable than writing out 32 hand-rolled wire assignments. The synthesizer unrolls cleanly.

while, repeat, forever

Beyond for, Verilog has three other loop constructs - mostly for testbenches:

// Run until a condition fails
while (~done) begin
    @(posedge clk);
    cycles = cycles + 1;
end

// Run N times - simpler than for when you don't need a counter
repeat (8) @(posedge clk);

// Run forever - clock generators, monitoring loops
always #5 clk = ~clk;
forever begin
    @(posedge clk);
    $display("count=%0d", count);
end

while, repeat, and forever are synthesizable only in narrow cases (notably repeat with a constant count and a clocked body). In testbenches they're useful tools; in synthesizable RTL prefer a counted for plus an explicit state machine.

Procedural for in Testbenches

In a testbench, for loops behave the way software does. Use them freely:

Nested loops sweep every combination of two 2-bit inputs. The simulator runs the iterations sequentially. No unrolling concerns - testbenches don't synthesize.

Common Mistakes

Using a for loop in synthesizable code with a non-constant bound. The synthesizer will reject it. If the bound is runtime, build a counter and a state machine.

Forgetting that the loop body becomes parallel hardware. A 64-iteration loop with a multiplier in the body is 64 parallel multipliers - probably not what you want. For wide datapaths, build a single multiplier and feed it sequentially.

Mixing integer i and a reg named i. The two are different scopes; the integer wins inside the loop. Pick clear names to avoid the confusion.

What Comes Next

You now have every procedural construct Verilog offers. The next chapter pulls it all together into the patterns digital designers actually ship: Clocked Logic - flip-flops, registers, and pipelines - and Finite State Machines - the standard idiom for any controller with multiple operating modes.

Frequently Asked Questions

How do for loops work in Verilog?

Syntactically they look like C: for (i = 0; i < N; i = i + 1) statement;. But for synthesizable code, the loop is unrolled at elaboration time - the synthesizer expands it into N copies of the body. There's no runtime loop counter and no looping in the hardware. For testbenches, for loops behave like their software cousins because the simulator can step through them sequentially.

Is a for loop synthesizable in Verilog?

Yes, but only when the loop bounds are constants known at elaboration time. The synthesizer unrolls the loop into N parallel copies of the body. If the bounds depend on a runtime signal, the loop isn't synthesizable - you have to convert it to a clocked sequential design.

What is the difference between for and generate for in Verilog?

A for loop inside an always block is a procedural construct that synthesizes by unrolling. A generate for loop (with genvar) is an explicit elaboration-time construct that stamps out structural hardware - multiple module instances, multiple wires, multiple assign statements. Use for inside procedural blocks; use generate for outside them to replicate structure.

Does Verilog have a while loop?

Yes - while (condition) statement;. It's synthesizable only when the synthesizer can prove the loop terminates with a bounded number of iterations. In practice that's rare, so while shows up mostly in testbenches and simulation-only code. For synthesizable iteration, use a counted for loop instead.

Coddy programming languages illustration

Learn to code with Coddy

GET STARTED