CS641 Branches in MIPS and x86 code

Note: There are no branch delay slots with the x86 family: it’s too hard to do with variable length instructions

So in x86 and x86-64, branches typically just stall the pipeline.

Compilers try hard to remove branches.

Look at MIPS case first:

Our cross-compiler setup: mips-gcc and mips-gcc-O2 are just shell scripts:

sf06.cs.umb.edu$ more mips-gcc

mips-gcc-compiler -march=r2000 $*

 

sf06.cs.umb.edu$ more mips-gcc-O2

# -O2 optimization turns on -fdelayed-branch, which

# allows instruction reordering to fill in branch delay slots

# to start, disallow delayed branch slot use with -O2

mips-gcc-compiler -march=r2000 -O2 -fno-delayed-branch $*

 

Example showing use of “if” to avoid division: good idea?

 

// do x%N without division: cheap mod, for x below 2*N

// used in Weiss, DS & Problem Solving using Java (CS310 text), pg. 794

// note: on some processors, x%N may be faster, but not our MIPS

int mod1(int x, int N)

{

    if (x >= N)    // conditional subtraction instead of x%N

        x -= N;

    return x;

}

 

Compile it, allowing branch-delay slot use by MIPS C compiler:

 

mips-gcc-compiler -march=r2000 -O2

...

        slt     $2,$4,$5

        bne     $2,$0,$L34

        nop                 <--branch delay slot with nop

        subu    $4,$4,$5

$L34:

        j       $31

        move    $2,$4       <--branch delay slot with out-of-order instruction

 

Same code, compile with x86: 6 instructions either way, no branch (gcc –O2 compile on sf06)

The trick: with the conditional subtraction, the compiler uses the conditional move instruction “cmovge”, which tests condition codes set by the cmpl and copies conditionally from EAX to EDI. This is a fairly new instruction, not in most x86 textbooks.

 

 

 

with conditional subtraction:

        movl    %edi, %eax

        subl    %esi, %eax

        cmpl    %esi, %edi

        cmovge  %eax, %edi

        movl    %edi, %eax

        ret

 

with %:

        movl    %edi, %edx

        movl    %edi, %eax

        sarl    $31, %edx        <--sign extension by right shift 31

        idivl   %esi             <--integer division: divide edx:eax by esi, remainder in edx

        movl    %edx, %eax

        ret

 

One place this matters in compare routine for sorting. We need to return negative, 0, or positive from the function

Ex. (note it is not correct to subtract and then check sign)

 

int intcompare(int *x, int *y)

{

if (*x < *y)

         return -1;

       else if (*x > *y)

         return 1;

       else return 0;

}

Looks like 2 branches, but MIPS and x86-64 can do it with one (maybe zero?)

 

MIPS: -O2

        lw      $4,0($4)

        lw      $5,0($5)

        nop

        slt     $2,$4,$5

        bne     $2,$0,$L33

        li      $3,-1                   # 0xffffffffffffffff

        slt     $3,$5,$4

$L33:

        j       $31

        move    $2,$3

 

gcc -S -O2 : x86-64 can do

intcompare:

.LFB3:

        movl    (%rsi), %eax

        cmpl    %eax, (%rdi)

        movl    $-1, %edx

        jl      .L3

        setg    %al

        movzbl  %al, %edx

.L3:

        movl    %edx, %eax

        ret

.LF