Optimization issue for target's offset field of load operation in DAGSelection

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Dan
Reply | Threaded
Open this post in threaded view
|

Optimization issue for target's offset field of load operation in DAGSelection

Dan
I am working on an experimental target and trying to make sure that
the load offset field is used to the best way. There appears to be
some control over the architecture's offset range and whether the
offset is too large and needs to be lowered/converted into a separate
sequence of operations in DAGSelection?

Can someone point me to what might be the case?

For example, the difference between index=63 and 64 causes the
difference in address+offset being generated as separate operation
versus built into the architecture versus just a load operation. In my
architecture, there are larger offsets and 63 and 64 should not be the
dividing line.

Is there a limit on the ranges specified effectively for all targets
or is somehow a constraint for my target set and causing this?

Suggestions?

long long array[100];
long long func()
 {
  return array[63];
  // return array[64];
}

Here is the difference in the .ll code with the 63 or 64 as the index:

  %0 = load i64* getelementptr inbounds ([10000 x i64]* @array, i32 0,
i64 63), align 8
  ret i64 %0


  %0 = load i64* getelementptr inbounds ([10000 x i64]* @array, i32 0,
i64 64), align 8
  ret i64 %0


Here is the Instruction Selection for size 63:


ISEL: Starting pattern match on root node: 0x3d9ad80: i64,ch = load
0x3d866f8, 0x3d9aa80, 0x3d9ac80<LD8[getelementptr inbounds ([10000 x
i64]* @array, i32 0, i64 63)]> [ORD=2] [ID=6]

  Initial Opcode index to 813
  Skipped scope entry (due to false predicate) at index 822, continuing at 876
  Skipped scope entry (due to false predicate) at index 877, continuing at 931
  TypeSwitch[i64] from 934 to 937
  Morphed node: 0x3d9ad80: i64,ch = LDWri 0x3d9a880, 0x3d9ab80,
0x3d866f8<Mem:LD8[getelementptr inbounds ([10000 x i64]* @array, i32
0, i64 63)]> [ORD=2]

ISEL: Match complete!
===== Instruction selection ends:

Here is the Instruction Selection for size 64:


ISEL: Match complete!
ISEL: Starting pattern match on root node: 0x2d2cda0: i64,ch = load
0x2d18718, 0x2d2caa0, 0x2d2cca0<LD8[getelementptr inbounds ([10000 x
i64]* @array, i32 0, i64 64)]> [ORD=2] [ID=6]

  Initial Opcode index to 813
  Skipped scope entry (due to false predicate) at index 822, continuing at 876
  Skipped scope entry (due to false predicate) at index 877, continuing at 931
  TypeSwitch[i64] from 934 to 937
  Morphed node: 0x2d2cda0: i64,ch = LDWri 0x2d2caa0, 0x2d2cba0,
0x2d18718<Mem:LD8[getelementptr inbounds ([10000 x i64]* @array, i32
0, i64 64)]> [ORD=2]

ISEL: Match complete!
ISEL: Starting pattern match on root node: 0x2d2caa0: i64 = add
0x2d2c8a0, 0x2d2c9a0 [ORD=1] [ID=5]

  Initial Opcode index to 1473
  Match failed at index 1482
  Continuing at 1498
  Morphed node: 0x2d2caa0: i64 = ADD 0x2d2c8a0, 0x2d2c9a0 [ORD=1]
  etc
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Optimization issue for target's offset field of load operation in DAGSelection

Krzysztof Parzyszek
On 7/9/2013 12:17 PM, Dan wrote:
> I am working on an experimental target and trying to make sure that
> the load offset field is used to the best way. There appears to be
> some control over the architecture's offset range and whether the
> offset is too large and needs to be lowered/converted into a separate
> sequence of operations in DAGSelection?
>
> Can someone point me to what might be the case?

Instruction patterns can have predicates on each operand to make sure
that the operand meets the required criteria.  For example, in
lib/Target/PowerPC/PPCInstrInfo.td, there is a definition of ADDI:

def ADDI   : DForm_2<14, (outs gprc:$rD),
                          (ins gprc_nor0:$rA, symbolLo:$imm),
                      "addi $rD, $rA, $imm", IntSimple,
                      [(set i32:$rD, (add i32:$rA, immSExt16:$imm))]>;

The "immSExt16" is a predicate, and it's defined in the same file:

def immSExt16  : PatLeaf<(imm), [{
   // immSExt16 predicate - True if the immediate fits in a 16-bit
   // sign extended field.  Used by instructions like 'addi'.
   if (N->getValueType(0) == MVT::i32)
     return (int32_t)N->getZExtValue() == (short)N->getZExtValue();
   else
     return (int64_t)N->getZExtValue() == (short)N->getZExtValue();
}]>;


In this case, the ADDI will be generated only if the immediate operand
satisfies the predicate.  Otherwise, the ADDI pattern won't match, and
the instruction selector will attempt to match other patterns.  In this
case it will most likely be the immediate by itself (loaded into a
register), and then the pattern for ADD (register+register) will match
on the result.

-Krzysztof


--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev