x86-64 backend generates aligned ADDPS with unaligned address

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

x86-64 backend generates aligned ADDPS with unaligned address

Frank Winter
When I compile attached IR with LLVM 3.6

llc -march=x86-64 -o f.S f.ll

it generates an aligned ADDPS with unaligned address. See attached f.S,
here an extract:

         addq    $12, %r9         # $12 is not a multiple of 4, thus for
xmm0 this is unaligned
         xorl    %esi, %esi
         .align  16, 0x90
.LBB0_1:                                # %loop2
                                         # =>This Inner Loop Header: Depth=1
         movq    offset_array3(,%rsi,8), %rdi
         movq    offset_array2(,%rsi,8), %r10
         movss   -28(%rax), %xmm0
         movss   -8(%rax), %xmm1
         movss   -4(%rax), %xmm2
         unpcklps        %xmm0, %xmm2    # xmm2 =
xmm2[0],xmm0[0],xmm2[1],xmm0[1]
         movss   (%rax), %xmm0
         unpcklps        %xmm0, %xmm1    # xmm1 =
xmm1[0],xmm0[0],xmm1[1],xmm0[1]
         unpcklps        %xmm2, %xmm1    # xmm1 =
xmm1[0],xmm2[0],xmm1[1],xmm2[1]
         addps   (%r9), %xmm1          # here, it gets used, causes a
segfault


Frank


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: x86-64 backend generates aligned ADDPS with unaligned address

Reid Kleckner-2
This load instruction assumes the default ABI alignment for the <4 x float> type, which is 16:
  %15 = load <4 x float>* %14

You can set the alignment of loads to something lower than 16 in your frontend, and this will make LLVM use movups instructions:
  %15 = load <4 x float>* %14, align 4

If some LLVM mid-level pass is introducing this load without proving that the vector is 16-byte aligned, then that's a bug

On Wed, Jul 29, 2015 at 1:02 PM, Frank Winter <[hidden email]> wrote:
When I compile attached IR with LLVM 3.6

llc -march=x86-64 -o f.S f.ll

it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract:

        addq    $12, %r9         # $12 is not a multiple of 4, thus for xmm0 this is unaligned
        xorl    %esi, %esi
        .align  16, 0x90
.LBB0_1:                                # %loop2
                                        # =>This Inner Loop Header: Depth=1
        movq    offset_array3(,%rsi,8), %rdi
        movq    offset_array2(,%rsi,8), %r10
        movss   -28(%rax), %xmm0
        movss   -8(%rax), %xmm1
        movss   -4(%rax), %xmm2
        unpcklps        %xmm0, %xmm2    # xmm2 = xmm2[0],xmm0[0],xmm2[1],xmm0[1]
        movss   (%rax), %xmm0
        unpcklps        %xmm0, %xmm1    # xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
        unpcklps        %xmm2, %xmm1    # xmm1 = xmm1[0],xmm2[0],xmm1[1],xmm2[1]
        addps   (%r9), %xmm1          # here, it gets used, causes a segfault


Frank


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: x86-64 backend generates aligned ADDPS with unaligned address

Frank Winter
No, I generated this IR. So, then I have to generate it along with
alignment info if the pointers are not default ABI aligned. I wasn't
aware of this.. Thanks!

Frank


On 07/29/2015 04:54 PM, Reid Kleckner wrote:

> This load instruction assumes the default ABI alignment for the <4 x
> float> type, which is 16:
>   %15 = load <4 x float>* %14
>
> You can set the alignment of loads to something lower than 16 in your
> frontend, and this will make LLVM use movups instructions:
>   %15 = load <4 x float>* %14, align 4
>
> If some LLVM mid-level pass is introducing this load without proving
> that the vector is 16-byte aligned, then that's a bug
>
> On Wed, Jul 29, 2015 at 1:02 PM, Frank Winter <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     When I compile attached IR with LLVM 3.6
>
>     llc -march=x86-64 -o f.S f.ll
>
>     it generates an aligned ADDPS with unaligned address. See attached
>     f.S, here an extract:
>
>             addq    $12, %r9         # $12 is not a multiple of 4,
>     thus for xmm0 this is unaligned
>             xorl    %esi, %esi
>             .align  16, 0x90
>     .LBB0_1:                                # %loop2
>                                             # =>This Inner Loop
>     Header: Depth=1
>             movq    offset_array3(,%rsi,8), %rdi
>             movq    offset_array2(,%rsi,8), %r10
>             movss   -28(%rax), %xmm0
>             movss   -8(%rax), %xmm1
>             movss   -4(%rax), %xmm2
>             unpcklps        %xmm0, %xmm2    # xmm2 =
>     xmm2[0],xmm0[0],xmm2[1],xmm0[1]
>             movss   (%rax), %xmm0
>             unpcklps        %xmm0, %xmm1    # xmm1 =
>     xmm1[0],xmm0[0],xmm1[1],xmm0[1]
>             unpcklps        %xmm2, %xmm1    # xmm1 =
>     xmm1[0],xmm2[0],xmm1[1],xmm2[1]
>             addps   (%r9), %xmm1          # here, it gets used, causes
>     a segfault
>
>
>     Frank
>
>
>     _______________________________________________
>     LLVM Developers mailing list
>     [hidden email] <mailto:[hidden email]>
>     http://llvm.cs.uiuc.edu
>     http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev