rotate

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

rotate

Reed Kotler
in C or C++, how can I get clang/llvm to try and do a "rotate".

(want to test this code in the mips16 port)

i.e. emit rotr node.

tia.

reed

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Michael Gottesman
I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e.,

======
#include <stdlib.h>
#include <stdint.h>

uint32_t ror(uint32_t input, size_t rot_bits)
{
  return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
}
======

Then compile with (assuming you are on OS X):
======
ISYSROOT=$(xcodebuild -sdk macosx -version PlatformPath)/Developer/SDKs/MacOSX10.8.sdk
$(xcrun -find clang) -isysroot $ISYSROOT ror.c -c -S -Os -o -
======

yielding an assembly output of:
======
        .section __TEXT,__text,regular,pure_instructions
        .globl _rotr
_rotr:                                  ## @rotr
        .cfi_startproc
## BB#0:
        pushq %rbp
Ltmp2:
        .cfi_def_cfa_offset 16
Ltmp3:
        .cfi_offset %rbp, -16
        movq %rsp, %rbp
Ltmp4:
        .cfi_def_cfa_register %rbp
        movb %sil, %cl
        rorl %cl, %edi                              <==== Rotate instruction
        movl %edi, %eax
        popq %rbp
        ret
        .cfi_endproc
.subsections_via_symbols
======

I hope this helps.

Michael

On Jul 28, 2012, at 8:29 PM, reed kotler <[hidden email]> wrote:

> in C or C++, how can I get clang/llvm to try and do a "rotate".
>
> (want to test this code in the mips16 port)
>
> i.e. emit rotr node.
>
> tia.
>
> reed
>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Reed Kotler
Nice!

Clever compiler..


On 07/28/2012 08:55 PM, Michael Gottesman wrote:

> I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e.,
>
> ======
> #include<stdlib.h>
> #include<stdint.h>
>
> uint32_t ror(uint32_t input, size_t rot_bits)
> {
>    return (input>>  rot_bits) | (input<<  ((sizeof(input)<<  3) - rot_bits));
> }
> ======
>
> Then compile with (assuming you are on OS X):
> ======
> ISYSROOT=$(xcodebuild -sdk macosx -version PlatformPath)/Developer/SDKs/MacOSX10.8.sdk
> $(xcrun -find clang) -isysroot $ISYSROOT ror.c -c -S -Os -o -
> ======
>
> yielding an assembly output of:
> ======
> .section __TEXT,__text,regular,pure_instructions
> .globl _rotr
> _rotr:                                  ## @rotr
> .cfi_startproc
> ## BB#0:
> pushq %rbp
> Ltmp2:
> .cfi_def_cfa_offset 16
> Ltmp3:
> .cfi_offset %rbp, -16
> movq %rsp, %rbp
> Ltmp4:
> .cfi_def_cfa_register %rbp
> movb %sil, %cl
> rorl %cl, %edi<==== Rotate instruction
> movl %edi, %eax
> popq %rbp
> ret
> .cfi_endproc
> .subsections_via_symbols
> ======
>
> I hope this helps.
>
> Michael
>
> On Jul 28, 2012, at 8:29 PM, reed kotler<[hidden email]>  wrote:
>
>> in C or C++, how can I get clang/llvm to try and do a "rotate".
>>
>> (want to test this code in the mips16 port)
>>
>> i.e. emit rotr node.
>>
>> tia.
>>
>> reed
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> [hidden email]         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Michael Gottesman
*NOTE* IIRC compiling this with -O0 on x86-64 can yield the wrong result since clang will emit shifts and on intel shifts are mod the register size:

=====
        .section __TEXT,__text,regular,pure_instructions
        .globl _ror
        .align 4, 0x90
_ror:                                   ## @ror
        .cfi_startproc
## BB#0:
        pushq %rbp
Ltmp2:
        .cfi_def_cfa_offset 16
Ltmp3:
        .cfi_offset %rbp, -16
        movq %rsp, %rbp
Ltmp4:
        .cfi_def_cfa_register %rbp
        movl %edi, -4(%rbp)
        movq %rsi, -16(%rbp)
        movl -4(%rbp), %edi
        movq -16(%rbp), %rsi
        movl %esi, %eax
        movl %eax, %ecx
                                        ## kill: CL<def> ECX<kill>
        shrl %cl, %edi
        movl -4(%rbp), %eax
        movabsq $32, %rsi
        subq -16(%rbp), %rsi
        movl %esi, %edx
        movl %edx, %ecx
                                        ## kill: CL<def> ECX<kill>
        shll %cl, %eax
        orl %eax, %edi
        movl %edi, %eax
        popq %rbp
        ret
        .cfi_endproc


.subsections_via_symbols
=====

Michael

On Jul 28, 2012, at 9:04 PM, reed kotler <[hidden email]> wrote:

> Nice!
>
> Clever compiler..
>
>
> On 07/28/2012 08:55 PM, Michael Gottesman wrote:
>> I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e.,
>>
>> ======
>> #include<stdlib.h>
>> #include<stdint.h>
>>
>> uint32_t ror(uint32_t input, size_t rot_bits)
>> {
>>   return (input>>  rot_bits) | (input<<  ((sizeof(input)<<  3) - rot_bits));
>> }
>> ======
>>
>> Then compile with (assuming you are on OS X):
>> ======
>> ISYSROOT=$(xcodebuild -sdk macosx -version PlatformPath)/Developer/SDKs/MacOSX10.8.sdk
>> $(xcrun -find clang) -isysroot $ISYSROOT ror.c -c -S -Os -o -
>> ======
>>
>> yielding an assembly output of:
>> ======
>> .section __TEXT,__text,regular,pure_instructions
>> .globl _rotr
>> _rotr:                                  ## @rotr
>> .cfi_startproc
>> ## BB#0:
>> pushq %rbp
>> Ltmp2:
>> .cfi_def_cfa_offset 16
>> Ltmp3:
>> .cfi_offset %rbp, -16
>> movq %rsp, %rbp
>> Ltmp4:
>> .cfi_def_cfa_register %rbp
>> movb %sil, %cl
>> rorl %cl, %edi<==== Rotate instruction
>> movl %edi, %eax
>> popq %rbp
>> ret
>> .cfi_endproc
>> .subsections_via_symbols
>> ======
>>
>> I hope this helps.
>>
>> Michael
>>
>> On Jul 28, 2012, at 8:29 PM, reed kotler<[hidden email]>  wrote:
>>
>>> in C or C++, how can I get clang/llvm to try and do a "rotate".
>>>
>>> (want to test this code in the mips16 port)
>>>
>>> i.e. emit rotr node.
>>>
>>> tia.
>>>
>>> reed
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> [hidden email]         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Andy Gibbs
> *NOTE* IIRC compiling this with -O0 on x86-64 can yield the wrong result
> since clang will emit shifts and on intel shifts are mod the register
> size [...snip...]

I remember finding the same thing (although I haven't tried it on a recent
clang version) and what I wondered was whether there was mileage in having
an explicit intrinsic for rotation (like there is for bit counting, as in
__builtin_clz and __builtin_ctz and so on).  Microsoft's compiler has an
explicit intrinsic in the form of _rotl8 and _rotl16 (IIRC -- this is from
memory!).  It would be nice to have a __builtin_rotl family in clang, in
my opinion, but it would need back-end support from llvm.  I would expect
it to find use in code relating to hashing and cryptology.  I know that
the compiler *should* optimise

uint32_t ror(uint32_t input, size_t rot_bits) {
  return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
}

and such macros / inline functions are common, but an intrinsic specifies
intent better and could provide for better code optimisation.

Would this be worthwhile pursuing?

Cheers
Andy

 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Michael Gottesman
After some reflection, I believe the problem was actually with shld/shrd and a similar
bit of C code.

On Jul 29, 2012, at 1:02 PM, Andy Gibbs <[hidden email]> wrote:

>> *NOTE* IIRC compiling this with -O0 on x86-64 can yield the wrong result
>> since clang will emit shifts and on intel shifts are mod the register
>> size [...snip...]
>
> I remember finding the same thing (although I haven't tried it on a recent
> clang version) and what I wondered was whether there was mileage in having
> an explicit intrinsic for rotation (like there is for bit counting, as in
> __builtin_clz and __builtin_ctz and so on).  Microsoft's compiler has an
> explicit intrinsic in the form of _rotl8 and _rotl16 (IIRC -- this is from
> memory!).  It would be nice to have a __builtin_rotl family in clang, in
> my opinion, but it would need back-end support from llvm.  I would expect
> it to find use in code relating to hashing and cryptology.  I know that
> the compiler *should* optimise
>
> uint32_t ror(uint32_t input, size_t rot_bits) {
> return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
> }
>
> and such macros / inline functions are common, but an intrinsic specifies
> intent better and could provide for better code optimisation.
>
> Would this be worthwhile pursuing?
>
> Cheers
> Andy
>
>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Cameron McInally
In reply to this post by Andy Gibbs
Hey Andy,

I proposed a similar patch to LLVM (left circular shift) around 10/2011. Parts of my patch did make it into trunk about a year after, but others did not. 

At that time, my solution was to add a binary operator to the IRBuilder, since LCS fits in nicely with the other shift operators. But, it is quite cumbersome to merge :*(. I would be happy to resend the original patch if you'd like.

-Cameron

On Sun, Jul 29, 2012 at 4:02 PM, Andy Gibbs <[hidden email]> wrote:
... 
 It would be nice to have a __builtin_rotl family in clang, in
my opinion, but it would need back-end support from llvm. 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Andy Gibbs
On Monday, July 30, 2012 12:16 AM, Cameron McInally wrote:

> Hey Andy,
>
> I proposed a similar patch to LLVM (left circular shift) around 10/2011.
> Parts of my patch did make it into trunk about a year after, but others
> did not.
>
> At that time, my solution was to add a binary operator to the IRBuilder,
> since LCS fits in nicely with the other shift operators. But, it is quite
> cumbersome to merge :*(. I would be happy to resend the original patch
> if you'd like.

Yes, I would be interested.  Thank you!  I don't know if was rejected before
that I'll have any better luck this time, but I can try...

Andy


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Cameron McInally
Oh, no. I should have been more clear. The patch was not rejected, just lost in the daily shuffle.

I already have my employer's approval to send this upstream, so I will prepare a patch against trunk this morning.

> I proposed a similar patch to LLVM (left circular shift) around 10/2011.
> Parts of my patch did make it into trunk about a year after, but others
> did not.

And now that I reread this... it should have been "month after", not "year after". 

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Joerg Sonnenberger
In reply to this post by Michael Gottesman
On Sat, Jul 28, 2012 at 09:11:13PM -0700, Michael Gottesman wrote:
> *NOTE* IIRC compiling this with -O0 on x86-64 can yield the wrong
> result since clang will emit shifts and on intel shifts are mod the
> register size:

The original code is depending on UB, if it doesn't mask.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Cameron McInally
In reply to this post by Cameron McInally
Andy,

Here is the left circular shift operator patch. I apologize to the reviewer in advance. The patch has a good bit of fine detail. Any comments/criticisms?

Some caveats...

1) This is just the bare minimum needed to make the left circular shift operator work (e.g. no instruction combining).

2) I tried my best to select operator names in the existing style; please feel free to change them as appropriate.

Ty,
Cameron

On Tue, Jul 31, 2012 at 8:05 AM, Cameron McInally <[hidden email]> wrote:
Oh, no. I should have been more clear. The patch was not rejected, just lost in the daily shuffle.

I already have my employer's approval to send this upstream, so I will prepare a patch against trunk this morning.

> I proposed a similar patch to LLVM (left circular shift) around 10/2011.
> Parts of my patch did make it into trunk about a year after, but others
> did not.

And now that I reread this... it should have been "month after", not "year after". 


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

CShl.txt (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-commits] rotate

Eli Friedman-2
On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally
<[hidden email]> wrote:

> Andy,
>
> Here is the left circular shift operator patch. I apologize to the reviewer
> in advance. The patch has a good bit of fine detail. Any
> comments/criticisms?
>
> Some caveats...
>
> 1) This is just the bare minimum needed to make the left circular shift
> operator work (e.g. no instruction combining).
>
> 2) I tried my best to select operator names in the existing style; please
> feel free to change them as appropriate.

We intentionally haven't included a rotate instruction in LLVM in the
past; the justification is that it's generally straightforward for the
backend to form rotate operations, and making the optimizer
effectively handle the new rotation instruction adds a substantial
amount of complexity.  You're going to need to make a strong argument
that the current approach is insufficient if you want to commit a
patch like this.

-Eli
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-commits] rotate

Cameron McInally
Well, we like the operator because it maps cleanly to Fortran's ISHFTC intrinsic. There must also be other compilers out there, maybe catering to cryptos, that have their own intrinsic for circular shift in other languages. It seems wasteful for an optimizer to break apart an intrinsic into its elemental pieces in order for LLVM to put them back together. This was done in our compiler for some time and it cluttered up the interface code.

Just curious... what kind of optimizations are done on ISD::ROTL/ROTR? We're able to preform certain InstCombines and other peeps when we see a binary operator. I do not have any experience trying to optimize ISD::ROTL.

On Tue, Jul 31, 2012 at 12:17 PM, Eli Friedman <[hidden email]> wrote:
On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally
<[hidden email]> wrote:
> Andy,
>
> Here is the left circular shift operator patch. I apologize to the reviewer
> in advance. The patch has a good bit of fine detail. Any
> comments/criticisms?
>
> Some caveats...
>
> 1) This is just the bare minimum needed to make the left circular shift
> operator work (e.g. no instruction combining).
>
> 2) I tried my best to select operator names in the existing style; please
> feel free to change them as appropriate.

We intentionally haven't included a rotate instruction in LLVM in the
past; the justification is that it's generally straightforward for the
backend to form rotate operations, and making the optimizer
effectively handle the new rotation instruction adds a substantial
amount of complexity.  You're going to need to make a strong argument
that the current approach is insufficient if you want to commit a
patch like this.

-Eli


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-commits] rotate

Andy Gibbs
On Tuesday, July 31, 2012 7:01 PM, Cameron McInally wrote:
> [...snip...]
> There must also be other compilers out there, maybe catering to cryptos,
> that have their own intrinsic for circular shift [...snip...]

This is possibly inconsequent, but Microsoft's compiler for one has rotate
intrinsics... (just my 2 cents!)

Andy





_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-commits] rotate

Arnaud A. de Grandmaison
In reply to this post by Eli Friedman-2
On 07/31/2012 06:17 PM, Eli Friedman wrote:

> On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally
> <[hidden email]> wrote:
>> Andy,
>>
>> Here is the left circular shift operator patch. I apologize to the reviewer
>> in advance. The patch has a good bit of fine detail. Any
>> comments/criticisms?
>>
>> Some caveats...
>>
>> 1) This is just the bare minimum needed to make the left circular shift
>> operator work (e.g. no instruction combining).
>>
>> 2) I tried my best to select operator names in the existing style; please
>> feel free to change them as appropriate.
> We intentionally haven't included a rotate instruction in LLVM in the
> past; the justification is that it's generally straightforward for the
> backend to form rotate operations, and making the optimizer
> effectively handle the new rotation instruction adds a substantial
> amount of complexity.  You're going to need to make a strong argument
> that the current approach is insufficient if you want to commit a
> patch like this.
>
Well,

I believe something is currently broken with respect to forming rotate
instructions :

For example, using a recent clang/llvm on linux/x86_64 :

uint32_t ror32(uint32_t input, size_t rot_bits) {
  return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
}

uint32_t rol32(uint32_t input, size_t rot_bits) {
  return (input << rot_bits) | (input >> ((sizeof(input) << 3) - rot_bits));
}

gives the expected ror and rol instructions, but their 16bits counter
parts :

uint16_t ror16(uint16_t input, size_t rot_bits) {
  return (input >> rot_bits) | (input << ((sizeof(input) << 3) - rot_bits));
}

uint16_t rol16(uint16_t input, size_t rot_bits) {
  return (input << rot_bits) | (input >> ((sizeof(input) << 3) - rot_bits));
}

fail miserably :

        .globl  ror16
        .align  16, 0x90
        .type   ror16,@function
ror16:                                  # @ror16
        .cfi_startproc
# BB#0:                                 # %entry
        movb    %sil, %cl
        movl    %edi, %eax
        shrl    %cl, %eax
        movl    $16, %ecx
        subl    %esi, %ecx
                                        # kill: CL<def> CL<kill> ECX<kill>
        shll    %cl, %edi
        orl     %eax, %edi
        movzwl  %di, %eax
        ret
.Ltmp2:
        .size   ror16, .Ltmp2-ror16
        .cfi_endproc

        .globl  rol16
        .align  16, 0x90
        .type   rol16,@function
rol16:                                  # @rol16
        .cfi_startproc
# BB#0:                                 # %entry
        movb    %sil, %cl
        movl    %edi, %eax
        shll    %cl, %eax
        movl    $16, %ecx
        subl    %esi, %ecx
                                        # kill: CL<def> CL<kill> ECX<kill>
        shrl    %cl, %edi
        orl     %eax, %edi
        movzwl  %di, %eax
        ret
.Ltmp3:
        .size   rol16, .Ltmp3-rol16
        .cfi_endproc

At a quick first glance, this seems to be related to the values being
promoted from i16 to i32 in the IR optimization passes, but this may not
be the only reason.

--
Arnaud de Grandmaison

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Chris Lattner-2
In reply to this post by Andy Gibbs

On Jul 31, 2012, at 3:04 AM, Andy Gibbs <[hidden email]> wrote:

> On Monday, July 30, 2012 12:16 AM, Cameron McInally wrote:
>> Hey Andy,
>>
>> I proposed a similar patch to LLVM (left circular shift) around 10/2011.
>> Parts of my patch did make it into trunk about a year after, but others
>> did not.
>>
>> At that time, my solution was to add a binary operator to the IRBuilder,
>> since LCS fits in nicely with the other shift operators. But, it is quite
>> cumbersome to merge :*(. I would be happy to resend the original patch
>> if you'd like.
>
> Yes, I would be interested.  Thank you!  I don't know if was rejected before
> that I'll have any better luck this time, but I can try…

Why do we need a new builtin for rotates?  What problem does it solve?

-Chris
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: rotate

Cameron McInally
The initial inspiration for my local mod was to create a way for our optimizer to get at ISD::ROTL directly. This would allow us to always produce an x86 rol instruction for a user intrinsic call, even at -O0.

I suppose an LLVM intrinsic would do the same thing. But, it is nice to have an operator so LLVM can optimize common cases, like cshift by a constant. I personally feel that it fits in with LSHL, LSHR, and ASHR.

That being said, I'd be just as happy with any other way to convey the same functionality. It just seems less nice to have to unfold a circular shift intrinsic into the Hacker's Delight algorithm (5 or more operations) at our opt/LLVM interface, just to have LLVM try to put it back together.

On Tue, Jul 31, 2012 at 1:58 PM, Chris Lattner <[hidden email]> wrote:
...

Why do we need a new builtin for rotates?  What problem does it solve?

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev