[llvm-dev] About CodeGen quality

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
Hi All,

  Is there known issue that LLVM is bad at codegen for some language structure, say C bitfield?
Our custom backend generates inefficient code for bitfield access, so I am wondering where
should I look into first.

  Thanks.

Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
Would probably help if you explained which backend you are working on (assuming it's a publicly available one). An example, with source that can be compiled by "anyone", along with the generated "bad code" and what you expect to see as "good code" would also help a lot.

From the things I've seen, it's not noticeably worse (or better) than other compilers. But it's not an area that I've spent a LOT of time on, and the combination of generic LLVM operations and the target implementation will determine the outcome - there are lots of clever tricks one can do at the machine-code level, that LLVM can't "know" in generic ways, since it's dependent on specific instructions. Most of my experience comes from x86 and ARM, both of which are fairly well established architectures with a good amount of people supporting the code-gen part. If you are using a different target, there may be missing target optimisations that the compiler could do.

I probably can't really help, just trying to help you make the question as clear as possible, so that those who may be able to help have enough information to work on.

--
Mats

On 14 June 2017 at 13:57, 陳韋任 via llvm-dev <[hidden email]> wrote:
Hi All,

  Is there known issue that LLVM is bad at codegen for some language structure, say C bitfield?
Our custom backend generates inefficient code for bitfield access, so I am wondering where
should I look into first.

  Thanks.

Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
Hi Mats,

  It's private backend. I will try describing what I am dealing with.

    struct S {
      unsigned int a : 8;
      unsigned int b : 8;
      unsigned int c : 8;
      unsigned int d : 8;

      unsigned int e;
    }

We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp

    setOperationAction(ISD::LOAD, MVT::i64, Custom);
  
Transform load i64 to load v2i32 during type legalization. During op legalization, load v2i32
is found unaligned (4 v.s. 8), so stack load/store instructions are generated. This is one problem.

Besides of that, our target has bitset/bitextract instructions, we want to use them on bitfield
access, too. But don't know how to do that.

Thanks.

Regards,
chenwj


2017-06-15 0:10 GMT+08:00 mats petersson <[hidden email]>:
Would probably help if you explained which backend you are working on (assuming it's a publicly available one). An example, with source that can be compiled by "anyone", along with the generated "bad code" and what you expect to see as "good code" would also help a lot.

From the things I've seen, it's not noticeably worse (or better) than other compilers. But it's not an area that I've spent a LOT of time on, and the combination of generic LLVM operations and the target implementation will determine the outcome - there are lots of clever tricks one can do at the machine-code level, that LLVM can't "know" in generic ways, since it's dependent on specific instructions. Most of my experience comes from x86 and ARM, both of which are fairly well established architectures with a good amount of people supporting the code-gen part. If you are using a different target, there may be missing target optimisations that the compiler could do.

I probably can't really help, just trying to help you make the question as clear as possible, so that those who may be able to help have enough information to work on.

--
Mats

On 14 June 2017 at 13:57, 陳韋任 via llvm-dev <[hidden email]> wrote:
Hi All,

  Is there known issue that LLVM is bad at codegen for some language structure, say C bitfield?
Our custom backend generates inefficient code for bitfield access, so I am wondering where
should I look into first.

  Thanks.

Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev





--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
I understand the problem. Can't offer any useful help - most likely, you need to add some code to help the instruction selection or some such... but it's not an area that I'm familiar with...

--
Mats

On 15 June 2017 at 12:06, 陳韋任 <[hidden email]> wrote:
Hi Mats,

  It's private backend. I will try describing what I am dealing with.

    struct S {
      unsigned int a : 8;
      unsigned int b : 8;
      unsigned int c : 8;
      unsigned int d : 8;

      unsigned int e;
    }

We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp

    setOperationAction(ISD::LOAD, MVT::i64, Custom);
  
Transform load i64 to load v2i32 during type legalization. During op legalization, load v2i32
is found unaligned (4 v.s. 8), so stack load/store instructions are generated. This is one problem.

Besides of that, our target has bitset/bitextract instructions, we want to use them on bitfield
access, too. But don't know how to do that.

Thanks.

Regards,
chenwj


2017-06-15 0:10 GMT+08:00 mats petersson <[hidden email]>:
Would probably help if you explained which backend you are working on (assuming it's a publicly available one). An example, with source that can be compiled by "anyone", along with the generated "bad code" and what you expect to see as "good code" would also help a lot.

From the things I've seen, it's not noticeably worse (or better) than other compilers. But it's not an area that I've spent a LOT of time on, and the combination of generic LLVM operations and the target implementation will determine the outcome - there are lots of clever tricks one can do at the machine-code level, that LLVM can't "know" in generic ways, since it's dependent on specific instructions. Most of my experience comes from x86 and ARM, both of which are fairly well established architectures with a good amount of people supporting the code-gen part. If you are using a different target, there may be missing target optimisations that the compiler could do.

I probably can't really help, just trying to help you make the question as clear as possible, so that those who may be able to help have enough information to work on.

--
Mats

On 14 June 2017 at 13:57, 陳韋任 via llvm-dev <[hidden email]> wrote:
Hi All,

  Is there known issue that LLVM is bad at codegen for some language structure, say C bitfield?
Our custom backend generates inefficient code for bitfield access, so I am wondering where
should I look into first.

  Thanks.

Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev





--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
I may be out to lunch here but this sounds like something that SROA converts into an i64 load. I wonder if disabling it produces IR that is easier for your target to handle. 
Of course, this isn't to say that simply disabling SROA is a viable solution, but it may give you some ideas as to where to go in terms of looking for a solution. 

You also may be able to combine such patterns in the SDAG (before legalization) into loads that your target can handle.

This is all kind of speculative but hopefully it sheds some light on what might be going on. 

On Thu, Jun 15, 2017 at 1:45 PM mats petersson via llvm-dev <[hidden email]> wrote:
I understand the problem. Can't offer any useful help - most likely, you need to add some code to help the instruction selection or some such... but it's not an area that I'm familiar with...

--
Mats

On 15 June 2017 at 12:06, 陳韋任 <[hidden email]> wrote:
Hi Mats,

  It's private backend. I will try describing what I am dealing with.

    struct S {
      unsigned int a : 8;
      unsigned int b : 8;
      unsigned int c : 8;
      unsigned int d : 8;

      unsigned int e;
    }

We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp

    setOperationAction(ISD::LOAD, MVT::i64, Custom);
  
Transform load i64 to load v2i32 during type legalization. During op legalization, load v2i32
is found unaligned (4 v.s. 8), so stack load/store instructions are generated. This is one problem.

Besides of that, our target has bitset/bitextract instructions, we want to use them on bitfield
access, too. But don't know how to do that.

Thanks.

Regards,
chenwj


2017-06-15 0:10 GMT+08:00 mats petersson <[hidden email]>:
Would probably help if you explained which backend you are working on (assuming it's a publicly available one). An example, with source that can be compiled by "anyone", along with the generated "bad code" and what you expect to see as "good code" would also help a lot.

From the things I've seen, it's not noticeably worse (or better) than other compilers. But it's not an area that I've spent a LOT of time on, and the combination of generic LLVM operations and the target implementation will determine the outcome - there are lots of clever tricks one can do at the machine-code level, that LLVM can't "know" in generic ways, since it's dependent on specific instructions. Most of my experience comes from x86 and ARM, both of which are fairly well established architectures with a good amount of people supporting the code-gen part. If you are using a different target, there may be missing target optimisations that the compiler could do.

I probably can't really help, just trying to help you make the question as clear as possible, so that those who may be able to help have enough information to work on.

--
Mats

On 14 June 2017 at 13:57, 陳韋任 via llvm-dev <[hidden email]> wrote:
Hi All,

  Is there known issue that LLVM is bad at codegen for some language structure, say C bitfield?
Our custom backend generates inefficient code for bitfield access, so I am wondering where
should I look into first.

  Thanks.

Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev





--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
In reply to this post by ORiordan, Martin via llvm-dev
On 6/15/2017 4:06 AM, 陳韋任 via llvm-dev wrote:
Hi Mats,

  It's private backend. I will try describing what I am dealing with.

    struct S {
      unsigned int a : 8;
      unsigned int b : 8;
      unsigned int c : 8;
      unsigned int d : 8;

      unsigned int e;
    }

We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

This looks fine.


Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp

    setOperationAction(ISD::LOAD, MVT::i64, Custom);
  
Transform load i64 to load v2i32 during type legalization.

If misaligned load v2i32 isn't legal, don't generate it.  If it is legal, you might need to mess with your implementation of allowsMisalignedMemoryAccesses.

Besides of that, our target has bitset/bitextract instructions, we want to use them on bitfield
access, too. But don't know how to do that.

This is generally implemented by pattern-matching the shift and mask operations.  ARM has instructions like this if you're looking for inspiration; look for UBFX, SBFX and BFI.

-Eli
-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp

    setOperationAction(ISD::LOAD, MVT::i64, Custom);
  
Transform load i64 to load v2i32 during type legalization.

If misaligned load v2i32 isn't legal, don't generate it.  If it is legal, you might need to mess with your implementation of allowsMisalignedMemoryAccesses.


​Will check that. ​Just a little more explanation about the misaligned part. We declare i64 is 8 align in the DataLayout, and in "%0 = load i64, *i64 ptr, align 4" the alignment is 4. In the op legalization stage, it will go through

    SelectionDAGLegalize::LegalizeLoadOps -> TargetLowering::expandUnalignedLoad

We don't expect load i64 would be 4 align, so how do I know I will generate misaligned load v2i32 beforehand? Another question is usually what we do to handle load i64 if that is not natively supported? Is it correct transforming load i64 to load v2i32? An existing backend example would be great.

Besides of that, our target has bitset/bitextract instructions, we want to use them on bitfield
access, too. But don't know how to do that.

This is generally implemented by pattern-matching the shift and mask operations.  ARM has instructions like this if you're looking for inspiration; look for UBFX, SBFX and BFI.

​Thanks. Having example​ is good. :-)

Regards,
chenwj


--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
On 6/15/2017 1:37 PM, 陳韋任 wrote:
Our target doesn't support load i64, so we have following code in XXXISelLowering.cpp

    setOperationAction(ISD::LOAD, MVT::i64, Custom);
  
Transform load i64 to load v2i32 during type legalization.

If misaligned load v2i32 isn't legal, don't generate it.  If it is legal, you might need to mess with your implementation of allowsMisalignedMemoryAccesses.


​Will check that. ​Just a little more explanation about the misaligned part. We declare i64 is 8 align in the DataLayout, and in "%0 = load i64, *i64 ptr, align 4" the alignment is 4. In the op legalization stage, it will go through

    SelectionDAGLegalize::LegalizeLoadOps -> TargetLowering::expandUnalignedLoad

We don't expect load i64 would be 4 align, so how do I know I will generate misaligned load v2i32 beforehand? Another question is usually what we do to handle load i64 if that is not natively supported? Is it correct transforming load i64 to load v2i32? An existing backend example would be great.

You can get the alignment by casting the SDNode to LoadSDNode, then calling getAlignment().

I think all in-tree backends which don't have 64-bit integer registers use the default expansion for an i64 load, which splits the load into two i32 loads.

-Eli
-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
In reply to this post by ORiordan, Martin via llvm-dev

Forgot to reply to all

Hi Eli


    struct S {
      unsigned int a : 8;
      unsigned int b : 8;
      unsigned int c : 8;
      unsigned int d : 8;

      unsigned int e;
    }

We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

This looks fine.

Why can't we expect InstCombine to simplify this to an 8 bit load, assuming each of %0 and %1 has only one use ?

Thanks
Ehsan




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
Here is the complete IR I am dealing with, if that helps discussion.

-------------------------------------------------------------

    %struct.A = type { %struct.Z*, %struct.Z*, %struct.Z*, %struct.Z*, %struct.Z* }

    %struct.Z = type { %union.X, [180 x %union.Y] }

    %union.X = type { %struct.anon }

    %struct.anon = type { i64 }

    %union.Y = type { %struct.anon.0 }

    %struct.anon.0 = type { i32 }

    %struct.D = type { i64 }


    ; Function Attrs: norecurse nounwind

    define void @func(%struct.A* noalias nocapture readonly %a, %struct.D* noalias nocapture readonly %d) local_unnamed_addr #0 {

    entry:

      %a2 = getelementptr inbounds %struct.A, %struct.A* %a, i32 0, i32 1

      %0 = load %struct.Z*, %struct.Z** %a2, align 4, !tbaa !1

      %1 = getelementptr inbounds %struct.D, %struct.D* %d, i32 0, i32 0

      %bf.load = load i64, i64* %1, align 4

      %bf.lshr = lshr i64 %bf.load, 8

      %2 = trunc i64 %bf.lshr to i32

      %bf.cast = and i32 %2, 255

      %3 = getelementptr inbounds %struct.Z, %struct.Z* %0, i32 0, i32 1, i32 %bf.cast, i32 0, i32 0

      %bf.load1 = load i32, i32* %3, align 4

      %bf.clear2 = and i32 %bf.load1, 65535

      store i32 %bf.clear2, i32* %3, align 4

      ret void

    }

-------------------------------------------------------------


Regards,

chenwj


2017-06-16 14:13 GMT+08:00 Ehsan Amiri via llvm-dev <[hidden email]>:

Forgot to reply to all

Hi Eli


    struct S {
      unsigned int a : 8;
      unsigned int b : 8;
      unsigned int c : 8;
      unsigned int d : 8;

      unsigned int e;
    }

We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

This looks fine.

Why can't we expect InstCombine to simplify this to an 8 bit load, assuming each of %0 and %1 has only one use ?

Thanks
Ehsan




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
In reply to this post by ORiordan, Martin via llvm-dev
On 6/15/2017 11:13 PM, Ehsan Amiri wrote:

Forgot to reply to all

Hi Eli
We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

This looks fine.

Why can't we expect InstCombine to simplify this to an 8 bit load, assuming each of %0 and %1 has only one use ?


We don't aggressively narrow loads and stores in IR because it tends to block other optimizations.  See https://reviews.llvm.org/D30416.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
I guess it tends not to block cross block optimization opportunity, or it just happen

Regards,
chenwj

2017-06-17 1:49 GMT+08:00 Friedman, Eli via llvm-dev <[hidden email]>:
On 6/15/2017 11:13 PM, Ehsan Amiri wrote:

Forgot to reply to all

Hi Eli
We want to read S->b for example. The size of struct S is 64 bits, and seems LLVM treats it as i64.
Below is the IR corresponding to S->b, IIRC. 

    %0 = load i64, *i64 ptr, align 4;
    %1 = %0 lshr 8;
    %2 = %1 and 255;

This looks fine.

Why can't we expect InstCombine to simplify this to an 8 bit load, assuming each of %0 and %1 has only one use ?


We don't aggressively narrow loads and stores in IR because it tends to block other optimizations.  See https://reviews.llvm.org/D30416.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev




--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
In reply to this post by ORiordan, Martin via llvm-dev

For this specific case, modifying source code (or the frontend) to use a struct instead of
bitfield seems to be an easy way since all sizes of bitfields are 8 bits. But you cannot?
For general cases, you may want to enhance backend to emit bitextract/bitset.

-----
Hiroshi Inoue <[hidden email]>
IBM Research - Tokyo


"llvm-dev" <[hidden email]> wrote on 2017/06/15 20:06:49:

> From: 陳韋任 via llvm-dev <[hidden email]>

> To: mats petersson <[hidden email]>
> Cc: LLVM Developers Mailing List <[hidden email]>
> Date: 2017/06/15 20:07
> Subject: Re: [llvm-dev] About CodeGen quality
> Sent by: "llvm-dev" <[hidden email]>
>
> Hi Mats,

>
>   It's private backend. I will try describing what I am dealing with.

>
>     struct S {

>       unsigned int a : 8;
>       unsigned int b : 8;
>       unsigned int c : 8;
>       unsigned int d : 8;
>
>       unsigned int e;

>     }
>
> We want to read S->b for example. The size of struct S is 64 bits,
> and seems LLVM treats it as i64.

> Below is the IR corresponding to S->b, IIRC. 
>
>     %0 = load i64, *i64 ptr, align 4;

>     %1 = %0 lshr 8;
>     %2 = %1 and 255;
>
> Our target doesn't support load i64, so we have following code
> in XXXISelLowering.cpp

>
>     setOperationAction(ISD::LOAD, MVT::i64, Custom);

>   
> Transform load i64 to load v2i32 during type legalization. During op
> legalization, load v2i32

> is found unaligned (4 v.s. 8), so stack load/store instructions are
> generated. This is one problem.

>
> Besides of that, our target has bitset/bitextract instructions, we
> want to use them on bitfield

> access, too. But don't know how to do that.
>
> Thanks.

>
> Regards,

> chenwj
>
> 2017-06-15 0:10 GMT+08:00 mats petersson <[hidden email]>:

> Would probably help if you explained which backend you are working
> on (assuming it's a publicly available one). An example, with source
> that can be compiled by "anyone", along with the generated "bad
> code" and what you expect to see as "good code" would also help a lot.

> From the things I've seen, it's not noticeably worse (or better)
> than other compilers. But it's not an area that I've spent a LOT of
> time on, and the combination of generic LLVM operations and the
> target implementation will determine the outcome - there are lots of
> clever tricks one can do at the machine-code level, that LLVM can't
> "know" in generic ways, since it's dependent on specific
> instructions. Most of my experience comes from x86 and ARM, both of
> which are fairly well established architectures with a good amount
> of people supporting the code-gen part. If you are using a different
> target, there may be missing target optimisations that the compiler could do.

> I probably can't really help, just trying to help you make the
> question as clear as possible, so that those who may be able to help
> have enough information to work on.

>
> --

> Mats
>
> On 14 June 2017 at 13:57, 陳韋任 via llvm-dev <[hidden email]> wrote:

> Hi All,
>
>   Is there known issue that LLVM is bad at codegen for some language
> structure, say C bitfield?

> Our custom backend generates inefficient code for bitfield access,
> so I am wondering where

> should I look into first.
>
>   Thanks.

>
> Regards,

> chenwj
>
> --

> Wei-Ren Chen (陳韋任)
> Homepage: https://people.cs.nctu.edu.tw/~chenwj

>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

>
>

>
> --

> Wei-Ren Chen (陳韋任)
> Homepage: https://people.cs.nctu.edu.tw/~chenwj
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev


2017-06-17 3:19 GMT+08:00 Hiroshi 7 Inoue <[hidden email]>:

For this specific case, modifying source code (or the frontend) to use a struct instead of
bitfield seems to be an easy way since all sizes of bitfields are 8 bits. But you cannot?
For general cases, you may want to enhance backend to emit bitextract/bitset.

​The original test case was modified as it is, the bitfield are not always 8 bits.
And sure, emitting bitextract/bitset will improve the code quality.​

​Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
In reply to this post by ORiordan, Martin via llvm-dev
On 6/16/2017 11:44 AM, 陳韋任 wrote:
I guess it tends not to block cross block optimization opportunity, or it just happen

Regards,
chenwj

2017-06-17 1:49 GMT+08:00 Friedman, Eli via llvm-dev <[hidden email]>:

We don't aggressively narrow loads and stores in IR because it tends to block other optimizations.  See https://reviews.llvm.org/D30416.


I guess my previous reply was a bit too terse.

We don't want to narrow loads early in the optimization pipeline because it tends to hide relationships between bitfields.  In particular, overlapping loads/stores are more difficult to optimize.  We want EarlyCSE/GVN/DSE/etc. to see the full-width loads and stores to make them more effective.

Currently, we do narrowing in DAGCombine.  D30416 is a proposal to move that slightly earlier, to the late stages of the IR optimization pipeline, so we can avoid the limitations of SelectionDAG.

If you read the whole discussion on that patch, it goes into more detail about this.

-Eli
-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] About CodeGen quality

ORiordan, Martin via llvm-dev
I guess my previous reply was a bit too terse.

We don't want to narrow loads early in the optimization pipeline because it tends to hide relationships between bitfields.  In particular, overlapping loads/stores are more difficult to optimize.  We want EarlyCSE/GVN/DSE/etc. to see the full-width loads and stores to make them more effective.

Currently, we do narrowing in DAGCombine.  D30416 is a proposal to move that slightly earlier, to the late stages of the IR optimization pipeline, so we can avoid the limitations of SelectionDAG.

If you read the whole discussion on that patch, it goes into more detail about this.

​Thanks. I was only reading the summary of the patch. :-)

Regards,
chenwj​

--
Wei-Ren Chen (陳韋任)
Homepage: https://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev