Alias analysis and instruction level parallelism

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Alias analysis and instruction level parallelism

Pertti Kellomäki-2
I am pretty excited about the recent activity on dependence
analysis. The only remaining problem from our point of view
is how to get the alias information to the back end instruction
scheduler. If I understand things correctly, the alias information
basically gets lost in the process of lowering to target
instructions.

We are interested in the DSP domain, so we really need to get
SIMD style parallelism to work, and this needs alias information.
In one of the earlier threads on alias analysis someone commented
that preserving the alias information would not really be that
difficult, but possibly tedious.

My initial reaction is that if one were to decorate MachineInstr's
with the LLVM level pointer values that they use for reading
and writing memory, then one should be able to use those
values and the AliasAnalysis interface to query dependences
between MachineInstr's. I am not intimately familiar with
how the lowering is done, so if there are some obvious
problems with this approach, please let me know.
--
Pertti
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Duncan Sands
Hi,

> My initial reaction is that if one were to decorate MachineInstr's
> with the LLVM level pointer values that they use for reading
> and writing memory,

this is already the case: SrcValue and SVOffset.

Ciao,

Duncan.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Pertti Kellomäki-2
Duncan Sands wrote:
>> My initial reaction is that if one were to decorate MachineInstr's
>> with the LLVM level pointer values that they use for reading
>> and writing memory,
>
> this is already the case: SrcValue and SVOffset.

Ah, that's right. I went back and read the discussion from
January, and Florian Brandner explains there that the real
culprit is the lowering of GEP in the codegen prepare pass.

Florian, if you have worked more on this, we would be very interested
in your work. Serialization caused by memory references is a big
obstacle for us at the moment.
--
Pertti
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Florian Brandner-3
On Wednesday 02 April 2008 11:54:26 Pertti Kellomäki wrote:
> Ah, that's right. I went back and read the discussion from
> January, and Florian Brandner explains there that the real
> culprit is the lowering of GEP in the codegen prepare pass.

codgenprepare rewrites GEP instructions into a sequence of PtrToInt casts,
additions, multiplies and finally a IntToPtr cast. The basic alias analysis
pass does not analyse pointers past an IntToPtr cast, and thus returns "may
alias".

i have tried to overcome this. basically you only need to find the base object
of the original GEP, but this is possible in some simple cases only. if
pointer arithmetic is involved it is hard to distinguish between the GEP base
and other calculations. you might even end up analyzing pointer calculations
that did not originate from a GEP.

an other option would be to save the GEP instructions and provide a mapping
between them and the lowered arithmetic. it's not very clean but it would be
safe, and probably work well.

florian



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Pertti Kellomäki-2
Florian Brandner wrote:
> an other option would be to save the GEP instructions and provide a mapping
> between them and the lowered arithmetic. it's not very clean but it would be
> safe, and probably work well.

In my opinion this seems like the obvious way to go. The information
is already there, so reverse engineering it back does not sound too
clever. Of course if the reverse engineering would provide useful
information about pointers that do not originate from GEP instructions,
then it might be worthwhile. But my feeling is that at least for our
purposes, GEPs really are the interesting stuff. My understanding is
that parallelizing C compilers try to convert pointer arithmetic back
to array access in order to make sense of them, so essentially turning
them into GEPs. Certainly the dependence literature is heavily array
based.
--
Pertti
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Devang Patel
In reply to this post by Pertti Kellomäki-2

On Apr 2, 2008, at 1:14 AM, Pertti Kellomäki wrote:

> I am pretty excited about the recent activity on dependence
> analysis. The only remaining problem from our point of view
> is how to get the alias information to the back end instruction
> scheduler. If I understand things correctly, the alias information
> basically gets lost in the process of lowering to target
> instructions.
>
> We are interested in the DSP domain, so we really need to get
> SIMD style parallelism to work, and this needs alias information.

If you handle this at LLVM IR level then you've access to all the  
alias info and GEP instructions are directly available to you.  LLVM  
IR supports vector types for SIMD style parallelism and target  
specific code generators lowers them appropriately.

-
Devang

>
> In one of the earlier threads on alias analysis someone commented
> that preserving the alias information would not really be that
> difficult, but possibly tedious.
>
>
> My initial reaction is that if one were to decorate MachineInstr's
> with the LLVM level pointer values that they use for reading
> and writing memory, then one should be able to use those
> values and the AliasAnalysis interface to query dependences
> between MachineInstr's. I am not intimately familiar with
> how the lowering is done, so if there are some obvious
> problems with this approach, please let me know.
> --
> Pertti
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Dan Gohman-2
In reply to this post by Pertti Kellomäki-2

On Apr 3, 2008, at 12:31 AM, Pertti Kellomäki wrote:
> Florian Brandner wrote:
>> an other option would be to save the GEP instructions and provide a  
>> mapping
>> between them and the lowered arithmetic. it's not very clean but it  
>> would be
>> safe, and probably work well.
>
> In my opinion this seems like the obvious way to go.

I think this is trickier than it sounds; the reason GEPs are lowered  
is to
allow strength-reduction and other things to do transformations on them.
It would require those passes to know how to update the mapping.

Dan


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Pertti Kellomäki-2
In reply to this post by Devang Patel
Devang Patel wrote:
> If you handle this at LLVM IR level then you've access to all the  
> alias info and GEP instructions are directly available to you.  LLVM  
> IR supports vector types for SIMD style parallelism and target  
> specific code generators lowers them appropriately.

Our target is a statically scheduled VLIW style processor,
which may have custom FUs designed by the user. This means
that the instruction scheduler needs to have very detailed
knowledge of the available FUs, latencies, the interconnection
network etc.  The main problem is not exploiting some set of
vector style FUs, but rather packing parallel operations
into wide instructions. We would love to do this at the level
of LLVM IR, but it does not seem to be possible.
--
Pertti
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Pertti Kellomäki-2
In reply to this post by Dan Gohman-2
Dan Gohman wrote:
> I think this is trickier than it sounds; the reason GEPs are lowered  
> is to
> allow strength-reduction and other things to do transformations on them.
> It would require those passes to know how to update the mapping.

Yes, I do appreciate the amount of work involved, and I am
very open to other suggestions.

What the backend really needs to know is what loads and
stores are independent of each other. When looking at a
scheduling region (basic block for now), we already know
at the LLVM IR level which loads and stores could potentially
be scheduled at the same cycle, so essentially we can
anticipate which alias queries the backend would make.

An alternative game plan would then be to identify the
loads and stores of interest, do the alias queries
at the LLVM IR level, and store the independence info
somewhere. The back end would then trace the target memory
references back to LLVM IR loads and stores, and consult
the stored independence information while scheduling.

It seems to me that this should be safe, at least if
one is careful about what passes are run between
the alias analysis and the back end. I cannot think
of anything offhand that could make two independent
memory references to be dependent, which would need
to happen in order for things to go bad.

There is a potential combinatorial explosion built into
this, but I don't think it would turn up in practice.
--
Pertti
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Dan Gohman-2

On Apr 3, 2008, at 2:20 PM, Pertti Kellomäki wrote:
> Dan Gohman wrote:
>> I think this is trickier than it sounds; the reason GEPs are lowered
>> is to
>> allow strength-reduction and other things to do transformations on  
>> them.
>> It would require those passes to know how to update the mapping.
>
> Yes, I do appreciate the amount of work involved, and I am
> very open to other suggestions.

To clarify, I was responding to the approach of mapping
address values back to their original GEP instructions
here.

>
>
> What the backend really needs to know is what loads and
> stores are independent of each other. When looking at a
> scheduling region (basic block for now), we already know
> at the LLVM IR level which loads and stores could potentially
> be scheduled at the same cycle, so essentially we can
> anticipate which alias queries the backend would make.
>
> An alternative game plan would then be to identify the
> loads and stores of interest, do the alias queries
> at the LLVM IR level, and store the independence info
> somewhere. The back end would then trace the target memory
> references back to LLVM IR loads and stores, and consult
> the stored independence information while scheduling.
>
> It seems to me that this should be safe, at least if
> one is careful about what passes are run between
> the alias analysis and the back end. I cannot think
> of anything offhand that could make two independent
> memory references to be dependent, which would need
> to happen in order for things to go bad.

I agree. I think what you're describing here and what
I described in a separate email are very similar :-).

The current SelectionDAG already has a way to store the
the information you want here, using its chain dependencies.
It needs improvement to do a better job identifying
interesting loads and stores and making alias queries --
the things you're describing here.

If we do that, and then preserve this dependency information
when SelectionDAG nodes are translated to MachineInstrs,
we'll have a very useful system.

Dan


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Alias analysis and instruction level parallelism

Chris Lattner
In reply to this post by Pertti Kellomäki-2

On Apr 3, 2008, at 2:20 PM, Pertti Kellomäki wrote:

> Dan Gohman wrote:
>> I think this is trickier than it sounds; the reason GEPs are lowered
>> is to
>> allow strength-reduction and other things to do transformations on  
>> them.
>> It would require those passes to know how to update the mapping.
>
> Yes, I do appreciate the amount of work involved, and I am
> very open to other suggestions.

How about a much simpler approach!  Here's a silly, but reasonable  
example:

int A[100];

void test(int x) {
   while (x)
     A[--x] = 0;
}

we compile this into (x86 with pic codegen):

..
LBB1_1: ## bb.preheader
        movl L_A$non_lazy_ptr, %ecx
        leal -4(%ecx,%eax,4), %ecx
        xorl %edx, %edx
        .align 4,0x90
LBB1_2: ## bb
        movl $0, (%ecx)
        addl $4294967292, %ecx
        incl %edx
        cmpl %eax, %edx
        jne LBB1_2 ## bb
...

This is pretty reasonable code, what does the output of lsr look like  
though?

$ llvm-gcc t.c -S -o - -O3 -emit-llvm | llvm-as | llc -print-isel-
input -relocation-model=pic
...
        %tmp = mul i32 %x, 4 ; <i32> [#uses=1]
        %tmp2 = add i32 ptrtoint ([100 x i32]* @A to i32), %tmp ; <i32>  
[#uses=1]
        %tmp4 = add i32 %tmp2, -4 ; <i32> [#uses=1]
        br label %bb
bb: ; preds = %bb.preheader, %bb
        %iv.5 = phi i32 [ %tmp4, %bb.preheader ], [ %iv.5.inc, %bb ] ; <i32>  
[#uses=2]
...
        %iv.57 = inttoptr i32 %iv.5 to i32* ; <i32*> [#uses=1]
        store i32 0, i32* %iv.57, align 4
...
        %iv.5.inc = add i32 %iv.5, -4 ; <i32> [#uses=1]
        br i1 %exitcond, label %return, label %bb
return: ; preds = %bb, %entry
        ret void


Wow, that's horrible!  No wonder basicaa gets confused, I don't blame  
it.  However, none of this is needed.  It would be much better for LSR  
to lower the code into:


    %tmp4 = getelementptr [100 x i32]* @A, i32 0, i32 %x   ; i32*
    br label %bb
bb:
    %iv.5 = phi i32* [tmp4], [iv.5.inc]
..
    store i32 0, i32* iv.5
..
    %iv.5.inc = getelementptr i32* %iv.5, i32 -1
        br i1 %exitcond, label %return, label %bb
return: ; preds = %bb, %entry
        ret void

With this code, basicaa will have little problem understanding what is  
going on, and generating this from LSR should not be very hard at  
all.  Better yet, no new crazy infrastructure change is required.

What do you think?

-Chris






_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev