SSA or not SSA?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

SSA or not SSA?

Matthieu Moy
Hi,

Silly question from an LLVM newbie: the LLVM LRM say that the bytecode
is "is an SSA based representation". Indeed, my experience with
llvm-gcc is that the generated code is not necessarily SSA, while
the one given by "llvm-gcc -O1" is.

Is this assumption correct?

Is there a non-SSA to SSA translator available?

Thanks,

--
Matthieu
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

Patrick Meredith
All register uses are SSA.  Memory is not in SSA.  The mem2reg pass  
which promotes stack variables to registers effectively converts non-
SSA to SSA.  There was a reg2mem pass, written by Andrew Lenharth, I'm  
not sure if it's still being maintained.

On Jul 7, 2008, at 8:47 AM, Matthieu Moy wrote:

> Hi,
>
> Silly question from an LLVM newbie: the LLVM LRM say that the bytecode
> is "is an SSA based representation". Indeed, my experience with
> llvm-gcc is that the generated code is not necessarily SSA, while
> the one given by "llvm-gcc -O1" is.
>
> Is this assumption correct?
>
> Is there a non-SSA to SSA translator available?
>
> Thanks,
>
> --
> Matthieu
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

Matthieu Moy
[ sorry for the late reply ]

Patrick Meredith <[hidden email]> wrote:
> All register uses are SSA.  Memory is not in SSA.  The mem2reg pass  
> which promotes stack variables to registers effectively converts non-
> SSA to SSA.  There was a reg2mem pass, written by Andrew Lenharth, I'm  
> not sure if it's still being maintained.

What is the difference between register and memory? I don't see such
distinction in the LRM.

Do you mean for example

register = i32
memory = i32*

?

Thanks,

> On Jul 7, 2008, at 8:47 AM, Matthieu Moy wrote:
>
> > Hi,
> >
> > Silly question from an LLVM newbie: the LLVM LRM say that the bytecode
> > is "is an SSA based representation". Indeed, my experience with
> > llvm-gcc is that the generated code is not necessarily SSA, while
> > the one given by "llvm-gcc -O1" is.
> >
> > Is this assumption correct?
> >
> > Is there a non-SSA to SSA translator available?
> >
> > Thanks,

--
Matthieu
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

Patrick Meredith
Memory is what the i32* points too.  The i32* itself is in a  
register.  You can store to it as many times as you want, but you  
can't change the address, because that would violate SSA.

On Jul 17, 2008, at 4:26 AM, Matthieu Moy wrote:

> [ sorry for the late reply ]
>
> Patrick Meredith <[hidden email]> wrote:
>> All register uses are SSA.  Memory is not in SSA.  The mem2reg pass
>> which promotes stack variables to registers effectively converts non-
>> SSA to SSA.  There was a reg2mem pass, written by Andrew Lenharth,  
>> I'm
>> not sure if it's still being maintained.
>
> What is the difference between register and memory? I don't see such
> distinction in the LRM.
>
> Do you mean for example
>
> register = i32
> memory = i32*
>
> ?
>
> Thanks,
>
>> On Jul 7, 2008, at 8:47 AM, Matthieu Moy wrote:
>>
>>> Hi,
>>>
>>> Silly question from an LLVM newbie: the LLVM LRM say that the  
>>> bytecode
>>> is "is an SSA based representation". Indeed, my experience with
>>> llvm-gcc is that the generated code is not necessarily SSA, while
>>> the one given by "llvm-gcc -O1" is.
>>>
>>> Is this assumption correct?
>>>
>>> Is there a non-SSA to SSA translator available?
>>>
>>> Thanks,
>
> --
> Matthieu
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

Matthieu Moy
Patrick Meredith <[hidden email]> writes:

> Memory is what the i32* points too.  The i32* itself is in a  
> register.  You can store to it as many times as you want, but you  
> can't change the address, because that would violate SSA.

Thanks,

For the record, I finally understood by making a few experiments:

This is (obviously) valid:

  define i32 @main() {
         %some-variable = add i32 1, 1   ; integer with the value 1+1
         %some-other-variable = add i32 %some-variable, 1
         ret i32 %some-other-variable
  }

This is not (the assembler complains with ``Redefinition of value
'some-variable' of type 'i32' ''):

  define i32 @main() {
         %some-variable = add i32 1, 1   ; integer with the value 1+1
         %some-variable = add i32 %some-variable, 1
         ret i32 %some-variable
  }

but this one is:

  define i32 @main() {
         %some-pointer = alloca i32 ; declares an int in memory (stack)
         store i32 42, i32* %some-pointer
         store i32 54, i32* %some-pointer
         %retval = load i32* %some-pointer
         ret i32 %retval
  }

In short, "Single Assignment" means "Single '='", not "Single
'store'".

--
Matthieu
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

matthijs
In reply to this post by Matthieu Moy
Hi Matthieu

> What is the difference between register and memory? I don't see such
> distinction in the LRM.
>
> Do you mean for example
>
> register = i32
> memory = i32*
Not really. Registers are all values that have a name starting with % and are
by definition local to a function. These are in SSA form, so you can't assign
them twice (instead, you would simply use a new register, since you have
infinite of them). The result of an instruction is always stored to a
register.

Memory, on the other hand, cannot be accessed directly and has to be
explicitely allocated. The only way to access memory is to obtain a pointer to
it (Usually in a register) and execute a load or a store instruction. For
example, the following C code:

int run(){
        int i = foo();
        i += 1;
        return i;
}

will be initially mapped to have every local variable in memory, since that
does not require complicated stuff to comply with SSA (which only gets
interesting when the code is more complicated). So,

        %i = alloca i32, align 4                ; <i32*> [#uses=4]
        %call = call i32 (...)* @foo( )         ; <i32> [#uses=1]
        store i32 %call, i32* %i
        %tmp = load i32* %i                     ; <i32> [#uses=1]
        %add = add i32 %tmp, 1                  ; <i32> [#uses=1]
        store i32 %add, i32* %i
        %tmp1 = load i32* %i                    ; <i32> [#uses=1]
        ret i32 %tmp1

Here you can see that the variable i is allocated in memory, allocated by the
first alloca instruction. There, %i contains the _address_ of i, not its
value. Note that %i is again a register and is in SSA form. To get at its
value, load and store instructions are used. For every intermediate value (ie,
evaluated expressions), registers are used (%call, %tmp, %add, %tmp1).
However, the above code is not very efficiently mappable on a regular
processor, since accessing memory all the time is not necessary. We can use
the "mem2reg" optimization pass, which reduces the above to:

  %call = call i32 (...)* @foo( )         ; <i32> [#uses=1]
  %add = add i32 %call, 1                 ; <i32> [#uses=1]
  ret i32 %add

Now, no memory is involved and all values are in SSA. This code is a lot
easier for a codegen to map, since the lifetimes of various values are a lot
shorter and dataflow is clearer.

Hope that this helps a bit? Also, did you see
http://www.llvm.org/docs/tutorial/LangImpl7.html already?

Gr.

Matthijs

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

signature.asc (196 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

Daniel Berlin
In reply to this post by Matthieu Moy
On Thu, Jul 17, 2008 at 6:34 AM, Matthieu Moy <[hidden email]> wrote:

> Patrick Meredith <[hidden email]> writes:
>
>> Memory is what the i32* points too.  The i32* itself is in a
>> register.  You can store to it as many times as you want, but you
>> can't change the address, because that would violate SSA.
>
> Thanks,
>
> For the record, I finally understood by making a few experiments:
>
> This is (obviously) valid:
>
>  define i32 @main() {
>         %some-variable = add i32 1, 1   ; integer with the value 1+1
>         %some-other-variable = add i32 %some-variable, 1
>         ret i32 %some-other-variable
>  }
>
> This is not (the assembler complains with ``Redefinition of value
> 'some-variable' of type 'i32' ''):
>
>  define i32 @main() {
>         %some-variable = add i32 1, 1   ; integer with the value 1+1
>         %some-variable = add i32 %some-variable, 1
>         ret i32 %some-variable
>  }
>
> but this one is:
>
>  define i32 @main() {
>         %some-pointer = alloca i32 ; declares an int in memory (stack)
>         store i32 42, i32* %some-pointer
>         store i32 54, i32* %some-pointer
>         %retval = load i32* %some-pointer
>         ret i32 %retval
>  }
>
> In short, "Single Assignment" means "Single '='", not "Single
> 'store'".

Sure, because store is not an assignment.
It would be impossible to correctly rename memory for all programs,
since may-alias is undecidable at compile time and must-alias is
uncomputable at compile time.
The best you can do are conservative approximations, which would mean
sometimes your ssa form would be wrong.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: SSA or not SSA?

Jonathan S. Shapiro-2
[Note: moderately off-topic; this is an attempt to explain the rationale
of SSA that is not particular to LLVM. The relevance is that it may help
some new front-end writers grok the LLVM IR faster. ]

On Thu, 2008-07-17 at 08:13 -0400, Daniel Berlin wrote:

> On Thu, Jul 17, 2008 at 6:34 AM, Matthieu Moy <[hidden email]> wrote:
> > Patrick Meredith <[hidden email]> writes:
> > In short, "Single Assignment" means "Single '='", not "Single
> > 'store'".
>
> Sure, because store is not an assignment.
> It would be impossible to correctly rename memory for all programs,
> since may-alias is undecidable at compile time and must-alias is
> uncomputable at compile time.
> The best you can do are conservative approximations, which would mean
> sometimes your ssa form would be wrong.

What you say is true in the general case, but there are a few languages
for which this is not true, notably Haskell (because the I/O monad
remains purely functional from a semantics perspective). Also, in
strongly typed languages we can often induce a partition on the heap by
type or by potential interference. The SafeCode work uses both
techniques in a nicely integrated way.

Going back to the original question though, perhaps the following will
help explain what is going on with SSA conceptually:

SSA is a compromise. In an idealized (though perhaps boring) world, what
a compiler writer would like to see is a world in which every identifier
is bound (note: not assigned) exactly once and is never mutated. Such an
intermediate form would be purely functional. This would make various
forms of dependency analysis exceptionally easy, and it would allow us
to substitute expressions for their variables everywhere (term
substitution), or conversely it would let us do that substitution in
reverse (a.k.a. common sub-expression elimination). I'll call this
"static single binding" form (SSB).

In practice, given the languages that we need to compile in the real
world, we can't get there. We need writable locations in the heap and
also writable variables on the stack.

Even though we need these things, restricting our IR graph so that it
only allows mutation in restricted places is still very useful. It
concentrates all of the hard parts of things like dependency analysis on
a few places, and it still lets us do term substitution and common
sub-expression most of the time. It gives us an IR form where we can
analyze which parts of the code are dependent on assignment and which
are not. For these reasons we want to stay as close to the "pure" form
SSB that we can, admitting only enough pragmatism to express the
languages we are trying to compile.

So a typical SSA form makes two compromises relative to SSB.

  1. Since we cannot (in general) analyze the heap for aliasing issues,
     we allow updates in the heap arbitrarily. This is why stores
     to i32* are okay where stores to i32 are not.

     Additionally, stores to heap are intimately tied up with data
     structure representations, and therefore with the semantics of
     locations, where stores to stack are much less so. The issue of
     location semantics generally stops us from "renaming" heap
     locations in the kinds of ways that we can rename stack-based
     values.

  2. We allow updates (assignments) on stack locations, but only in
     specialized and carefully defined SSA graph nodes.


Don't know if any of that helps. None of it is particular to LLVM, but
perhaps it will help you think about how to generate IR more
effectively.


shap

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev