[RFC] 3-bit Waymarking

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] 3-bit Waymarking

Gabor Greif-2
Hi devs,

after my intentionally "playful" EuroLLVM presentation (*) I think it
would be time to get serious about merging to ToT. But we should
probably find out whether an optimized algorithm is desired at all.

So I'd solicit comments from the code owners (Use.{h,cpp}) and anybody
who is interested. For closer scrutiny, the code is here:
<http://llvm.org/viewvc/llvm-project/llvm/branches/ggreif/waymark-64-new/>

I do not have the equipment to perform a compile-time measurement. How
do folks benchmark for this nowadays? Is it a viable alternative to
bring the changes to ToT and compare speedups/slowdowns in the nightly
builds retrospectively?

Thanks for any input,

cheers,

    Gabor

(*) Some even say "bizarre"
(https://twitter.com/alexr/statuses/453486315083157504) :
<http://llvm.org/devmtg/2014-04/PDFs/LightningTalks/waymark.pdf>
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] 3-bit Waymarking

Chris Lattner-2

On Apr 22, 2014, at 7:28 AM, Gabor Greif <[hidden email]> wrote:

> Hi devs,
>
> after my intentionally "playful" EuroLLVM presentation (*) I think it
> would be time to get serious about merging to ToT. But we should
> probably find out whether an optimized algorithm is desired at all.
>
> So I'd solicit comments from the code owners (Use.{h,cpp}) and anybody
> who is interested. For closer scrutiny, the code is here:
> <http://llvm.org/viewvc/llvm-project/llvm/branches/ggreif/waymark-64-new/>
>
> I do not have the equipment to perform a compile-time measurement. How
> do folks benchmark for this nowadays? Is it a viable alternative to
> bring the changes to ToT and compare speedups/slowdowns in the nightly
> builds retrospectively?

I saw the slides, it looks very interesting.  Have you actually measured any memory wins from this?

-Chris
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] 3-bit Waymarking

Gabor Greif-2
On 4/22/14, Chris Lattner <[hidden email]> wrote:

>
> On Apr 22, 2014, at 7:28 AM, Gabor Greif <[hidden email]> wrote:
>
>> Hi devs,
>>
>> after my intentionally "playful" EuroLLVM presentation (*) I think it
>> would be time to get serious about merging to ToT. But we should
>> probably find out whether an optimized algorithm is desired at all.
>>
>> So I'd solicit comments from the code owners (Use.{h,cpp}) and anybody
>> who is interested. For closer scrutiny, the code is here:
>> <http://llvm.org/viewvc/llvm-project/llvm/branches/ggreif/waymark-64-new/>
>>
>> I do not have the equipment to perform a compile-time measurement. How
>> do folks benchmark for this nowadays? Is it a viable alternative to
>> bring the changes to ToT and compare speedups/slowdowns in the nightly
>> builds retrospectively?
>
> I saw the slides, it looks very interesting.  Have you actually measured any
> memory wins from this?

Hi Chris,

there are no memory savings, Use has still 3 pointers (the 4->3
reduction happened back in 2008). What should be faster with the new
algorithm are the "value_use_iterator::operator->" operators (i.e.
finding all Users of a Value).

Cheers,

    Gabor


>
> -Chris
>
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] 3-bit Waymarking

Chris Lattner-2
On Apr 22, 2014, at 12:26 PM, Gabor Greif <[hidden email]> wrote:

>>> I do not have the equipment to perform a compile-time measurement. How
>>> do folks benchmark for this nowadays? Is it a viable alternative to
>>> bring the changes to ToT and compare speedups/slowdowns in the nightly
>>> builds retrospectively?
>>
>> I saw the slides, it looks very interesting.  Have you actually measured any
>> memory wins from this?
>
> Hi Chris,
>
> there are no memory savings, Use has still 3 pointers (the 4->3
> reduction happened back in 2008). What should be faster with the new
> algorithm are the "value_use_iterator::operator->" operators (i.e.
> finding all Users of a Value).

Ah, ok.  If you're looking for performance benchmarks, you could look at LTO time of something large, in that a majority of the time is spent in the mid-level optimizer.

-Chris
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Fwd: [RFC] 3-bit Waymarking

Jay Foad-2
In reply to this post by Gabor Greif-2
I forgot to Cc this (and some other unimportant replies) to the list.

Jay.

---------- Forwarded message ----------
From: Jay Foad <[hidden email]>
Date: 23 April 2014 09:50
Subject: Re: [LLVMdev] [RFC] 3-bit Waymarking
To: Gabor Greif <[hidden email]>


On 22 April 2014 15:28, Gabor Greif <[hidden email]> wrote:
> Hi devs,
>
> after my intentionally "playful" EuroLLVM presentation (*) I think it
> would be time to get serious about merging to ToT. But we should
> probably find out whether an optimized algorithm is desired at all.
>
> So I'd solicit comments from the code owners (Use.{h,cpp}) and anybody
> who is interested. For closer scrutiny, the code is here:
> <http://llvm.org/viewvc/llvm-project/llvm/branches/ggreif/waymark-64-new/>

Hi Gabor,

I've looked at your slides. I have two comments about the new
waymarking scheme, as described in your "Comparison" slide.


1. I assert without proof :-) that most Values have two Uses, so by
far the worst thing about the current scheme is that it takes *2*
accesses to get from the penultimate Use to the User. When you upgrade
from 2 to 3 bits you get four more tags to play with, so how about
using one or more of them as additional Full Stop tags, that tell you
exactly how far away the user is? E.g. if you created four more full
stop tags you could go from this access pattern:

2 bits: ...654654343221

to this:

3 bits: ...543432211111

I realise that this is asymptotically worse than your scheme, but I
reckon it might be better in the very common case where there are few
Uses.


2. Your "number of accesses" is a bit misleading. I think all the
numbers should be increased by one, because after following the
waymarks we *always* look at the next word as well, to see whether it
is the start of the User, or a pointer to the User. So another way of
spending the four new tags would be to have different kinds of full
stop:

S = full Stop, User follows inline
P = full stoP, Pointer to User follows

This avoids the extra access in the (hopefully common) case where the
User follows inline.


I reckon these two ideas can be combined, e.g. you could have five
full stop bits as follows:

P = full stoP, Pointer to User follows
S0 = full Stop, User follows inline after this Use
S1 = full Stop, User follows inline after 1 more Use
S2 = full Stop, User follows inline after 2 more Uses
S3 = full Stop, User follows inline after 3 more Uses

This would optimise for the case where the Uses are inline, and there
are <= 4 of them.

Jay.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] 3-bit Waymarking

Jay Foad-2
In reply to this post by Gabor Greif-2
On 22 April 2014 15:28, Gabor Greif <[hidden email]> wrote:
Hi devs,

after my intentionally "playful" EuroLLVM presentation (*) I think it
would be time to get serious about merging to ToT. But we should
probably find out whether an optimized algorithm is desired at all.

So I'd solicit comments from the code owners (Use.{h,cpp}) and anybody
who is interested. For closer scrutiny, the code is here:
<http://llvm.org/viewvc/llvm-project/llvm/branches/ggreif/waymark-64-new/>

Just to stimulate discussion, here's another playful/bizarre idea for improving the current 2-bit waymarking scheme.

Use this encoding:

00 = s (stop)
01 = 1
10 = 2
11 = 3

Change the order of fields in a Use to be prev, next, value. I think we can guarantee that the first word of a User has 00 in its low order bits, so if you try to read just beyond the end of the Use array, it'll look like a stop tag.

Then the sequence goes like this, where I've used a colon to separate the real tags (in Uses) from the fake tag (in the first word of the User):

encoding: ...s121s112s32s22s12s3s1s:s
accesses: ...5876576546546545434332:

To decode this, skip digits until you reach a stop tag, and then read all the digits up to the next stop tag. The digits tell you how far to skip from the terminating stop tag to the start of the User, using a "3-adic numeration system" (http://en.wikipedia.org/wiki/Bijective_numeration):

0 = (empty string)
1 = 1
2 = 2
3 = 3
4 = 11
5 = 12
6 = 13
7 = 21
...
13 = 111
14 = 112
15 = 113
16 = 121
...

This is asymptotically better than the current system because it's (more or less) using base 3 instead of base 2 to encode the skip distances. However, it's probably not very good in practice, because it needs more accesses when there are only one or two Uses, which is probably the common case. Oh well.

Jay.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] 3-bit Waymarking

Duncan P. N. Exon Smith
In reply to this post by Chris Lattner-2
On 2014 Apr 22, at 19:46, Chris Lattner <[hidden email]> wrote:

> On Apr 22, 2014, at 12:26 PM, Gabor Greif <[hidden email]> wrote:
>>>> I do not have the equipment to perform a compile-time measurement. How
>>>> do folks benchmark for this nowadays? Is it a viable alternative to
>>>> bring the changes to ToT and compare speedups/slowdowns in the nightly
>>>> builds retrospectively?
>>>
>>> I saw the slides, it looks very interesting.  Have you actually measured any
>>> memory wins from this?
>>
>> Hi Chris,
>>
>> there are no memory savings, Use has still 3 pointers (the 4->3
>> reduction happened back in 2008). What should be faster with the new
>> algorithm are the "value_use_iterator::operator->" operators (i.e.
>> finding all Users of a Value).
>
> Ah, ok.  If you're looking for performance benchmarks, you could look at LTO time of something large, in that a majority of the time is spent in the mid-level optimizer.

In case you're looking for examples of "something large", clang
is a good option as long as you have enough RAM not to swap.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev