PBQP & CalcSpillWeights

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

PBQP & CalcSpillWeights

Arnaud A. de Grandmaison
Hi All,

I finally had a chance to get back to my pbqp trials, now using the 3.0
release. I still hit the same assert : "Attempting to spill already spilled
value."

This is triggered because in RegAllocPBQP::mapPBQPToRegAlloc, a vreg node is
either :
 - a physical register (problem.isPRegOption(vreg, alloc)),
 - or a spill (problem.isSpillOption(vreg, alloc))

The problem is that pass CalcSpillWeights can 'hint' that it is a poor
idea to spill this specific register with :

CalcSpillWeights.cpp / VirtRegAuxInfo::CalculateWeightAndHint :
  // Mark li as unspillable if all live ranges are tiny.
  if (li.isZeroLength(LIS.getSlotIndexes())) {
    li.markNotSpillable();
    ...

This hint makes the register non spillable at all for the spiller (that's the
assert above), not just a bad-idea-to-spill-but-feasible. The pbqp allocator
does not cope with this distinction and allways attempts to spill it.

I would need some guidance on how to modify the pbqp to handle this case
properly.

Best regards,
--
Arnaud de Grandmaison
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Lang Hames
Hi Arnaud,

LiveInterval::markNotSpillable() sets the live interval's spill weight
to infinity. For well-formed PBQP graphs (i.e. ones that have some
finite-cost solution), PBQP should never chose to spill such an
interval. The two possibilities for this crash are that the input
graph has no finite-cost solution, or that you've exposed a bug in the
PBQP solver.

>From memory your target is not public, so I won't be able to reproduce
the crash myself. Is that correct?

If that's the case, I could add functionality to dump the PBQP graphs
during allocation. I think they should give me enough information to
debug the issue. Would you be able to share the PBQP graphs?

Cheers,
Lang.

On Wed, Mar 21, 2012 at 8:40 AM, Arnaud de Grandmaison
<[hidden email]> wrote:

>
> Hi All,
>
> I finally had a chance to get back to my pbqp trials, now using the 3.0
> release. I still hit the same assert : "Attempting to spill already spilled
> value."
>
> This is triggered because in RegAllocPBQP::mapPBQPToRegAlloc, a vreg node is
> either :
>  - a physical register (problem.isPRegOption(vreg, alloc)),
>  - or a spill (problem.isSpillOption(vreg, alloc))
>
> The problem is that pass CalcSpillWeights can 'hint' that it is a poor
> idea to spill this specific register with :
>
> CalcSpillWeights.cpp / VirtRegAuxInfo::CalculateWeightAndHint :
>  // Mark li as unspillable if all live ranges are tiny.
>  if (li.isZeroLength(LIS.getSlotIndexes())) {
>    li.markNotSpillable();
>    ...
>
> This hint makes the register non spillable at all for the spiller (that's the
> assert above), not just a bad-idea-to-spill-but-feasible. The pbqp allocator
> does not cope with this distinction and allways attempts to spill it.
>
> I would need some guidance on how to modify the pbqp to handle this case
> properly.
>
> Best regards,
> --
> Arnaud de Grandmaison
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Arnaud A. de Grandmaison
Hi Lang,

> From memory your target is not public, so I won't be able to reproduce
> the crash myself. Is that correct?
Correct.
 
> If that's the case, I could add functionality to dump the PBQP graphs
> during allocation. I think they should give me enough information to
> debug the issue. Would you be able to share the PBQP graphs?
I can share the pbqp graph if you send me a patch with the added functionality
you want. On my side, I already checked the node cost for this register, which
is correctly set to inf for spill, as well as the edge costs, and they look
ok.

I also attached my target's pbqp related file in case you want to double check
what I did. This is llvm-3.0 based. It comprises 2 passes : the
FemtoPBQPBuilder, plus a FemtoPairablepass, which undo some of the coalescer's
work. The insertRegCopy may specifically need a double check, as I am not 100%
sure to have updated correctly the LiveInterval information.

In terms of registers, the Femto target is simplistic : a single register
class GR16, for data and pointers, all i16. It has 16 registers, R0 to R15;
R15 is used as stack pointer, and R14 potentialy as framepointer. A pair is
constituted from a register + its successor, i.e. (R0, R1), (R1,R2), (R2, R3),
... are valid pairs. This is an instruction encoding constraint, as we only
have 16bits wide instructions. Pairs involving R15 are never allowed, those
with R14 may be allowed, depending on each function.

Most instructions have no pairing constraints, and do not appear here, being
handled by the PBQPBuilder default base class. A few instructions have 1 or 2
pairs of registers, and are all handled by the 2 passes above.

Thanks for your help,
--
Arnaud de Grandmaison
Senior CPU engineer
Business Unit Digital Tuner

Parrot S.A.
174, quai de Jemmapes
75010 Paris - France
Phone: +33 1 48 03 84 59
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

FemtoRegisterAllocator.cpp (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Lang Hames
Hi Arnaud,

Thanks for attaching those files. I'll take a look at them.

Commit r153483 adds an option to the PBQP allocator,
"-pbqp-dump-graphs", to dump the PBQP graph for each round of each
function in a compilation unit. The generated files are named "<module
id>.<function>.<round>.pbqpgraph", and contain a simple text
representation of the PBQP graph. The PBQP allocator has been stable
for some time - I think this patch should apply cleanly to the 3.0
version.

Can you run the failing test case through the allocator with this
patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
-debug-only=regalloc" options. I'll need to take a look at the last
graph dumped before the assertion is triggered.

Cheers,
Lang.

On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
<[hidden email]> wrote:

> Hi Lang,
>
>> From memory your target is not public, so I won't be able to reproduce
>> the crash myself. Is that correct?
> Correct.
>
>> If that's the case, I could add functionality to dump the PBQP graphs
>> during allocation. I think they should give me enough information to
>> debug the issue. Would you be able to share the PBQP graphs?
> I can share the pbqp graph if you send me a patch with the added functionality
> you want. On my side, I already checked the node cost for this register, which
> is correctly set to inf for spill, as well as the edge costs, and they look
> ok.
>
> I also attached my target's pbqp related file in case you want to double check
> what I did. This is llvm-3.0 based. It comprises 2 passes : the
> FemtoPBQPBuilder, plus a FemtoPairablepass, which undo some of the coalescer's
> work. The insertRegCopy may specifically need a double check, as I am not 100%
> sure to have updated correctly the LiveInterval information.
>
> In terms of registers, the Femto target is simplistic : a single register
> class GR16, for data and pointers, all i16. It has 16 registers, R0 to R15;
> R15 is used as stack pointer, and R14 potentialy as framepointer. A pair is
> constituted from a register + its successor, i.e. (R0, R1), (R1,R2), (R2, R3),
> ... are valid pairs. This is an instruction encoding constraint, as we only
> have 16bits wide instructions. Pairs involving R15 are never allowed, those
> with R14 may be allowed, depending on each function.
>
> Most instructions have no pairing constraints, and do not appear here, being
> handled by the PBQPBuilder default base class. A few instructions have 1 or 2
> pairs of registers, and are all handled by the 2 passes above.
>
> Thanks for your help,
> --
> Arnaud de Grandmaison
> Senior CPU engineer
> Business Unit Digital Tuner
>
> Parrot S.A.
> 174, quai de Jemmapes
> 75010 Paris - France
> Phone: +33 1 48 03 84 59
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Arnaud A. de Grandmaison
Hi Lang,

I have reduced the testcase as much as possible. The log of the run and the  
dumped graphes are attached.

Cheers,
--
Arnaud de Grandmaison

On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:

> Hi Arnaud,
>
> Thanks for attaching those files. I'll take a look at them.
>
> Commit r153483 adds an option to the PBQP allocator,
> "-pbqp-dump-graphs", to dump the PBQP graph for each round of each
> function in a compilation unit. The generated files are named "<module
> id>.<function>.<round>.pbqpgraph", and contain a simple text
> representation of the PBQP graph. The PBQP allocator has been stable
> for some time - I think this patch should apply cleanly to the 3.0
> version.
>
> Can you run the failing test case through the allocator with this
> patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
> -debug-only=regalloc" options. I'll need to take a look at the last
> graph dumped before the assertion is triggered.
>
> Cheers,
> Lang.
>
> On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
>
> <[hidden email]> wrote:
> > Hi Lang,
> >
> >> From memory your target is not public, so I won't be able to reproduce
> >> the crash myself. Is that correct?
> >
> > Correct.
> >
> >> If that's the case, I could add functionality to dump the PBQP graphs
> >> during allocation. I think they should give me enough information to
> >> debug the issue. Would you be able to share the PBQP graphs?
> >
> > I can share the pbqp graph if you send me a patch with the added
> > functionality you want. On my side, I already checked the node cost for
> > this register, which is correctly set to inf for spill, as well as the
> > edge costs, and they look ok.
> >
> > I also attached my target's pbqp related file in case you want to double
> > check what I did. This is llvm-3.0 based. It comprises 2 passes : the
> > FemtoPBQPBuilder, plus a FemtoPairablepass, which undo some of the
> > coalescer's work. The insertRegCopy may specifically need a double
> > check, as I am not 100% sure to have updated correctly the LiveInterval
> > information.
> >
> > In terms of registers, the Femto target is simplistic : a single
> > register
> > class GR16, for data and pointers, all i16. It has 16 registers, R0 to
> > R15; R15 is used as stack pointer, and R14 potentialy as framepointer.
> > A pair is constituted from a register + its successor, i.e. (R0, R1),
> > (R1,R2), (R2, R3), ... are valid pairs. This is an instruction encoding
> > constraint, as we only have 16bits wide instructions. Pairs involving
> > R15 are never allowed, those with R14 may be allowed, depending on each
> > function.
> >
> > Most instructions have no pairing constraints, and do not appear here,
> > being handled by the PBQPBuilder default base class. A few instructions
> > have 1 or 2 pairs of registers, and are all handled by the 2 passes
> > above.
> >
> > Thanks for your help,
> > --
> > Arnaud de Grandmaison
> > Senior CPU engineer
> > Business Unit Digital Tuner
> >
> > Parrot S.A.
> > 174, quai de Jemmapes
> > 75010 Paris - France
> > Phone: +33 1 48 03 84 59

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

invalid-spill.tar.gz (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Lang Hames
Hi Arnaud,

Apologies for the delayed reply.

Thank you for the excellent test case - it exposed a subtle bug in the colorability heuristic. This has been fixed in r153958.

In case you are curious, the bug was as follows: the PBQP solver applies applies a simplification step to each matrix. When all elements of a matrix row or column are equal, the value for those elements is "pushed out" to the corresponding element of the cost vector, and the row/column is zeroed out. This is an attempt to turn the matrices into zero matrices which can be eliminated from the problem. This simplification step runs even for rows/columns that are all infinite. When an all infinite row/column is encountered, it will be zeroed out, and the infinite cost attached to some register in the cost vector. Unfortunately the infinite-cost elements of vectors were not being taken into consideration when determining colorability, only the infinities in the matrices were. In your test case this led to a node being falsely assumed to be colorable when it was not, and pushed to the coloring stack too early. By the time it was encountered in the coloring phase it had already been over-constrained, and no finite cost solutions were available.

In future I hope to make the simplification step strip infinite rows/columns from the problem altogether (along with their corresponding vector elements). This would speed the solver up further by avoiding consideration of such impossible options.

With the fix from r153958 applied the solver now finds a zero-cost solution for the test case you sent me. This should translate to a valid register allocation for your test case. Please try it out and let me know if it works for you.

Cheers,
Lang.

On Tue, Mar 27, 2012 at 5:05 AM, Arnaud de Grandmaison <[hidden email]> wrote:
Hi Lang,

I have reduced the testcase as much as possible. The log of the run and the
dumped graphes are attached.

Cheers,
--
Arnaud de Grandmaison

On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:
> Hi Arnaud,
>
> Thanks for attaching those files. I'll take a look at them.
>
> Commit r153483 adds an option to the PBQP allocator,
> "-pbqp-dump-graphs", to dump the PBQP graph for each round of each
> function in a compilation unit. The generated files are named "<module
> id>.<function>.<round>.pbqpgraph", and contain a simple text
> representation of the PBQP graph. The PBQP allocator has been stable
> for some time - I think this patch should apply cleanly to the 3.0
> version.
>
> Can you run the failing test case through the allocator with this
> patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
> -debug-only=regalloc" options. I'll need to take a look at the last
> graph dumped before the assertion is triggered.
>
> Cheers,
> Lang.
>
> On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
>
> <[hidden email]> wrote:
> > Hi Lang,
> >
> >> From memory your target is not public, so I won't be able to reproduce
> >> the crash myself. Is that correct?
> >
> > Correct.
> >
> >> If that's the case, I could add functionality to dump the PBQP graphs
> >> during allocation. I think they should give me enough information to
> >> debug the issue. Would you be able to share the PBQP graphs?
> >
> > I can share the pbqp graph if you send me a patch with the added
> > functionality you want. On my side, I already checked the node cost for
> > this register, which is correctly set to inf for spill, as well as the
> > edge costs, and they look ok.
> >
> > I also attached my target's pbqp related file in case you want to double
> > check what I did. This is llvm-3.0 based. It comprises 2 passes : the
> > FemtoPBQPBuilder, plus a FemtoPairablepass, which undo some of the
> > coalescer's work. The insertRegCopy may specifically need a double
> > check, as I am not 100% sure to have updated correctly the LiveInterval
> > information.
> >
> > In terms of registers, the Femto target is simplistic : a single
> > register
> > class GR16, for data and pointers, all i16. It has 16 registers, R0 to
> > R15; R15 is used as stack pointer, and R14 potentialy as framepointer.
> > A pair is constituted from a register + its successor, i.e. (R0, R1),
> > (R1,R2), (R2, R3), ... are valid pairs. This is an instruction encoding
> > constraint, as we only have 16bits wide instructions. Pairs involving
> > R15 are never allowed, those with R14 may be allowed, depending on each
> > function.
> >
> > Most instructions have no pairing constraints, and do not appear here,
> > being handled by the PBQPBuilder default base class. A few instructions
> > have 1 or 2 pairs of registers, and are all handled by the 2 passes
> > above.
> >
> > Thanks for your help,
> > --
> > Arnaud de Grandmaison
> > Senior CPU engineer
> > Business Unit Digital Tuner
> >
> > Parrot S.A.
> > 174, quai de Jemmapes
> > 75010 Paris - France
> > Phone: <a href="tel:%2B33%201%2048%2003%2084%2059" value="+33148038459" target="_blank">+33 1 48 03 84 59


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Arnaud A. de Grandmaison
Hi Lang,

Thanks a lot for taking time to look into this. I will test the fix soon and
let you know the results.

Cheers,
--
Arnaud de Grandmaison

On Tuesday, April 03, 2012 17:30:33 Lang Hames wrote:

> Hi Arnaud,
>
> Apologies for the delayed reply.
>
> Thank you for the excellent test case - it exposed a subtle bug in the
> colorability heuristic. This has been fixed in r153958.
>
> In case you are curious, the bug was as follows: the PBQP solver applies
> applies a simplification step to each matrix. When all elements of a matrix
> row or column are equal, the value for those elements is "pushed out" to
> the corresponding element of the cost vector, and the row/column is zeroed
> out. This is an attempt to turn the matrices into zero matrices which can
> be eliminated from the problem. This simplification step runs even for
> rows/columns that are all infinite. When an all infinite row/column is
> encountered, it will be zeroed out, and the infinite cost attached to some
> register in the cost vector. Unfortunately the infinite-cost elements of
> vectors were not being taken into consideration when determining
> colorability, only the infinities in the matrices were. In your test case
> this led to a node being falsely assumed to be colorable when it was not,
> and pushed to the coloring stack too early. By the time it was encountered
> in the coloring phase it had already been over-constrained, and no finite
> cost solutions were available.
>
> In future I hope to make the simplification step strip infinite rows/columns
> from the problem altogether (along with their corresponding vector
> elements). This would speed the solver up further by avoiding consideration
> of such impossible options.
>
> With the fix from r153958 applied the solver now finds a zero-cost solution
> for the test case you sent me. This should translate to a valid register
> allocation for your test case. Please try it out and let me know if it
> works for you.
>
> Cheers,
> Lang.
>
> On Tue, Mar 27, 2012 at 5:05 AM, Arnaud de Grandmaison
> <[hidden email]<mailto:arnaud.allarddegrandmaison@pa
> rrot.com>> wrote: Hi Lang,
>
> I have reduced the testcase as much as possible. The log of the run and the
> dumped graphes are attached.
>
> Cheers,
> --
> Arnaud de Grandmaison
>
> On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:
> > Hi Arnaud,
> >
> > Thanks for attaching those files. I'll take a look at them.
> >
> > Commit r153483 adds an option to the PBQP allocator,
> > "-pbqp-dump-graphs", to dump the PBQP graph for each round of each
> > function in a compilation unit. The generated files are named "<module
> > id>.<function>.<round>.pbqpgraph", and contain a simple text
> > representation of the PBQP graph. The PBQP allocator has been stable
> > for some time - I think this patch should apply cleanly to the 3.0
> > version.
> >
> > Can you run the failing test case through the allocator with this
> > patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
> > -debug-only=regalloc" options. I'll need to take a look at the last
> > graph dumped before the assertion is triggered.
> >
> > Cheers,
> > Lang.
> >
> > On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
> >
> >
<[hidden email]<mailto:[hidden email]>>
wrote:

> > > Hi Lang,
> > >
> > >> From memory your target is not public, so I won't be able to
> > >> reproduce
> > >> the crash myself. Is that correct?
> > >
> > > Correct.
> > >
> > >> If that's the case, I could add functionality to dump the PBQP
> > >> graphs
> > >> during allocation. I think they should give me enough information
> > >> to
> > >> debug the issue. Would you be able to share the PBQP graphs?
> > >
> > > I can share the pbqp graph if you send me a patch with the added
> > > functionality you want. On my side, I already checked the node cost
> > > for
> > > this register, which is correctly set to inf for spill, as well as
> > > the
> > > edge costs, and they look ok.
> > >
> > > I also attached my target's pbqp related file in case you want to
> > > double check what I did. This is llvm-3.0 based. It comprises 2
> > > passes : the FemtoPBQPBuilder, plus a FemtoPairablepass, which undo
> > > some of the coalescer's work. The insertRegCopy may specifically
> > > need a double check, as I am not 100% sure to have updated
> > > correctly the LiveInterval information.
> > >
> > > In terms of registers, the Femto target is simplistic : a single
> > > register
> > > class GR16, for data and pointers, all i16. It has 16 registers, R0
> > > to
> > > R15; R15 is used as stack pointer, and R14 potentialy as
> > > framepointer.
> > > A pair is constituted from a register + its successor, i.e. (R0,
> > > R1),
> > > (R1,R2), (R2, R3), ... are valid pairs. This is an instruction
> > > encoding
> > > constraint, as we only have 16bits wide instructions. Pairs
> > > involving
> > > R15 are never allowed, those with R14 may be allowed, depending on
> > > each
> > > function.
> > >
> > > Most instructions have no pairing constraints, and do not appear
> > > here,
> > > being handled by the PBQPBuilder default base class. A few
> > > instructions
> > > have 1 or 2 pairs of registers, and are all handled by the 2 passes
> > > above.
> > >
> > > Thanks for your help,
> > > --
> > > Arnaud de Grandmaison
> > > Senior CPU engineer
> > > Business Unit Digital Tuner
> > >
> > > Parrot S.A.
> > > 174, quai de Jemmapes
> > > 75010 Paris - France
> > > Phone: +33 1 48 03 84 59<tel:%2B33%201%2048%2003%2084%2059>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Arnaud A. de Grandmaison
Hi Lang,

The assert is not triggered any longer on my testcases :)

I however still get my wrong allocation in some non trivial cases : the
pairing constraint is not fulfilled.

I have tried to modify the 'ensure pairable' pass (the pass undoing some
of the coalescer's work) to always insert register copies for
instructions with the pairable constraint, instead of being smart and
inserting the copy only when needed. This had no visible effect.
Although I am deriving from PBQPBuilder, the PBQP seems to be coalescing
some register copy, without taking into account that the source or dest
reg may have different constraints. In which part of pbqp would this be
happening ?

Cheers,
--
Arnaud de Grandmaison

On 04/05/2012 05:23 PM, Arnaud de Grandmaison wrote:

> Hi Lang,
>
> Thanks a lot for taking time to look into this. I will test the fix soon and
> let you know the results.
>
> Cheers,
> --
> Arnaud de Grandmaison
>
> On Tuesday, April 03, 2012 17:30:33 Lang Hames wrote:
>> Hi Arnaud,
>>
>> Apologies for the delayed reply.
>>
>> Thank you for the excellent test case - it exposed a subtle bug in the
>> colorability heuristic. This has been fixed in r153958.
>>
>> In case you are curious, the bug was as follows: the PBQP solver applies
>> applies a simplification step to each matrix. When all elements of a matrix
>> row or column are equal, the value for those elements is "pushed out" to
>> the corresponding element of the cost vector, and the row/column is zeroed
>> out. This is an attempt to turn the matrices into zero matrices which can
>> be eliminated from the problem. This simplification step runs even for
>> rows/columns that are all infinite. When an all infinite row/column is
>> encountered, it will be zeroed out, and the infinite cost attached to some
>> register in the cost vector. Unfortunately the infinite-cost elements of
>> vectors were not being taken into consideration when determining
>> colorability, only the infinities in the matrices were. In your test case
>> this led to a node being falsely assumed to be colorable when it was not,
>> and pushed to the coloring stack too early. By the time it was encountered
>> in the coloring phase it had already been over-constrained, and no finite
>> cost solutions were available.
>>
>> In future I hope to make the simplification step strip infinite rows/columns
>> from the problem altogether (along with their corresponding vector
>> elements). This would speed the solver up further by avoiding consideration
>> of such impossible options.
>>
>> With the fix from r153958 applied the solver now finds a zero-cost solution
>> for the test case you sent me. This should translate to a valid register
>> allocation for your test case. Please try it out and let me know if it
>> works for you.
>>
>> Cheers,
>> Lang.
>>
>> On Tue, Mar 27, 2012 at 5:05 AM, Arnaud de Grandmaison
>> <[hidden email]<mailto:arnaud.allarddegrandmaison@pa
>> rrot.com>> wrote: Hi Lang,
>>
>> I have reduced the testcase as much as possible. The log of the run and the
>> dumped graphes are attached.
>>
>> Cheers,
>> --
>> Arnaud de Grandmaison
>>
>> On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:
>>> Hi Arnaud,
>>>
>>> Thanks for attaching those files. I'll take a look at them.
>>>
>>> Commit r153483 adds an option to the PBQP allocator,
>>> "-pbqp-dump-graphs", to dump the PBQP graph for each round of each
>>> function in a compilation unit. The generated files are named "<module
>>> id>.<function>.<round>.pbqpgraph", and contain a simple text
>>> representation of the PBQP graph. The PBQP allocator has been stable
>>> for some time - I think this patch should apply cleanly to the 3.0
>>> version.
>>>
>>> Can you run the failing test case through the allocator with this
>>> patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
>>> -debug-only=regalloc" options. I'll need to take a look at the last
>>> graph dumped before the assertion is triggered.
>>>
>>> Cheers,
>>> Lang.
>>>
>>> On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
>>>
>>>
> <[hidden email]<mailto:[hidden email]>>
> wrote:
>>>> Hi Lang,
>>>>
>>>>> From memory your target is not public, so I won't be able to
>>>>> reproduce
>>>>> the crash myself. Is that correct?
>>>> Correct.
>>>>
>>>>> If that's the case, I could add functionality to dump the PBQP
>>>>> graphs
>>>>> during allocation. I think they should give me enough information
>>>>> to
>>>>> debug the issue. Would you be able to share the PBQP graphs?
>>>> I can share the pbqp graph if you send me a patch with the added
>>>> functionality you want. On my side, I already checked the node cost
>>>> for
>>>> this register, which is correctly set to inf for spill, as well as
>>>> the
>>>> edge costs, and they look ok.
>>>>
>>>> I also attached my target's pbqp related file in case you want to
>>>> double check what I did. This is llvm-3.0 based. It comprises 2
>>>> passes : the FemtoPBQPBuilder, plus a FemtoPairablepass, which undo
>>>> some of the coalescer's work. The insertRegCopy may specifically
>>>> need a double check, as I am not 100% sure to have updated
>>>> correctly the LiveInterval information.
>>>>
>>>> In terms of registers, the Femto target is simplistic : a single
>>>> register
>>>> class GR16, for data and pointers, all i16. It has 16 registers, R0
>>>> to
>>>> R15; R15 is used as stack pointer, and R14 potentialy as
>>>> framepointer.
>>>> A pair is constituted from a register + its successor, i.e. (R0,
>>>> R1),
>>>> (R1,R2), (R2, R3), ... are valid pairs. This is an instruction
>>>> encoding
>>>> constraint, as we only have 16bits wide instructions. Pairs
>>>> involving
>>>> R15 are never allowed, those with R14 may be allowed, depending on
>>>> each
>>>> function.
>>>>
>>>> Most instructions have no pairing constraints, and do not appear
>>>> here,
>>>> being handled by the PBQPBuilder default base class. A few
>>>> instructions
>>>> have 1 or 2 pairs of registers, and are all handled by the 2 passes
>>>> above.
>>>>
>>>> Thanks for your help,
>>>> --
>>>> Arnaud de Grandmaison
>>>> Senior CPU engineer
>>>> Business Unit Digital Tuner
>>>>
>>>> Parrot S.A.
>>>> 174, quai de Jemmapes
>>>> 75010 Paris - France
>>>> Phone: +33 1 48 03 84 59<tel:%2B33%201%2048%2003%2084%2059>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Lang Hames
Hi Arnaud,

I'm glad to hear that your test case is working.

I however still get my wrong allocation in some non trivial cases : the
pairing constraint is not fulfilled.

I have tried to modify the 'ensure pairable' pass (the pass undoing some
of the coalescer's work) to always insert register copies for
instructions with the pairable constraint, instead of being smart and
inserting the copy only when needed. This had no visible effect.
Although I am deriving from PBQPBuilder, the PBQP seems to be coalescing
some register copy, without taking into account that the source or dest
reg may have different constraints. In which part of pbqp would this be
happening ?


If you're deriving PBQPBuilder (and not PBQPBuilderWithCoalescing) then PBQP won't attempt any coalescing, though obviously it can "get lucky" and assign registers that are connected by a copy the same register.

Those copies shouldn't be necessary - as long as you don't have a circular chain of paired variables (a,b,c,...,a) with all-infinite spill costs you should be safe.

Can you send me a PBQP graph for your failing test case? I'll see if I can figure out where the solver is going wrong.

- Lang.
 
Cheers,
--
Arnaud de Grandmaison

On 04/05/2012 05:23 PM, Arnaud de Grandmaison wrote:
> Hi Lang,
>
> Thanks a lot for taking time to look into this. I will test the fix soon and
> let you know the results.
>
> Cheers,
> --
> Arnaud de Grandmaison
>
> On Tuesday, April 03, 2012 17:30:33 Lang Hames wrote:
>> Hi Arnaud,
>>
>> Apologies for the delayed reply.
>>
>> Thank you for the excellent test case - it exposed a subtle bug in the
>> colorability heuristic. This has been fixed in r153958.
>>
>> In case you are curious, the bug was as follows: the PBQP solver applies
>> applies a simplification step to each matrix. When all elements of a matrix
>> row or column are equal, the value for those elements is "pushed out" to
>> the corresponding element of the cost vector, and the row/column is zeroed
>> out. This is an attempt to turn the matrices into zero matrices which can
>> be eliminated from the problem. This simplification step runs even for
>> rows/columns that are all infinite. When an all infinite row/column is
>> encountered, it will be zeroed out, and the infinite cost attached to some
>> register in the cost vector. Unfortunately the infinite-cost elements of
>> vectors were not being taken into consideration when determining
>> colorability, only the infinities in the matrices were. In your test case
>> this led to a node being falsely assumed to be colorable when it was not,
>> and pushed to the coloring stack too early. By the time it was encountered
>> in the coloring phase it had already been over-constrained, and no finite
>> cost solutions were available.
>>
>> In future I hope to make the simplification step strip infinite rows/columns
>> from the problem altogether (along with their corresponding vector
>> elements). This would speed the solver up further by avoiding consideration
>> of such impossible options.
>>
>> With the fix from r153958 applied the solver now finds a zero-cost solution
>> for the test case you sent me. This should translate to a valid register
>> allocation for your test case. Please try it out and let me know if it
>> works for you.
>>
>> Cheers,
>> Lang.
>>
>> On Tue, Mar 27, 2012 at 5:05 AM, Arnaud de Grandmaison
>> <[hidden email]<mailto:[hidden email]
>> rrot.com>> wrote: Hi Lang,
>>
>> I have reduced the testcase as much as possible. The log of the run and the
>> dumped graphes are attached.
>>
>> Cheers,
>> --
>> Arnaud de Grandmaison
>>
>> On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:
>>> Hi Arnaud,
>>>
>>> Thanks for attaching those files. I'll take a look at them.
>>>
>>> Commit r153483 adds an option to the PBQP allocator,
>>> "-pbqp-dump-graphs", to dump the PBQP graph for each round of each
>>> function in a compilation unit. The generated files are named "<module
>>> id>.<function>.<round>.pbqpgraph", and contain a simple text
>>> representation of the PBQP graph. The PBQP allocator has been stable
>>> for some time - I think this patch should apply cleanly to the 3.0
>>> version.
>>>
>>> Can you run the failing test case through the allocator with this
>>> patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
>>> -debug-only=regalloc" options. I'll need to take a look at the last
>>> graph dumped before the assertion is triggered.
>>>
>>> Cheers,
>>> Lang.
>>>
>>> On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
>>>
>>>
> <[hidden email]<mailto:[hidden email]>>
> wrote:
>>>> Hi Lang,
>>>>
>>>>> From memory your target is not public, so I won't be able to
>>>>> reproduce
>>>>> the crash myself. Is that correct?
>>>> Correct.
>>>>
>>>>> If that's the case, I could add functionality to dump the PBQP
>>>>> graphs
>>>>> during allocation. I think they should give me enough information
>>>>> to
>>>>> debug the issue. Would you be able to share the PBQP graphs?
>>>> I can share the pbqp graph if you send me a patch with the added
>>>> functionality you want. On my side, I already checked the node cost
>>>> for
>>>> this register, which is correctly set to inf for spill, as well as
>>>> the
>>>> edge costs, and they look ok.
>>>>
>>>> I also attached my target's pbqp related file in case you want to
>>>> double check what I did. This is llvm-3.0 based. It comprises 2
>>>> passes : the FemtoPBQPBuilder, plus a FemtoPairablepass, which undo
>>>> some of the coalescer's work. The insertRegCopy may specifically
>>>> need a double check, as I am not 100% sure to have updated
>>>> correctly the LiveInterval information.
>>>>
>>>> In terms of registers, the Femto target is simplistic : a single
>>>> register
>>>> class GR16, for data and pointers, all i16. It has 16 registers, R0
>>>> to
>>>> R15; R15 is used as stack pointer, and R14 potentialy as
>>>> framepointer.
>>>> A pair is constituted from a register + its successor, i.e. (R0,
>>>> R1),
>>>> (R1,R2), (R2, R3), ... are valid pairs. This is an instruction
>>>> encoding
>>>> constraint, as we only have 16bits wide instructions. Pairs
>>>> involving
>>>> R15 are never allowed, those with R14 may be allowed, depending on
>>>> each
>>>> function.
>>>>
>>>> Most instructions have no pairing constraints, and do not appear
>>>> here,
>>>> being handled by the PBQPBuilder default base class. A few
>>>> instructions
>>>> have 1 or 2 pairs of registers, and are all handled by the 2 passes
>>>> above.
>>>>
>>>> Thanks for your help,
>>>> --
>>>> Arnaud de Grandmaison
>>>> Senior CPU engineer
>>>> Business Unit Digital Tuner
>>>>
>>>> Parrot S.A.
>>>> 174, quai de Jemmapes
>>>> 75010 Paris - France
>>>> Phone: <a href="tel:%2B33%201%2048%2003%2084%2059" value="+33148038459">+33 1 48 03 84 59<tel:%2B33%201%2048%2003%2084%2059>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: PBQP & CalcSpillWeights

Arnaud A. de Grandmaison
Hi Lang,

Thanks for proposing to look into this testcase.

Regarding the copies, this was done to be on the safe side. I would normally only insert copies where needed in order to express the pairable constraint. For example, in a case like 'instr %vreg1, %vreg2, %vreg1', I need to insert a %vreg1 copy in order to express the pairing constraint between the first and third arguments of instr.

Cheers,
--
Arnaud

On 04/19/2012 02:32 AM, Lang Hames wrote:
Hi Arnaud,

I'm glad to hear that your test case is working.

I however still get my wrong allocation in some non trivial cases : the
pairing constraint is not fulfilled.

I have tried to modify the 'ensure pairable' pass (the pass undoing some
of the coalescer's work) to always insert register copies for
instructions with the pairable constraint, instead of being smart and
inserting the copy only when needed. This had no visible effect.
Although I am deriving from PBQPBuilder, the PBQP seems to be coalescing
some register copy, without taking into account that the source or dest
reg may have different constraints. In which part of pbqp would this be
happening ?


If you're deriving PBQPBuilder (and not PBQPBuilderWithCoalescing) then PBQP won't attempt any coalescing, though obviously it can "get lucky" and assign registers that are connected by a copy the same register.

Those copies shouldn't be necessary - as long as you don't have a circular chain of paired variables (a,b,c,...,a) with all-infinite spill costs you should be safe.

Can you send me a PBQP graph for your failing test case? I'll see if I can figure out where the solver is going wrong.

- Lang.
 
Cheers,
--
Arnaud de Grandmaison

On 04/05/2012 05:23 PM, Arnaud de Grandmaison wrote:
> Hi Lang,
>
> Thanks a lot for taking time to look into this. I will test the fix soon and
> let you know the results.
>
> Cheers,
> --
> Arnaud de Grandmaison
>
> On Tuesday, April 03, 2012 17:30:33 Lang Hames wrote:
>> Hi Arnaud,
>>
>> Apologies for the delayed reply.
>>
>> Thank you for the excellent test case - it exposed a subtle bug in the
>> colorability heuristic. This has been fixed in r153958.
>>
>> In case you are curious, the bug was as follows: the PBQP solver applies
>> applies a simplification step to each matrix. When all elements of a matrix
>> row or column are equal, the value for those elements is "pushed out" to
>> the corresponding element of the cost vector, and the row/column is zeroed
>> out. This is an attempt to turn the matrices into zero matrices which can
>> be eliminated from the problem. This simplification step runs even for
>> rows/columns that are all infinite. When an all infinite row/column is
>> encountered, it will be zeroed out, and the infinite cost attached to some
>> register in the cost vector. Unfortunately the infinite-cost elements of
>> vectors were not being taken into consideration when determining
>> colorability, only the infinities in the matrices were. In your test case
>> this led to a node being falsely assumed to be colorable when it was not,
>> and pushed to the coloring stack too early. By the time it was encountered
>> in the coloring phase it had already been over-constrained, and no finite
>> cost solutions were available.
>>
>> In future I hope to make the simplification step strip infinite rows/columns
>> from the problem altogether (along with their corresponding vector
>> elements). This would speed the solver up further by avoiding consideration
>> of such impossible options.
>>
>> With the fix from r153958 applied the solver now finds a zero-cost solution
>> for the test case you sent me. This should translate to a valid register
>> allocation for your test case. Please try it out and let me know if it
>> works for you.
>>
>> Cheers,
>> Lang.
>>
>> On Tue, Mar 27, 2012 at 5:05 AM, Arnaud de Grandmaison
>> <[hidden email]<mailto:[hidden email]
>> rrot.com>> wrote: Hi Lang,
>>
>> I have reduced the testcase as much as possible. The log of the run and the
>> dumped graphes are attached.
>>
>> Cheers,
>> --
>> Arnaud de Grandmaison
>>
>> On Tuesday, March 27, 2012 01:20:35 Lang Hames wrote:
>>> Hi Arnaud,
>>>
>>> Thanks for attaching those files. I'll take a look at them..
>>>
>>> Commit r153483 adds an option to the PBQP allocator,
>>> "-pbqp-dump-graphs", to dump the PBQP graph for each round of each
>>> function in a compilation unit. The generated files are named "<module
>>> id>.<function>.<round>.pbqpgraph", and contain a simple text
>>> representation of the PBQP graph. The PBQP allocator has been stable
>>> for some time - I think this patch should apply cleanly to the 3.0
>>> version.
>>>
>>> Can you run the failing test case through the allocator with this
>>> patch applied, passing the "-regalloc=pbqp -pbqp-dump-graphs
>>> -debug-only=regalloc" options. I'll need to take a look at the last
>>> graph dumped before the assertion is triggered.
>>>
>>> Cheers,
>>> Lang.
>>>
>>> On Mon, Mar 26, 2012 at 7:35 AM, Arnaud de Grandmaison
>>>
>>>
> <[hidden email]<mailto:[hidden email]>>
> wrote:
>>>> Hi Lang,
>>>>
>>>>> From memory your target is not public, so I won't be able to
>>>>> reproduce
>>>>> the crash myself. Is that correct?
>>>> Correct.
>>>>
>>>>> If that's the case, I could add functionality to dump the PBQP
>>>>> graphs
>>>>> during allocation. I think they should give me enough information
>>>>> to
>>>>> debug the issue. Would you be able to share the PBQP graphs?
>>>> I can share the pbqp graph if you send me a patch with the added
>>>> functionality you want. On my side, I already checked the node cost
>>>> for
>>>> this register, which is correctly set to inf for spill, as well as
>>>> the
>>>> edge costs, and they look ok.
>>>>
>>>> I also attached my target's pbqp related file in case you want to
>>>> double check what I did. This is llvm-3.0 based. It comprises 2
>>>> passes : the FemtoPBQPBuilder, plus a FemtoPairablepass, which undo
>>>> some of the coalescer's work. The insertRegCopy may specifically
>>>> need a double check, as I am not 100% sure to have updated
>>>> correctly the LiveInterval information.
>>>>
>>>> In terms of registers, the Femto target is simplistic : a single
>>>> register
>>>> class GR16, for data and pointers, all i16. It has 16 registers, R0
>>>> to
>>>> R15; R15 is used as stack pointer, and R14 potentialy as
>>>> framepointer.
>>>> A pair is constituted from a register + its successor, i.e.. (R0,
>>>> R1),
>>>> (R1,R2), (R2, R3), ... are valid pairs. This is an instruction
>>>> encoding
>>>> constraint, as we only have 16bits wide instructions. Pairs
>>>> involving
>>>> R15 are never allowed, those with R14 may be allowed, depending on
>>>> each
>>>> function.
>>>>
>>>> Most instructions have no pairing constraints, and do not appear
>>>> here,
>>>> being handled by the PBQPBuilder default base class. A few
>>>> instructions
>>>> have 1 or 2 pairs of registers, and are all handled by the 2 passes
>>>> above.
>>>>
>>>> Thanks for your help,
>>>> --
>>>> Arnaud de Grandmaison
>>>> Senior CPU engineer
>>>> Business Unit Digital Tuner
>>>>
>>>> Parrot S.A.
>>>> 174, quai de Jemmapes
>>>> 75010 Paris - France
>>>> Phone: <a moz-do-not-send="true" href="tel:%2B33%201%2048%2003%2084%2059" value="+33148038459">+33 1 48 03 84 59<tel:%2B33%201%2048%2003%2084%2059>
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>





-- 
Arnaud de Grandmaison
Senior CPU engineer
Business Unit Digital Tuner

Parrot S.A.
174, quai de Jemmapes
75010 Paris - France
Phone: +33 1 48 03 84 59

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

test4.tar.gz (25K) Download Attachment