[llvm-dev] Accelerating TLI getLibFunc lookups

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] Accelerating TLI getLibFunc lookups

Shawn Webb via llvm-dev

TLDR: Figuring out whether a declaration is a TLI LibFunc is slow.  We hammer that path in CGP.  I'm proposing storing the ID of a TLI LibFunc in the same IntID field in Function we use for IntrinsicID to make that fast.

Looking into a compile time issue during codegen (LLC) for a large IR file, I came across something interesting.  Due to the presence of a very large number of intrinsics in the particular example, we were spending almost 30% of time in CodeGenPrep::optimizeCallInst, and within that, almost all of it in the FortifiedLibCallSimplifier.  Now, since the IR file in question has no fortified libcalls, that seemed a bit odd.

Looking into, it turns out that figuring out that an arbitrary direct call is *not* a call to a LibCall requires a full name normalization and table lookup that a successful one does.  We could simply make the lookup itself faster - it looks like we could probably tablegen a near optimal character switch lookup table - but that still leaves us with the normalization.  We could cache the lookup, but then we have an analysis invalidation problem for all users of TLI.  Not unsolvable, but not fun if we have a better option. 

Instead, I noticed that we have no overlap between intrinsics, and target library functions.  Assuming we're happy with that, and don't see that changing in the future, that gives us an opportunity.  We could cache the libfunc ID into the Function itself, just like we do for intrinsics today.

What would that look like in practice you ask?

  • We'd move the definition of LibFunc into include/IR/TargetLibraryFunctions.h/def (only the enum, not the rest of TLI)
  • We'd change IntID field in GlobalValue to be union of IntrinsicID and LibFunc.
  • We'd change Function::getIntrisicID to check the HasLLVMReservedName flag (already existing), and return Intrinsic::not_inrinsic value if not set.
  • We'd add a corresponding getLibFuncID, and isLibFunc function to Function.
  • We'd modify recalculateIntrinsicID to compute the libfunc enum as well.

The tradeoff is that function construction and renaming would become slightly slower, but determining whether a function was a library function would become fast.  We could also populate the value lazily, but that seems like complexity with little benefit. 

Thoughts?  Objections?  Better ideas?

If folks are on board with this, I'm happy to prepare a patch.

Philip


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Accelerating TLI getLibFunc lookups

Shawn Webb via llvm-dev
> -----Original Message-----
> From: llvm-dev <[hidden email]> On Behalf Of Philip Reames
> via llvm-dev
> Sent: den 24 april 2019 02:50
> To: llvm-dev <[hidden email]>
> Subject: [llvm-dev] Accelerating TLI getLibFunc lookups
>
> TLDR: Figuring out whether a declaration is a TLI LibFunc is slow.  We
> hammer that path in CGP.  I'm proposing storing the ID of a TLI LibFunc in
> the same IntID field in Function we use for IntrinsicID to make that fast.
> Looking into a compile time issue during codegen (LLC) for a large IR file,
> I came across something interesting.  Due to the presence of a very large
> number of intrinsics in the particular example, we were spending almost 30%
> of time in CodeGenPrep::optimizeCallInst, and within that, almost all of it
> in the FortifiedLibCallSimplifier.  Now, since the IR file in question has
> no fortified libcalls, that seemed a bit odd.
> Looking into, it turns out that figuring out that an arbitrary direct call
> is *not* a call to a LibCall requires a full name normalization and table
> lookup that a successful one does.  We could simply make the lookup itself
> faster - it looks like we could probably tablegen a near optimal character
> switch lookup table - but that still leaves us with the normalization.  We
> could cache the lookup, but then we have an analysis invalidation problem
> for all users of TLI.  Not unsolvable, but not fun if we have a better
> option.
> Instead, I noticed that we have no overlap between intrinsics, and target
> library functions.  Assuming we're happy with that, and don't see that
> changing in the future, that gives us an opportunity.  We could cache the
> libfunc ID into the Function itself, just like we do for intrinsics today.
> What would that look like in practice you ask?
> • We'd move the definition of LibFunc into
> include/IR/TargetLibraryFunctions.h/def (only the enum, not the rest of
> TLI)
> • We'd change IntID field in GlobalValue to be union of IntrinsicID and
> LibFunc.
> • We'd change Function::getIntrisicID to check the HasLLVMReservedName flag
> (already existing), and return Intrinsic::not_inrinsic value if not set.
> • We'd add a corresponding getLibFuncID, and isLibFunc function to
> Function.
> • We'd modify recalculateIntrinsicID to compute the libfunc enum as well.
> The tradeoff is that function construction and renaming would become
> slightly slower, but determining whether a function was a library function
> would become fast.  We could also populate the value lazily, but that seems
> like complexity with little benefit.
> Thoughts?  Objections?  Better ideas?
> If folks are on board with this, I'm happy to prepare a patch.
> Philip

So if we know that an there are no intrinsic calls being simplified
by the  FortifiedLibCallSimplifier, then I guess that we could
early out and return false inside the if

  IntrinsicInst *II = dyn_cast<IntrinsicInst>(CI);
  if (II) {
     ...
  }

that is before the FortifiedLibCallSimplifier simplifications.

That would at least reduce the amount of time spend in the
FortifiedLibCallSimplifier trying to lookup intrinsics.

And if there are some intrinsics that should be dealt with by
the FortifiedLibCallSimplifier we could make sure we fallthrough
by adding some cases to the switch inside that if statement.

Such a solution is ofcourse not as general as the one you are
suggesting, but it might be a simple solution if the problem
is that we try to lookup intrinsic calls inside the
FortifiedLibCallSimplifier.

/Björn
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Accelerating TLI getLibFunc lookups

Shawn Webb via llvm-dev

On 4/24/19 7:55 AM, Björn Pettersson A wrote:

>> -----Original Message-----
>> From: llvm-dev <[hidden email]> On Behalf Of Philip Reames
>> via llvm-dev
>> Sent: den 24 april 2019 02:50
>> To: llvm-dev <[hidden email]>
>> Subject: [llvm-dev] Accelerating TLI getLibFunc lookups
>>
>> TLDR: Figuring out whether a declaration is a TLI LibFunc is slow.  We
>> hammer that path in CGP.  I'm proposing storing the ID of a TLI LibFunc in
>> the same IntID field in Function we use for IntrinsicID to make that fast.
>> Looking into a compile time issue during codegen (LLC) for a large IR file,
>> I came across something interesting.  Due to the presence of a very large
>> number of intrinsics in the particular example, we were spending almost 30%
>> of time in CodeGenPrep::optimizeCallInst, and within that, almost all of it
>> in the FortifiedLibCallSimplifier.  Now, since the IR file in question has
>> no fortified libcalls, that seemed a bit odd.
>> Looking into, it turns out that figuring out that an arbitrary direct call
>> is *not* a call to a LibCall requires a full name normalization and table
>> lookup that a successful one does.  We could simply make the lookup itself
>> faster - it looks like we could probably tablegen a near optimal character
>> switch lookup table - but that still leaves us with the normalization.  We
>> could cache the lookup, but then we have an analysis invalidation problem
>> for all users of TLI.  Not unsolvable, but not fun if we have a better
>> option.
>> Instead, I noticed that we have no overlap between intrinsics, and target
>> library functions.  Assuming we're happy with that, and don't see that
>> changing in the future, that gives us an opportunity.  We could cache the
>> libfunc ID into the Function itself, just like we do for intrinsics today.
>> What would that look like in practice you ask?
>> • We'd move the definition of LibFunc into
>> include/IR/TargetLibraryFunctions.h/def (only the enum, not the rest of
>> TLI)
>> • We'd change IntID field in GlobalValue to be union of IntrinsicID and
>> LibFunc.
>> • We'd change Function::getIntrisicID to check the HasLLVMReservedName flag
>> (already existing), and return Intrinsic::not_inrinsic value if not set.
>> • We'd add a corresponding getLibFuncID, and isLibFunc function to
>> Function.
>> • We'd modify recalculateIntrinsicID to compute the libfunc enum as well.
>> The tradeoff is that function construction and renaming would become
>> slightly slower, but determining whether a function was a library function
>> would become fast.  We could also populate the value lazily, but that seems
>> like complexity with little benefit.
>> Thoughts?  Objections?  Better ideas?
>> If folks are on board with this, I'm happy to prepare a patch.
>> Philip
> So if we know that an there are no intrinsic calls being simplified
> by the  FortifiedLibCallSimplifier, then I guess that we could
> early out and return false inside the if
>
>   IntrinsicInst *II = dyn_cast<IntrinsicInst>(CI);
>   if (II) {
>      ...
>   }
>
> that is before the FortifiedLibCallSimplifier simplifications.
>
> That would at least reduce the amount of time spend in the
> FortifiedLibCallSimplifier trying to lookup intrinsics.
>
> And if there are some intrinsics that should be dealt with by
> the FortifiedLibCallSimplifier we could make sure we fallthrough
> by adding some cases to the switch inside that if statement.
>
> Such a solution is ofcourse not as general as the one you are
> suggesting, but it might be a simple solution if the problem
> is that we try to lookup intrinsic calls inside the
> FortifiedLibCallSimplifier.
Your framing does work for this particular use case. This is actually
how I figured out what was going on.  But while CGP is the codepath
which showed up hot in this example, I'm sure there are others. 
Filtering within getLibFunc is also a possibility, which is slightly
more general.  But both variants leave open the same problem for a
module which is heavy on non-intrinsic non-libfunc calls.  The advantage
of my proposed scheme is that all calls are treated equally. 
>
> /Björn
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev