Mul & div support for wider-than-legal types

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Mul & div support for wider-than-legal types

Paweł Bylica
Hi LLVM,
  1. Can mul and/or div support be added for big integer types like i256?
  2. What are the limits?
  3. If yes, how should it be done?
I have experience only with X86 target and know that mul i128 works and div i128 is lowered to function call from compile-rt like library (what works only if you link with such library). Can that support be extended?

- Paweł

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Joerg Sonnenberger
On Fri, Mar 20, 2015 at 09:06:24AM +0000, Paweł Bylica wrote:
> Hi LLVM,
>
>    1. Can mul and/or div support be added for big integer types like i256?
>    2. What are the limits?
>    3. If yes, how should it be done?
>
> I have experience only with X86 target and know that mul i128 works and div
> i128 is lowered to function call from compile-rt like library (what works
> only if you link with such library). Can that support be extended?

mul can be inlined easily if necessary for arbitrary sizes, but div is
very expensive. Note that div can be expanded to calls even for smaller
sizes, since many (older) architectures don't have hardware division.

What's your use case?

Joerg

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Tim Northover-2
> mul can be inlined easily if necessary for arbitrary sizes, but div is very expensive.

Shall I file a bug for "implement FFT in LLVM"?

Tim.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Joerg Sonnenberger
On Fri, Mar 20, 2015 at 08:06:11PM -0700, Tim Northover wrote:
> > mul can be inlined easily if necessary for arbitrary sizes, but div is very expensive.
>
> Shall I file a bug for "implement FFT in LLVM"?

I didn't say it is the most efficient algorithm :) That's why I asked
about the purpose.

Joerg
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Paweł Bylica
In reply to this post by Joerg Sonnenberger
On Sat, Mar 21, 2015 at 3:20 AM Joerg Sonnenberger <[hidden email]> wrote:
On Fri, Mar 20, 2015 at 09:06:24AM +0000, Paweł Bylica wrote:
> Hi LLVM,
>
>    1. Can mul and/or div support be added for big integer types like i256?
>    2. What are the limits?
>    3. If yes, how should it be done?
>
> I have experience only with X86 target and know that mul i128 works and div
> i128 is lowered to function call from compile-rt like library (what works
> only if you link with such library). Can that support be extended?

mul can be inlined easily if necessary for arbitrary sizes, but div is
very expensive. Note that div can be expanded to calls even for smaller
sizes, since many (older) architectures don't have hardware division.

What's your use case?

Joerg

I would be happy to have mul and div up to i512 on x86_64 target. My current solution is to replace that instructions with call to my helper functions.

If I understand correctly one big mul can be composed of 4 half-precision mul instructions? Would such solution be accepted by LLVM?

- Paweł

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Paweł Bylica
In reply to this post by Joerg Sonnenberger
On Sun, Mar 22, 2015 at 5:57 AM Joerg Sonnenberger <[hidden email]> wrote:
On Fri, Mar 20, 2015 at 08:06:11PM -0700, Tim Northover wrote:
> > mul can be inlined easily if necessary for arbitrary sizes, but div is very expensive.
>
> Shall I file a bug for "implement FFT in LLVM"?

I didn't say it is the most efficient algorithm :) That's why I asked
about the purpose.

Joerg

 Can you suggest an algorithm and implementation then?

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Joerg Sonnenberger
On Sun, Mar 22, 2015 at 03:18:14PM +0000, Paweł Bylica wrote:

> On Sun, Mar 22, 2015 at 5:57 AM Joerg Sonnenberger <[hidden email]>
> wrote:
>
> > On Fri, Mar 20, 2015 at 08:06:11PM -0700, Tim Northover wrote:
> > > > mul can be inlined easily if necessary for arbitrary sizes, but div is
> > very expensive.
> > >
> > > Shall I file a bug for "implement FFT in LLVM"?
> >
> > I didn't say it is the most efficient algorithm :) That's why I asked
> > about the purpose.
> >
> > Joerg
> >
>
>  Can you suggest an algorithm and implementation then?

For short multi-word multiplications, school book approach works fine.
For medium sizes problems, Karatsuba or Toom-Cook is better. For truely
larger problems, it boils down to FFT. Of course, all this needs careful
benchmarking if you want to select a specific one.

Joerg

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Paweł Bylica
Can it be a first step?

- Paweł

On Sun, Mar 22, 2015 at 5:35 PM Joerg Sonnenberger <[hidden email]> wrote:
On Sun, Mar 22, 2015 at 03:18:14PM +0000, Paweł Bylica wrote:
> On Sun, Mar 22, 2015 at 5:57 AM Joerg Sonnenberger <[hidden email]>
> wrote:
>
> > On Fri, Mar 20, 2015 at 08:06:11PM -0700, Tim Northover wrote:
> > > > mul can be inlined easily if necessary for arbitrary sizes, but div is
> > very expensive.
> > >
> > > Shall I file a bug for "implement FFT in LLVM"?
> >
> > I didn't say it is the most efficient algorithm :) That's why I asked
> > about the purpose.
> >
> > Joerg
> >
>
>  Can you suggest an algorithm and implementation then?

For short multi-word multiplications, school book approach works fine.
For medium sizes problems, Karatsuba or Toom-Cook is better. For truely
larger problems, it boils down to FFT. Of course, all this needs careful
benchmarking if you want to select a specific one.

Joerg

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Mul & div support for wider-than-legal types

Paweł Bylica
I played with legalizing mul instruction. I implemented naive expanding that requires 8 multiplication of half-size types (http://reviews.llvm.org/D8770). This, of course, does not scale up very well, and it probably doable only up to i512. After this exercise, I have more questions than before.

Is information coming from TargetLowering::getOperationAction() related to mul type legalization? Where do that come from? Can a target request a "custom" lowering for that case?

Depending on a target, some muls divs and other operations are lowered to a call to a function from a "compiler-rt"-like library. The set of this built-in functions seems to be sealed. If I would like implement a support for e.g. mul i256, should I attack it in a similar manner? I.e. by adding a function to such library?

Another solution I think about is to inject a helper function to a module that requires long multiplication. That function should have a weak linkage to be possible merged later by a linker.

Any comments very welcome,
Paweł Bylica

On Wed, Apr 1, 2015 at 4:20 PM Paweł Bylica <[hidden email]> wrote:
Can it be a first step?

- Paweł


On Sun, Mar 22, 2015 at 5:35 PM Joerg Sonnenberger <[hidden email]> wrote:
On Sun, Mar 22, 2015 at 03:18:14PM +0000, Paweł Bylica wrote:
> On Sun, Mar 22, 2015 at 5:57 AM Joerg Sonnenberger <[hidden email]>
> wrote:
>
> > On Fri, Mar 20, 2015 at 08:06:11PM -0700, Tim Northover wrote:
> > > > mul can be inlined easily if necessary for arbitrary sizes, but div is
> > very expensive.
> > >
> > > Shall I file a bug for "implement FFT in LLVM"?
> >
> > I didn't say it is the most efficient algorithm :) That's why I asked
> > about the purpose.
> >
> > Joerg
> >
>
>  Can you suggest an algorithm and implementation then?

For short multi-word multiplications, school book approach works fine.
For medium sizes problems, Karatsuba or Toom-Cook is better. For truely
larger problems, it boils down to FFT. Of course, all this needs careful
benchmarking if you want to select a specific one.

Joerg

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev