AESOP autoparallelizing compiler

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

AESOP autoparallelizing compiler

Timothy Mattausch Creech
Hi,
 We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at

 http://aesop.ece.umd.edu

The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads. Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.

We would welcome any feedback, comments or questions!

Thanks,
Tim Creech
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Jiong Wang
On 03/03/2013 02:09 PM, Timothy Mattausch Creech wrote:
> Hi,
>   We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
>
>   http://aesop.ece.umd.edu
>
> The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads.

    Interesting ! I happen to finish the initial TileGX backend support,
which is a many core processor. I am looking forward to testing AESOP on
TileGX silicon.

---
Regards,
Jiong

> Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.
>
> We would welcome any feedback, comments or questions!
>
> Thanks,
> Tim Creech
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


--
Regards,
Jiong. Wang
Tilera Corporation.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Sebastian Dreßler
In reply to this post by Timothy Mattausch Creech
Hi,

On 03/03/2013 07:09 AM, Timothy Mattausch Creech wrote:
> [...]
> The main components of the released implementation are loop memory
> dependence analysis and parallel code generation using calls to POSIX
> threads.

The loop memory dependence analysis sounds very interesting to me. Could
you provide some more information regarding its capabilities?


Cheers,
Sebastian


--
Mit freundlichen Grüßen / Kind regards

Sebastian Dreßler

Zuse Institute Berlin (ZIB)
Takustraße 7
D-14195 Berlin-Dahlem
Germany

[hidden email]
Phone: +49 30 84185-261

http://www.zib.de/
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Timothy Mattausch Creech
In reply to this post by Jiong Wang
Hi Jiong,
  I actually work day-to-day with Tilera processors and I was very pleased to see your recent mail about the TileGx patch! I have access to a Tile-Gx 8036 myself and am certainly planning to add native TileGx support to AESOP in the near future. (Shouldn't be hard: mostly it will require us to finally upgrade from LLVM 3.0 and compile our runtime dependencies for it.) I expect that we will use Tilera's own barrier implementations (in libtmc) directly in our codegen.

-Tim

On Sun, Mar 03, 2013 at 03:01:23PM +0800, Jiong Wang wrote:

> On 03/03/2013 02:09 PM, Timothy Mattausch Creech wrote:
> >Hi,
> >  We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
> >
> >  http://aesop.ece.umd.edu
> >
> >The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads.
>
>    Interesting ! I happen to finish the initial TileGX backend
> support, which is a many core processor. I am looking forward to
> testing AESOP on TileGX silicon.
>
> ---
> Regards,
> Jiong
>
> >Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.
> >
> >We would welcome any feedback, comments or questions!
> >
> >Thanks,
> >Tim Creech
> >_______________________________________________
> >LLVM Developers mailing list
> >[hidden email]         http://llvm.cs.uiuc.edu
> >http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
> --
> Regards,
> Jiong. Wang
> Tilera Corporation.
>
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Timothy Mattausch Creech
In reply to this post by Sebastian Dreßler
Hi Sebastian,
  Sure! The bulk of LMDA was written by Aparna Kotha (CCd). It computes dependences between all instructions, computes the resulting direction vectors in the function, then associates them all with loops.

At a high level, the dependence analysis consults with AliasAnalysis, and ScalarEvolution before resorting to attempting to understand the effective affine expressions and performing dependence tests (e.g., Banerjee). If it cannot rule out a dependence, then it will additionally consult with an ArrayPrivatization analysis to see if an involved memory object can be made thread private. It is probably also worth mentioning that the LMDA has been written not only to function well with IR from source code, but also with low level IR from a binary to IR translator in a separate project. This has required new techniques specific to this problem. Aparna can provide more information on techniques used in our LMDA.

-Tim

On Sun, Mar 03, 2013 at 09:18:47AM +0100, Sebastian Dreßler wrote:

> Hi,
>
> On 03/03/2013 07:09 AM, Timothy Mattausch Creech wrote:
> > [...]
> > The main components of the released implementation are loop memory
> > dependence analysis and parallel code generation using calls to POSIX
> > threads.
>
> The loop memory dependence analysis sounds very interesting to me. Could
> you provide some more information regarding its capabilities?
>
>
> Cheers,
> Sebastian
>
>
> --
> Mit freundlichen Grüßen / Kind regards
>
> Sebastian Dreßler
>
> Zuse Institute Berlin (ZIB)
> Takustraße 7
> D-14195 Berlin-Dahlem
> Germany
>
> [hidden email]
> Phone: +49 30 84185-261
>
> http://www.zib.de/
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Sebastian Dreßler
Hi Tim,

On 03/03/2013 06:32 PM, Timothy Mattausch Creech wrote:

> [...]
> At a high level, the dependence analysis consults with AliasAnalysis,
> and ScalarEvolution before resorting to attempting to understand the
> effective affine expressions and performing dependence tests (e.g.,
> Banerjee). If it cannot rule out a dependence, then it will
> additionally consult with an ArrayPrivatization analysis to see if an
> involved memory object can be made thread private. It is probably
> also worth mentioning that the LMDA has been written not only to
> function well with IR from source code, but also with low level IR
> from a binary to IR translator in a separate project. This has
> required new techniques specific to this problem. Aparna can provide
> more information on techniques used in our LMDA.
>

This sounds pretty neat. I was asking, because we currently research
methods to model the performance of task execution and loop analysis is
rather hard. So maybe we could benefit from this part of AESOP. Anyway,
the whole compiler is very interesting for us!


Cheers,
Sebastian


--
Mit freundlichen Grüßen / Kind regards

Sebastian Dreßler

Zuse Institute Berlin (ZIB)
Takustraße 7
D-14195 Berlin-Dahlem
Germany

[hidden email]
Phone: +49 30 84185-261

http://www.zib.de/
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Hal Finkel
In reply to this post by Timothy Mattausch Creech
----- Original Message -----

> From: "Timothy Mattausch Creech" <[hidden email]>
> To: "Sebastian Dreßler" <[hidden email]>
> Cc: "Aparna Kotha" <[hidden email]>, [hidden email]
> Sent: Sunday, March 3, 2013 11:32:49 AM
> Subject: Re: [LLVMdev] AESOP autoparallelizing compiler
>
> Hi Sebastian,
>   Sure! The bulk of LMDA was written by Aparna Kotha (CCd). It
>   computes dependences between all instructions, computes the
>   resulting direction vectors in the function, then associates them
>   all with loops.
>
> At a high level, the dependence analysis consults with AliasAnalysis,
> and ScalarEvolution before resorting to attempting to understand the
> effective affine expressions and performing dependence tests (e.g.,
> Banerjee). If it cannot rule out a dependence, then it will
> additionally consult with an ArrayPrivatization analysis to see if
> an involved memory object can be made thread private. It is probably
> also worth mentioning that the LMDA has been written not only to
> function well with IR from source code, but also with low level IR
> from a binary to IR translator in a separate project. This has
> required new techniques specific to this problem. Aparna can provide
> more information on techniques used in our LMDA.

This sounds very interesting; thanks for sharing this with the community. I'd also like to know more about this.

Also, out of curiosity, two quick questions:

1. Why are you using the old induction-variable simplification?

2. Are you generating OpenMP runtime calls (I see some omp references in the CodeGenerationPass); if so, for what runtime?

Sincerely,
Hal

>
> -Tim
>
> On Sun, Mar 03, 2013 at 09:18:47AM +0100, Sebastian Dreßler wrote:
> > Hi,
> >
> > On 03/03/2013 07:09 AM, Timothy Mattausch Creech wrote:
> > > [...]
> > > The main components of the released implementation are loop
> > > memory
> > > dependence analysis and parallel code generation using calls to
> > > POSIX
> > > threads.
> >
> > The loop memory dependence analysis sounds very interesting to me.
> > Could
> > you provide some more information regarding its capabilities?
> >
> >
> > Cheers,
> > Sebastian
> >
> >
> > --
> > Mit freundlichen Grüßen / Kind regards
> >
> > Sebastian Dreßler
> >
> > Zuse Institute Berlin (ZIB)
> > Takustraße 7
> > D-14195 Berlin-Dahlem
> > Germany
> >
> > [hidden email]
> > Phone: +49 30 84185-261
> >
> > http://www.zib.de/
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Timothy Mattausch Creech
On Sun, Mar 03, 2013 at 02:06:44PM -0600, Hal Finkel wrote:

> ----- Original Message -----
> > From: "Timothy Mattausch Creech" <[hidden email]>
> > To: "Sebastian Dreßler" <[hidden email]>
> > Cc: "Aparna Kotha" <[hidden email]>, [hidden email]
> > Sent: Sunday, March 3, 2013 11:32:49 AM
> > Subject: Re: [LLVMdev] AESOP autoparallelizing compiler
> >
> > Hi Sebastian,
> >   Sure! The bulk of LMDA was written by Aparna Kotha (CCd). It
> >   computes dependences between all instructions, computes the
> >   resulting direction vectors in the function, then associates them
> >   all with loops.
> >
> > At a high level, the dependence analysis consults with AliasAnalysis,
> > and ScalarEvolution before resorting to attempting to understand the
> > effective affine expressions and performing dependence tests (e.g.,
> > Banerjee). If it cannot rule out a dependence, then it will
> > additionally consult with an ArrayPrivatization analysis to see if
> > an involved memory object can be made thread private. It is probably
> > also worth mentioning that the LMDA has been written not only to
> > function well with IR from source code, but also with low level IR
> > from a binary to IR translator in a separate project. This has
> > required new techniques specific to this problem. Aparna can provide
> > more information on techniques used in our LMDA.
>
> This sounds very interesting; thanks for sharing this with the community. I'd also like to know more about this.
>
> Also, out of curiosity, two quick questions:
>
> 1. Why are you using the old induction-variable simplification?
>
> 2. Are you generating OpenMP runtime calls (I see some omp references in the CodeGenerationPass); if so, for what runtime?
>
> Sincerely,
> Hal

Hi Hal,
  1. We are still using the old indvars simplifcation because we were depending on "-enable-iv-rewrite" and canonical induction variables. I think "-enable-iv-rewrite" is still there in 3.0, but was behaving differently from in 2.9, so we just kept using the 2.9 implementation.
  2. We only link to OpenMP (any implementation should be fine) to get the number of threads to use (omp_get_max_threads()). This makes our binaries obey the OMP_NUM_THREADS environment variable, if present. OpenMP is _not_ used for parallelization: AESOP generates pthreads calls and does very lightweight static scheduling of loop iterations at the IR level.

One thing to note is that we do not use pthreads' barriers. We still link to a (spinning) barrier implementation which was part of a now-defunct sister project. We should eventually do away with this dependence, but haven't had time yet.

Thanks,
Tim

>
> >
> > -Tim
> >
> > On Sun, Mar 03, 2013 at 09:18:47AM +0100, Sebastian Dreßler wrote:
> > > Hi,
> > >
> > > On 03/03/2013 07:09 AM, Timothy Mattausch Creech wrote:
> > > > [...]
> > > > The main components of the released implementation are loop
> > > > memory
> > > > dependence analysis and parallel code generation using calls to
> > > > POSIX
> > > > threads.
> > >
> > > The loop memory dependence analysis sounds very interesting to me.
> > > Could
> > > you provide some more information regarding its capabilities?
> > >
> > >
> > > Cheers,
> > > Sebastian
> > >
> > >
> > > --
> > > Mit freundlichen Grüßen / Kind regards
> > >
> > > Sebastian Dreßler
> > >
> > > Zuse Institute Berlin (ZIB)
> > > Takustraße 7
> > > D-14195 Berlin-Dahlem
> > > Germany
> > >
> > > [hidden email]
> > > Phone: +49 30 84185-261
> > >
> > > http://www.zib.de/
> > _______________________________________________
> > LLVM Developers mailing list
> > [hidden email]         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

陳韋任 (Wei-Ren Chen)
In reply to this post by Timothy Mattausch Creech
Hi Timothy,

>  We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
>
>  http://aesop.ece.umd.edu
>
> The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads. Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.

  Do you have data show us that how much parallelization the AESOP can
extract from those benchmarks? :)

Regards,
chenwj

--
Wei-Ren Chen (陳韋任)
Computer Systems Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667
Homepage: http://people.cs.nctu.edu.tw/~chenwj

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Timothy Mattausch Creech
On Mon, Mar 04, 2013 at 03:01:15PM +0800, 陳韋任 (Wei-Ren Chen) wrote:

> Hi Timothy,
>
> >  We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
> >
> >  http://aesop.ece.umd.edu
> >
> > The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads. Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.
>
>   Do you have data show us that how much parallelization the AESOP can
> extract from those benchmarks? :)
>
> Regards,
> chenwj
>
> --
> Wei-Ren Chen (陳韋任)
> Computer Systems Lab, Institute of Information Science,
> Academia Sinica, Taiwan (R.O.C.)
> Tel:886-2-2788-3799 #1667
> Homepage: http://people.cs.nctu.edu.tw/~chenwj

Hi Wei-Ren,
  Sorry for the slow response. We're working on a short tech report which will be up on the website in April. This will contain a "results" section, including results from the SPEC benchmarks which we can't include in the source distribution.

Briefly, I can say that we get good speedups on some of the NAS and SPEC benchmarks, such as a 3.6x+ speedup on 4 cores on the serial version of NAS "CG" (Fortran), and "lbm" (C) from CPU2006. (These are of course among our best results.)

-Tim
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

rahul
Hi Timothy,

Today I happened to download the code and do some experiments.
I actually wanted to see how you handle inter-procedure alias analysis. 
So, I set inline threshold to zero and tried out following example
===============================================
#define N 1024
void func(double *A, double *B) 
{
  int i;
  for (i=1; i<N-2; i++) {
    B[i] = A[i] + i*3;
  }
}

void func1(double *A, double *B)
{
  func(A,B);
}

int main(void)
{
  double data[N];
  double data1[N];
  double result=0;
  int i;
  
  for (i=0; i<N; i++) {
    result += i*3;
    data[i] = result;
  }
  func1(data, data1);
  
  printf(" Data[10] = %lf\n", data[10]);
  printf("Data1[10] = %lf\n", data1[10]);
  
  return 0;
}
===============================================
I got following parallalization info after compiling:

    Loop main:for.body has a loop carried scalar dependence, hence will not be parallelized 
    Loop func:for.body carries no dependence, hence is being parallelized 

Since A and B to function "func" are aliases it shouldn't have parallelized.

Can you please let me know how you compute dependence ??


Thanks,
Rahul



On Sun, Mar 10, 2013 at 7:28 AM, Timothy Mattausch Creech <[hidden email]> wrote:
On Mon, Mar 04, 2013 at 03:01:15PM +0800, 陳韋任 (Wei-Ren Chen) wrote:
> Hi Timothy,
>
> >  We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
> >
> >  http://aesop.ece.umd.edu
> >
> > The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads. Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.
>
>   Do you have data show us that how much parallelization the AESOP can
> extract from those benchmarks? :)
>
> Regards,
> chenwj
>
> --
> Wei-Ren Chen (陳韋任)
> Computer Systems Lab, Institute of Information Science,
> Academia Sinica, Taiwan (R.O.C.)
> Tel:886-2-2788-3799 #1667
> Homepage: http://people.cs.nctu.edu.tw/~chenwj

Hi Wei-Ren,
  Sorry for the slow response. We're working on a short tech report which will be up on the website in April. This will contain a "results" section, including results from the SPEC benchmarks which we can't include in the source distribution.

Briefly, I can say that we get good speedups on some of the NAS and SPEC benchmarks, such as a 3.6x+ speedup on 4 cores on the serial version of NAS "CG" (Fortran), and "lbm" (C) from CPU2006. (These are of course among our best results.)

-Tim
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



--
Regards,
Rahul Patil.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

Timothy Mattausch Creech
Hi Rahul,
  Thanks for your interest!

  Our work does not attempt to make any significant contributions to alias analysis, and acts as a client to existing LLVM AA. Furthermore, the options passed to the AESOP frontend scripts are obeyed at compile time, but at link time certain transformations occur unconditionally.

Here, AESOP has actually thwarted your experiment by performing inlining just before link-time. You can disable this in ~/aesop/blank/Makefile. It does this unconditionally as a standard set of pre passes to help the analysis as much as possible. As a result, AESOP knows that A and B do not alias and is correctly parallelizing here.

Best,
Tim
 
On Mar 11, 2013, at 8:59 AM, Rahul wrote:

> Hi Timothy,
>
> Today I happened to download the code and do some experiments.
> I actually wanted to see how you handle inter-procedure alias analysis.
> So, I set inline threshold to zero and tried out following example
> ===============================================
> #define N 1024
> void func(double *A, double *B)
> {
>   int i;
>   for (i=1; i<N-2; i++) {
>     B[i] = A[i] + i*3;
>   }
> }
>
> void func1(double *A, double *B)
> {
>   func(A,B);
> }
>
> int main(void)
> {
>   double data[N];
>   double data1[N];
>   double result=0;
>   int i;
>  
>   for (i=0; i<N; i++) {
>     result += i*3;
>     data[i] = result;
>   }
>   func1(data, data1);
>  
>   printf(" Data[10] = %lf\n", data[10]);
>   printf("Data1[10] = %lf\n", data1[10]);
>  
>   return 0;
> }
> ===============================================
> I got following parallalization info after compiling:
>
>     Loop main:for.body has a loop carried scalar dependence, hence will not be parallelized
>     Loop func:for.body carries no dependence, hence is being parallelized
>
> Since A and B to function "func" are aliases it shouldn't have parallelized.
>
> Can you please let me know how you compute dependence ??
>
>
> Thanks,
> Rahul
>
>
>
> On Sun, Mar 10, 2013 at 7:28 AM, Timothy Mattausch Creech <[hidden email]> wrote:
> On Mon, Mar 04, 2013 at 03:01:15PM +0800, 陳韋任 (Wei-Ren Chen) wrote:
> > Hi Timothy,
> >
> > >  We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
> > >
> > >  http://aesop.ece.umd.edu
> > >
> > > The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads. Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.
> >
> >   Do you have data show us that how much parallelization the AESOP can
> > extract from those benchmarks? :)
> >
> > Regards,
> > chenwj
> >
> > --
> > Wei-Ren Chen (陳韋任)
> > Computer Systems Lab, Institute of Information Science,
> > Academia Sinica, Taiwan (R.O.C.)
> > Tel:886-2-2788-3799 #1667
> > Homepage: http://people.cs.nctu.edu.tw/~chenwj
>
> Hi Wei-Ren,
>   Sorry for the slow response. We're working on a short tech report which will be up on the website in April. This will contain a "results" section, including results from the SPEC benchmarks which we can't include in the source distribution.
>
> Briefly, I can say that we get good speedups on some of the NAS and SPEC benchmarks, such as a 3.6x+ speedup on 4 cores on the serial version of NAS "CG" (Fortran), and "lbm" (C) from CPU2006. (These are of course among our best results.)
>
> -Tim
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> --
> Regards,
> Rahul Patil.


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: AESOP autoparallelizing compiler

rahul
Hi Timothy,

Thanks for quick reply :)

I disabled inlining in ~/aesop/blank/Makefile. Now function "main" calls function "func1" with two arguments which alias, aesop still goes ahead and parallelize the loop in function "func".
==============
func1(A, A+1); 
==============
So, irrespective of whether aesop inlines function func1/func, above call shouldn't have parallelized loop in function "func" (called from func1).

Thanks,
Rahul



On Mon, Mar 11, 2013 at 7:15 PM, Timothy Mattausch Creech <[hidden email]> wrote:
Hi Rahul,
  Thanks for your interest!

  Our work does not attempt to make any significant contributions to alias analysis, and acts as a client to existing LLVM AA. Furthermore, the options passed to the AESOP frontend scripts are obeyed at compile time, but at link time certain transformations occur unconditionally.

Here, AESOP has actually thwarted your experiment by performing inlining just before link-time. You can disable this in ~/aesop/blank/Makefile. It does this unconditionally as a standard set of pre passes to help the analysis as much as possible. As a result, AESOP knows that A and B do not alias and is correctly parallelizing here.

Best,
Tim

On Mar 11, 2013, at 8:59 AM, Rahul wrote:

> Hi Timothy,
>
> Today I happened to download the code and do some experiments.
> I actually wanted to see how you handle inter-procedure alias analysis.
> So, I set inline threshold to zero and tried out following example
> ===============================================
> #define N 1024
> void func(double *A, double *B)
> {
>   int i;
>   for (i=1; i<N-2; i++) {
>     B[i] = A[i] + i*3;
>   }
> }
>
> void func1(double *A, double *B)
> {
>   func(A,B);
> }
>
> int main(void)
> {
>   double data[N];
>   double data1[N];
>   double result=0;
>   int i;
>
>   for (i=0; i<N; i++) {
>     result += i*3;
>     data[i] = result;
>   }
>   func1(data, data1);
>
>   printf(" Data[10] = %lf\n", data[10]);
>   printf("Data1[10] = %lf\n", data1[10]);
>
>   return 0;
> }
> ===============================================
> I got following parallalization info after compiling:
>
>     Loop main:for.body has a loop carried scalar dependence, hence will not be parallelized
>     Loop func:for.body carries no dependence, hence is being parallelized
>
> Since A and B to function "func" are aliases it shouldn't have parallelized.
>
> Can you please let me know how you compute dependence ??
>
>
> Thanks,
> Rahul
>
>
>
> On Sun, Mar 10, 2013 at 7:28 AM, Timothy Mattausch Creech <[hidden email]> wrote:
> On Mon, Mar 04, 2013 at 03:01:15PM +0800, 陳韋任 (Wei-Ren Chen) wrote:
> > Hi Timothy,
> >
> > >  We would like to inform the community that we're releasing a version of our research compiler, "AESOP", developed at UMD using LLVM. AESOP is a distance-vector-based autoparallelizing compiler for shared-memory machines. The source code and some further information is available at
> > >
> > >  http://aesop.ece.umd.edu
> > >
> > > The main components of the released implementation are loop memory dependence analysis and parallel code generation using calls to POSIX threads. Since we currently have only a 2-person development team, we are still on LLVM 3.0, and some of the code could use some cleanup. Still, we hope that the work will be of interest to some.
> >
> >   Do you have data show us that how much parallelization the AESOP can
> > extract from those benchmarks? :)
> >
> > Regards,
> > chenwj
> >
> > --
> > Wei-Ren Chen (陳韋任)
> > Computer Systems Lab, Institute of Information Science,
> > Academia Sinica, Taiwan (R.O.C.)
> > Tel:886-2-2788-3799 #1667
> > Homepage: http://people.cs.nctu.edu.tw/~chenwj
>
> Hi Wei-Ren,
>   Sorry for the slow response. We're working on a short tech report which will be up on the website in April. This will contain a "results" section, including results from the SPEC benchmarks which we can't include in the source distribution.
>
> Briefly, I can say that we get good speedups on some of the NAS and SPEC benchmarks, such as a 3.6x+ speedup on 4 cores on the serial version of NAS "CG" (Fortran), and "lbm" (C) from CPU2006. (These are of course among our best results.)
>
> -Tim
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> --
> Regards,
> Rahul Patil.




--
Regards,
Rahul Patil.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev