Which transform passes to apply?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Which transform passes to apply?

Josh Klontz
Hello, I'm a new LLVM user working on a C++ EDSL for image processing. I have a function, which after applying createInstructionCombiningPass() and createDeadCodeEliminationPass() looks like:

define void @jitcv_sum_64sf1001(%Matrix* %src, %Matrix* %dst, i32 %len) {
entry:
  br label %loop_i

loop_i:                                           ; preds = %loop_i_end, %entry
  %i = phi i32 [ 0, %entry ], [ %increment_i, %loop_i_end ]
  %0 = getelementptr inbounds %Matrix* %dst, i32 0, i32 2
  %dst_columns = load i32* %0
  %dst_yRem = urem i32 %i, %dst_columns
  %dst_y = urem i32 %i, %dst_columns
  %1 = sub i32 %i, %dst_y
  %2 = add i32 %1, %dst_yRem

  %3 = getelementptr inbounds %Matrix* %src, i32 0, i32 0
  %4 = load i8** %3
  %src_data = bitcast i8* %4 to double*
  %5 = getelementptr double* %src_data, i32 %2
  %6 = load double* %5
  %accumulate = fadd double %6, 0.000000e+00
  %7 = getelementptr inbounds %Matrix* %dst, i32 0, i32 0
  %8 = load i8** %7
  %dst_data = bitcast i8* %8 to double*
  %9 = getelementptr double* %dst_data, i32 %i
  store double %accumulate, double* %9
  br label %loop_i_end

loop_i_end:                                       ; preds = %loop_i
  %increment_i = add i32 %i, 1
  %loop_i_test = icmp eq i32 %increment_i, %len
  br i1 %loop_i_test, label %loop_i_exit, label %loop_i

loop_i_exit:                                      ; preds = %loop_i_end
  ret void
}

My question is which optimization pass(es) are needed to simplify the instructions in bold. I've tried running the same passes again and also tried createInstructionSimplifierPass() with no luck.

Many Thanks,

Josh
Reply | Threaded
Open this post in threaded view
|

Re: Which transform passes to apply?

Duncan Sands
Hi Josh,

On 03/12/12 02:58, Josh Klontz wrote:
> Hello, I'm a new LLVM user working on a C++ EDSL for image processing. I have
> a function, which after applying createInstructionCombiningPass() and
> createDeadCodeEliminationPass() looks like:
>

...
>    *%dst_yRem = urem i32 %i, %dst_columns
>    %dst_y = urem i32 %i, %dst_columns
>    %1 = sub i32 %i, %dst_y
>    %2 = add i32 %1, %dst_yRem*
...

> My question is which optimization pass(es) are needed to simplify the
> instructions in bold. I've tried running the same passes again and also
> tried createInstructionSimplifierPass() with no luck.

GVN or EarlyCSE.

Ciao, Duncan.

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Which transform passes to apply?

Josh Klontz
Thanks Duncan, GVN/EarlyCSE worked as suggested. Any pointers on how to optimize out:
%accumulate = fadd double %6, 0.000000e+00

Using the 3.1 release and the C++ API, I can't figure out how FPMathOperator, TargetOptions, nor IRBuilder::SetDefaultFPMathTag work. I also don't see any floating point math transformation passes. I did see IRBuilder::SetFastMathFlags, do I need to update to 3.2 and use this call?
Reply | Threaded
Open this post in threaded view
|

Re: Which transform passes to apply?

David Tweed
Hi,

If you want LLVM to do this you'll definitely have to enable unsafe-math
flags since the transformation isn't strictly valid in IEEE floating point
(the only web ref I can find quickly is

http://dlang.org/float.html

). Duncan has been doing some work in implementing stuff in that area that I
haven't kept up with. However, it might be worth considering tracking those
empty operations you want to at your own DSL level, so that you've got more
control over when these fp-unsafe optimizations are applied. (I was doing
something with automatic differentiation -- which throws up lots of
"multiply by 1"s , "add 0"s, etc --- about a year ago and found this was the
way to go then, but as mentioned there's been some activity in the area I
haven't been following.)

Regards,
Dave
 
-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On
Behalf Of Josh Klontz
Sent: 04 December 2012 05:24
To: [hidden email]
Subject: Re: [LLVMdev] Which transform passes to apply?

Thanks Duncan, GVN/EarlyCSE worked as suggested. Any pointers on how to
optimize out:
%accumulate = fadd double %6, 0.000000e+00

Using the 3.1 release and the C++ API, I can't figure out how
FPMathOperator, TargetOptions, nor IRBuilder::SetDefaultFPMathTag work. I
also don't see any floating point math transformation passes. I did see
IRBuilder::SetFastMathFlags, do I need to update to 3.2 and use this call?




--
View this message in context:
http://llvm.1065342.n5.nabble.com/Which-transform-passes-to-apply-tp52111p52
211.html
Sent from the LLVM - Dev mailing list archive at Nabble.com.
_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev





_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Which transform passes to apply?

Duncan Sands
In reply to this post by Josh Klontz
Hi Josh,

On 04/12/12 06:23, Josh Klontz wrote:
> Thanks Duncan, GVN/EarlyCSE worked as suggested. Any pointers on how to
> optimize out:
> %accumulate = fadd double %6, 0.000000e+00

instcombine, however this transform is only correct if you are adding -0.0
so you won't get it without "fast math", and in 3.2 I think it will only be
done by codegen (llc).  The situation should be better in 3.3 since a bunch
of fast-math support started going in to the IR level transforms.

>
> Using the 3.1 release and the C++ API, I can't figure out how
> FPMathOperator, TargetOptions, nor IRBuilder::SetDefaultFPMathTag work. I
> also don't see any floating point math transformation passes. I did see
> IRBuilder::SetFastMathFlags, do I need to update to 3.2 and use this call?

Please open a bug report since even in mainline this transform isn't done yet
(as fast-math support at the IR level only just got going).

Ciao, Duncan.

>
>
>
>
> --
> View this message in context: http://llvm.1065342.n5.nabble.com/Which-transform-passes-to-apply-tp52111p52211.html
> Sent from the LLVM - Dev mailing list archive at Nabble.com.
> _______________________________________________
> LLVM Developers mailing list
> [hidden email]         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Which transform passes to apply?

Josh Klontz
Duncan Sands wrote
Hi Josh,

instcombine, however this transform is only correct if you are adding -0.0
so you won't get it without "fast math", and in 3.2 I think it will only be
done by codegen (llc).  The situation should be better in 3.3 since a bunch
of fast-math support started going in to the IR level transforms.

...

Please open a bug report since even in mainline this transform isn't done yet
(as fast-math support at the IR level only just got going).

Ciao, Duncan.
Filed under http://llvm.org/bugs/show_bug.cgi?id=14513. The -0.0 trick did the job in my case. Thanks again!