[llvm-dev] Sub-optimal register allocation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] Sub-optimal register allocation

Tim Northover via llvm-dev
​Hi

I am using Halide​, and trying to generate a simplified version of the inner kernel in a GEMM operation, similar to this. Basically it multiplies a 12x1 column vector with a 1x4 row vector and updates an accumulator cell of size 12x4. I am targeting 32-bit ARM NEON.

Ideally, all the accumulators and operands should fit in the q registers, without spilling to the stack. However, the generated ARM assembly uses the registers in a sub-optimal way, and keeps spilling registers onto the stack and reloading them.

The relevant part of the LLVM IR is here, and the corresponding arm32 assembly is here.

​Any help to how to solve this, or what might be causing it, will be greatly appreciated.

Thanks a lot,
Mohamed​



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev