GSoC Proposal: Inter-Procedure Program Slicing in LLVM

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

GSoC Proposal: Inter-Procedure Program Slicing in LLVM

liuml07
Hi all,

This is a GSoC 2013 proposal for LLVM project. Please see the formatted version at here: http://pacman.cs.tsinghua.edu.cn/~liuml07/files/gsoc2013-proposal-program-slicing.pdf

Program slicing has been used in many applications, the criteria of which is a pair of statement and variables. I would like to write an inter-procedural program slicing pass in LLVM, which is able to calculate C program slices of source code effectively. There is no previous work implemented in LLVM, which considers both the dynamic program slicing and source code of the sliced program. Program slice contains all statements in a program that directly or indirectly act the value of a variable occurrence [5]. Program slicing has been used in many applications, e.g., program verification, testing, maintenance, automatic parallelization of program execution, automatic integration of program versions.

While it's straightforward to implement the slicer in the back-end of compiler using SSA form, the source code of the original program instead of intermediate
representation is preferred in most cases. Moreover, we can further narrow the notion of slice, which contains statements that influence the value of a
variable occurrence for speci c program inputs. This is referred as dynamic program slicing [1]. However, there is no previous work implemented in LLVM
which solved the two problems.

There are two public projects which implement the backwards static slicing in LLVM.
  • Giri Written by John Criswell from UIUC, a subproject of LLVM. The Giri code contains the static backwards intra-procedure slicing passes, and runs with an older version of LLVM. It also only backtracks until it hits a load. Additional code must be written to backtrack further to find potentially reaching stores.
  • LLVMSlicer This implementation is a complete static backwards slicer from Masaryk University. It works on the well de ned data and control flow equations in a white paper by F. Tip [4]. However, this code was written for special purpose, thus it's not general enough to be use by others. They implemented the Andersen's alias algorithm [2], callgraph, and modi es analysis to support the slicer, instead of using the LLVM APIs.

They eliminate the useless IR statements and keep the ones a ect the values of the criteria. However, neither of them generates the compilable source code
slice, which is heavily needed in reality. There are several ways to do this. One is to generate the source code from sliced IR using llc tool. The issue is that
the IR is not concerned with high-level semantic. The generated source code is different from the original program and not suitable for human reading. An-
other approach is deleting the source code according to sliced IR, with line number information (in meta-data of each instruction). The naive script deleting
sliced source code one by one fails to handle tricky cases. I think a better source code slicer is to take use of the AST info.

There are three main parts of this work.
  1. First, borrow an implementation of static back-wards slicing from Giri or LLVMSlicer, and use the LLVM callgraph, mod/ref and alias interfaces as much as possible.
  2. Second, implement the dynamic program slicing using the approach 3 in the paper [1].
  3. Third, generate the source code of the sliced program. To make the sliced source code compile directly, we need to employ clang front-end

The final result of this summer of code is to make this pass work e effectively and documented well. Further, I'll write test cases and behave as the active
maintainer for this project. My long-term plan is to add more features, e.g. Objective-C/C++ support, thing slicing [3], to this project.

Any comment is highly appreciated.


[1] H AGRAWAL. Dynamic Program Slicing. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI),1990.
2] L.O. Andersen. Program analysis and specialization for the C programming language. PhD thesis, University of Cophenhagen, Germany, 1994.
[3] M. Sridharan, S.J. Fink, and R. Bodik. Thin slicing. In PLDI'07, 2007.
[4] F. Tip. A survey of program slicing techniques.Journal of Programming Languages, 1995.
[5] M. Weiser. Program slicing. In Proceedings of the 5th International Conference on Software Engineering, ICSE, pages 439{449. IEEE, 1981.

--
Mingliang LIU (刘明亮 in Chinese)

PACMAN Group,  Dept. of Computer Science & Technology
Tsinghua University, Beijing 100084, China
Email: [hidden email]
Homepage: http://pacman.cs.tsinghua.edu.cn/~liuml07/

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

gsoc2013-proposal-program-slicing.pdf (152K) Download Attachment