[llvm-dev] LLVM Call Graph may not cover all calls

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] LLVM Call Graph may not cover all calls

David Jones via llvm-dev
Hi there,
   I am working with opt-6.0 and try to generate a call graph of libsndfile, but it seems the call graph doesn't cover all call relationship.
   Actually, I am doing static analysis on CVE-2014-8130, which is a zero division on libtiff/tif_write.c  TIFFWriteScanline.   (see https://security-tracker.debian.org/tracker/CVE-2014-8130)
   Theoretically, the main function in tiffdither.c will call fsdither, and fsdither will call TIFFWriteScanLine.   main (tiffdither.c) -> fsdither (tiffdither.c) -> TIFFWriteScanLine (tif_write.c)
   I want to get a call graph of the buggy program tiffdither but I find the call graph generated doesn't cover the call relationship from fsdither ->  TIFFWriteScanLine.
   For short, the call graph now shows TIFFWriteScanLine is only called by an external node.
   I already compile tiffdither, and I upload it as an attached file. I also write a small python to help analyze the dot file.
   Actually, I do  opt-6.0 -analyze -dot-callgraph tiffdither.bc to generate the dot file. And then modify the dotPath in dotHandle.py. You can modify the python code to help analyze.
   I can't figure out why this happens, and I will be very appreciate if you can help!

Thanks & Regards,
Chaz

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

tiffdither.bc (2M) Download Attachment
dotHandle.py (2K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Call Graph may not cover all calls

David Jones via llvm-dev
How are you generating the calligraph from? Generally a compiler only acts on a per translation unit basis, so it couldn't form the complete program call graph across multiple files (hence why it's missing the edge that crosses the boundary between .C files)

On Thu, Nov 8, 2018, 8:43 AM changze cui via llvm-dev <[hidden email] wrote:
Hi there,
   I am working with opt-6.0 and try to generate a call graph of libsndfile, but it seems the call graph doesn't cover all call relationship.
   Actually, I am doing static analysis on CVE-2014-8130, which is a zero division on libtiff/tif_write.c  TIFFWriteScanline.   (see https://security-tracker.debian.org/tracker/CVE-2014-8130)
   Theoretically, the main function in tiffdither.c will call fsdither, and fsdither will call TIFFWriteScanLine.   main (tiffdither.c) -> fsdither (tiffdither.c) -> TIFFWriteScanLine (tif_write.c)
   I want to get a call graph of the buggy program tiffdither but I find the call graph generated doesn't cover the call relationship from fsdither ->  TIFFWriteScanLine.
   For short, the call graph now shows TIFFWriteScanLine is only called by an external node.
   I already compile tiffdither, and I upload it as an attached file. I also write a small python to help analyze the dot file.
   Actually, I do  opt-6.0 -analyze -dot-callgraph tiffdither.bc to generate the dot file. And then modify the dotPath in dotHandle.py. You can modify the python code to help analyze.
   I can't figure out why this happens, and I will be very appreciate if you can help!

Thanks & Regards,
Chaz
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] LLVM Call Graph may not cover all calls

David Jones via llvm-dev
Looks like this is old LLVM IR - I don't have an old version of LLVM that can read this. (I only have LLVM from subversion). Not sure what version it is, but if it's not past the upgrade horizon, distributing/sharing bitcode (as LLVM has backwards compatibility guarantees for that) rather than textual IR might be more effective.

In any case, I'd suggest you try to create a reduced test case that demonstrates the problem. Strip out unrelated instructions, functions, etc, while preserving the IR, and see if it still produces the problem - when you can't strip out anything else without losing the problem, then you have something someone else can more easily look at and explain/understand (equally, by oding this, maybe you find something critical that helps you understand what's going on)

On Wed, Nov 14, 2018 at 7:22 PM changze cui <[hidden email]> wrote:
 Hi Dave,
    As you mention, I do use wllvm to do the compilation and extract-bc work. 
    For now, the call graph works fine on CVE-2014-8130 after I recompile the program. I don't know why. It is weird.
    However, the call graph stll has some problem on CVE-2017-16942. The call graph just miss something.  I follow your advice and I check the IR and find everything is in there. By the way, I also try to recompile the program but don't work.
    According to the code, the call graph in CVE-2017-16942 is :
           psf_open_file -> wav_open -> wav_read_header -> wav_w64_read_fmt_chunk (this is the buggy function!)
    The IR shows the same call relationship (see the attached file 16942.ll).
    But if I generate the call graph by opt, it will miss   psf_open_file -> wav_open  and   wav_read_header-> wav_w64_read_fmt_chunk. 
    Also, I find some interesting phenomenon. When i generate the call graph, I find some nodes in edge won't show up in nodeList.  So it may looks like   psf_open_file ->  ""    (For now I am using pydot to handle the dot generated by opt).  Maybe the phenomenon is related to the missing call relationship?  I have no idea.
    I put the dot file and analysis result in the attached file. The dot is generated by opt and the analysis result show the map of caller callee (map[caller]= [callee1 callee2 calee3 ...]).
    Do you have other idea???
    Thanks a lot!!!!!!!!

Regards,
Chaz
    

David Blaikie <[hidden email]> 于2018年11月11日周日 上午5:01写道:


On Fri, Nov 9, 2018 at 10:39 PM changze cui <[hidden email]> wrote:
Hi David,
    Thanks for your reply !
    Actually, I compile the program into an executable program, and I use extract-bc to get .bc file from the executable program.

I can't say I'd heard of extract-bc - googling around I came across this? https://github.com/travitch/whole-program-llvm - is that what you're using? & you built the program with 'wllvm'?
 
Also, I can trigger the CVE from the executable program, which means function TIFFWriteScanLine is inside the executable program. So i think it is one translation unit and the dot-callgraph are supposed to handle this.

Did you take a look at the LLVM IR (llvm-dis will give you a textual representation of a bitcode file) to make sure everything's in there that you expect to be? Are there function definitions (not only declarations) of all the entities you want in the CFG? Are they calling each other directly, etc?

- Dave
 
    Do you have other ideas?

Thanks & Regards,
Chaz

David Blaikie <[hidden email]> 于2018年11月10日周六 上午1:44写道:
How are you generating the calligraph from? Generally a compiler only acts on a per translation unit basis, so it couldn't form the complete program call graph across multiple files (hence why it's missing the edge that crosses the boundary between .C files)

On Thu, Nov 8, 2018, 8:43 AM changze cui via llvm-dev <[hidden email] wrote:
Hi there,
   I am working with opt-6.0 and try to generate a call graph of libsndfile, but it seems the call graph doesn't cover all call relationship.
   Actually, I am doing static analysis on CVE-2014-8130, which is a zero division on libtiff/tif_write.c  TIFFWriteScanline.   (see https://security-tracker.debian.org/tracker/CVE-2014-8130)
   Theoretically, the main function in tiffdither.c will call fsdither, and fsdither will call TIFFWriteScanLine.   main (tiffdither.c) -> fsdither (tiffdither.c) -> TIFFWriteScanLine (tif_write.c)
   I want to get a call graph of the buggy program tiffdither but I find the call graph generated doesn't cover the call relationship from fsdither ->  TIFFWriteScanLine.
   For short, the call graph now shows TIFFWriteScanLine is only called by an external node.
   I already compile tiffdither, and I upload it as an attached file. I also write a small python to help analyze the dot file.
   Actually, I do  opt-6.0 -analyze -dot-callgraph tiffdither.bc to generate the dot file. And then modify the dotPath in dotHandle.py. You can modify the python code to help analyze.
   I can't figure out why this happens, and I will be very appreciate if you can help!

Thanks & Regards,
Chaz
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev