[llvm-dev] XRay TID mismatch when forking

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] XRay TID mismatch when forking

Tom Stellard via llvm-dev

Hello,

 

There seems to be a bug with the value of the thread ID (TID) that is generated by XRay when calling fork(). The value of the TID of the child has the same TID as its parent (this is seen in the “thread: “ field of the YAML output produced by the llvm-xray tool). This is a problem as the trace generated contains the same TID for the parent and children processes, making it difficult to distinguish between threads running in different processes.


I am running linux on x86_64 machine and am using clang 7.0.

henry@OptiPlex-9010:~$ cat /etc/issue
Ubuntu 16.04.4 LTS \n \l

henry@OptiPlex-9010:~$ uname -a
Linux OptiPlex-9010 4.13.0-43-generic #48~16.04.1-Ubuntu SMP Thu May 17 12:56:46 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Together with this email I include sample code that should allow you to reproduce the problem, as well as the output I obtain from running that code, and that shows the problem described above.

Below is a test file I wrote to show the bug:

The parent process prints its own tid, calls fork ten times and for each fork, the child prints own TID using the gettid syscall. The syscall code is based on that of the sanitizer code – to make sure that I am using the same mechanism as xray to obtain the thread ID from the platform.

=================

#include <stdio.h>

#include <unistd.h>

#include <stdint.h>

#include <sys/syscall.h>

 

//taken from sanitizer

 

#define SYSCALL(name) __NR_ ## name

 

static uintptr_t internal_syscall(uint64_t nr) {

  uint64_t retval;

  asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11",

               "memory", "cc");

  return retval;

}

 

/////////////

 

int main()

{

                printf("\nTID of parent: %lu\n", internal_syscall(SYSCALL(gettid)));

                for(int i = 0; i < 10; i++)

                {

                                if(fork())

                                {

                                }

                                else

                                {

                                                printf("\nTID of child: %lu\n", internal_syscall(SYSCALL(gettid)));

                                                break;

                                }

                }

                return 0;

}


I compiled the test file using clang 7.0 and ran it:

=================

 

./clang++ -fxray-instrument test.cpp –o test

XRAY_OPTIONS="patch_premain=true xray_mode=xray-basic verbosity=1" ./test


The output shows that logger prints out the parent TID 11 times, but making the gettid syscall in the test shows that the TIDs of the children are different than their parent process.

Here is the output:

=================

 


==24020==XRay: Log file in 'xray-log.test.yLWPSD'

 

TID of parent: 24020

 

TID of child: 24021

 

TID of child: 24022

 

TID of child: 24023

 

TID of child: 24024

 

TID of child: 24025

 

TID of child: 24026

 

TID of child: 24027

 

TID of child: 24028

 

TID of child: 24029

 

TID of child: 24030

==24021==Cleaned up log for TID: 24020

==24022==Cleaned up log for TID: 24020

==24024==Cleaned up log for TID: 24020

==24026==Cleaned up log for TID: 24020

==24025==Cleaned up log for TID: 24020

==24023==Cleaned up log for TID: 24020

==24028==Cleaned up log for TID: 24020

==24027==Cleaned up log for TID: 24020

==24029==Cleaned up log for TID: 24020

==24030==Cleaned up log for TID: 24020

==24020==Cleaned up log for TID: 24020            

=================

       

The llvm-xray tool reports the same TID across all threads when running the tool on the generated trace file.

=================

---                                                                                                                                                                  

header:                                                                                                                                                              

  version:         2                                                                                                                                                

  type:            0                                                                                                                                                 

  constant-tsc:    true                                                                                                                                             

  nonstop-tsc:     true                                                                                                                                             

  cycle-frequency: 3400000000                                                                                                                                        

records:                                                                                                                                                            

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                     

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-enter, tsc: 4300740411623868 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind: function-exit, tsc: 4300740412091416 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 1, thread: 24065, kind: function-exit, tsc: 4300740412505836 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind: function-exit, tsc: 4300740412671366 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 1, thread: 24065, kind: function-exit, tsc: 4300740412775572 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind: function-exit, tsc: 4300740412934286 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind: function-exit, tsc: 4300740413141394 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 2, thread: 24065, kind: function-exit, tsc: 4300740413259282 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind: function-exit, tsc: 4300740413338498 }                                                      

  - { type: 0, func-id: 1, function: main, cpu: 0, thread: 24065, kind: function-exit, tsc: 4300740413514362 }                                                       

  - { type: 0, func-id: 1, function: main, cpu: 3, thread: 24065, kind: function-exit, tsc: 4300740413559132 }                                                      

- { type: 0, func-id: 1, function: main, cpu: 1, thread: 24065, kind: function-exit, tsc: 4300740413725842 }                                                      


Is this an issue that can be fixed? If not, is there any advice on how to make xray generate different TIDs when forking on my end?


Attached below is the file I tested with.

Thanks,

Henry



 

 

 



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

test.cpp (820 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] XRay TID mismatch when forking

Tom Stellard via llvm-dev
Hi Henry,
On Thu, Jun 7, 2018 at 11:51 PM Henry Zhu via llvm-dev
<[hidden email]> wrote:

>
> Hello,
>
>
>
> There seems to be a bug with the value of the thread ID (TID) that is generated by XRay when calling fork(). The value of the TID of the child has the same TID as its parent (this is seen in the “thread: “ field of the YAML output produced by the llvm-xray tool). This is a problem as the trace generated contains the same TID for the parent and children processes, making it difficult to distinguish between threads running in different processes.
>
>
> I am running linux on x86_64 machine and am using clang 7.0.
>
[snip]
>
> Is this an issue that can be fixed? If not, is there any advice on how to make xray generate different TIDs when forking on my end?
>

XRay's basic mode doesn't get the thread ID for every event, and
actually caches the thread ID the first time it creates a thread-local
buffer it uses. This means because you effectively have a
single-threaded application, then forks a number of times, it will get
a copy of the thread ID in the parent process' thread.

There's ways to fix this but will make basic mode a tad more expensive
by getting the thread ID every time we write an event.

We can also attempt to write different files per-process, which means
it's also going to be a bit more expensive to check from the XRay
handler whether the process' ID is different from the cached data we
get from the thread-local structure.

The implementation of the Basic Mode handlers are in
https://github.com/llvm-project/llvm-project-20170507/blob/master/compiler-rt/lib/xray/xray_basic_logging.cc
-- in particular that's 'getThreadLocalData()' which caches the thread
ID.

Note that we haven't considered the multi-process use-case very much
with XRay instrumentation -- and that we may need to make some
adjustments on how to handle multi-process programming effectively.

We *could* have multi-process detection done somehow as an option. If
you have time to give that a shot, then I'd be happy to review the
changes to make that happen.

Cheers

--
Dean
_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] XRay TID mismatch when forking

Tom Stellard via llvm-dev


> On 27 Jun 2018, at 03:14, Henry Zhu <[hidden email]> wrote:
>
> I will give it a go by adding a multiprocess option as a flag. Should I update the TID each time an event is written?
>

Thanks — looking forward to the patch!

I suspect for Basic mode, it should be OK to rely on pthread caching the thread ID anyway (or at least is cheap enough to call anytime).

> Also, this issue occurs in FDR mode as well.

FDR mode is going to be a little tricky.

FDR mode writes down a metadata record at the beginning of each buffer with repetitive information (like thread ID and in this case process ID) written down once. We can certainly add a new metadata record in FDR mode to capture the process ID, but we’ll have to also update the LLVM tools. See my response on the other thread about how this is tied together.

>
> Thanks
>
> On Fri, Jun 8, 2018 at 5:02 AM, Dean Michael Berris <[hidden email]> wrote:
> Hi Henry,
> On Thu, Jun 7, 2018 at 11:51 PM Henry Zhu via llvm-dev
> <[hidden email]> wrote:
> >
> > Hello,
> >
> >
> >
> > There seems to be a bug with the value of the thread ID (TID) that is generated by XRay when calling fork(). The value of the TID of the child has the same TID as its parent (this is seen in the “thread: “ field of the YAML output produced by the llvm-xray tool). This is a problem as the trace generated contains the same TID for the parent and children processes, making it difficult to distinguish between threads running in different processes.
> >
> >
> > I am running linux on x86_64 machine and am using clang 7.0.
> >
> [snip]
> >
> > Is this an issue that can be fixed? If not, is there any advice on how to make xray generate different TIDs when forking on my end?
> >
>
> XRay's basic mode doesn't get the thread ID for every event, and
> actually caches the thread ID the first time it creates a thread-local
> buffer it uses. This means because you effectively have a
> single-threaded application, then forks a number of times, it will get
> a copy of the thread ID in the parent process' thread.
>
> There's ways to fix this but will make basic mode a tad more expensive
> by getting the thread ID every time we write an event.
>
> We can also attempt to write different files per-process, which means
> it's also going to be a bit more expensive to check from the XRay
> handler whether the process' ID is different from the cached data we
> get from the thread-local structure.
>
> The implementation of the Basic Mode handlers are in
> https://github.com/llvm-project/llvm-project-20170507/blob/master/compiler-rt/lib/xray/xray_basic_logging.cc
> -- in particular that's 'getThreadLocalData()' which caches the thread
> ID.
>
> Note that we haven't considered the multi-process use-case very much
> with XRay instrumentation -- and that we may need to make some
> adjustments on how to handle multi-process programming effectively.
>
> We *could* have multi-process detection done somehow as an option. If
> you have time to give that a shot, then I'd be happy to review the
> changes to make that happen.
>
> Cheers
>
> --
> Dean
>

-- Dean

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev