[llvm-dev] Minimal UBSAN runtime with ASAN?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] Minimal UBSAN runtime with ASAN?

David Jones via llvm-dev

Hello, 


In my organization, we've been using ASAN and most of UBSAN checks in the default developers mode with a big success. I'd like to enable a few remaining UBSAN checks too but noticed that they cause significant (up to 2x in some cases) binary size overhead (mostly .rodata and .data). These checks are: null, function, vptr, object-size.  


Inspecting .rodata, it looks like there are a lot of strings with file and type names. I tried to use `-fsanitize-undefined-strip-path-components=-1` from [1]. It appeared to have no effect when `-fsanitize=function` and `-fsanitize=address` are used at the same time (filed bug [2]). Disabling `-fsanitize=function` and using  `-fsanitize-undefined-strip-path-components=-1` reduces the size overhead to 1.4x. This is quite already significant.


I've considered -fsanitize=trap, it causes very little size overhead but it in some cases is hard to work with. 


I noticed that [3] added minimal runtime for UBSAN. It works similar to `-fsanitize-trap`, but prints a bit more informative message, which would suffice. Out of the box, I didn't notice a measurable binary size reduction as mentioned on that change, but if used with `-fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,--print-gc-sections`, the size bloat of .rodata and .data is almost eliminated. Note, in this case, those flags don't help without `-fsanitize-minimal-runtime`.


Unfortunately, there is a restriction in the driver preventing this minimal UBSAN runtime to be used when ASAN is also enabled. I don't completely understand the reasons for having this restriction. When I removed that restriction, both ASAN and UBSAN still seem functioning in my tests. 


I'd like to allow using minimal UBSAN runtime with ASAN. Are there reasons against it? I'd volunteer to do the work here.


Also, vptr UBSAN check is disallowed when minimal UBSAN runtime is used. Would someone clarify why?


-- Igor




  1. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#additional-configuration
  2. https://bugs.llvm.org/show_bug.cgi?id=39347
  3. https://reviews.llvm.org/D36810




https://reviews.llvm.org/D36810




_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Minimal UBSAN runtime with ASAN?

David Jones via llvm-dev
[hidden email] [hidden email] 

Hi Igor, 
yes, please send the patches for the clang driver and compiler-rt.
It might require some refactoring to get minimal ubsan-rt working with asan. 

As for vptr UBSAN: I guess that vptr checking does actually require very non-trivial run-time support and is not included into the minimal one.  


On Sun, Oct 21, 2018 at 11:46 AM Igor Sugak via llvm-dev <[hidden email]> wrote:

Hello, 


In my organization, we've been using ASAN and most of UBSAN checks in the default developers mode with a big success. I'd like to enable a few remaining UBSAN checks too but noticed that they cause significant (up to 2x in some cases) binary size overhead (mostly .rodata and .data). These checks are: null, function, vptr, object-size.  


Inspecting .rodata, it looks like there are a lot of strings with file and type names. I tried to use `-fsanitize-undefined-strip-path-components=-1` from [1]. It appeared to have no effect when `-fsanitize=function` and `-fsanitize=address` are used at the same time (filed bug [2]). Disabling `-fsanitize=function` and using  `-fsanitize-undefined-strip-path-components=-1` reduces the size overhead to 1.4x. This is quite already significant.


I've considered -fsanitize=trap, it causes very little size overhead but it in some cases is hard to work with. 


I noticed that [3] added minimal runtime for UBSAN. It works similar to `-fsanitize-trap`, but prints a bit more informative message, which would suffice. Out of the box, I didn't notice a measurable binary size reduction as mentioned on that change, but if used with `-fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,--print-gc-sections`, the size bloat of .rodata and .data is almost eliminated. Note, in this case, those flags don't help without `-fsanitize-minimal-runtime`.


Unfortunately, there is a restriction in the driver preventing this minimal UBSAN runtime to be used when ASAN is also enabled. I don't completely understand the reasons for having this restriction. When I removed that restriction, both ASAN and UBSAN still seem functioning in my tests. 


I'd like to allow using minimal UBSAN runtime with ASAN. Are there reasons against it? I'd volunteer to do the work here.


Also, vptr UBSAN check is disallowed when minimal UBSAN runtime is used. Would someone clarify why?


-- Igor




  1. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#additional-configuration
  2. https://bugs.llvm.org/show_bug.cgi?id=39347
  3. https://reviews.llvm.org/D36810
    Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.




https://reviews.llvm.org/D36810

Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Minimal UBSAN runtime with ASAN?

David Jones via llvm-dev
Hi,

the idea of minimal ubsan, as mentioned in https://reviews.llvm.org/D36810, is to provide a hardened runtime suitable for production. That's why the driver rejects -fsanitize-minimal-runtime with ASan, because otherwise you are getting all the sanitizer_common stuff in your binary, with demangling and logging redirection and whatnot. I think this is a good property and don't want to change it.

Vptr checker has non-trivial runtime support. It could be implemented in minimal runtime.

I'm sure there are ways to optimize full ubsan metadata. Some kind of string compression. Relative abi (offsets instead of pointers). Maybe use debug info (ex. instead of emitting file name string to .rodata emit a single byte with debug source location of that file/line, and pass a pointer to it to the ubsan handler).

If not, I guess it would be acceptable to add a flag to skip file path strings and replace them with nullptr in ubsan calls.

On Tue, Oct 23, 2018 at 10:36 AM, Kostya Serebryany <[hidden email]> wrote:
[hidden email] [hidden email] 

Hi Igor, 
yes, please send the patches for the clang driver and compiler-rt.
It might require some refactoring to get minimal ubsan-rt working with asan. 

As for vptr UBSAN: I guess that vptr checking does actually require very non-trivial run-time support and is not included into the minimal one.  


On Sun, Oct 21, 2018 at 11:46 AM Igor Sugak via llvm-dev <[hidden email]> wrote:

Hello, 


In my organization, we've been using ASAN and most of UBSAN checks in the default developers mode with a big success. I'd like to enable a few remaining UBSAN checks too but noticed that they cause significant (up to 2x in some cases) binary size overhead (mostly .rodata and .data). These checks are: null, function, vptr, object-size.  


Inspecting .rodata, it looks like there are a lot of strings with file and type names. I tried to use `-fsanitize-undefined-strip-path-components=-1` from [1]. It appeared to have no effect when `-fsanitize=function` and `-fsanitize=address` are used at the same time (filed bug [2]). Disabling `-fsanitize=function` and using  `-fsanitize-undefined-strip-path-components=-1` reduces the size overhead to 1.4x. This is quite already significant.


I've considered -fsanitize=trap, it causes very little size overhead but it in some cases is hard to work with. 


I noticed that [3] added minimal runtime for UBSAN. It works similar to `-fsanitize-trap`, but prints a bit more informative message, which would suffice. Out of the box, I didn't notice a measurable binary size reduction as mentioned on that change, but if used with `-fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,--print-gc-sections`, the size bloat of .rodata and .data is almost eliminated. Note, in this case, those flags don't help without `-fsanitize-minimal-runtime`.


Unfortunately, there is a restriction in the driver preventing this minimal UBSAN runtime to be used when ASAN is also enabled. I don't completely understand the reasons for having this restriction. When I removed that restriction, both ASAN and UBSAN still seem functioning in my tests. 


I'd like to allow using minimal UBSAN runtime with ASAN. Are there reasons against it? I'd volunteer to do the work here.


Also, vptr UBSAN check is disallowed when minimal UBSAN runtime is used. Would someone clarify why?


-- Igor




  1. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#additional-configuration
  2. https://bugs.llvm.org/show_bug.cgi?id=39347
  3. https://reviews.llvm.org/D36810
    Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.




https://reviews.llvm.org/D36810

Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Minimal UBSAN runtime with ASAN?

David Jones via llvm-dev

Thank you Evgenii and Kostya for replying. 


I agree with Evgenii, that the current UBSAN metadata could be optimized, but it seemed to me that allowing minimal runtime with ASAN would be more trivial to implement.


Evgenii, quick question:


why the driver rejects -fsanitize-minimal-runtime with ASan, because otherwise you are getting all the sanitizer_common stuff in your binary, with demangling and logging redirection and whatnot. I think this is a good property and don't want to change it.


I don't know well how sanitizers interact with each other, but would it be possible to persist the current -fsanitize-minimal-runtime behavior when it's used on its own (e.g. no extra sanitizer_common stuff), but allow ASAN when explicitly requested via `-fsanitize=address`? When I remove the restriction in the driver and use the minimal runtime without ASAN I don't see any extra instrumentation/ symbols from sanitizer_common. And it still seems to work. But I might be missing something. We could add a test that no extra stuff is added to the binary by mistake. 

On the other side wanted to clarify: is the concern here, that someone might enable ASAN by mistake and run it in production when only UBSAN minimal runtime was supposed to be used?


-- Igor 


From: Evgenii Stepanov <[hidden email]>
Sent: Tuesday, October 23, 2018 12:16:26 PM
To: Kostya Serebryany
Cc: Igor Sugak; Vitaly Buka; LLVM Dev
Subject: Re: [llvm-dev] Minimal UBSAN runtime with ASAN?
 
Hi,

the idea of minimal ubsan, as mentioned in https://reviews.llvm.org/D36810, is to provide a hardened runtime suitable for production. That's why the driver rejects -fsanitize-minimal-runtime with ASan, because otherwise you are getting all the sanitizer_common stuff in your binary, with demangling and logging redirection and whatnot. I think this is a good property and don't want to change it.

Vptr checker has non-trivial runtime support. It could be implemented in minimal runtime.

I'm sure there are ways to optimize full ubsan metadata. Some kind of string compression. Relative abi (offsets instead of pointers). Maybe use debug info (ex. instead of emitting file name string to .rodata emit a single byte with debug source location of that file/line, and pass a pointer to it to the ubsan handler).

If not, I guess it would be acceptable to add a flag to skip file path strings and replace them with nullptr in ubsan calls.

On Tue, Oct 23, 2018 at 10:36 AM, Kostya Serebryany <[hidden email]> wrote:
[hidden email] [hidden email] 

Hi Igor, 
yes, please send the patches for the clang driver and compiler-rt.
It might require some refactoring to get minimal ubsan-rt working with asan. 

As for vptr UBSAN: I guess that vptr checking does actually require very non-trivial run-time support and is not included into the minimal one.  


On Sun, Oct 21, 2018 at 11:46 AM Igor Sugak via llvm-dev <[hidden email]> wrote:

Hello, 


In my organization, we've been using ASAN and most of UBSAN checks in the default developers mode with a big success. I'd like to enable a few remaining UBSAN checks too but noticed that they cause significant (up to 2x in some cases) binary size overhead (mostly .rodata and .data). These checks are: null, function, vptr, object-size.  


Inspecting .rodata, it looks like there are a lot of strings with file and type names. I tried to use `-fsanitize-undefined-strip-path-components=-1` from [1]. It appeared to have no effect when `-fsanitize=function` and `-fsanitize=address` are used at the same time (filed bug [2]). Disabling `-fsanitize=function` and using  `-fsanitize-undefined-strip-path-components=-1` reduces the size overhead to 1.4x. This is quite already significant.


I've considered -fsanitize=trap, it causes very little size overhead but it in some cases is hard to work with. 


I noticed that [3] added minimal runtime for UBSAN. It works similar to `-fsanitize-trap`, but prints a bit more informative message, which would suffice. Out of the box, I didn't notice a measurable binary size reduction as mentioned on that change, but if used with `-fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,--print-gc-sections`, the size bloat of .rodata and .data is almost eliminated. Note, in this case, those flags don't help without `-fsanitize-minimal-runtime`.


Unfortunately, there is a restriction in the driver preventing this minimal UBSAN runtime to be used when ASAN is also enabled. I don't completely understand the reasons for having this restriction. When I removed that restriction, both ASAN and UBSAN still seem functioning in my tests. 


I'd like to allow using minimal UBSAN runtime with ASAN. Are there reasons against it? I'd volunteer to do the work here.


Also, vptr UBSAN check is disallowed when minimal UBSAN runtime is used. Would someone clarify why?


-- Igor




  1. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#additional-configuration
  2. https://bugs.llvm.org/show_bug.cgi?id=39347
  3. https://reviews.llvm.org/D36810
    Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.




https://reviews.llvm.org/D36810

Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Minimal UBSAN runtime with ASAN?

David Jones via llvm-dev


On Tue, Oct 23, 2018 at 3:49 PM, Igor Sugak <[hidden email]> wrote:

Thank you Evgenii and Kostya for replying. 


I agree with Evgenii, that the current UBSAN metadata could be optimized, but it seemed to me that allowing minimal runtime with ASAN would be more trivial to implement.


Evgenii, quick question:


why the driver rejects -fsanitize-minimal-runtime with ASan, because otherwise you are getting all the sanitizer_common stuff in your binary, with demangling and logging redirection and whatnot. I think this is a good property and don't want to change it.


I don't know well how sanitizers interact with each other, but would it be possible to persist the current -fsanitize-minimal-runtime behavior when it's used on its own (e.g. no extra sanitizer_common stuff), but allow ASAN when explicitly requested via `-fsanitize=address`? When I remove the restriction in the driver and use the minimal runtime without ASAN I don't see any extra instrumentation/ symbols from sanitizer_common. And it still seems to work. But I might be missing something. We could add a test that no extra stuff is added to the binary by mistake. 

On the other side wanted to clarify: is the concern here, that someone might enable ASAN by mistake and run it in production when only UBSAN minimal runtime was supposed to be used?


Yes, -fsanitize-minimal-runtime is meant to guarantee that only production-ready sanitizer bits are being used. It does not look very important to me now. After all, if we accidentally include ASan in production build and don't catch it, we've got bigger problems with the release process.

OK, let's allow minimal runtime and asan at the same time. No new runtime libraries though - simply link minimal ubsan runtime into full ubsan runtime, and everywhere else full ubsan is linked into.

 



-- Igor 


From: Evgenii Stepanov <[hidden email]>
Sent: Tuesday, October 23, 2018 12:16:26 PM
To: Kostya Serebryany
Cc: Igor Sugak; Vitaly Buka; LLVM Dev
Subject: Re: [llvm-dev] Minimal UBSAN runtime with ASAN?
 
Hi,

the idea of minimal ubsan, as mentioned in https://reviews.llvm.org/D36810, is to provide a hardened runtime suitable for production. That's why the driver rejects -fsanitize-minimal-runtime with ASan, because otherwise you are getting all the sanitizer_common stuff in your binary, with demangling and logging redirection and whatnot. I think this is a good property and don't want to change it.

Vptr checker has non-trivial runtime support. It could be implemented in minimal runtime.

I'm sure there are ways to optimize full ubsan metadata. Some kind of string compression. Relative abi (offsets instead of pointers). Maybe use debug info (ex. instead of emitting file name string to .rodata emit a single byte with debug source location of that file/line, and pass a pointer to it to the ubsan handler).

If not, I guess it would be acceptable to add a flag to skip file path strings and replace them with nullptr in ubsan calls.

On Tue, Oct 23, 2018 at 10:36 AM, Kostya Serebryany <[hidden email]> wrote:
[hidden email] [hidden email] 

Hi Igor, 
yes, please send the patches for the clang driver and compiler-rt.
It might require some refactoring to get minimal ubsan-rt working with asan. 

As for vptr UBSAN: I guess that vptr checking does actually require very non-trivial run-time support and is not included into the minimal one.  


On Sun, Oct 21, 2018 at 11:46 AM Igor Sugak via llvm-dev <[hidden email]> wrote:

Hello, 


In my organization, we've been using ASAN and most of UBSAN checks in the default developers mode with a big success. I'd like to enable a few remaining UBSAN checks too but noticed that they cause significant (up to 2x in some cases) binary size overhead (mostly .rodata and .data). These checks are: null, function, vptr, object-size.  


Inspecting .rodata, it looks like there are a lot of strings with file and type names. I tried to use `-fsanitize-undefined-strip-path-components=-1` from [1]. It appeared to have no effect when `-fsanitize=function` and `-fsanitize=address` are used at the same time (filed bug [2]). Disabling `-fsanitize=function` and using  `-fsanitize-undefined-strip-path-components=-1` reduces the size overhead to 1.4x. This is quite already significant.


I've considered -fsanitize=trap, it causes very little size overhead but it in some cases is hard to work with. 


I noticed that [3] added minimal runtime for UBSAN. It works similar to `-fsanitize-trap`, but prints a bit more informative message, which would suffice. Out of the box, I didn't notice a measurable binary size reduction as mentioned on that change, but if used with `-fdata-sections -ffunction-sections -Wl,--gc-sections -Wl,--print-gc-sections`, the size bloat of .rodata and .data is almost eliminated. Note, in this case, those flags don't help without `-fsanitize-minimal-runtime`.


Unfortunately, there is a restriction in the driver preventing this minimal UBSAN runtime to be used when ASAN is also enabled. I don't completely understand the reasons for having this restriction. When I removed that restriction, both ASAN and UBSAN still seem functioning in my tests. 


I'd like to allow using minimal UBSAN runtime with ASAN. Are there reasons against it? I'd volunteer to do the work here.


Also, vptr UBSAN check is disallowed when minimal UBSAN runtime is used. Would someone clarify why?


-- Igor




  1. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html#additional-configuration
  2. https://bugs.llvm.org/show_bug.cgi?id=39347
  3. https://reviews.llvm.org/D36810
    Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.




https://reviews.llvm.org/D36810

Not worried about that. If it allocates or prints too much, we can add in a custom allocator or printing strategy. If it requires too much metadata to be inserted into user programs, we can devise a new, smaller encoding for the metadata.


_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev



_______________________________________________
LLVM Developers mailing list
[hidden email]
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev