Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Ramkumar Ramachandra
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Reid Kleckner-2
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Ramkumar Ramachandra
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Keno Fischer-2
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Lang Hames

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Dale Martin

​This sounds pretty serious and it won't be easy for us to upgrade - particularly not to trunk.  Are there plans to take bug fixes like this into llvm 3.5.x point releases?  (Do I remember right that 3.5.x is supposed to have some kind of long term support?  Where is that process documented?)


Thanks,

  Dale



From: Lang Hames <[hidden email]>
Sent: Friday, May 22, 2015 7:55 PM
To: Keno Fischer
Cc: Ramkumar Ramachandra; Peng Cheng; LLVMdev; Dale Martin
Subject: Re: [LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows
 

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Eric Christopher
Hi Dale,

I don't think that Keno's rewrite is applicable for a bug fix release. We have, in the last year, moved to having some dot releases for our older releases, but these are definitely bug fix only and low risk as we don't want to break anything new. 

The release documentation is located here:


for future reference. There's no official long term support strategy past the information on that page, previously we released every 6 mos without dot releases at all so this is a fairly new trial for us. Backporting of patches is at the discretion of the author, the code owner, and the release manager.

Keno: perfectly happy to entertain a backport of your patch if you want to do such a thing, but IIRC it was a bit more than a simple bug fix.

-eric

On Sat, May 23, 2015 at 7:28 AM Dale Martin <[hidden email]> wrote:

​This sounds pretty serious and it won't be easy for us to upgrade - particularly not to trunk.  Are there plans to take bug fixes like this into llvm 3.5.x point releases?  (Do I remember right that 3.5.x is supposed to have some kind of long term support?  Where is that process documented?)


Thanks,

  Dale



From: Lang Hames <[hidden email]>
Sent: Friday, May 22, 2015 7:55 PM
To: Keno Fischer
Cc: Ramkumar Ramachandra; Peng Cheng; LLVMdev; Dale Martin
Subject: Re: [LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows
 

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Keno Fischer-2
The commits in question are r234839 and you'll probably also want r236341. I don't think these are the kinds of commits that should generally be back ported. It's not really a small self-contained commit. If you're willing you can probably carry these patches yourself (we will be doing so on top of 3.6 until 3.7 is released), but do note that in my experience using MCJIT with the large code model does not quite work yet (it's on my todo list to work out exactly why and fit). Also, I believe the memory allocation scheme for MCJIT was rewritten slightly between 3.5 and trunk, so there may be additional problems I don't know about.

On Sat, May 23, 2015 at 5:12 PM, Eric Christopher <[hidden email]> wrote:
Hi Dale,

I don't think that Keno's rewrite is applicable for a bug fix release. We have, in the last year, moved to having some dot releases for our older releases, but these are definitely bug fix only and low risk as we don't want to break anything new. 

The release documentation is located here:


for future reference. There's no official long term support strategy past the information on that page, previously we released every 6 mos without dot releases at all so this is a fairly new trial for us. Backporting of patches is at the discretion of the author, the code owner, and the release manager.

Keno: perfectly happy to entertain a backport of your patch if you want to do such a thing, but IIRC it was a bit more than a simple bug fix.

-eric

On Sat, May 23, 2015 at 7:28 AM Dale Martin <[hidden email]> wrote:

​This sounds pretty serious and it won't be easy for us to upgrade - particularly not to trunk.  Are there plans to take bug fixes like this into llvm 3.5.x point releases?  (Do I remember right that 3.5.x is supposed to have some kind of long term support?  Where is that process documented?)


Thanks,

  Dale



From: Lang Hames <[hidden email]>
Sent: Friday, May 22, 2015 7:55 PM
To: Keno Fischer
Cc: Ramkumar Ramachandra; Peng Cheng; LLVMdev; Dale Martin
Subject: Re: [LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows
 

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Dale Martin
That implies that 3.6 is not really useable on Windows, doesn't it, since the legacy JIT was removed? (At least if you could need the large code model.)

Sent from my iPhone

On May 23, 2015, at 5:25 PM, Keno Fischer <[hidden email]> wrote:

The commits in question are r234839 and you'll probably also want r236341. I don't think these are the kinds of commits that should generally be back ported. It's not really a small self-contained commit. If you're willing you can probably carry these patches yourself (we will be doing so on top of 3.6 until 3.7 is released), but do note that in my experience using MCJIT with the large code model does not quite work yet (it's on my todo list to work out exactly why and fit). Also, I believe the memory allocation scheme for MCJIT was rewritten slightly between 3.5 and trunk, so there may be additional problems I don't know about.

On Sat, May 23, 2015 at 5:12 PM, Eric Christopher <[hidden email]> wrote:
Hi Dale,

I don't think that Keno's rewrite is applicable for a bug fix release. We have, in the last year, moved to having some dot releases for our older releases, but these are definitely bug fix only and low risk as we don't want to break anything new. 

The release documentation is located here:


for future reference. There's no official long term support strategy past the information on that page, previously we released every 6 mos without dot releases at all so this is a fairly new trial for us. Backporting of patches is at the discretion of the author, the code owner, and the release manager.

Keno: perfectly happy to entertain a backport of your patch if you want to do such a thing, but IIRC it was a bit more than a simple bug fix.

-eric

On Sat, May 23, 2015 at 7:28 AM Dale Martin <[hidden email]> wrote:

​This sounds pretty serious and it won't be easy for us to upgrade - particularly not to trunk.  Are there plans to take bug fixes like this into llvm 3.5.x point releases?  (Do I remember right that 3.5.x is supposed to have some kind of long term support?  Where is that process documented?)


Thanks,

  Dale



From: Lang Hames <[hidden email]>
Sent: Friday, May 22, 2015 7:55 PM
To: Keno Fischer
Cc: Ramkumar Ramachandra; Peng Cheng; LLVMdev; Dale Martin
Subject: Re: [LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows
 

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Keno Fischer-2
Correct, though this is certainly not the only issue preventing LLVM 3.6 from being usable on Windows. I think we got a few of them backported to 3.6.1, but there's a few more still remaining. 

On Sat, May 23, 2015 at 6:08 PM, Dale Martin <[hidden email]> wrote:
That implies that 3.6 is not really useable on Windows, doesn't it, since the legacy JIT was removed? (At least if you could need the large code model.)

Sent from my iPhone

On May 23, 2015, at 5:25 PM, Keno Fischer <[hidden email]> wrote:

The commits in question are r234839 and you'll probably also want r236341. I don't think these are the kinds of commits that should generally be back ported. It's not really a small self-contained commit. If you're willing you can probably carry these patches yourself (we will be doing so on top of 3.6 until 3.7 is released), but do note that in my experience using MCJIT with the large code model does not quite work yet (it's on my todo list to work out exactly why and fit). Also, I believe the memory allocation scheme for MCJIT was rewritten slightly between 3.5 and trunk, so there may be additional problems I don't know about.

On Sat, May 23, 2015 at 5:12 PM, Eric Christopher <[hidden email]> wrote:
Hi Dale,

I don't think that Keno's rewrite is applicable for a bug fix release. We have, in the last year, moved to having some dot releases for our older releases, but these are definitely bug fix only and low risk as we don't want to break anything new. 

The release documentation is located here:


for future reference. There's no official long term support strategy past the information on that page, previously we released every 6 mos without dot releases at all so this is a fairly new trial for us. Backporting of patches is at the discretion of the author, the code owner, and the release manager.

Keno: perfectly happy to entertain a backport of your patch if you want to do such a thing, but IIRC it was a bit more than a simple bug fix.

-eric

On Sat, May 23, 2015 at 7:28 AM Dale Martin <[hidden email]> wrote:

​This sounds pretty serious and it won't be easy for us to upgrade - particularly not to trunk.  Are there plans to take bug fixes like this into llvm 3.5.x point releases?  (Do I remember right that 3.5.x is supposed to have some kind of long term support?  Where is that process documented?)


Thanks,

  Dale



From: Lang Hames <[hidden email]>
Sent: Friday, May 22, 2015 7:55 PM
To: Keno Fischer
Cc: Ramkumar Ramachandra; Peng Cheng; LLVMdev; Dale Martin
Subject: Re: [LLVMdev] Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows
 

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

I didn't notice that you were running 3.5 the first time I read this. Keno's diagnosis is very likely to be correct. You should try trunk if you're able to.

- Lang.

On Fri, May 22, 2015 at 4:14 PM, Keno Fischer <[hidden email]> wrote:
This might be related to GOT relocations. I rewrote that part of RuntimeDyldELFbecause I was seeing this issue. Have you tried trunk?

On Fri, May 22, 2015 at 5:10 PM, Ramkumar Ramachandra <[hidden email]> wrote:
So it appears that we get about half the crashes with the large code model. The rest are crashing in the same way. It could either mean that large code model still takes that crashing codepath and that the number of crashes only went down by chance, or that in one place in the flow, large code model is not matched to mean ELF::R_X86_64_PC64. I'm digging into this issue further, but any hints along the way would be appreciated.

Thanks.

Ram

On Tue, May 19, 2015 at 12:06 PM, Reid Kleckner <[hidden email]> wrote:
That sounds like a PC-relative relocation failure. Usually this happens when the relocation target is more than 2 GB away from the source. Try using the large code model or tweaking the memory manager.

It turns out it's surprisingly hard to portably allocate some memory and then allocate some more within a 2 GB offset of the first allocation in a 64-bit process. For various reasons that I don't understand, reserving 2 GB of address space upfront and allocating from that is not workable for some MCJIT clients.

On Tue, May 19, 2015 at 7:19 AM, Ramkumar Ramachandra <[hidden email]> wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram

_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Reply | Threaded
Open this post in threaded view
|

Re: Sporadic "RealOffset <= INT32_MAX && RealOffset >= INT32_MIN" failures with MCJIT on Windows

Nicholas Chapman-2
In reply to this post by Ramkumar Ramachandra
There was a crash on Windows when more than 4K is allocated on the stack at once, due to chkstk call offset being too large.  Maybe you are seeing that.
It's fixed in trunk, but not in any stable release.

Nick C.

    
On 19/05/2015 15:19, Ramkumar Ramachandra wrote:
Hi,

We are seeing sporadic crashes since we migrated to MCJIT on Win64. The same tests pass without issues on Mac64 and Linux64. The issue is this assertion failure in RuntimeDyldELF.c:

  RealOffset <= INT32_MAX && RealOffset >= INT32_MIN

I haven't managed to successfully catch the failure in Visual to try and debug it. Any tips on how to make progress?

Oh, and we're on LLVM 3.5.

Thanks.

Ram


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev


_______________________________________________
LLVM Developers mailing list
[hidden email]         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev