Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXCEPTION_ACCESS_VIOLATION (0xc0000005) with jbyte_disjoint_arraycopy #887

Closed
gschrader opened this issue Aug 31, 2023 · 9 comments
Closed
Labels
bug Something isn't working Waiting on OP

Comments

@gschrader
Copy link

Please provide a brief summary of the bug

We have our unit tests running in an ADO agent, split across several jobs. One particular job is always causing the JVM to crash once it hits a certain test.

The log file does not contain any java frames so it's hard to tell what code could be causing the crash. It's not always the jbyte, we've also seen jint, jshort and jlong. It's always the junit "Test worker" thread that crashes. If we ignore the test, it appears to fail on the next test.

We upgraded to the latest Temurin release, we we're initially using 17.0.6+10.

We've upgraded several libraries (i.e. spring boot to 2.7.15) and JDBC driver to no avail.

I'm speculating it could be memory related so I'm currently running the job to grab a JVM heap dump.

hs_err_pid.log file will be attached
hs_err_pid6484.log

Please provide steps to reproduce where possible

Unfortunately I don't think we can reproduce it in isolation, the test works just fine on its own.

Expected Results

The JVM to not crash or at least give us more clues as to why it is crashing.

Actual Results

The JVM crashes with EXCEPTION_ACCESS_VIOLATION (0xc0000005)

What Java Version are you using?

openjdk version "17.0.8" 2023-07-18 OpenJDK Runtime Environment Temurin-17.0.8+7 (build 17.0.8+7) OpenJDK 64-Bit Server VM Temurin-17.0.8+7 (build 17.0.8+7, mixed mode, sharing)

What is your operating system and platform?

Windows Server 2016 , 64 bit Build 14393 (10.0.14393.5786)

How did you install Java?

zip from https://adoptium.net

Did it work before?

It was working in my java17 branch before I rebased on master, nothing obvious jumps out as to what change could have caused it.

Did you test with the latest update version?

We updated to the latest release and it still occurs

Did you test with other Java versions?

We haven't yet

Relevant log output

2023-08-31T01:01:52.9395521Z #
2023-08-31T01:01:52.9395768Z # A fatal error has been detected by the Java Runtime Environment:
2023-08-31T01:01:52.9395984Z #
2023-08-31T01:01:53.0406065Z #  EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00000284498f7ff6, pid=6484, tid=6428
2023-08-31T01:01:53.0406543Z #
2023-08-31T01:01:53.0406801Z # JRE version: OpenJDK Runtime Environment Temurin-17.0.8+7 (17.0.8+7) (build 17.0.8+7)
2023-08-31T01:01:53.0407279Z # Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.8+7 (17.0.8+7, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, windows-amd64)
2023-08-31T01:01:53.0407643Z # Problematic frame:
2023-08-31T01:01:53.0407901Z # v  ~StubRoutines::jbyte_disjoint_arraycopy
2023-08-31T01:01:53.0408102Z #
2023-08-31T01:01:53.0408382Z # Core dump will be written. Default location: E:\CI\ado\1\_work\1\s\hs_err_pid6484.mdmp
2023-08-31T01:01:53.0408659Z #
2023-08-31T01:01:53.0408876Z # An error report file with more information is saved as:
2023-08-31T01:01:53.0409152Z # E:\CI\ado\1\_work\1\s\hs_err_pid6484.log
2023-08-31T01:01:53.0409513Z Compiled method (c2) 26953161 111034   !   4       sun.nio.ch.IOUtil::write (178 bytes)
2023-08-31T01:01:53.0409881Z  total in heap  [0x000002844bd8ff10,0x000002844bd92d40] = 11824
2023-08-31T01:01:53.0410567Z  relocation     [0x000002844bd90068,0x000002844bd90198] = 304
2023-08-31T01:01:53.0410862Z  main code      [0x000002844bd901a0,0x000002844bd91680] = 5344
2023-08-31T01:01:53.0411170Z  stub code      [0x000002844bd91680,0x000002844bd91720] = 160
2023-08-31T01:01:53.0411450Z  oops           [0x000002844bd91720,0x000002844bd91730] = 16
2023-08-31T01:01:53.0411750Z  metadata       [0x000002844bd91730,0x000002844bd91898] = 360
2023-08-31T01:01:53.0412042Z  scopes data    [0x000002844bd91898,0x000002844bd92750] = 3768
2023-08-31T01:01:53.0412338Z  scopes pcs     [0x000002844bd92750,0x000002844bd92ba0] = 1104
2023-08-31T01:01:53.0413419Z  dependencies   [0x000002844bd92ba0,0x000002844bd92be8] = 72
2023-08-31T01:01:53.0413718Z  handler table  [0x000002844bd92be8,0x000002844bd92cf0] = 264
2023-08-31T01:01:53.0414045Z  nul chk table  [0x000002844bd92cf0,0x000002844bd92d40] = 80
2023-08-31T01:01:53.0414496Z Compiled method (c2) 26953174 51242       4       org.hibernate.engine.internal.NaturalIdXrefDelegate::cacheNaturalIdCrossReference (117 bytes)
2023-08-31T01:01:53.0414959Z  total in heap  [0x000002844c63f510,0x000002844c647450] = 32576
2023-08-31T01:01:53.0415255Z  relocation     [0x000002844c63f668,0x000002844c63f968] = 768
2023-08-31T01:01:53.0415562Z  main code      [0x000002844c63f980,0x000002844c644420] = 19104
2023-08-31T01:01:53.0415850Z  stub code      [0x000002844c644420,0x000002844c644528] = 264
2023-08-31T01:01:53.0416142Z  oops           [0x000002844c644528,0x000002844c644530] = 8
2023-08-31T01:01:53.0416427Z  metadata       [0x000002844c644530,0x000002844c644760] = 560
2023-08-31T01:01:53.0416738Z  scopes data    [0x000002844c644760,0x000002844c6463e0] = 7296
2023-08-31T01:01:53.0417030Z  scopes pcs     [0x000002844c6463e0,0x000002844c646f30] = 2896
2023-08-31T01:01:53.0417341Z  dependencies   [0x000002844c646f30,0x000002844c646f88] = 88
2023-08-31T01:01:53.0417648Z  handler table  [0x000002844c646f88,0x000002844c6472b8] = 816
2023-08-31T01:01:53.0418058Z  nul chk table  [0x000002844c6472b8,0x000002844c647450] = 408
2023-08-31T01:01:53.0418482Z Compiled method (c2) 26953175 47879   !   4       oracle.jdbc.driver.OraclePreparedStatement::basicBindString (314 bytes)
2023-08-31T01:01:53.0418903Z  total in heap  [0x000002844ca5a290,0x000002844ca5ba78] = 6120
2023-08-31T01:01:53.0419203Z  relocation     [0x000002844ca5a3e8,0x000002844ca5a4c8] = 224
2023-08-31T01:01:53.0419506Z  main code      [0x000002844ca5a4e0,0x000002844ca5b100] = 3104
2023-08-31T01:01:53.0419795Z  stub code      [0x000002844ca5b100,0x000002844ca5b160] = 96
2023-08-31T01:01:53.0420114Z  oops           [0x000002844ca5b160,0x000002844ca5b188] = 40
2023-08-31T01:01:53.0420402Z  metadata       [0x000002844ca5b188,0x000002844ca5b228] = 160
2023-08-31T01:01:53.0420698Z  scopes data    [0x000002844ca5b228,0x000002844ca5b690] = 1128
2023-08-31T01:01:53.0421005Z  scopes pcs     [0x000002844ca5b690,0x000002844ca5b8f0] = 608
2023-08-31T01:01:53.0421312Z  dependencies   [0x000002844ca5b8f0,0x000002844ca5b910] = 32
2023-08-31T01:01:53.0421627Z  handler table  [0x000002844ca5b910,0x000002844ca5ba00] = 240
2023-08-31T01:01:53.0421923Z  nul chk table  [0x000002844ca5ba00,0x000002844ca5ba78] = 120
2023-08-31T01:01:53.1394441Z #
2023-08-31T01:01:53.1394798Z # If you would like to submit a bug report, please visit:
2023-08-31T01:01:53.1395166Z #   https://github.com/adoptium/adoptium-support/issues
2023-08-31T01:01:53.1395390Z #
2023-08-31T01:02:09.4409839Z 
2023-08-31T01:02:09.4410821Z Unexpected exception thrown.
2023-08-31T01:02:09.4416238Z org.gradle.internal.remote.internal.MessageIOException: Could not write '/127.0.0.1:62552'.
2023-08-31T01:02:09.4416788Z 	at org.gradle.internal.remote.internal.inet.SocketConnection.flush(SocketConnection.java:140)
2023-08-31T01:02:09.4417269Z 	at org.gradle.internal.remote.internal.hub.MessageHub$ConnectionDispatch.run(MessageHub.java:333)
2023-08-31T01:02:09.4417785Z 	at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:64)
2023-08-31T01:02:09.4418627Z 	at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:49)
2023-08-31T01:02:09.4419094Z 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
2023-08-31T01:02:09.4419575Z 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
2023-08-31T01:02:09.4419981Z 	at java.base/java.lang.Thread.run(Thread.java:833)
2023-08-31T01:02:09.4420273Z Caused by: java.io.IOException: Connection reset by peer
2023-08-31T01:02:09.4420645Z 	at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method)
2023-08-31T01:02:09.4421206Z 	at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:54)
2023-08-31T01:02:09.4421605Z 	at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:132)
2023-08-31T01:02:09.4422063Z 	at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:76)
2023-08-31T01:02:09.4422507Z 	at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:53)
2023-08-31T01:02:09.4423004Z 	at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:532)
2023-08-31T01:02:09.4423774Z 	at org.gradle.internal.remote.internal.inet.SocketConnection$SocketOutputStream.writeWithNonBlockingRetry(SocketConnection.java:279)
2023-08-31T01:02:09.4424934Z 	at org.gradle.internal.remote.internal.inet.SocketConnection$SocketOutputStream.writeBufferToChannel(SocketConnection.java:267)
2023-08-31T01:02:09.4425518Z 	at org.gradle.internal.remote.internal.inet.SocketConnection$SocketOutputStream.flush(SocketConnection.java:261)
2023-08-31T01:02:09.4426023Z 	at org.gradle.internal.remote.internal.inet.SocketConnection.flush(SocketConnection.java:138)
2023-08-31T01:02:09.4426387Z 	... 6 more
2023-08-31T01:02:09.7410504Z 
2023-08-31T01:02:09.7411164Z > Task :suite:testSuiteBase1
2023-08-31T01:02:09.7411467Z tests passed: 2215, tests failed: 0, test skipped: 9
@gschrader gschrader added the bug Something isn't working label Aug 31, 2023
@karianna
Copy link
Contributor

karianna commented Sep 4, 2023

@gschrader This is looking like an Oracle JDBC driver issue (it's a byte buffer copy operation from Java to native). Recommend bringing this to ORacle support in the first instance

@gschrader
Copy link
Author

Thanks @karianna, we're narrowing in on the cause, we think it's something in our code that was recently committed that JDK 11 had no problem with but 17 does. We're currently doing a bisect to narrow it down further.

We've ruled out later JDK 11 releases. Confirmed it happens as well with the latest Zulu JDK 17. Nothing jumped out from the heap dump so I don't think it's memory related.

Can I ask why you think it is related to the Oracle JDBC driver? If it's the blurb in the Relevant log output section, we have other examples where no driver shows up there.

I'll follow up when we learn more and then I can get clarification where I need to go to bring it to "Oracle support".

@SimonCanJer
Copy link

We've faced a similar case; somebody wanted to run an executable Jar from a batch file when Open JDK -17 was included in the delivered package together with the executable JAR and the launching batch file. The irregularity is that the case happened when the package had been downloaded from FTP as a directory. Remarkable that it happens ONLY in the downloaded package, even on the same machine, where it has been uploaded from. At the same time, this does not occur, when the package has been zipped before uploaded and unzipped after download- in this case, it works great. It sounded strange, but it is a fact.

@gschrader
Copy link
Author

@SimonCanJer I'm not sure I totally understand your case, it sounds like your JDK was corrupted when downloaded, from a repackaged application. I did just confirm the SHA256 of my download and it appears to match.

@SimonCanJer
Copy link

SimonCanJer commented Sep 19, 2023 via email

@karianna
Copy link
Contributor

Thanks @karianna, we're narrowing in on the cause, we think it's something in our code that was recently committed that JDK 11 had no problem with but 17 does. We're currently doing a bisect to narrow it down further.

We've ruled out later JDK 11 releases. Confirmed it happens as well with the latest Zulu JDK 17. Nothing jumped out from the heap dump so I don't think it's memory related.

Can I ask why you think it is related to the Oracle JDBC driver? If it's the blurb in the Relevant log output section, we have other examples where no driver shows up there.

I'll follow up when we learn more and then I can get clarification where I need to go to bring it to "Oracle support".

I think the oracle.jdbc.driver.OraclePreparedStatement::basicBindString (314 bytes) line is hinting that the issue is at that boundary.

@gschrader
Copy link
Author

I'm not sure what happened to my last response from a couple of weeks ago, I must have not pressed the submit button ☹️

We have narrowed this down to a test that goes against a Spring controller that does file upload/downloads, the test itself doesn't crash but the next one that runs does. I thought perhaps it was due to a mix of webflux/webmvc that we have going on but now I think it might be as simple as missing a DirtiesContext .

As for the Oracle driver, I had pasted a bunch of other examples of the crash that didn't have Oracle driver in the log so I think might be a red herring. Anyway I'll know more in a while once this next test run either completes or crashes.

@gschrader
Copy link
Author

The DirtiesContext didn't make a difference however changing the code to be pure WebMVC seems to have prevented the crash.

Ideally the JVM wouldn't crash or at least provide some more breadcrumbs to figure it out. Maybe it's a Spring issue but I doubt the mixture of code would be supported so I'm not going to try to explain it to the Spring team.

So I think closing this issues make sense.

@codespearhead
Copy link

For reference:

quarkus --version && gradle --version
# 3.10.2
# ------------------------------------------------------------
# Gradle 8.7
# ------------------------------------------------------------

# Build time:   2024-03-22 15:52:46 UTC
# Revision:     650af14d7653aa949fce5e886e685efc9cf97c10

# Kotlin:       1.9.22
# Groovy:       3.0.17
# Ant:          Apache Ant(TM) version 1.10.13 compiled on January 4 2023
# JVM:          21.0.3 (Eclipse Adoptium 21.0.3+9-LTS)
# OS:           Windows 10 10.0 amd64

If I try to run a sample Quarkus project with Gradle [1], I get the exact same error under Git Bash, but not under PowerShell.

[1] quarkus create --gradle && cd code-with-quarkus && quarkus dev

If I run quarkus dev --console=plain, it works as expected under Git Bash. All combinations work under WSL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Waiting on OP
Projects
None yet
Development

No branches or pull requests

4 participants