-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Artifact fails to download with invalid certificate: BadDER
#12000
Comments
To clarify, you have set a custom CA bundle like this: https://www.pantsbuild.org/docs/proxies#setting-up-a-certificate-authority but those aren't working because of the subjectAltName issue? |
Yes I have, and no, it doesn't fix this (it actually makes things worse 😅). I should have lead with that; with the setup my company uses, I'm well accustomed to having to configure an alternate certificate bundle for tooling on my system. So much so, that I've written a rather nice set of scripts similar to I've verified that this works with a number of different tools (openssl s_client, curl), including things that are All this is to say, I'm getting very accustomed to this song and dance. So much so that I'm used to everything up to and including the packet capture I described (to make sure the client is behaving / the SSL request is well formed / to steal the target URL so I can test with However, this only seems to make things worse: The rust engine only seems to crash sooner - even with debugging turned on I can't seem to capture anything that specifies why (other than the cryptic "BadDER" output - more on this later). I believe this is due to the rust engine attempting to parse the bundle and failing on certificates in the bundle. Now, keep in mind this bundle consists of:
Per the issue I linked above (rustls/rustls#127), As I mentioned before, I can't share these certificates because they are authorities so they could be used for signing and therefore malicious purposes. However, what I can tell you is that they're (1) autogenerated by standard microsoft tooling - not by me, by software controlled by other people in my very large organization and (2) work just fine with every other tool that uses SSL on my system. They are not poorly formed certificates causing parsing errors - they're valid and fully formed. I've even verified that they come over the wire in the SSL packets fully formed (along with the SSL Inspection dynamically generated cert, also valid and fully formed). Sorry for the long answer, just trying to be thorough. I genuinely want to figure out why this isn't working, and help however I can. I'll try anything you all suggest. |
Oops, forgot one last detail. The python that pants finds and uses to create its virtual environment (3.8.9) is installed via asdf, which is a wrapper around pyenv and is installed by building it from source via python-build. This system uses the openssl provided by homebrew when building the openssl module; the one I mentioned intrinsically finds the bundle at |
Hmm, this is a tricky one. I appreciate the thorough debugging and all the information you've provided. To be clear - it's a deep part of our project philosophy that things should work as users expect. So we wouldn't treat "you have to regenerate your certificates a certain way" as an acceptable solution, if those certs work in practice for every other tool. As you say, for better or worse, "what openssl supports" is the de-facto standard. Will have to think through this, but to debug and fix it would be super helpful to have an example of a cert that exhibits this behavior. How sure are we that the crux is "commonName set without subjectAltName"? Could you identify a cert in your bundle that has that property? And can you provide any more detail as to how that cert was generated? We can then try and generate a similar cert to test on. |
We're not. As a rust novice, I'm just entirely going off of what @tdyas was suggesting in slack when I posed this question (which spawned this issue); he's the one who found the issue from
After a little bit of digging, tada!
This is the Entrust L1K intermediate certificate. I realized that this is a fully public certificate that is available elsewhere but it has similar properties to the certificates used to sign the SSL Inspection certificates in my usecase. My company has cerificates derived from this certificate for other purposes, so my bundle also contains this one.
The SSL Inspection system essentially dynamically generates a new cert for any outgoing SSL packet using its own CA, but using all the details from the certificate from the original request. You could simulate this by creating your own self-signed CA, and then generating a self-signed certificate with the details of the original request? Looks like squid can potentially simulate the same type of proxying our system does? https://wiki.squid-cache.org/ConfigExamples/Intercept/SslBumpExplicit I think what @tdyas is suggesting has merit, mostly because when I set |
We have some infrastructure for generating certs for testing (https://github.com/pantsbuild/pants/tree/main/src/python/pants/engine/internals/fs_test_data/tls) and I'm playing around with those to see if I can reproduce. |
In case anyone else wants to take a concurrent swing at it:
|
There is some code linked from the rustls issue that may solve the issue by supplying a custom certificate verification step to rustls. paritytech/x509-signature#4 (comment) |
While this is helpful (and keep in mind I can’t fully trace the rust execution), it also feels really heavy handed. It basically solves the problem by disabling SSL altogether, other than checking that the certificate is in the CA bundle. The comments about not checking expiration don’t inspire confidence - it sort of feels like it defeats the purpose of using SSL in the first place. I think what you’d want would be an implementation of this that strives for parity with OpenSSL. I’m still unclear exactly why the two diverge here, or if we’ve even fully bisected the issue. I think it’s fair that our running assumption about subjectAltName is valid, but, based on the discussion in the rustls issue, it sounds like a bug on their side. Regardless of if the specification says that the SAN extension should be included, the majority of systems use OpenSSL which apparently ignores it if it’s missing (if that’s even the issue). It’s great and all for them to stick to their principles / the spec here, but not in the face of practicality (especially when the majority of users won’t know or have control over their CA bundle. It’s important to keep in mind we’re all power users here). |
I'm pushing up against my limited knowledge of TLS here, but isn't subjectAltName relevant only to the end-entity cert? The CA certs we test against don't set it (see above), but the server cert does. and as far as I can tell (again, from limited understanding) it isn't necessary for the CA certs, as those aren't being verified against a hostname. When I look at the certs we test against, only the server cert has a subjectAltName, the others do not. |
Meanwhile, naively appending that Entrust L1K intermediate certificate you pasted above into the test's |
@cjntaylor I hate to impose, but if it's not too much trouble, this would be super-helpful in diagnosis: Could you do the following?
It'll be instructive to see if this fails on loading the certs, and if so which cert is the issue. The idea is to isolate whether this has to do with some property of the custom certs, or whether it's specifically to do with the rewritten ssl packets. There might be more than one thing going on, of course... In your experimentation, the fact that pantsd got the error on startup would indicate that it's an issue with reading the custom cert bundle. But if you don't set it, I'd expect an error because the end-entity cert has been rewritten to something for which the client has no chain of trust, but I'm not sure I'd expect that error to be "Bad DER"... So I'm trying to get a tighter handle on this. Again, sorry to impose, but hopefully this will help reproduce the error without access to the real certs, so we can then work on, and verify, a fix. |
I'm even less knowledgeable than you, but that's what I thought too. I've just echoed what I have so far as it's all I've had to go on
You're never imposing, I'm happy to help. I'd been thinking after you posted the reproduction method that I should get a testing environment setup and I have. Aha! I did a bit of certificate "bijecting" and found the culprit. My non-working bundle contains my public signing key for my apple developer account 🤦♂️ Apologies for a bit of a goose chase; this breaks because the certificate in question isn't actually a CA, so the bundle isn't actually valid from However, this wasn't entirely pointless. The reason this works with everywhere else is that openssl silently ignores any non-ca certificates in the bundle, so this hasn't been an issue until now. This will affect anyone else running homebrew on macOS - my fancy bundle generator is just a derived rewrite of a ruby function that the openssl formula runs on install. It'll affect anyone trying to use that bundle or anything built from it the same way, as the script doesn't distinguish which certificates are CAs and which aren't. I'll leave this open for now but feel free to mark this |
First of all, great news that it's now working! Phew! As for automatically filtering out non-CA certs. We could certainly do that in order to emulate openssl. Or we could, as you say, at least make the error message more useful. So I'll leave this open for now. Thanks for persevering and debugging! |
Well...I spoke too soon. It's only half working. My company (for reasons I won't get into) hard cuts our internet mainline every night so there are limits on the testing I can do against external targets. Removing the culprit certificates does let me set 20210503094100_pants_debug.log This seems to have brought things to parity wether or not I set
Regardless of logging level, there doesn't seem to ever be any additional information regarding the warning |
You can enable trace level logging for rustls and see if that gets more out of the logging. Try adding |
I suspected this might work and I gave it a shot; no dice. It outputs exactly the same session warning DecodeError in the same place. All that is added is additional information regarding the setup and negotiation of the connection to the server over SSL. It does contain a dump of the certificate bytes returned by the exchange - but for that reason, I'm not comfortable sharing the logs from this level of debugging (yes, I'm aware these are "public keys", but I'm 99% sure I'd still have to get authorization to share the contents anyways - they're technically "private" to our intranet - they have no utility or exposure elsewhere). I did another packet capture with the fixed CA bundle in place (and Next things to try? |
So |
More details to share. I did a bit more digging and I found something interesting. As mentioned, when First thought was to try to parse it with openssl as a DER (
Hmm 🤔. Dumping it with Parsing cert/packet capture/good on the left, There's a consistent problem between the good and the
(That operation is bitwise OR, for clarity). So what I think is happening is actually that somehow, zero bytes are getting decoded improperly. In some cases, they're being omitted and dropped from the certificate. In others, bytes that should be hex 0x30 are getting mangled. I've found all three of these cases:
I actually think the first case might be a sort of "combination" of the latter two (the byte both gets mangled, and misinterpreted?) All this said, I've had one red herring already so I'm trying to be extra extra diligent to make sure I don't get down the rabbit hole too far this time too quickly. Thoughts? The only reason I don't blame the injection server (which definitely plays a role here) is because I've run packet captures CONCURRENTLY between these two. The comparisons I'm making are from certificates I've hand extracted the bytes from pcap captures of the SSL exchange via wireshark that were generated by running the pants test. So I know for sure that at a minimum, the binary string in the log and the certificate I dump are at least correlated. Also, as the packet capture is done at the level of response from the server, I'm hard pressed to come up with any reason why the server could be the fault point here, because I believe I've captured the "on the wire" data. One step further: I can curl the URL that pants tries to access (https://github.com/pantsbuild/pex/releases/download/v2.1.35/pex). It involves the same SSL injection server. The same certs. I've done the same capture. It produces the same dumped certificates. Byte for byte. It works perfectly, no issues. So I'm 99% sure that the issue is on the Turns out |
I think I may have actually determined that this is a bit of a red-herring, or at least lower priority. Now that I've fixed my bundle and I've been able to pull apart the certificates used in a few different places, I can tell you things I didn't know for sure before:
Anyways, I think the issue may actually be the byte issues I found. As for why there are byte issues, I couldn't tell you, but I'm still actively trying to figure that out. |
Ooooof, thanks for the excellent debugging. This is frustrating. If we can narrow this down to that bytestream decoding issue, and produce a test that proves this, then we can submit that bug to rustls or webpki, possibly with a fix. |
And yes, this subjectAltName thing seems like a red herring. If we can reproduce this bytestream parsing issue in a standalone way then we've got something to work with. I assume the cert in question is private? |
Does this specific cert have some property that is different from the others? Presumably ~all certs have zero bytes in them, so it must be some other thing that causes the parser to fail (or possibly causes a different parser to be called). For example, could it be ecdsa vs rsa or something? |
It may or may not be coincidence that 0x30 has special significance in the DER-encoding:
|
Since this has gotten so complicated, I've decided to do the following though; this is the "correct" certificate I pulled from the packet dump:
These are "public" of sorts, and I don't think it's going to be possible to reasonably debug this at this point without some access to the relevant certificates. So I've decided to share the server certificate only - mostly because this is almost entirely derived from the public github certificate. I'll hold back the intermediate and root CA as private (just know that there are two more certificates, that form the complete chain for this one).
Relevant here: Both of the other certificates in the chain also have missing bytes. Specifically, they were both missing a handful of This is the corrupt certificate I pulled from the byte string. @benjyw , can you confirm that my decoding method is valid - I lifted this:
I pulled out the byte string, put treated it as a literal in python, and dumped it as a binary file in python: cert = b"0\x82\x05\x040\x82\x03\xec\xa0\x03\x02\x01\x02\x02\x10\xc0>%7\x86\xfbS(\xa6;\xa9\r~\x99kT0\r\x06\t*\x86H\x86\xf7\r\x01\x01\x0b\x05\00k1\x0b0\t\x06\x03U\x04\x06\x13\x02US1<0:\x06\x03U\x04\n\x133Johns Hopkins University Applied Physics Laboratory1\x1e0\x1c\x06\x03U\x04\x03\x13\x15JHUAPL SSL Inspection0\x1e\x17\r210325000000Z\x17\r220330235959Z0f1\x0b0\t\x06\x03U\x04\x06\x13\x02US1\x130\x11\x06\x03U\x04\x08\x13\nCalifornia1\x160\x14\x06\x03U\x04\x07\x13\rSan Francisco1\x150\x13\x06\x03U\x04\n\x13\x0cGitHub, Inc.1\x130\x11\x06\x03U\x04\x03\x13\ngithub.com0\x82\x01\"0\r\x06\t*\x86H\x86\xf7\r\x01\x01\x01\x05\0\x03\x82\x01\x0f\00\x82\x01\n\x02\x82\x01\x01\0\xa0\x04\xed\x84A\xe1\x92\xa3\x0b\x08r\x08\x89\xd0r\x0e#*\xcf\xf5\\\xd3\xef@\xe7\xa6\x1c\xc9\x1f5'\xd7\\v\xe3\xfe\xb04\x9f\x82\x91\xf3\x96\x01\n\xca/\xfe\xcd0\xde\x1b\x02z#[?\xe9\xf7\x04+l\xdcL\xa1\x94?\xb3\xd2f\xaf=\x8b\xe4\xc6\x0c\xe0x\x1e\xfc\x06KzR\xac\xdc\xd2G\xf9\xb3\xa9\xc2&\xd1K'Y\xb3'\xf62\xbd\xd7\xc4\xbfT\x99^\xd9\xc9\xb3UbXC\x92\xdd\xadW[\x86\x99\xcc\x8f\x7f!\xf7\xdc?\x8e\xd1\xa60\x80\x1a4\x85\xd6|\xd1\"i\x04\xd4\xdb\xe5\xba\xc0\x05\xbf \xea{\\\x89\xdf\xf9\xd2\xb7\x98\x99\x8fY\x83N\xd1\xd2AZ0\xc5\x8f\x17\xdb2\xd60\x9c*8\x982\xed\x0b@\x0e%@\x03\t\xbd\xd3\xe5\xf0\xce\xa0\xd4\xecS\xd8\x8d\x9c\x90)Y\xc0O\xd3\x08\xd3\x996e\x1b\xdat\xde\xfcS\xfc\xe8z5\x11\xe9CA\xab';\xec\x13\xce\xf7;]\xf1=Sb\xea\xa0\x94\xb3%\xfc\xb5b\x8b'\xe9\xdc\xd3\x0c\xa2{\x02\x03\x01\0\x01\xa3\x82\x01\xa70\x82\x01\xa30%\x06\x03U\x1d\x11\x04\x1e0\x1c\x82\ngithub.com\x82\x0ewww.github.com0\x0e\x06\x03U\x1d\x0f\x01\x01\xff\x04\x04\x03\x02\x03\xa80\x13\x06\x03U\x1d%\x04\x0c0\n\x06\x08+\x06\x01\x05\x05\x07\x03\x010\x0c\x06\x03U\x1d\x13\x01\x01\xff\x04\x020\00\x1f\x06\x03U\x1d#\x04\x180\x16\x80\x14\x1at$\xdd#\xa9\x7f\xd1\x11\xc6E\xfcQ\xe7h\xfa\xd4\x01\xdd\xbf0\x1d\x06\x03U\x1d\x0e\x04\x16\x04\x14\x85\xcf\xadA\x8b\xcb\xf2`|\xc1\x8c\xe3\n\xd4\x8cxG\xb5\xa4\x080\x82\x01\x05\x06\n+\x06\x01\x04\x01\xd6y\x02\x04\x02\x04\x81\xf6\x04\x81\xf3\0\xf1\0v\0)y\xbe\xf0\x9e99!\xf0Vs\x9fc\xa5w\xe5\xbeW}\x9c`\n\xf8\xf9M]&\\%]\xc7\x84\0\0\x01xj\xbf\xbd\x1a\0\0\x04\x03\0G0E\x02!\0\x9e\xe6\x88D\x7f\xfc4E\x9c2M\x9f\xab\x94\x86\x06\xae\xddc-\xe2\xf5_c\x97F\x8a\x0b\xa59\xd8\xd7\x02 HT'\xd1\xc62\xb5\xbf\x81w\xd7\xeb\x15h\xac\xf2\xc8\xee\xc9\x01\xad\x1f\xcc4\x0c\xee\xc9\x10rD\x98Y\0w\0\"EE\x07YU$V\x96?\xa1/\xf1\xf7m\x86\xe0#&c\xad\xc0K\x7f]\xc6\x83\\n\xe2\x0f\x02\0\0\x01xj\xbf\xbd9\0\0\x04\x03\0H0F\x02!\0\x98\0\x12J\tA\x18\xaf\x06\\(\xef\x1e\xbb\xde\x85l\x7fX\xa9\xd3\xde\x96\xb2\x16j\x99\x10\xae/\xf2i\x02!\0\xdd\xc5\xf8\xad\xbd\xf0h\xb0\xcb\xab\x80\xb8\xf0\xd4\xa8Rg0\xe7\xa3\xf0;\xf9\xb6\xbb\t\xd0\xa6\xb6\xfe\xca\x1d0\r\x06\t*\x86H\x86\xf7\r\x01\x01\x0b\x05\0\x03\x82\x01\x01\0\\\x1cn\x87c0\xa8\xae|\xdd%\xe6\xe9'F<\xe5\xfc\xd0\xdf\x96\xd1\xfe\xeb\xfc\xb1\xc8\x8c\x82mP2\xc8\xa7\x15y\x94.\xc1\x8e\xa5\0&]\x1b\xaa\xd4\x04\xee\xfd\xcf\x11\xa1.\xaa\x9f}\x11S\xa3\x8c\xd1H\x9cD\x8bf`\x98-\xeeE\x9c\x16\xb935\xdbK\\\x01\xf4P{\x86\xa9\xfeS\x1d;(V\x90\xf9\x99h>\xc0\x9e:\xdds\x04\xbe\xe9\x12\xc7\x82K]\x8bd\x1f\xd5xerd\xb2\xe1\xf2\x01\xfekH\xea<\x99\xf9\xf1m5^\x0cQ\xc8\xbb] \x8a`\"f\xdd\x84\xb9\xbbS\x16\xfb6\xb0\x92\x8c\xca}\xaaw7\xe8_\x84(\xa3\xa4\xfd\x96$xa\xe0\xda=\xfb\xd8S3\xb0\xa9;\x1a\x88\x11D\xdc\x8e\xb0\xe1\x15\xcem\xf1 \xfe=\xe4\xee\t\xd2\xf6)\xe7\xb3\03.TM\xdd/\xe1\xfb\x86v\xe9\xe6\xba\x92mb+\xbfK1\xf6\x07\xc5\x9d\xa4\t\xf2\xb9\xd3\xa8\xf5\xba\x0eX$DE \xf0uG\xc1-m\xa4\05T\x13\r\x05_"
with open("injection_cert.der", "wb") as f:
f.write(cert) This *should* be a binary encoded DER certificate, but of course it doesn't decode properly. Manually adding only the missing bytes yields this:
It's nearly identical except for two bytes near the end, which are missing an upper 0x3 byte (byte 0x4d5 is 0x03 and should be 0x33 and byte 0x503 is 0x05 and should be 0x35).
I don't see anything out of the ordinary. Do you? It looks like a standard RSA certificate to me...
Interesting 🤔. We've escaped the limits of my understanding of TLS 😅 , but that certainly does seem relevant. Maybe an issue with escaping 0x30 bytes / not interpreting the escape properly? It's odd though, that this wouldn't have shown up until now. I feel like I'm grasping at straws as to why the bytestream gets mangled, but it definitely seems to, if the Certificate byte string I lifted is accurate... |
I'm looking at the bytestring in python, which bytes in the original 1283-length byte sequence are malformed? |
In other words, what are the exact transformations I would need to do to that |
Oh NM I can produce this myself, I missed that you posted the "good" cert. |
Eh, it's a good exercise anyways: Using incremental offsets (so the count is relative after each modification is made), counting from 0x00: cert = b"0\x82\x05\x040\x82\x03\xec\xa0\x03\x02\x01\x02\x02\x10\xc0>%7\x86\xfbS(\xa6;\xa9\r~\x99kT0\r\x06\t*\x86H\x86\xf7\r\x01\x01\x0b\x05\00k1\x0b0\t\x06\x03U\x04\x06\x13\x02US1<0:\x06\x03U\x04\n\x133Johns Hopkins University Applied Physics Laboratory1\x1e0\x1c\x06\x03U\x04\x03\x13\x15JHUAPL SSL Inspection0\x1e\x17\r210325000000Z\x17\r220330235959Z0f1\x0b0\t\x06\x03U\x04\x06\x13\x02US1\x130\x11\x06\x03U\x04\x08\x13\nCalifornia1\x160\x14\x06\x03U\x04\x07\x13\rSan Francisco1\x150\x13\x06\x03U\x04\n\x13\x0cGitHub, Inc.1\x130\x11\x06\x03U\x04\x03\x13\ngithub.com0\x82\x01\"0\r\x06\t*\x86H\x86\xf7\r\x01\x01\x01\x05\0\x03\x82\x01\x0f\00\x82\x01\n\x02\x82\x01\x01\0\xa0\x04\xed\x84A\xe1\x92\xa3\x0b\x08r\x08\x89\xd0r\x0e#*\xcf\xf5\\\xd3\xef@\xe7\xa6\x1c\xc9\x1f5'\xd7\\v\xe3\xfe\xb04\x9f\x82\x91\xf3\x96\x01\n\xca/\xfe\xcd0\xde\x1b\x02z#[?\xe9\xf7\x04+l\xdcL\xa1\x94?\xb3\xd2f\xaf=\x8b\xe4\xc6\x0c\xe0x\x1e\xfc\x06KzR\xac\xdc\xd2G\xf9\xb3\xa9\xc2&\xd1K'Y\xb3'\xf62\xbd\xd7\xc4\xbfT\x99^\xd9\xc9\xb3UbXC\x92\xdd\xadW[\x86\x99\xcc\x8f\x7f!\xf7\xdc?\x8e\xd1\xa60\x80\x1a4\x85\xd6|\xd1\"i\x04\xd4\xdb\xe5\xba\xc0\x05\xbf \xea{\\\x89\xdf\xf9\xd2\xb7\x98\x99\x8fY\x83N\xd1\xd2AZ0\xc5\x8f\x17\xdb2\xd60\x9c*8\x982\xed\x0b@\x0e%@\x03\t\xbd\xd3\xe5\xf0\xce\xa0\xd4\xecS\xd8\x8d\x9c\x90)Y\xc0O\xd3\x08\xd3\x996e\x1b\xdat\xde\xfcS\xfc\xe8z5\x11\xe9CA\xab';\xec\x13\xce\xf7;]\xf1=Sb\xea\xa0\x94\xb3%\xfc\xb5b\x8b'\xe9\xdc\xd3\x0c\xa2{\x02\x03\x01\0\x01\xa3\x82\x01\xa70\x82\x01\xa30%\x06\x03U\x1d\x11\x04\x1e0\x1c\x82\ngithub.com\x82\x0ewww.github.com0\x0e\x06\x03U\x1d\x0f\x01\x01\xff\x04\x04\x03\x02\x03\xa80\x13\x06\x03U\x1d%\x04\x0c0\n\x06\x08+\x06\x01\x05\x05\x07\x03\x010\x0c\x06\x03U\x1d\x13\x01\x01\xff\x04\x020\00\x1f\x06\x03U\x1d#\x04\x180\x16\x80\x14\x1at$\xdd#\xa9\x7f\xd1\x11\xc6E\xfcQ\xe7h\xfa\xd4\x01\xdd\xbf0\x1d\x06\x03U\x1d\x0e\x04\x16\x04\x14\x85\xcf\xadA\x8b\xcb\xf2`|\xc1\x8c\xe3\n\xd4\x8cxG\xb5\xa4\x080\x82\x01\x05\x06\n+\x06\x01\x04\x01\xd6y\x02\x04\x02\x04\x81\xf6\x04\x81\xf3\0\xf1\0v\0)y\xbe\xf0\x9e99!\xf0Vs\x9fc\xa5w\xe5\xbeW}\x9c`\n\xf8\xf9M]&\\%]\xc7\x84\0\0\x01xj\xbf\xbd\x1a\0\0\x04\x03\0G0E\x02!\0\x9e\xe6\x88D\x7f\xfc4E\x9c2M\x9f\xab\x94\x86\x06\xae\xddc-\xe2\xf5_c\x97F\x8a\x0b\xa59\xd8\xd7\x02 HT'\xd1\xc62\xb5\xbf\x81w\xd7\xeb\x15h\xac\xf2\xc8\xee\xc9\x01\xad\x1f\xcc4\x0c\xee\xc9\x10rD\x98Y\0w\0\"EE\x07YU$V\x96?\xa1/\xf1\xf7m\x86\xe0#&c\xad\xc0K\x7f]\xc6\x83\\n\xe2\x0f\x02\0\0\x01xj\xbf\xbd9\0\0\x04\x03\0H0F\x02!\0\x98\0\x12J\tA\x18\xaf\x06\\(\xef\x1e\xbb\xde\x85l\x7fX\xa9\xd3\xde\x96\xb2\x16j\x99\x10\xae/\xf2i\x02!\0\xdd\xc5\xf8\xad\xbd\xf0h\xb0\xcb\xab\x80\xb8\xf0\xd4\xa8Rg0\xe7\xa3\xf0;\xf9\xb6\xbb\t\xd0\xa6\xb6\xfe\xca\x1d0\r\x06\t*\x86H\x86\xf7\r\x01\x01\x0b\x05\0\x03\x82\x01\x01\0\\\x1cn\x87c0\xa8\xae|\xdd%\xe6\xe9'F<\xe5\xfc\xd0\xdf\x96\xd1\xfe\xeb\xfc\xb1\xc8\x8c\x82mP2\xc8\xa7\x15y\x94.\xc1\x8e\xa5\0&]\x1b\xaa\xd4\x04\xee\xfd\xcf\x11\xa1.\xaa\x9f}\x11S\xa3\x8c\xd1H\x9cD\x8bf`\x98-\xeeE\x9c\x16\xb935\xdbK\\\x01\xf4P{\x86\xa9\xfeS\x1d;(V\x90\xf9\x99h>\xc0\x9e:\xdds\x04\xbe\xe9\x12\xc7\x82K]\x8bd\x1f\xd5xerd\xb2\xe1\xf2\x01\xfekH\xea<\x99\xf9\xf1m5^\x0cQ\xc8\xbb] \x8a`\"f\xdd\x84\xb9\xbbS\x16\xfb6\xb0\x92\x8c\xca}\xaaw7\xe8_\x84(\xa3\xa4\xfd\x96$xa\xe0\xda=\xfb\xd8S3\xb0\xa9;\x1a\x88\x11D\xdc\x8e\xb0\xe1\x15\xcem\xf1 \xfe=\xe4\xee\t\xd2\xf6)\xe7\xb3\03.TM\xdd/\xe1\xfb\x86v\xe9\xe6\xba\x92mb+\xbfK1\xf6\x07\xc5\x9d\xa4\t\xf2\xb9\xd3\xa8\xf5\xba\x0eX$DE \xf0uG\xc1-m\xa4\05T\x13\r\x05_"
fixed_cert = cert[:0x2e] + b"\x30" + cert[0x2e:]
fixed_cert = fixed_cert[:0x13b] + b"\x30" + fixed_cert[0x13b:]
fixed_cert = fixed_cert[:0x2ab] + b"\x30" + fixed_cert[0x2ab:]
fixed_cert = fixed_cert[:0x4d3] + b"\x00" + fixed_cert[0x4d3:]
fixed_cert = fixed_cert[:0x4d4] + b"\x33" + fixed_cert[0x4d5:]
fixed_cert = fixed_cert[:0x501] + b"\x00" + fixed_cert[0x501:]
fixed_cert = fixed_cert[:0x502] + b"\x35" + fixed_cert[0x503:]
with open("server.der", "wb") as f:
f.write(fixed_cert) Thats all the modifications. The other certs in the chain also have missing |
Cool, so the one thing left to eliminate is whether the differences are some artifact of the logging output rather than a true representation of a mis-parsed cert. It's unlikely though, this does seem likely to be the underlying problem. |
Agreed. I've been wondering about this and I don't know how to rule that out |
unfortunately it does look like this is a red herring due to an issue with the debug string of the object, which is what's logged: I wrote a little rust code to load the good cert from a file in to a Certificate object and then writes both its individual bytes (as an array of ints) and its debug string form to stdout. The list of ints is good, but the stdout copy is bad once you read it back in via Python. |
Yeah, I looked at the debug output code, and it emits So, alas, this is not it. |
OK, regrouping and trying to think of how else to debug. |
To be sure I have this right: the cert posted above is the one with the BadDER error, and we know this because it was logged on error? If so then I note that this cert has a "Subject Alternative Name" set, so that whole thing is definitely not the problem. |
Correct. The first cert is the exact response the “server” replies with when the code attempts to download pex. The reason I say it that way is that it’s a dynamically generated certificate, from the SSL inspection server. Normally you wouldn’t be able to snag them easily but I yanked it out of a packet capture of the request. I don’t know if the inspection server regenerates certificates for each request - I can try to look into this if we think it’d be helpful. But yeah, it has a SAN, and it’s well formed on the wire in the response from the server: #12000 (comment) And I can curl the same URL just fine. So that’s the same bundle, just OpenSSL instead of rustls. The cert is okay. It’s something with the rust side itself 🤷 |
To clarify. The BadDER error was logged along with a poorly formatted version of that cert at trace level. I simultaneously packet captured the request in wireshark, and then dumped the certificate segment as a DER. We know they match, and we know they’re the same request (I’ve verified it targeted the url). |
Cool. I gotta do other stuff for a bit but will return to this tomorrow. We can't actually test by using this cert as a server cert in the unit test, because we don't have the private key it was signed with. But maybe we can generate a key with the same set of attributes. |
I've tweaked the test cert to have the exact same X509v3 extension values as the problematic one, with the exception of the custom "CT Precertificate SCTs" extension, and things still work. So it may be that the issue is those Precertificate SCTs. I don't see an easy way to generate those in a test cert. Another angle to try is - do you have information on the vendor of the SSL inspection system that is generating these certs? We might be able to set up a demo system for testing. |
@cjntaylor just checking in to see if we can get any info on the vendor of the SSL inspection system, so we can attempt to reproduce that another way. Right now I may have narrowed the problem down to the Precertificate SCTs, but those are a little tricky to generate and inject into a cert for testing. And we wouldn't have confidence that if we saw an error it was the same error as you're getting, rather than a mistake in encoding the cert. |
Apologies for the delay. Right now I have very little information about the SSL Inspection system that is in use, but I will try to find out more. As you might imagine, this is a rather large system as it needs to handle all SSL traffic for the entire network, so it's deployed at a level and by folks I don't normally have access to. That said, I'll see if I can get some more information. Is there something about the Precertificate SCTs that is causing an issue? I don't know that I've seen them before on other requests, and I'm curious as to why they'd be giving rustls hangups (or why you'd think that'd be the case). |
Doing a little bit of digging by backtracing the error (and I'm sure you've done the same), would you agree that the problem I'm getting is that it's erroring right here: If I traced it correctly, the BadDer error is caught by this function (chained there): Which is translating it into the "fatal alert DecodeError" right here Do I have that right? |
https://www.forcepoint.com/product/ngfw-next-generation-firewall That doesn't look to be all that helpful. Very enterprise, very inaccessible (and expensive). But I've verified that's actually what is in use. At a minimum though, I think we can be fairly certain that it "does the right thing" for most applications. At least, as much as I'll be able to expect a system/restriction like this to provide. I think in my case, I'll have to live with the environment it creates. Frustratingly, this is only an issue in the deep internals of what appears to be rustls / webpki for now, so it's hard for me to argue that it's a fundamental issue when anything openssl based works fine. |
Thanks! I'll do a little more poking. I don't have strong evidence that this has to do with the Precertificate SCTs, other than:
|
I suspect you're right about the backtrace and where the error is coming from. But I cannot reproduce the error so it's hard to be certain. But this does give me a couple of ideas. |
Would it be helpful to ask the Rustls folks to take a look at this thread? They may have ideas for things we're missing. |
I've realized that this is probably not due to Precertificate SCTs as those have been copied verbatim from the original github cert (I checked). |
Plus I can now reproduce the failure in a rustls test which ignores the SCTs. |
Was able to file an issue with a repro against webpki: briansmith/webpki#232 Hopefully they can take it from there, I am now definitely up against my limited knowledge of Rust and of TLS. |
My company uses SSL inspection as part of its security envelope. This system requires the equivalent of a self-signed intermediate certificate to be added to the system certificate store as a root certificate authority in order for any SSL based connection to function. Any SSL based connection will be intercepted and rewritten; the new connection will be signed by this CA instead of its original authority and packets will be rewritten to match.
I've been unable to use pants in this context, which, when attempting to download it's artifacts, reports the following error:
With the help of some very nice folks on slack, we've already traced this issue to the rust part of the system, in the
rustls
crate used byreqwest
. I've attatched the relevant logs, and I'd be happy to generate / capture anything else you'd find helpful. Long story short, it looks like I'm running into the issue reported here: rustls/rustls#127. Rustls diverges from openssl behaviour; when attempting to process the certificate, it attempts to read thesubjectAltName
from it. In my case, the final dynamic inspection certificate has one, but the two previous signing CA certificates in chain only have a common name. Since this extension is missing, whenrustls
attempts to read the field, it fails, implodes, and the request fails with the above error (or so we currently think).The certificates probably should have subjectAltName extensions set. However, they're generated dynamically, and not really in my control - they're handled at a much higher level by a different department in a large organization; it will take months if not years to get something like that fixed (if I even can get someone to care). Most people don't run into this issue because openssl is more permissive in this case and doesn't fault when the subjectAltName is missing. So if it's possible, it would be really helpful if a pants side fix could be implemented to address this.
SSL Intercept systems like this are becoming increasingly common at large corporations, to prevent in-network malicious actors from messaging home in SSL packets. I promise you this isn't a one-off issue.
I'm a full stack developer, and normally I'd just figure out the solution, and issue an MR with a proposed fix (since this affects me, it seems like the right thing to do). However, I'm not at all familiar with rust, and this is a security issue - I'm not the person you want poking around in the source code trying to fix this. I'd be happy to help in any other way I can, and I can test / debug / log anything that would be helpful.
I have captured the SSL exchange at the packet level, and I have confirmed that the certificates in the exchange are fully formed and valid. As the certificates are authorities, and could be used for signing purposes, I can't actually share any of these, but, I can report back any information that would be helpful, or try to capture any details that might be useful.
Thanks in advance for any help 😄
20210501194000_pants_debug.log
The text was updated successfully, but these errors were encountered: