Form-Data not treated as binary when request is intercepted #20143

mlauinger · 2022-02-10T12:19:32Z

Current behavior

Currently, when you intercept a request with a body of type multipart/form-data which includes a binary file, as well as a considerable amount of text, it is possible that the body will not be detected as type binary. Therefore the toString() method is called on the body which breaks the binary file for server-side processing.

See request.ts

I think this issue was first introduced with cypress 7.0.0, at least the issue occured for us after upgrading from 6.7.1 to 7.0.0. It does not matter if you do anything with the interceptor or not. In our case we just wanted to wait for the response before continuing the test.

Desired behavior

Form-Data bodies should be treated as binary.

Test code to reproduce

Send a form-data request with two fields, one small binary file (e.g. an empty PDF) and one field containing around 1000 words from Lorem Ipsum.

Run the test with DEBUG=cypress:net-stubbing* and you wont see the debug line eq.body contained non-utf8 characters, treating as binary content

Reduce the text fields content to like 100 words and you will see the debug message.

Cypress Version

7.0.0+

Other

No response

The text was updated successfully, but these errors were encountered:

BlueWinds · 2022-02-15T19:54:04Z

Thanks for the report, and for opening a pull request! We'll get a second reviewer from the team here in the next few days, and try to get it merged in time for the next release (currently planned as 9.5.1).

flotwig · 2022-02-23T20:37:30Z

@mlauinger Merging #20144 revealed some unintended breakage in cypress-example-recipes that's indicative of real-world breakage for users of cy.intercept(): https://circleci.com/gh/cypress-io/cypress/1283292?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

The patch to cypress-example-recipes required for it to pass after #20144: cypress-io/cypress-example-recipes@149ad0e

In light of this, we're going to revert #20144 for the time being, and look for another solution.

The main issue here seems to be our use of istextorbinary. From the README:

The contents check (with the default options) will check 24 bytes at the start, middle, and end of the buffer. History has shown that checking all three locations is mandatory for accuracy, and that anything less is not accurate. This technique offers superior performance while still offering superior accuracy. Alternatives generally just do 1000 bytes at the start, which is slower, and inaccurate.

This has the end effect of making the types of req.body and res.body very unpredictable - it could be a string, or it could be a buffer, depending on this heuristic which is totally dependent on things like the ordering of the parts in a multipart/form-data body and the length of text and/or binary components.

I think that it would make sense to do 2 things here to fix this behavior:

Get rid of istextorbinary since it makes the behavior hard to predict, and
Add some API to req/res for users to retrieve the body as a buffer or as a string. Here is a non-exhaustive list of API options:
a. Make req/res.body always a string. Add req/res.getBodyBuffer() for retrieving the raw bytes.
b. Make req/res.body always an ArrayBuffer. Add req/res.getBodyString() for decoding it as a UTF8 string.
I think (a) would be more preferable since the use case of body being a string is probably more common than that of it being binary content.

Additionally, a future improvement to cy.intercept() could automatically detect multipart/form-data responses and use something like multer to give users a nice API to interact with the different parts, and could automatically determine binary/string based on each part's MIME type. Like req/res.formData.attachement. But, this doesn't cover all use cases for binary message bodies, so I think it is less important than just getting (1) and (2) fixed.

Thoughts @mlauinger @BlueWinds ?

BlueWinds · 2022-02-24T16:27:34Z

The main issue here seems to be our use of istextorbinary. From the README:

This has the end effect of making the types of req.body and res.body very unpredictable - it could be a string, or it could be a buffer, depending on this heuristic which is totally dependent on things like the ordering of the parts in a multipart/form-data body and the length of text and/or binary components.

I think that it would make sense to do 2 things here to fix this behavior:
1. Get rid of `istextorbinary` since it makes the behavior hard to predict, and

2. Add some API to `req/res` for users to retrieve the body as a buffer or as a string. Here is a non-exhaustive list of API options:
   a. Make `req/res.body` always a `string`. Add `req/res.getBodyBuffer()` for retrieving the raw bytes.
   b. Make `req/res.body` always an `ArrayBuffer`. Add `req/res.getBodyString()` for decoding it as a UTF8 string.
   I think (a) would be more preferable since the use case of `body` being a string is probably more common than that of it being binary content.

I think this all sounds very reasonable. I was noticing istextorbinary while reviewing this PR, and thinking much the same thing - the unpredictability in our API is a not particularly friendly. I'm a fan of req.body and req.raw being for strings and buffers respectively.

Additionally, a future improvement to cy.intercept() could automatically detect multipart/form-data responses and use something like multer to give users a nice API to interact with the different parts, and could automatically determine binary/string based on each part's MIME type. Like req/res.formData.attachement. But, this doesn't cover all use cases for binary message bodies, so I think it is less important than just getting (1) and (2) fixed.

This seems less important, as you said, and I'm not sure adding an additional library is necessary on our end. Something to consider for a later date.

strigefleur · 2023-01-26T09:53:38Z

Meanwhile, what would be the current way to mock multipart/form-data response within intercept scope?

Doesn't seem to work atm:

const formData = new FormData();
formData.append('some', 'text');

cy.intercept('some-api', formData);

Harshithkumar · 2023-07-17T12:56:28Z

Still the Issue persist in 12.17.1V as well. Can somebody help and fix this issue ?

jordymolenaar · 2024-05-07T13:16:26Z

Just encountered this in v13.7.3
Adding an Authorization header with cy.intercept breaks our pdf file upload!

Tzinov15 · 2024-10-23T21:15:04Z

I think we are hitting something similar. We have a GraphQL request going out (using graphql-request) which automatically detects a File type parameter and updates the graphql request to be include a Content-Type header of multipart/form-data. When trying to mock this (and in particular use Cypress's recommended path of alias GraphQL requests using req.body.operationName), this particular upload file request gets intercepted and it's body is a ArrayBuffer which we can't do anything with

as opposed to our other mocked requests that come through fine

and have a readable operationName

stoonya · 2024-11-08T16:52:53Z

I think we are hitting something similar. We have a GraphQL request going out (using graphql-request) which automatically detects a File type parameter and updates the graphql request to be include a Content-Type header of multipart/form-data. When trying to mock this (and in particular use Cypress's recommended path of alias GraphQL requests using req.body.operationName), this particular upload file request gets intercepted and it's body is a ArrayBuffer which we can't do anything with
as opposed to our other mocked requests that come through fine and have a readable `operationName`

Not sure if it works, but you could do something like this:

function isFormData(request) {
  return request.headers['content-type'].toString().includes('form-data')
}

cy.intercept('POST', '<your GQL relative path>', (req) => {
  if(!isFormData) {
    // your regular mocking logic for JSON-type requests
  } else {
    const bodyString = Cypress.Blob.arrayBufferToBinaryString(req.body)
    
    // parse and work with that string
    // if needed, you can do the opposite conversion: Cypress.Blob.binaryStringToArrayBuffer
  }
}

Tzinov15 · 2024-11-12T20:43:50Z

Ah, this would work, slightly cleaner than our current approach which is to just check if the operationName field can be read off the body and if not, assume it's a form-data request. Thanks! Regardless, would be great if Cypress could handle this

oscarmangan · 2024-12-02T13:59:40Z

Also facing an issue with this v13.8.1. We are uploading a file via a multipart/form-data request with other fields and the cy.intercept() we need to use is corrupting the file. In DevTools we're seeing characters like 'ÿ' replaced with '�' and the server-side processing is failing to parse correctly.

Can this be fixed?

mlauinger mentioned this issue Feb 10, 2022

fix: treat form-data bodies as binary #20144

Merged

4 tasks

flotwig closed this as completed in #20144 Feb 17, 2022

flotwig mentioned this issue Feb 23, 2022

fix: revert "fix: treat form-data bodies as binary" #20343

Merged

flotwig reopened this Feb 23, 2022

flotwig added topic: cy.intercept() type: bug labels Feb 23, 2022

davidmunechika mentioned this issue Apr 26, 2022

Multipart form-data for API Post Request Issue #21173

Open

cypress-bot bot added the stage: backlog label Apr 29, 2022

nagash77 added E2E Issue related to end-to-end testing Triaged Issue has been routed to backlog. This is not a commitment to have it prioritized by the team. labels Jul 18, 2023

jennifer-shehane removed the stage: backlog label Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Form-Data not treated as binary when request is intercepted #20143

Form-Data not treated as binary when request is intercepted #20143

mlauinger commented Feb 10, 2022

BlueWinds commented Feb 15, 2022

flotwig commented Feb 23, 2022 •

edited

Loading

BlueWinds commented Feb 24, 2022

strigefleur commented Jan 26, 2023

Harshithkumar commented Jul 17, 2023

jordymolenaar commented May 7, 2024

Tzinov15 commented Oct 23, 2024

stoonya commented Nov 8, 2024

Tzinov15 commented Nov 12, 2024

oscarmangan commented Dec 2, 2024

Form-Data not treated as binary when request is intercepted #20143

Form-Data not treated as binary when request is intercepted #20143

Comments

mlauinger commented Feb 10, 2022

Current behavior

Desired behavior

Test code to reproduce

Cypress Version

Other

BlueWinds commented Feb 15, 2022

flotwig commented Feb 23, 2022 • edited Loading

BlueWinds commented Feb 24, 2022

strigefleur commented Jan 26, 2023

Harshithkumar commented Jul 17, 2023

jordymolenaar commented May 7, 2024

Tzinov15 commented Oct 23, 2024

stoonya commented Nov 8, 2024

Tzinov15 commented Nov 12, 2024

oscarmangan commented Dec 2, 2024

flotwig commented Feb 23, 2022 •

edited

Loading