Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading big WKT geometry cause "memory access out of bounds" in Chrome and "zsh: bus error" in Node env #26

Closed
sanak opened this issue Oct 14, 2023 · 9 comments · Fixed by #33
Labels
bug Something isn't working

Comments

@sanak
Copy link
Contributor

sanak commented Oct 14, 2023

Chrome env error is reproducible from the following step.

  1. Access to https://sanak.github.io/geos-wasm-tester/
  2. Open Chrome DevTools
  3. Select misc - fme.xml from TestXml, then click Load button
  4. Select # 1 - Big Bad Nasty buffer from TestCase
  5. Check Chrome DevTools console error

Node env error is reproducible from the following step.

  1. Download reproducible test case file from https://github.com/sanak/geos-wasm/blob/wkt-big-geometry/test/tests/wkt_big_geometry.spec.mjs
  2. Put downloaded file to test/tests/ folder
  3. Run npx tape test/tests/wkt_big_geometry.spec.mjs

From the following changes, I could solve above issue.

  1. Change WKT Reader/Writer string type to number

    --- a/src/allCFunctions.mjs
    +++ b/src/allCFunctions.mjs
    @@ -3197,7 +3197,7 @@ pointer  * @param {number} s - The coordinate sequence pointer
      * @see {@link https://libgeos.org/doxygen/geos__c_8h.html#a0a0f7c1b9f6a9f7c3c4d1b5a7b6f9e2e|GEOSWKTReader_read_r}
      * @alias module:geos
       */
    -  geos.GEOSWKTReader_read_r = Module.cwrap('GEOSWKTReader_read_r', 'number', ['number', 'number', 'string'])
    +  geos.GEOSWKTReader_read_r = Module.cwrap('GEOSWKTReader_read_r', 'number', ['number', 'number', 'number'])
     
       /**
      * Creates a GEOSWKTWriter object.
    @@ -3248,7 +3248,7 @@ pointer  * @param {number} s - The coordinate sequence pointer
      * @returns {string} The WKT representation of the geometry, or null on error.
      * @alias module:geos
       */
    -  geos.GEOSWKTWriter_write_r = Module.cwrap('GEOSWKTWriter_write_r', 'string', ['number', 'number', 'number'])
    +  geos.GEOSWKTWriter_write_r = Module.cwrap('GEOSWKTWriter_write_r', 'number', ['number', 'number', 'number'])
  2. Run npm run build

  3. Switch to old WKT read/write style

    --- a/test/tests/wkt_big_geometry.spec.mjs
    +++ b/test/tests/wkt_big_geometry.spec.mjs
    @@ -15,12 +15,12 @@ const wkt = 'POLYGON ((377888.21875 5686717.5, 377888.09375 5686717.5, 377886.46
     // Run the tests
     test('Test WKT read/write big geometry', (t) => {
       // Convert the WKT string to a GEOS geometry pointer
    -  const geomPtr = geos.GEOSWKTReader_read(reader, wkt)
    -  // const size = wkt.length + 1
    -  // const wktPtr = geos.Module._malloc(size)
    -  // geos.Module.stringToUTF8(wkt, wktPtr, size)
    -  // const geomPtr = geos.GEOSWKTReader_read(reader, wktPtr)
    -  // geos.Module._free(wktPtr)
    +  // const geomPtr = geos.GEOSWKTReader_read(reader, wkt)
    +  const size = wkt.length + 1
    +  const wktPtr = geos.Module._malloc(size)
    +  geos.Module.stringToUTF8(wkt, wktPtr, size)
    +  const geomPtr = geos.GEOSWKTReader_read(reader, wktPtr)
    +  geos.Module._free(wktPtr)
     
       // Check if the pointer is valid
       t.ok(geomPtr, 'GEOSWKTReader_read should return a valid pointer')
    @@ -29,10 +29,10 @@ test('Test WKT read/write big geometry', (t) => {
       geos.GEOSWKTWriter_setRoundingPrecision(writer, 5)
     
       // Convert the GEOS geometry pointer back to a WKT string
    -  const result = geos.GEOSWKTWriter_write(writer, geomPtr)
    -  // const resultPtr = geos.GEOSWKTWriter_write(writer, geomPtr)
    -  // const result = geos.Module.UTF8ToString(resultPtr)
    -  // geos.GEOSFree(resultPtr)
    +  // const result = geos.GEOSWKTWriter_write(writer, geomPtr)
    +  const resultPtr = geos.GEOSWKTWriter_write(writer, geomPtr)
    +  const result = geos.Module.UTF8ToString(resultPtr)
    +  geos.GEOSFree(resultPtr)
  4. Run npx tape test/tests/wkt_big_geometry.spec.mjs

@sanak
Copy link
Contributor Author

sanak commented Oct 14, 2023

@chrispahm
This is just my assumption, but I guess that string type uses stack memory in WebAssembly, and big geometry (113,282 bytes in above test case) consumes over the stack size (default 64KB).

Since WKT geometry (and other WKB Hex and GeoJSON) size can be further bigger, so I think that using heap memory by _malloc with number type is safe than using string type.

But changing string type to number breaks current API, so something like the following changes may be good to keep current API (at least non-re API).

diff --git a/src/allCFunctions.mjs b/src/allCFunctions.mjs
index 3d5aca7..bef3ba7 100644
--- a/src/allCFunctions.mjs
+++ b/src/allCFunctions.mjs
@@ -3185,7 +3185,17 @@ pointer  * @param {number} s - The coordinate sequence pointer
  * @see {@link https://libgeos.org/doxygen/geos__c_8h.html#a0a0f7c1b9f6a9f7c3c4d1b5a7b6f9e2e|GEOSWKTReader_read_r}
  * @alias module:geos
   */
-  geos.GEOSWKTReader_read = null
+  geos.GEOSWKTReader_read = (reader, wkt) => {
+    if (typeof wkt !== 'string' || wkt.length === 0) {
+      return 0
+    }
+    const size = wkt.length + 1
+    const wktPtr = Module._malloc(size)
+    Module.stringToUTF8(wkt, wktPtr, size)
+    const geomPtr = geos.GEOSWKTReader_read_r(geos._ctx, reader, wktPtr)
+    Module._free(wktPtr)
+    return geomPtr
+  }
 
   /**
  * Reads a WKT string and returns a GEOSGeometry object.
@@ -3197,7 +3207,7 @@ pointer  * @param {number} s - The coordinate sequence pointer
  * @see {@link https://libgeos.org/doxygen/geos__c_8h.html#a0a0f7c1b9f6a9f7c3c4d1b5a7b6f9e2e|GEOSWKTReader_read_r}
  * @alias module:geos
   */
-  geos.GEOSWKTReader_read_r = Module.cwrap('GEOSWKTReader_read_r', 'number', ['number', 'number', 'string'])
+  geos.GEOSWKTReader_read_r = Module.cwrap('GEOSWKTReader_read_r', 'number', ['number', 'number', 'number'])
 
   /**
  * Creates a GEOSWKTWriter object.
@@ -3238,7 +3248,12 @@ pointer  * @param {number} s - The coordinate sequence pointer
  * @returns {string} The WKT representation of the geometry, or null on error.
  * @alias module:geos
   */
-  geos.GEOSWKTWriter_write = null
+  geos.GEOSWKTWriter_write = (writer, g) => {
+    const wktPtr = geos.GEOSWKTWriter_write_r(geos._ctx, writer, g)
+    const wkt = Module.UTF8ToString(wktPtr)
+    geos.GEOSFree_r(geos._ctx, wktPtr)
+    return wkt
+  }
 
   /**
  * Converts a GEOSGeometry object to a WKT string using a context handle.
@@ -3248,7 +3263,7 @@ pointer  * @param {number} s - The coordinate sequence pointer
  * @returns {string} The WKT representation of the geometry, or null on error.
  * @alias module:geos
   */
-  geos.GEOSWKTWriter_write_r = Module.cwrap('GEOSWKTWriter_write_r', 'string', ['number', 'number', 'number'])
+  geos.GEOSWKTWriter_write_r = Module.cwrap('GEOSWKTWriter_write_r', 'number', ['number', 'number', 'number'])
 
   /**
  * Sets whether the output WKT string should be trimmed of unnecessary spaces.
@@ -5135,7 +5150,9 @@ pointer  * @param {number} s - The coordinate sequence pointer
   Object.keys(geos).forEach(property => {
     if (typeof geos[property] === 'function' && property.endsWith('_r')) {
       const nonReFunctionName = property.slice(0, -2)
-      geos[nonReFunctionName] = (...args) => geos[property](geos._ctx, ...args)
+      if (typeof geos[nonReFunctionName] !== 'function') {
+        geos[nonReFunctionName] = (...args) => geos[property](geos._ctx, ...args)
+      }
     }
   })

How do you think about above ?

@chrispahm
Copy link
Owner

I'm wondering if it's really necessary/ a good idea to keep the current API, since a) we would have to keep maintaining it in the future and b) other users will run into the memory access out of bounds issue and would then need to manually find out how to fix that!

Since there are not many users of this library (according to npm), I think we could also introduce a breaking change and signal that the API changed by releasing a new major version (as pointed out by https://semver.org/ "MAJOR version when you make incompatible API changes").

What do you think? I guess it would make our lives easier as all we would have to change would be a few changes of "string" → "number" in the allCFunctions.mjs as well as a few updates to the tests!

@chrispahm chrispahm added the bug Something isn't working label Nov 1, 2023
@sanak
Copy link
Contributor Author

sanak commented Nov 2, 2023

@chrispahm Thanks for reply!

I'm wondering if it's really necessary/ a good idea to keep the current API, since a) we would have to keep maintaining it in the future and b) other users will run into the memory access out of bounds issue and would then need to manually find out how to fix that!

Since there are not many users of this library (according to npm), I think we could also introduce a breaking change and signal that the API changed by releasing a new major version (as pointed out by https://semver.org/ "MAJOR version when you make incompatible API changes").

Okay, it sounds good for me to upgrade major version for breaking API changes. 👍

What do you think? I guess it would make our lives easier as all we would have to change would be a few changes of "string" → "number" in the allCFunctions.mjs as well as a few updates to the tests!

Okay.

I guess that some functions (ex. GEOSRelate, GEOSGeomType etc.) return type can be kept as "string", because GEOS returns maximum 10-20 bytes as the result in such functions and it is enough less than default stack size (64KB).

Otherwise, it is okay to replace "string" to "number", but WKT reader/writer called amount is not a few in the tests at this moment,

and replacing the all calls to raw pointer access will increase the test code, so something adding helper methods (which are similar with original GEOSGeomFromWKB or PR #19's geosGeomToGeojson) may be convenient for that.

@mtrcn
Copy link

mtrcn commented Mar 25, 2024

@sanak @chrispahm any plan to fix this issue? because we are using this library for geometry validation and getting this error a lot.

@sanak
Copy link
Contributor Author

sanak commented Mar 29, 2024

Hi @chrispahm, @mtrcn
Sorry, I was away for a while...

At this moment, there are 3 patterns to fix this issue.

  1. Simple fix, but needs API change (1st comment)
    - const geomPtr = geos.GEOSWKTReader_read(reader, wkt)
    + const size = wkt.length + 1
    + const wktPtr = geos.Module._malloc(size)
    + geos.Module.stringToUTF8(wkt, wktPtr, size)
    + const geomPtr = geos.GEOSWKTReader_read(reader, wktPtr)
    + geos.Module._free(wktPtr)
  2. Keep API, but only in non-re function (2nd comment)
    const geomPtr = geos.GEOSWKTReader_read(reader, wkt)
  3. Simple fix with adding helper methods (3rd comment)
    import { geosGeomFromWkt } from 'geos-wasm/helpers'
    const geomPtr = geosGeomFromWkt(wkt, geos)

But, do you have a preference from above No.1~3 or another idea ?

From next week, I will have a time, so I am okay to create PR about this (after #28, #29 completion).

@mtrcn
Copy link

mtrcn commented Apr 3, 2024

Hello @sanak, I think the first option seems more future-proof to me but it will require you to update tests as I understand.

@chrispahm
Copy link
Owner

Hi @mtrcn, @sanak! Perfect let's move on with option 1 then!

@sanak
Copy link
Contributor Author

sanak commented Apr 3, 2024

@chrispahm @mtrcn
Thanks for confirmation!

... but it will require you to update tests as I understand.

Well, once it was done, updating tests will not be so hard. 😊

Hi @mtrcn, @sanak! Perfect let's move on with option 1 then!

Oh, yes. 😊
I will create a PR about this (at least by this weekend), so please wait for a while. 😄🙇

chrispahm added a commit that referenced this issue Apr 7, 2024
Change WKT Reader/Writer 'string' type to 'number' (#26)
@chrispahm
Copy link
Owner

Alright, just published v2.0.0 on npm! Thanks @sanak for providing the fix and everyone else for the discussion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants