Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

js string should use Uint8Array typed arrays #483

Open
timotheecour opened this issue Dec 26, 2020 · 3 comments
Open

js string should use Uint8Array typed arrays #483

timotheecour opened this issue Dec 26, 2020 · 3 comments

Comments

@timotheecour
Copy link
Owner

timotheecour commented Dec 26, 2020

links

proc arrayTypeForElemType(typ: PType): string =
  # XXX This should also support tyEnum and tyBool
  case typ.kind
  ...
  of tyInt8: "Int8Array"
  of tyUInt8: "Uint8Array"
  of tyFloat64, tyFloat: "Float64Array"
  else: ""

note

insome benchmakr i did though, TextEncoder.encode seems slow.

Ref
#156

@timotheecour
Copy link
Owner Author

timotheecour commented Dec 26, 2020

  • why does cstrToNimstr differs from this implementation?

in particular this part:

// Surrogate Pair
      c = 0x10000 + ((c & 0x03FF) << 10) + (str.charCodeAt(++i) & 0x03FF);

seems different?

https://github.com/google/closure-library/blob/master/closure/goog/crypt/crypt.js#L110

/**
 * Converts a JS string to a UTF-8 "byte" array.
 * @param {string} str 16-bit unicode string.
 * @return {!Array<number>} UTF-8 byte array.
 */
goog.crypt.stringToUtf8ByteArray = function(str) {
  'use strict';
  // TODO(user): Use native implementations if/when available
  var out = [], p = 0;
  for (var i = 0; i < str.length; i++) {
    var c = str.charCodeAt(i);
    if (c < 128) {
      out[p++] = c;
    } else if (c < 2048) {
      out[p++] = (c >> 6) | 192;
      out[p++] = (c & 63) | 128;
    } else if (
        ((c & 0xFC00) == 0xD800) && (i + 1) < str.length &&
        ((str.charCodeAt(i + 1) & 0xFC00) == 0xDC00)) {
      // Surrogate Pair
      c = 0x10000 + ((c & 0x03FF) << 10) + (str.charCodeAt(++i) & 0x03FF);
      out[p++] = (c >> 18) | 240;
      out[p++] = ((c >> 12) & 63) | 128;
      out[p++] = ((c >> 6) & 63) | 128;
      out[p++] = (c & 63) | 128;
    } else {
      out[p++] = (c >> 12) | 224;
      out[p++] = ((c >> 6) & 63) | 128;
      out[p++] = (c & 63) | 128;
    }
  }
  return out;
};

edit: see this other implementation: https://stackoverflow.com/a/64277403/1426932
(From emscripten)

@timotheecour timotheecour changed the title js string should use ArrayBuffer typed arrays js string should use Uint8Array typed arrays Dec 27, 2020
@timotheecour timotheecour changed the title js string should use Uint8Array typed arrays js string should use Uint8Array typed arrays Dec 27, 2020
@ringabout
Copy link
Collaborator

I agree. I think cstring should use uint8Array.

@AmjadHD
Copy link

AmjadHD commented Aug 17, 2022

Will this simplify makeNimstrLit, cstrToNimstr and makeJSStr ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants