Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

efficient way to encrypt binary Uint8Array data #291

Closed
rbuckheit opened this issue Sep 2, 2015 · 3 comments
Closed

efficient way to encrypt binary Uint8Array data #291

rbuckheit opened this issue Sep 2, 2015 · 3 comments

Comments

@rbuckheit
Copy link

Howdy!

I'm looking for an efficient way to encrypt binary Uint8Array data using forge crypto. We have some rather large binary arrays (100-300kb) and I'm currently trying to convert these into strings efficiently so that we can encrypt them using forge.

The problem we're running into is that converting to strings causes URI errors in certain scenarios. In general my impression is that this happens when two high byte values end up adjacent in the Uint8Array, producing a high UTF16 code such as 0xFFFF which is a non-character value. Here is an example test case which fails:

  it 'should convert a Uint8Array to a string', ->
    uint8  = new Uint8Array([255, 255])
    buffer = Forge.util.createBuffer(uint8)
    expect(-> buffer.toString()).not.toThrow()

  => Expected function not to throw, but it threw URIError: URI error.

My question is whether there is a better way to do this conversion, either by:

  1. directly encrypting a Uint8Array, or
  2. converting to a string using a more efficient representation than base64

Tagging @dlongley as he commented on Uint8Array support in a previous issue here:
#89

Thank you!

Ryan

@rbuckheit rbuckheit changed the title efficient way to encrypt binary Uint8Array data? efficient way to encrypt binary Uint8Array data Sep 2, 2015
@dlongley
Copy link
Member

dlongley commented Sep 2, 2015

So, just to be clear on the problem, it sounds like you want to avoid using too much memory at once. Right now, you're essentially having to duplicate what you've got in a large Uint8Array by converting it into a forge buffer prior to encryption. Is that right?

If so, you'll want to just convert your Uint8Array one small slice at a time, put that into a forge buffer, run the encryption process, and then get the result out and convert it back into a Uint8Array or do whatever else you want. You'll just have to write a bit of code to do that:

  1. Get a slice of the array
  2. Convert it to a forge buffer
  3. Pass that buffer to the update call on a forge cipher instance
  4. Use cipher.output.getBytes() and do something with the output (this will clear the underlying forge buffer to make room for the next slice)

If you have questions about what to do with the output of getBytes in order to get it into a particular format, free to add them in this issue and we'll try and help out.

As a side note, I think your toString issue is likely a red herring. Calling buffer.toString in forge 0.6.x means "please interpret the contents of this buffer as a UTF-8 encoded string and convert it to a string of characters". You can't easily perform this conversion via JavaScript's String.fromCharCode (see: Mozilla Developer Network) for strings that contain very rarely used characters (with code points above 0xFFFF). That's why you're getting an exception. However, this sounds like it has nothing to do with your use case because you don't have a 100-300K UTF-8 encoded string of rare characters anyway. Right?

Instead, it sounds like you've got a byte array full of arbitrary binary data (that should not be interpreted as a string) that you want to encrypt.

Forge 0.6.x was written prior to TypedArray support in browsers, so its buffers use a string to internally represent arrays of bytes (this is the same format that node.js uses with its binary string encoding). Note that this has nothing to do with representing characters. Unfortunately, this is a common misunderstanding with forge -- one that we're aiming to entirely avoid in forge 0.7.x (WIP). Until then, if it helps, don't think of forge as using "strings", but rather as using some "forge buffer" thing that holds binary data for you somehow. Unfortunately, this abstraction is too leaky in forge 0.6.x and you sometimes have to deal with it.

Anyway, all you need to do is get your data into a forge buffer so you can encrypt it. Don't try to convert a forge buffer to a string. If you're calling "toString" on a forge buffer somewhere in your code, that's probably the issue -- don't do that.

@rbuckheit
Copy link
Author

@dlongley Thanks. I did misunderstand the character representation issue and based on your description I think that you are correct in that my toString() issue was a red herring. I will write up the code to process the input in chunks as you describe and get back to you - I think it will solve the issue.

Out of curiosity, is the plan for 0.7.x to support using Uint8Array in forge?

Thank you for your help!

@dlongley
Copy link
Member

dlongley commented Sep 3, 2015

@rbuckheit,

Thank you for your help!

Sure!

Out of curiosity, is the plan for 0.7.x to support using Uint8Array in forge?

Yes, though you may still need to wrap a Uint8Array in a forge buffer for certain operations. Doing so will not result in a data copy, however, it would just be an API adapter to keep things consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants