Javascript: string to bytes

This guide walks you through how to convert a JavaScript string to bytes depending on the use case and performance needs.

Understanding character encoding

Before converting strings to bytes, you need to understand character encoding. JavaScript uses Unicode, typically UTF-16 encoding for strings. Each character can be one or two 16-bit code units.

The TextEncoder API

The most straightforward method to convert a string to bytes in modern JavaScript environments is using the TextEncoder API.

const textEncoder = new TextEncoder(); const uint8Array = textEncoder.encode("Your string here");

This method encodes the string into a Uint8Array as UTF-8 bytes.

Manual encoding

If TextEncoder is not available or you need a custom solution, you can manually encode a string.

UTF-16 to bytes

JavaScript strings are encoded as UTF-16 by default.

function stringToUtf16Bytes(str) { const bytes = []; for (let i = 0; i < str.length; i++) { const codeUnit = str.charCodeAt(i); bytes.push(codeUnit & 0xFF, codeUnit >> 8); } return bytes; } const bytes = stringToUtf16Bytes("Your string here");

UTF-8 to bytes manually

Converting to UTF-8 manually requires more work because you need to handle multi-byte characters properly.

function stringToUtf8Bytes(str) { const bytes = []; for (let i = 0; i < str.length; i++) { let codePoint = str.codePointAt(i); if (codePoint < 0x80) { bytes.push(codePoint); } else if (codePoint < 0x800) { bytes.push(0xc0 | (codePoint >> 6), 0x80 | (codePoint & 0x3f)); } else if (codePoint < 0x10000) { bytes.push( 0xe0 | (codePoint >> 12), 0x80 | ((codePoint >> 6) & 0x3f), 0x80 | (codePoint & 0x3f) ); } else { i++; // skip one iteration since we have a surrogate pair bytes.push( 0xf0 | (codePoint >> 18), 0x80 | ((codePoint >> 12) & 0x3f), 0x80 | ((codePoint >> 6) & 0x3f), 0x80 | (codePoint & 0x3f) ); } } return bytes; } const utf8Bytes = stringToUtf8Bytes("Your string here");

Handling binary data with ArrayBuffer

Sometimes you need to work with ArrayBuffer directly, particularly when dealing with binary file formats or network protocols.

Converting string to ArrayBuffer

function stringToArrayBuffer(str) { const buffer = new ArrayBuffer(str.length * 2); // 2 bytes for each char const bufferView = new Uint16Array(buffer); for (let i = 0, length = str.length; i < length; i++) { bufferView[i] = str.charCodeAt(i); } return buffer; } const arrayBuffer = stringToArrayBuffer("Your string here");

Converting string to ArrayBuffer with UTF-8

function stringToUtf8ArrayBuffer(str) { const uint8Array = new TextEncoder().encode(str); return uint8Array.buffer; } const utf8ArrayBuffer = stringToUtf8ArrayBuffer("Your string here");

Base64 encoding and byte arrays

When transferring data over a medium that does not support binary, such as JSON, you might want to encode your byte array to Base64.

function toBase64(byteArray) { let binaryString = ''; byteArray.forEach((byte) => { binaryString += String.fromCharCode(byte); }); return window.btoa(binaryString); } const base64String = toBase64(new Uint8Array([/* Your byte array here */]));

Handling large or complex strings

For large or complex strings, consider streaming the conversion or handling it in chunks to avoid blocking the main thread.

Streaming with TransformStreams

async function streamStringToBytes(str) { const stream = new ReadableStream({ start(controller) { const encoder = new TextEncoder(); controller.enqueue(encoder.encode(str)); controller.close(); }, }); const reader = stream.getReader(); while (true) { const { done, value } = await reader.read(); if (done) break; // Handle the value, an instance of Uint8Array } } // Call the function with an async context streamStringToBytes("Your large or complex string here");

This approach allows the browser to handle each chunk of the string efficiently without locking up the UI.

By employing these methods, you can convert JavaScript strings into byte representations effectively, taking into account both the encoding specifics and the context in which the bytes will be used.

Invite only

We're building the next generation of data visualization.