Skip to content

Commit

Permalink
Document which parts of the getSupplementaryPrivateUseRegExp regex …
Browse files Browse the repository at this point in the history
…refer to which unicode range
  • Loading branch information
ahuth committed Mar 19, 2020
1 parent 0e3777a commit 6a90379
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions lib/commons/text/unicode.js
Original file line number Diff line number Diff line change
Expand Up @@ -130,10 +130,11 @@ function getPunctuationRegExp() {
* @returns {RegExp}
*/
function getSupplementaryPrivateUseRegExp() {
/**
* Reference: https://www.unicode.org/charts/PDF/UD800.pdf
* https://www.unicode.org/charts/PDF/UDC00.pdf
* https://www.unicode.org/charts/PDF/UF0000.pdf
*/
// 1. High surrogate area (https://www.unicode.org/charts/PDF/UD800.pdf)
// 2. Low surrogate area (https://www.unicode.org/charts/PDF/UDC00.pdf)
// 3. Supplementary private use area A (https://www.unicode.org/charts/PDF/UF0000.pdf)
//
// 1 2 3
// ┏━━━━━━┻━━━━━━┓┏━━━━━━┻━━━━━━┓ ┏━━━━━━━━━━━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
return /[\uDB80-\uDBBF][\uDC00-\uDFFD]|(?:[\uDB80-\uDBBE][\uDC00-\uDFFF]|\uDBBF[\uDC00-\uDFFD])/g;
}

0 comments on commit 6a90379

Please sign in to comment.