.config/coc/extensions/node_modules/coc-prettier/node_modules/js-tokens/README.md

   1 Overview [![Build Status](https://travis-ci.org/lydell/js-tokens.svg?branch=master)](https://travis-ci.org/lydell/js-tokens)
   2 ========
   3
   4 A regex that tokenizes JavaScript.
   5
   6 ```js
   7 var jsTokens = require("js-tokens").default
   8
   9 var jsString = "var foo=opts.foo;\n..."
  10
  11 jsString.match(jsTokens)
  12 // ["var", " ", "foo", "=", "opts", ".", "foo", ";", "\n", ...]
  13 ```
  14
  15
  16 Installation
  17 ============
  18
  19 `npm install js-tokens`
  20
  21 ```js
  22 import jsTokens from "js-tokens"
  23 // or:
  24 var jsTokens = require("js-tokens").default
  25 ```
  26
  27
  28 Usage
  29 =====
  30
  31 ### `jsTokens` ###
  32
  33 A regex with the `g` flag that matches JavaScript tokens.
  34
  35 The regex _always_ matches, even invalid JavaScript and the empty string.
  36
  37 The next match is always directly after the previous.
  38
  39 ### `var token = matchToToken(match)` ###
  40
  41 ```js
  42 import {matchToToken} from "js-tokens"
  43 // or:
  44 var matchToToken = require("js-tokens").matchToToken
  45 ```
  46
  47 Takes a `match` returned by `jsTokens.exec(string)`, and returns a `{type:
  48 String, value: String}` object. The following types are available:
  49
  50 - string
  51 - comment
  52 - regex
  53 - number
  54 - name
  55 - punctuator
  56 - whitespace
  57 - invalid
  58
  59 Multi-line comments and strings also have a `closed` property indicating if the
  60 token was closed or not (see below).
  61
  62 Comments and strings both come in several flavors. To distinguish them, check if
  63 the token starts with `//`, `/*`, `'`, `"` or `` ` ``.
  64
  65 Names are ECMAScript IdentifierNames, that is, including both identifiers and
  66 keywords. You may use [is-keyword-js] to tell them apart.
  67
  68 Whitespace includes both line terminators and other whitespace.
  69
  70 [is-keyword-js]: https://github.com/crissdev/is-keyword-js
  71
  72
  73 ECMAScript support
  74 ==================
  75
  76 The intention is to always support the latest ECMAScript version whose feature
  77 set has been finalized.
  78
  79 If adding support for a newer version requires changes, a new version with a
  80 major verion bump will be released.
  81
  82 Currently, ECMAScript 2018 is supported.
  83
  84
  85 Invalid code handling
  86 =====================
  87
  88 Unterminated strings are still matched as strings. JavaScript strings cannot
  89 contain (unescaped) newlines, so unterminated strings simply end at the end of
  90 the line. Unterminated template strings can contain unescaped newlines, though,
  91 so they go on to the end of input.
  92
  93 Unterminated multi-line comments are also still matched as comments. They
  94 simply go on to the end of the input.
  95
  96 Unterminated regex literals are likely matched as division and whatever is
  97 inside the regex.
  98
  99 Invalid ASCII characters have their own capturing group.
 100
 101 Invalid non-ASCII characters are treated as names, to simplify the matching of
 102 names (except unicode spaces which are treated as whitespace). Note: See also
 103 the [ES2018](#es2018) section.
 104
 105 Regex literals may contain invalid regex syntax. They are still matched as
 106 regex literals. They may also contain repeated regex flags, to keep the regex
 107 simple.
 108
 109 Strings may contain invalid escape sequences.
 110
 111
 112 Limitations
 113 ===========
 114
 115 Tokenizing JavaScript using regexes—in fact, _one single regex_—won’t be
 116 perfect. But that’s not the point either.
 117
 118 You may compare jsTokens with [esprima] by using `esprima-compare.js`.
 119 See `npm run esprima-compare`!
 120
 121 [esprima]: http://esprima.org/
 122
 123 ### Template string interpolation ###
 124
 125 Template strings are matched as single tokens, from the starting `` ` `` to the
 126 ending `` ` ``, including interpolations (whose tokens are not matched
 127 individually).
 128
 129 Matching template string interpolations requires recursive balancing of `{` and
 130 `}`—something that JavaScript regexes cannot do. Only one level of nesting is
 131 supported.
 132
 133 ### Division and regex literals collision ###
 134
 135 Consider this example:
 136
 137 ```js
 138 var g = 9.82
 139 var number = bar / 2/g
 140
 141 var regex = / 2/g
 142 ```
 143
 144 A human can easily understand that in the `number` line we’re dealing with
 145 division, and in the `regex` line we’re dealing with a regex literal. How come?
 146 Because humans can look at the whole code to put the `/` characters in context.
 147 A JavaScript regex cannot. It only sees forwards. (Well, ES2018 regexes can also
 148 look backwards. See the [ES2018](#es2018) section).
 149
 150 When the `jsTokens` regex scans throught the above, it will see the following
 151 at the end of both the `number` and `regex` rows:
 152
 153 ```js
 154 / 2/g
 155 ```
 156
 157 It is then impossible to know if that is a regex literal, or part of an
 158 expression dealing with division.
 159
 160 Here is a similar case:
 161
 162 ```js
 163 foo /= 2/g
 164 foo(/= 2/g)
 165 ```
 166
 167 The first line divides the `foo` variable with `2/g`. The second line calls the
 168 `foo` function with the regex literal `/= 2/g`. Again, since `jsTokens` only
 169 sees forwards, it cannot tell the two cases apart.
 170
 171 There are some cases where we _can_ tell division and regex literals apart,
 172 though.
 173
 174 First off, we have the simple cases where there’s only one slash in the line:
 175
 176 ```js
 177 var foo = 2/g
 178 foo /= 2
 179 ```
 180
 181 Regex literals cannot contain newlines, so the above cases are correctly
 182 identified as division. Things are only problematic when there are more than
 183 one non-comment slash in a single line.
 184
 185 Secondly, not every character is a valid regex flag.
 186
 187 ```js
 188 var number = bar / 2/e
 189 ```
 190
 191 The above example is also correctly identified as division, because `e` is not a
 192 valid regex flag. I initially wanted to future-proof by allowing `[a-zA-Z]*`
 193 (any letter) as flags, but it is not worth it since it increases the amount of
 194 ambigous cases. So only the standard `g`, `m`, `i`, `y` and `u` flags are
 195 allowed. This means that the above example will be identified as division as
 196 long as you don’t rename the `e` variable to some permutation of `gmiyus` 1 to 6
 197 characters long.
 198
 199 Lastly, we can look _forward_ for information.
 200
 201 - If the token following what looks like a regex literal is not valid after a
 202   regex literal, but is valid in a division expression, then the regex literal
 203   is treated as division instead. For example, a flagless regex cannot be
 204   followed by a string, number or name, but all of those three can be the
 205   denominator of a division.
 206 - Generally, if what looks like a regex literal is followed by an operator, the
 207   regex literal is treated as division instead. This is because regexes are
 208   seldomly used with operators (such as `+`, `*`, `&&` and `==`), but division
 209   could likely be part of such an expression.
 210
 211 Please consult the regex source and the test cases for precise information on
 212 when regex or division is matched (should you need to know). In short, you
 213 could sum it up as:
 214
 215 If the end of a statement looks like a regex literal (even if it isn’t), it
 216 will be treated as one. Otherwise it should work as expected (if you write sane
 217 code).
 218
 219 ### ES2018 ###
 220
 221 ES2018 added some nice regex improvements to the language.
 222
 223 - [Unicode property escapes] should allow telling names and invalid non-ASCII
 224   characters apart without blowing up the regex size.
 225 - [Lookbehind assertions] should allow matching telling division and regex
 226   literals apart in more cases.
 227 - [Named capture groups] might simplify some things.
 228
 229 These things would be nice to do, but are not critical. They probably have to
 230 wait until the oldest maintained Node.js LTS release supports those features.
 231
 232 [Unicode property escapes]: http://2ality.com/2017/07/regexp-unicode-property-escapes.html
 233 [Lookbehind assertions]: http://2ality.com/2017/05/regexp-lookbehind-assertions.html
 234 [Named capture groups]: http://2ality.com/2017/05/regexp-named-capture-groups.html
 235
 236
 237 License
 238 =======
 239
 240 [MIT](LICENSE).