Word Counter computes its statistics with native JavaScript String methods and a small set of RegExp patterns. The character count uses text.length (which counts UTF-16 code units, not graphemes — see below for the emoji caveat). Characters-without-spaces strip whitespace using /\s/ before counting. Word counting lowercases the input, replaces non-word, non-apostrophe, non-hyphen characters with spaces, splits on /\s+/, and filters empty strings.
The whitespace-split approach is the standard newsroom and word-processor word count and matches Microsoft Word and Google Docs within a few percent. It works for any language that separates words with whitespace — English, French, German, Spanish, Russian, and so on — but it produces incorrect counts for Chinese, Japanese, Korean, and Thai, which write words contiguously without spaces. For accurate CJK word boundaries you would need Intl.Segmenter with the 'word' granularity, which is more expensive and currently not used here for performance reasons.
Sentence detection splits on /[.!?]+/ and filters empty fragments. This is fast but naive: 'Mr. Smith arrived at 3 p.m.' counts as four sentences because of the abbreviation periods. If your text has many abbreviations, the sentence count and the average words-per-sentence statistic will be skewed. Paragraph detection uses /\n\s*\n/ to split on blank lines, which matches the convention used by Markdown and most word processors.
Keyword density excludes a built-in stopword list (the most common articles, conjunctions, prepositions, and pronouns) and considers only words of three or more characters. Frequency is counted in a JavaScript Map keyed by the lowercased word. The top five keywords are selected by sorting the entries by count and slicing — O(n log n) over the unique-word count, which is fast even for tens of thousands of words.
Reading time uses 200 words per minute (the conservative end of the adult silent-reading range; the academic average is closer to 238 wpm but 200 leaves a buffer for unfamiliar material). Speaking time uses 130 wpm, which matches the typical conversational pace and is slightly slower than a presentation pace of 150 wpm. Both numbers round up so partial minutes show as the next whole minute.
Lexical diversity is the ratio of unique words to total words, presented as a percentage. A short text trivially has high diversity (every word is likely unique); a long text with a low diversity score signals repetitive vocabulary. This is a rough version of the Type-Token Ratio used in linguistics and is most informative when comparing two texts of similar length rather than as an absolute measure.
All statistics recompute via a useMemo hook on every text change, so the panel updates as you type. Each pass is O(n) over the input, which means even multi-thousand-word essays stay interactive. Settings, including the live tab, persist via localStorage. No fetch call is made and no analytics on the input is sent — the page operates entirely client-side.