6.8.2 String representations

Forth commonly represents strings as cell pair c-addr u on the stack; u is the length of the string in bytes (aka chars), and c-addr is the address of the first byte of the string. Note that a code point may be represented by a sequence of several chars in the string (and a Unicode “abstract character” may consist of several code points). See String words.

Another string representation is used with the string library of words containing $. It represents the string on the stack through the address of a cell-sized string handle, which can be located in, e.g., a variable. See $tring words.

A legacy string representation are counted strings, represented on the stack by c-addr. The char addressed by c-addr contains a character-count, n, of the string and the string occupies the subsequent n char addresses in memory. Counted strings are limited to 255 bytes in length. While counted strings may look attractive due to needing only one stack item, due to their limitations we recommend avoiding them, especially as input parameters of words. See Counted string words.