6.29.1 Definitions in assembly language

Gforth provides ways to implement words in assembly language (using abi-code...end-code), and also ways to define defining words with arbitrary run-time behaviour (like does>), where (unlike does>) the behaviour is not defined in Forth, but in assembly language (with ;code).

However, the machine-independent nature of Gforth poses a few problems: First of all, Gforth runs on several architectures, so it can provide no standard assembler. It does provide assemblers for several of the architectures it runs on, though. Moreover, you can use a system-independent assembler in Gforth, or compile machine code directly with , and c,.

Another problem is that the virtual machine registers of Gforth (the stack pointers and the virtual machine instruction pointer) depend on the installation and engine. Also, which registers are free to use also depend on the installation and engine. So any code written to run in the context of the Gforth virtual machine is essentially limited to the installation and engine it was developed for (it may run elsewhere, but you cannot rely on that).

Fortunately, you can define abi-code words in Gforth that are portable to any Gforth running on a platform with the same calling convention (ABI); typically this means portability to the same architecture/OS combination, sometimes crossing OS boundaries).

assembler ( ) tools-ext “assembler”

A vocubulary: Replaces the wordlist at the top of the search order with the assembler wordlist.

init-asm ( ) gforth-0.2 “init-asm”

Pushes the assembler wordlist on the search order.

abi-code ( "name" – colon-sys  ) gforth-1.0 “abi-code”

Start a native code definition that is called using the platform’s ABI conventions corresponding to the C-prototype:

Cell *function(Cell *sp, Float **fpp);

The FP stack pointer is passed in by providing a reference to a memory location containing the FP stack pointer and is passed out by storing the changed FP stack pointer there (if necessary).

;abi-code ( ) gforth-1.0 “semicolon-abi-code”

Ends the colon definition, but at run-time also changes the last defined word X (which must be a created word) to call the following native code using the platform’s ABI convention corresponding to the C prototype:

 Cell *function(Cell *sp, Float **fpp, Address body);

The FP stack pointer is passed in by providing a reference to a memory location containing the FP stack pointer and is passed out by storing the changed FP stack pointer there (if necessary). The parameter body is the body of X.

end-code ( colon-sys –  ) gforth-0.2 “end-code”

End a code definition. Note that you have to assemble the return from the ABI call (for abi-code) or the dispatch to the next VM instruction (for code and ;code) yourself.

code ( "name" – colon-sys  ) tools-ext “code”

Start a native code definition that runs in the context of the Gforth virtual machine (engine). Such a definition is not portable between Gforth installations, so we recommend using abi-code instead of code. You have to end a code definition with a dispatch to the next virtual machine instruction.

;code ( compilation. colon-sys1 – colon-sys2  ) tools-ext “semicolon-code”

The code after ;code becomes the behaviour of the last defined word (which must be a created word). The same caveats apply as for code, so we recommend using ;abi-code instead.

flush-icache ( c-addr u – ) gforth-0.2 “flush-icache”

Make sure that the instruction cache of the processor (if there is one) does not contain stale data at c-addr and u bytes afterwards. END-CODE performs a flush-icache automatically. Caveat: flush-icache might not work on your installation; this is usually the case if direct threading is not supported on your machine (take a look at your machine.h) and your machine has a separate instruction cache. In such cases, flush-icache does nothing instead of flushing the instruction cache.

If flush-icache does not work correctly, abi-code words etc. will not work (reliably), either.

The typical usage of these words can be shown most easily by analogy to the equivalent high-level defining words:

: foo                              abi-code foo
   <high-level Forth words>              <assembler>
;                                  end-code
                                
: bar                              : bar
   <high-level Forth words>           <high-level Forth words>
   CREATE                             CREATE
      <high-level Forth words>           <high-level Forth words>
   DOES>                              ;code
      <high-level Forth words>           <assembler>
;                                  end-code

For using abi-code, take a look at the ABI documentation of your platform to see how the parameters are passed (so you know where you get the stack pointers) and how the return value is passed (so you know where the data stack pointer is returned). The ABI documentation also tells you which registers are saved by the caller (caller-saved), so you are free to destroy them in your code, and which registers have to be preserved by the called word (callee-saved), so you have to save them before using them, and restore them afterwards. For some architectures and OSs we give short summaries of the parts of the calling convention in the appropriate sections. More reverse-engineering oriented people can also find out about the passing and returning of the stack pointers through see abi-call.

Most ABIs pass the parameters through registers, but some (in particular the most common 386 (aka IA-32) calling conventions) pass them on the architectural stack. The common ABIs all pass the return value in a register.

Other things you need to know for using abi-code is that both the data and the FP stack grow downwards (towards lower addresses) in Gforth, with 1 cells size per cell, and 1 floats size per FP value.

Here’s an example of using abi-code on the 386 architecture:

abi-code my+ ( n1 n2 -- n )
4 sp d) ax mov \ sp into return reg
ax )    cx mov \ tos
4 #     ax add \ update sp (pop)
cx    ax ) add \ sec = sec+tos
ret            \ return from my+
end-code

An AMD64 variant of this example can be found in AMD64 (x86_64) Assembler.

Here’s a 386 example that deals with FP values:

abi-code my-f+ ( r1 r2 -- r )
8 sp d) cx mov  \ load address of fp
cx )    dx mov  \ load fp
.fl dx )   fld  \ r2
8 #     dx add  \ update fp
.fl dx )   fadd \ r1+r2
.fl dx )   fstp \ store r
dx    cx ) mov  \ store new fp
4 sp d) ax mov  \ sp into return reg
ret             \ return from my-f+
end-code