| How To Write A PFE Module |
How To Write A PFE ModuleThe Current Statecurrently, building of a module has only been done within the pfe source tree. You have to add a few lines to the Makefile.am so it is build and installed. This howto will guide you through the process of creating a new module for the pfe source tree, i.e. a pfe extension module. The Name Of The Modulethe first thing you have to do of course: be creative and invent a name. This name will be used in many many occasions as a reference symbol and signon identifier. In this example the module is named 'example' which is creative enough here. this name is called a 'wordset-name' since it will be used as that. It can even be queried with ENVIRONMENT, and it is listed in the LOADED wordlist of pfe. Create the File And Add It To Makefile.amthe filename shall *not* be example.c, since I am compiling the pfe for some embedded/kernel targets which only need a '.o'-file, just think of a linux kernel module. Since the intermediate objects are '.o'-files and the 'ld -r' target of several intermediate objectfiles is also an '.o'-file, well, it must be assured that the intermediate objectfiles have a different name than the product '.o'-file. If you did not understand what I want, well, don't think about it too long and add a "-ext" to the filestem, so that in here, the extension module 'example' is build from the source file 'example-ext.c'.
Have a look at the Makefile.am and its toolbelt-ext.c.
You will instantly see what is to be done: first, add the
module name to the 'pkglib_LTLIBLIBRARIES, ie.
this is it for Makefile.am, now go ahead and create the
file, i.e. 'example-ext.c'
I do strongly suggest that you include a header comment that
goes right at the start of the file. The autodoc system
of pfe will see it as a special section that should be
treated specially and included in the documentation
file. Just explain everything that you want to point
out to anyone who would want to use your wordset. Do also
include your name and a copyright information. Remember,
it is the most easiest for you to send me the file, so it
can be distributed along with PFE, so it can get compiled
on many many platforms, and so it can get maintained over
some internal changes in PFE. And actually, this very
source file is stored also in the Tek/MPT Source Repository,
where you don't want that some Tekkie simply adds a Tektronix
Copyright in there - the files are writable by other
Tek developers too, not just me.
next you need to include some headers from the pfe base
system. These headers are made namespace clean, ie. they
all have a prefix like 'FX_' or mostly 'p4_'. For a real
programmer, this is inconvenient, and it makes the code
not very readable. If you look closer, you will see that
in most headers there are '#ifdef _P4_SOURCE' sections
(expecially in def-types.h) which do include things like
In general, most sources handwritten by users will want to have
these. This is however not a good recommendation if the extension
module is derived from some other source, e.g. Tek/MPT
has a SWIG extension to convert
C headers to pfe modules. Anyway, your file will most start with:
make sure to include one of the pfe headers first, so
that the gcc register allocation may work (--with-regs is
greater 0). For a single wordset, you need also to include
pfe/def-words.h, but I recommend to do that last, after
all other includes, since there are a lot of two-char
#defines (if you specified P4_SOURCE).
Now, let's have a look at a simple word, e.g. the 2NIP
as implemented in toolbelt-ext.c. Please add a
javadoc
like comment before, and make the first line of that
comment show the Forth Stack Notation.
Now everyone knows what that word should do. All
wordset words in PFE should then be declared with
a prototype macro as FCode. On most systems, a
'FCode(example)' will expand to 'void example_ (void)'
- note the underscore at the end that distinguishes
the pfe symbol from other C symbols.
Write the body of the function. Inside of an 'FCode'
word, you are assured to access the forth stacks and
dictionary directly - via its pointer macros. The
most common pointers are:
most of the important ones are declared in def-types.h, and
most of the important macros to access them are declared in
def-macro.h, e.g.
the 2NIP implementation is of course a short one. We just
want to nip the third and fourth item in the parameter
stack, and just as you would expect from 'PICK', the
values in the SP-stack are called SP[0] SP[1] SP[2] SP[3],
where SP[0] is of course the top of stack. Here we copy
[0]->[2] and [1]->[3] and decrease then the stack depth
by increasing the stack pointer by 2 - remember that the
parameter stack is a
you can then declare other such words, and finally you need
to make them known to forth. This is done by assembling all
the words in a Wordset-table. A Wordset table is really two
C strutures, where the first lists the entries and the second
gives some more information. They are always written shoulder
on shoulder, so it looks like
note that P4_FXco is a macro from pfe-words.h that does
all the relevant things. So just give it a name with a C-string
and the name you used in FCode. The P4_COUNTWORDS macro has
a string - and the first part (upto the first space) is
used to identify the wordset in ENVIRONMENT queries. It
will also show up in the LOADED WORDS.
the macro (e.g. P4_FXco) will define what the symbol should be
look like in forth - P4_FXco is a subroutine code reference,
i.e. a primitive. P4_IXco is the same, but immediate. There
are lots of other macros, just have a look at 'def-words.h'
note: the following paragraph is outdated since
the 31.x generation which has this LOADLIST table in the
referenced module-dll.c. No need to declare it by hand anymore,
just link it to your original source code. Anyway, you still
have the option to declare an explicit loadlist, but you
can not be assured that this will stay in the next generation
of PFE, where the loadlist code might be removed completly.
The single-level load-scheme does not need it anymore, as all kinds
of loader-commands are available in wordsets too. Avoid this one.
Anyway, here's what it looked like:
And now you are basically through with it. Just compile,
and when `pfe` is started, type 'LOADM example' to get
access to the words in the 'EXTENSIONS' vocabulary.
The Semant words are one of the nicest features of PFE.
Without much horrors, you get compiling words and state-smart
words ... and it will also be nicely decompiled by `SEE` without
any further problem.
Let's have a look now at
The last COMPILES-declaration is the binding link between
everything and all about Semant-words. The first parameter
references the original compiling FCode. The FX_COMPILE
in the compiling FCode will in turn reference this semant
declaration.
The second parameter of COMPILES is of course the execution
that should be COMMA into the dictionary. Since pfe is
indirect threaded, you cannot just use FX_COMMA(p4_literal_execution),
instead you compile the address of the pointer to p4_literal_execution
that is given by the static Semant-structure. The advantage is,
that the decompiler knows the address of this COMPILES-structure,
and so there are some hints for the decompiler. SKIPS_CELL should
be very obvious - the decompiler shall not interpret the next
token in the colon-definition. And the default-style is, well,
just nothing. All kinds of indentations for IF and LOOP style words
could be given. See 'def-const.h' for some of them.
The compiling word should now be understandable: if in compiling
mode, compile a execution-token (the address to a pointer to a C-function),
and the value on the stack into the dictionary at HERE. The POP
will also consume the value off the paramter stack.
The execution is supposed to do the reverse of it, so PUSH will
insert the value on top of the parameter stack, and the value
is retrieved by looking at IP. Remember, IP points to the next
token that the colon-inner-interpreter will execute if the
current C-function returns. Therefore, the value is fetched from
there (i.e.
Now that the implementation is done, export the semant-word
in the wordset-table - and be sure to use 'CS'. All 'CS' words
are of course immediate, and it does not reference the compiling
word, but the semant-structure. Here you would write...
The real benefit will be obvious when you make a colon-definition
with a semant-word, and when done, use SEE to see what is in
there. It will produce some very fine output. Well, the SEE
words are of course in debug-ext.c, since decompiling is used
usually during debugging or even single-stepping.
The previous section dealt with the execution semantics of
compiling words which add their execution vector to the current
colon definition under creation. Here we present the style of
creating new HEADER entries in the dictionary and setting up
its runtime code for the new words.
Up to the 31.x generation, this was very simple - one would
simply call a word that creates a header entry (or just skip
that part of noname entries), and the CFA runtime vector had
been simply COMMAd into its place, followed by more COMMAs
to set up the parameter field. Here's a typical snippet of
that style:
With the current generation of 32.x this is not quite
recommended, even that you can still use this scheme.
However, use a new style for it which is much more
obvious about what you want to do, so let it look like this:
The disadvantage is that it makes a specific assumption about
the setting of a runtime vector of a codefield, and it even
declared the codefield to be just the address of the runtime
C-routine. This is true for the default indirect-threaded
model.
In order to widen the range of possible threading-models, we
go the same way as for the semant-words - we create an runtime
info-block and the CFA-setup is done by referencing this
info-block - in the default indirect-threaded model it will
simply fetch the value that points to the C-routine, and
COMMA it where the definition is so far. Here is the style
that is recommended in the 32.x generation:
Actually, this new style with an FX_RUNTIME macro, makes the C source
much more obvious as the macro name FX_RUNTIME does point out what
shall be done at this point, to setup a runtime-vector for the header
just created before. But there is also another need around here which
circulates around the decompiling of words. Up to here, the debug-ext
will contain a large table of all known runtime-vector values and
associate it with the C code to decompile its parameter area, including
the colonwords. Using this new scheme, the moduleloader has the chance
to see new runtime vectors, and register them dynamically. This is not
done up to now, but it will be used in the 33.x generation.
In the 32.x generation, the style of the runtime implementation has
also changed, although the old style is still supported. The
traditional scheme for the forth systems is the use of a word-pointer,
short WP, that is either an explicit variable in the inner interpreter,
or it can be fetched indirectly by looking for IP[-1][] (the inner
interpreter will fetch the current value from IP and increment it.
Then the execution-token is executed by jumping indirectly. The value
IP[-1] points to the CFA of the current word, adding one cell lets
us see the PFA of the current word executed in the inner interpreter).
To access the parameter values, one can simply use the WP macros and
address it with a normal C-style array index. A typical runtime would
therefore look like:
To make it easier to support native-cpu sbr-threading and portable
call-thereading, there is a change about here, since for either of thse
we can not just fetch the IP[-1] to get at the wordpointer, nor is the
latest value be fetchable from the cpu register in sbr-threading mode
(atleast not without some assembler snippets). Instead of the assumption
of a global wordpointer (either explicit or implicit through IP), we
create a local wordpointer in the runtime-definition and a new macro
can be used to capsule the needed setup-code. The new style looks like:
The base to call this macro something like _POP_ has a simple reason
that lies in the call-threading model. Since a colonword will be made
up of pointer to C-code (instead of pointers to pointers of C-code as
is in the indirect-threaded model), there is no easy way to get at the
address of the parameter field - unless one would use direct threading
that would jump directly into a copy of native-code in the codefield
of each word, and that native-code snippet would be required to set
up the wordpointer then. For call-threading however we jump directly
into the C-compiler generated routine, so that DATA and CODE are
fully seperated in different segments, with the possibly of an
unwriteable CODE segment.
To get the parameterfield address, we have to add that one explicitly
into the colonword - each word that needs to access parameters will
be compiled with two cells in a call-threaded colondef, the first one
is the runtime-vector and the second the parameter-vector. The inner
interpreter will fetch the runtime-vector, increment the IP (instruction
pointer of the current colondef), and jump directly in the C code.
The runtime C code will then have to fetch the parameter-vector and
thereby adjust the IP to point to the next runtime-vector following
the current tuple. This would not needed to be done for primitives,
and well, that's what the name comes from - primitives don't have a
parameter field.
Unlike the traditional indirect-threaded forth, the codefield of
words in call-threaded mode do not contain a code address, instead
they point to a code info-block which could actually be just the
same as the info-record that is also available in the WORDSET table
to export definitions. The executions done in the indirect-threaded
listloader will simply be postponed to compile time.
Well, the call-threading mode of this style is not very consise
w.r.t. to the memory consumption - each call-threaded colondef
would get compiled as two cells, one for the colondef-runtime
and one parameter-vector point to the list of exec-vectors that
make upt that definition. Only the primitives being compiled from
C source would be one cell entries. However this restriction can
be lifted when going from call-threaded colondefs to sbr-threaded
colondefs for the cpu architectures that we know about. Each
runtime-vector would simply be preceded with the cpu-native code
for call-subroutine, and the complete colondef would then be a
native-code primitive in the end that does not need a parameter
vector when compiled. Unlike direct-threading forth systems, just
two native-code bitpatterns must be discovered to make it work -
sbr-call and sbr-return. The rest would be just native-code
optimizations.
|