dstrings wordset

description

-- Dynamic-Strings words

Copyright (C) 2001, 2002 David N. Williams

This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Library General Public License for more details.

You should have received a copy of the GNU Lesser General Public License along with this library; if not, write to the Free Software Foundation, 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

If you take advantage of the option in the LGPL to put a particular version of this library part under the GPL, the author would regard it as polite if you would put any direct modifications under the LGPL as well, and include a copy of this request near the beginning of the modified library source. A "direct modification" is one that enhances or extends the library in line with its original concept, as opposed to developing a distinct application or library which might use it.

This code is based on the ^Forth Motorola 680x0 strings package as of June, 1999.

Please direct any comments to david.n.williams@umich.edu.

FORTH

EMPTY$ ( $: -- empty$ )();
p4:"empty-str";

Push the MSA of a fixed, external representation of the empty string onto the string stack. <ansref>"empty-string"</ansref>

FORTH

\n$ ( $: -- newline$ )();
p4:"newline-str";

Push the MSA of a fixed, external string whose body is the Unix newline character onto the string stack. <ansref>"newline-string"</ansref>

FORTH

DSTRINGS ( .. )();
as:"dstrings";

threadstate variable DSTRINGS

dstrings (no special usage info)

FORTH
S, ( addr len -- addr' len )(); 
 ;

ALLOT room and store the Forth string into data space as an mstring, leaving data space aligned; and leave the length and new body address. It is assumed that len is unsigned. An error is thrown if len is larger than the system parameter MAX_DATA_STR. <ansref>"s-comma"</ansref>

NOTE: MAX_DATA_STR is returned by

   S" /SCOPY" ENVIRONMENT?

Perhaps this restriction should be removed in favor of a normal data space overflow error.

NOTE: S, is the same as STRING, in Wil Baden's Tool Belt, except it stores a measured string instead of a counted string.

FORTH

0STRINGS ( -- )();
p4:"zero-strings";

Set all string variables holding bound string values in string space to the empty string, and clear string space, including the string buffer, string stack, and string stack frames. <ansref>"zero-strings"</ansref>

NOTE: If used for under the hood development, this word must be executed only when string space is in a valid state.

FORTH

$GC-OFF ( .. )();
as:"str-gc-minus-off";

ordinary primitive $GC-OFF

an executable word (no special usage info)

or wrapper call around p4_str_gc_off

FORTH

$GC-ON ( .. )();
as:"str-gc-minus-on";

ordinary primitive $GC-ON

an executable word (no special usage info)

or wrapper call around p4_str_gc_on

FORTH

$UNUSED ( .. )();
as:"str-unused";

ordinary primitive $UNUSED

an executable word (no special usage info)

or wrapper call around p4_str_unused

FORTH
COLLECT-$GARBAGE ( -- collected-flag )(); 
 ;

If string space is not marked as containing garbage, return false. If there is garbage, throw an error when garbage collection is disabled. Otherwise remove the garbage and return true. Garbage collection is "transparent", so the user would not normally use this word. <ansref>"collect-string-garbage"</ansref>

FORTH

$GARBAGE? ( -- flag )();
str:"garbage-Q";

Leave true if there is garbage in the current string space. Not normally used, since garbage collection is transparent. <ansref>"string-garbage-question"</ansref>

 
FORTH
MAKE-$SPACE ( size #frames -- addr )(); 
 ;

Allocate and initialize a string space with size bytes available for the string buffer including the string stack, and with a string frame stack for frame description entries holding up to #frames. The size is rounded up to cell alignment, and the buffer begins and ends with cell alignment. Return addr, the address of the string space. The standard word FREE with addr as input can be used to release the space. <ansref>"make-string-space"</ansref>

FORTH

(M$:) ( .. )();
as:"paren-m-str-colon";

compiling primitive (M$:)

an executable word (no special usage info)

or wrapper call around p4_marg_execution

FORTH

$" ( [ccc<">] -- $: str )();
p4:"str-quote";

Parse ccc delimited by " (double-quote) and store it in data space as an mstring. If interpreting, leave the MSA on the string stack. If compiling, append run-time semantics to the current definition that leaves the MSA on the string stack. A program should not alter the stored string. An error is thrown if the quoted string length is larger than the system parameter MAX_DATA_STR (see S,). <ansref>"string-quote"</ansref>

NOTE: In contrast to S", the string stored by $" when interpreting is not transient.

The implementation is based on PFE code for S".

FORTH
$CONSTANT ( "name" $: a$ -- )(); 
 ;

Create a definition for "name" with the execution semantics "name" execution: ($: -- a$ )

It is assumed that the input string resides as a measured, unchanging string outside of string space. <ansref>"string-constant"</ansref>

For example:

   $" This is a sample string." $constant sample$
 
FORTH
$VARIABLE ( "name" -- )(); 
 ;
   "name" execution:	( -- dfa )

Create an ordinary Forth variable and initialize it to the address of a fixed, external, measured representation of the empty string, such as that pushed onto the string stack by EMPTY$. <ansref>"string-variable"</ansref>"

FORTH

($: ( .. )();
as:"note-str-colon";

immediate primitive ($:

an executable word (no special usage info)

or wrapper call around p4_paren

FORTH
ARGS{ ( arg1'$ ... argN'$ "arg1 ... argN <}>" -- )(); 
 ;
    compilation: ( -- $: arg1$ ... argN$ )

Immediate and compilation-only.

Copy the argument strings to the string buffer, push them onto the string stack with "argN" the most accessible, and make them into the top compile-time string stack frame. Compile the run-time code to make an argument frame out of the N most accessible run-time string stack entries. Inform the system text interpreter that it should compile run-time code for any white-space delimited argument encountered in the text of the definition, that concatenates the corresponding string in the run-time frame. At the semicolon terminating the definition, drop the compile-time argument frame and compile code to drop the run-time argument frame. <ansref>"args-brace"</ansref>

Syntax for defining a string macro GEORGE:

	: george  ($: a$ b$ c$ -- cat$ )
	  args{ arg1 arg2 arg3 }
	  cat" This is arg1:  " arg1 cat" ." ENDCAT ;

The blank following the last argument is required. For a macro with no arguments, ARGS{ } does nothing but add useless overhead and should be omitted. Two of the arguments in this example are ignored and could have been left out. Words intended only as steps in building a macro would omit ENDCAT, which terminates concatenation and leaves the concatenated string on the string stack.

Sample syntax using the string macro GEORGE:

    $" bill"  $" sue"  $" marie"  george $.

The resulting display is:

     This is arg1:  bill.

NOTE: Macro argument labels must be distinct from each other and from any local labels that appear in the same definition, and there is no check for that.

NOTE: At the moment the semantics of ARGS{ is undefined before DOES>.

FORTH

CAT" ( "ccc<quote>" -- )();
p4:"cat-quote";

This word has only compile-time semantics, just like CAT`. It appends run-time semantics to the current definition that concatenates the quoted string according to the specification for CAT. An error is thrown if the length of the quoted string is longer than the system parameter MAX_DATA_STR (see S,). <ansref>"cat-quote"</ansref>

FORTH
CAT` ( "ccc<backtick>" -- )(); 
 ;

This word has only compile-time semantics, just like CAT". It appends run-time semantics to the current definition that concatenates the back-ticked string according to the specification for CAT. An error is thrown if the length of the quoted string is longer than the system parameter MAX_DATA_STR (see S,). <ansref>"cat-back-tick"</ansref>

FORTH
$2DROP ( $: a$ b$ -- )(); 
 ;

Drop the two topmost string stack entries, marking them as garbage if appropriate. <ansref>"string-two-drop"</ansref>

FORTH
$2DUP ( $: a$ b$ -- a$ b$ a$ b$ )(); 
 ;

Leave copies of the two topmost string stack entries. The string values are not copied. <ansref>"string-two-dupe"</ansref>

FORTH

$DEPTH ( -- n )();
p4:"str-depth";

Leave the number of items on the string stack. <ansref>"string-depth"</ansref>

FORTH

$DROP ( $: a$ -- )();
p4:"str-drop";

Drop the topmost string stack entry, marking it as garbage if it is initially bound to the top of the string stack. <ansref>"string-drop"</ansref>

FORTH

$DUP ( $: a$ -- a$ a$ )();
p4:"str-dup";

Leave a copy of the topmost string stack entry. The string value is not copied. <ansref>"string-dupe"</ansref>

FORTH

$NIP ($: a$ b$ -- b$ )();
p4:"str-nip";

Drop the next to top item from the string stack. <ansref>"string-nip"</ansref>

NOTE: Because of essential string space bookkeeping, the system level implementation can be little more efficient than the high-level definition:

     	: $NIP  $SWAP $DROP ;
 
FORTH
$OVER ( $: a$ b$ -- a$ b$ a$ )(); 
 ;

Leave a copy of the next most accessible string stack entry on top of the string stack. The string value is not copied. <ansref>"string-over"</ansref>

FORTH
$PICK ( u $: au$ ... a0$ -- au$ ... a0$ au$ )(); 
 ;

Copy the u-th string stack entry to the top of the string stack. The string value is not copied. Throw an error if the input string stack does not have at least u+1 items. <ansref>"string-pick"</ansref>

FORTH
$SWAP ( $: a$ b$ -- b$ a$ )(); 
 ;

Exchange the two most accessible strings on the string stack. Throw an error if there are less than two strings on the stack. Neither string value is copied. <ansref>"string-swap"</ansref>

FORTH
$S> ( $: a$ -- S: a.str )(); 
 ;

Drop a$ from the string stack and leave it as a Forth string a.str, without copying. <ansref>"string-s-from"</ansref>

WARNING: If a$ is a bound string, it may move or disappear at the next garbage collection, making a.str invalid. This can be avoided by sandwiching sections of code where this could occur between $GC-OFF and $GC-ON.

FORTH
$S>-COPY ( $: a$ -- S: a.str )(); 
 ;

Drop a$ from the string stack, copy it into data space as a measured string, and leave it as a Forth string a.str. An error is thrown if the string length is larger than the system parameter MAX_DATA_STR (see S,). <ansref>"string-s-from-copy"</ansref>

FORTH
$S@ ( $: a$ -- a$ S: a.str )(); 
 ;

Leave the string stack unchanged, and leave the string body address and length on the data stack. <ansref>"string-s-fetch"</ansref>

NOTE: In earlier versions this was call $S@S. The trailing "S" is superfluous if it is understood that the only string format that usually appears on the data stack is the Forth string format.

WARNING: If a$ is a bound string, it may move at the next garbage collection, making a.str invalid. This can be avoided by sandwiching sections of code where this could occur between $GC-OFF and $GC-ON.

FORTH
$TUCK ($: a$ b$ -- b$ a$ b$ )(); 
 ;

Copy the top string stack item just below the second item. The string value is not copied. <ansref>"string-tuck"</ansref>

NOTE: Because of essential string space bookkeeping, the system level implementation can be little more efficient than the high-level definition:

 	: $TUCK  $SWAP $OVER ;
 
FORTH
>$S-COPY ( a.str -- $: a$ )(); 
 ;

Copy the external string value whose body address and count are on the parameter stack into the string buffer and push it onto the string stack. Errors are thrown if the count is larger than MAX_MCOUNT, if there is not enough room in string space, even after garbage collection, or if there is an unterminated string concatenation. The input external string need not exist as a measured string. <ansref>"to-string-s-copy"</ansref>

NOTE: MAX_MCOUNT is the largest size the count field of a measured string can hold, e.g., 255, 64K-1, or 4,096M-1. It is returned by: S" /DYNAMIC-STRING" ENVIRONMENT?

WARNING: This word should not be used when the input string is a bound string because the copy operation may generate a garbage collection which invalidates its MSA.

FORTH

>$S ( a.str -- $: a$ )();
p4:"to-str-s";

Push the external Forth string a.str onto the string stack, without copying the string value into the string buffer. It is an unchecked error if the Forth string a.str is not stored as an external measured string. <ansref>"to-string-s"</ansref>

WARNING: If the string value of a.str is actually in the string buffer and not external, the push operation may generate a garbage collection that invalidates its MSA.

FORTH

$! ( $var.dfa $: a$ -- )();
p4:"str-store";

Store the string MSA on the string stack in the variable whose DFA is on the parameter stack. <ansref>"string-store"</ansref>

NOTES: The only situation in which $! copies the string value is when it is a bound string already stored in another variable. In that case, the new copy is the one that is stored in the variable. In particular, external strings are not copied.

If the string value held by the string variable on entry is a bound string that is also referenced deeper on the string stack, its back link is reset to point to the deepest string stack reference. If it is a bound string not deeper on the string stack and not identical to the input string, its back link is set to zero, making it garbage. If it is an external string, its MSA in the variable is simply written over by that popped from the string stack.

FORTH

$. ( $: a$ -- )();
p4:"str-dot";

Display the string on the terminal. If the system implementation of TYPE has its output vectored, $. uses the same vector. <ansref>"string-dot"</ansref>

FORTH

$TYPE ( .. )();
as:"str-type";

ordinary primitive $TYPE

an executable word (no special usage info)

or wrapper call around p4_str_dot

FORTH

$@ ( $var.pfa -- $: a$ )();
p4:"str-fetch";

Leave the MSA of the string held by the string variable. <ansref>"string-fetch"</ansref>

FORTH

CAT ($: a$ -- )();
p4:"cat";

Append the string body to the end of the string currently being concatenated as the last string in the string buffer, and update its count field. If there is no concatenating string, start one. An error is thrown if the size of the combined string would be larger than MAX_MCOUNT or if there is not enough room in string space even after a garbage collection.

If garbage collection occurs, a$ remains valid even when it is in the string buffer.

 

When there is a concatenating string, concatenation is the only basic string operation that can copy a string into the string buffer. <ansref>"cat"</ansref>

NOTE: It is left to the user to define special concatenating words like:

    : \n-cat  ( -- )  \n$ cat ;
 
FORTH

S-CAT ( a.str -- )();
p4:"s-cat";

Append the Forth string body to the end of the string currently being concatenated as the last string in the string buffer, and update its count field. If there is no concatenating string, start one. An error is thrown if the size of the combined string would be larger than MAX_MCOUNT or if there is not enough room in string space even after a garbage collection.

S-CAT is most commonly used on external strings, not assumed to exist as mstrings. In contrast to CAT, garbage collection could invalidate a.str if it is a dynamic string in the string buffer. S-CAT can be used in that situation if garbage collection is turned off with $GC-OFF.

 

When there is a concatenating string, concatenation is the only basic string operation that can copy a string into the string buffer. <ansref>"s-cat"</ansref>

FORTH
ENDCAT ( -- $: cat$ | empty$ )(); 
 ;

If there is no concatenating string, do nothing but leave the empty string. If there is, leave it as a string bound to the top of the string stack, and terminate concatenation, permitting normal copies into the string buffer. <ansref>"end-cat"</ansref>

FORTH

$FRAME ( u -- )();
p4:"str-frame";

Push the description of a string stack frame starting at the top of the string stack and containing u entries onto the string frame stack. Errors are thrown if the frame stack would overflow or if the depth of the string stack above the top frame, if there is one, is less than u. The value u = 0 is allowed. <ansref>"string-frame"</ansref>

NOTE: The current implementation pushes u and the string stack pointer onto the frame stack.

FORTH

DROP-$FRAME ( -- )();
p4:"drop-str-frame";

Drop the topmost string frame from the string frame stack and string stack. Errors are thrown if either stack would underflow or if the string frame does not begin at the top of the string stack. The case where the frame has zero entries on the string stack is handled properly. <ansref>"drop-string-frame"</ansref>

FORTH
FIND-ARG ( s -- i true | false )(); 
 ;

Leave true and its index i in the top string frame if the Forth string matches an element of the frame, else leave false. The index of the top frame element is zero. <ansref>"find-arg"</ansref>

FORTH
(DROP-$FRAME) ( .. )(); 
 ;

compiling primitive (DROP-$FRAME)

an executable word (no special usage info)

or wrapper call around p4_do_drop_str_frame

FORTH

/$SPACE ( .. )();
as:"slash-str-space";

ordinary primitive /$SPACE

an executable word (no special usage info)

or wrapper call around per_str_space

FORTH
/$SPACE-HEADER ( .. )(); 
 ;

ordinary primitive /$SPACE-HEADER

an executable word (no special usage info)

or wrapper call around per_str_space_header

FORTH

$BREAK ( .. )();
as:"str-break";

ordinary primitive $BREAK

an executable word (no special usage info)

or wrapper call around str_break

FORTH

$BUFFER ( .. )();
as:"str-buffer";

ordinary primitive $BUFFER

an executable word (no special usage info)

or wrapper call around str_buffer

FORTH

$SP ( .. )();
as:"str-sp";

ordinary primitive $SP

an executable word (no special usage info)

or wrapper call around str_sp

FORTH

$SP0 ( .. )();
as:"str-sp-zero";

ordinary primitive $SP0

an executable word (no special usage info)

or wrapper call around str_sp0

FORTH

#FRAMES ( .. )();
as:"sharp-frames";

ordinary primitive #FRAMES

an executable word (no special usage info)

or wrapper call around num_frames

FORTH
/FRAME-STACK ( .. )(); 
 ;

ordinary primitive /FRAME-STACK

an executable word (no special usage info)

or wrapper call around per_frame_stack

FORTH

$FBREAK ( .. )();
as:"str-fbreak";

ordinary primitive $FBREAK

an executable word (no special usage info)

or wrapper call around sf_break

FORTH

$FSP ( .. )();
as:"str-fsp";

ordinary primitive $FSP

an executable word (no special usage info)

or wrapper call around sf_sp

FORTH

$FSP0 ( .. )();
as:"str-fsp-zero";

ordinary primitive $FSP0

an executable word (no special usage info)

or wrapper call around sf_sp0

FORTH

0$SPACE ( .. )();
as:"zero-str-space";

ordinary primitive 0$SPACE

an executable word (no special usage info)

or wrapper call around zero_str_space

FORTH
$FRAME-DEPTH ( .. )(); 
 ;

ordinary primitive $FRAME-DEPTH

an executable word (no special usage info)

or wrapper call around frame_depth

ENVIRONMENT
DSTRINGS-EXT ( .. )(); 
 ;
( 20627 )  constant DSTRINGS-EXT

an ordinary constant (no special usage info)

ENVIRONMENT

/SCOPY ( .. )();
as:"slash-scopy";

( MAX_DATA_STR  )  constant /SCOPY

an ordinary constant (no special usage info)

ENVIRONMENT
/DYNAMIC-STRING ( .. )(); 
 ;
( MAX_MCOUNT  )  constant /DYNAMIC-STRING

an ordinary constant (no special usage info)

ENVIRONMENT
DSTRINGS-LOADED ( .. )(); 
 ;

constructor primitive DSTRINGS-LOADED

an executable word (no special usage info)

or wrapper call around dstrings_init