It is know by designers of microprocessors that a processer can run much faster when every instruction has the same length. In fact, 4tH has his own virtual microprocessor. The compiler is nothing more than an assembler and the interpreter nothing more than an emulator in top of the real microprocessor.
In order to speed up 4tH, all instructions have the same length. They consist of a token (which is the real instruction) and an argument. The argument is a value that gives meaning to the instruction, e.g. the 'LITERAL' token means that a number is compiled here. The argument is the actual number.
Some instructions wouldn't need an argument, but for speeds sake, they have: it is always zero. Isn't that a lot overhead? Not really. Half the instructions in an actual program need an argument. Decoding a more elaborate scheme would need more processortime and more programming. So in the end, it would make hardly any difference. Except for the speed.
A token with its argument is called a word. And the Code Segment is one large array of words. Each of these words have an address and can be accessed by the word '@''. In fact, '@'' throws the argument on the stack. Where have we seen '@'' before?
Yes, when fetching from an array of constant numbers. These arrays are compiled into the Code Segment. How come that 4tH isn't confused by these arrays? Because they have the token 'NOOP', which does absolutely nothing.