Why Mike Vanier loves computer science

From his blog: “One other thing: even though I think Forth is way cool, it would not be my first choice for most of my programming projects. There are too many features I consider essential for most of my programming work that are missing from Forth (and are very hard to add to it). These include garbage collection, first-class functions and a type discipline. So Forth is not going to steal the place in my programmer’s heart where Scheme, Ocaml, Haskell or even Python currently live. Nevertheless, I still think it’s one of the most beautiful ideas in the history of programming languages, and it’s well worth studying and admiring.”
— and I was just starting to like this guy, from reading his article, then he goes and says that about FORTH on his blog.
Well FORTH WAS my first choice for most of my programming projects for 4 years in college and 4 years on the job right out of college. Even after dabbling with assembly, “C”, Pascal, LISP, APL, SNOBOL, SmallTalk and a few more.
You can implement garbage collection quite easily in FORTH, in fact, easier than any other language.
You don’t bother making a “modern” heap (although I have done that, just for kicks), and written garbage collection algorithms to work on them. The better way is to just use the built-in virtual memory capability, BLOCK. You don’t need to collect garbage, you just ignore it when you are finished using it, or you overwrite it with other data, you don’t have to free() it or anything, quite simple actually. I used this to great effect when building optimal huffman code (binary) trees with the highest possible compression ratio for large datasets of text sentences. I made a “virtual memory heap” with the true meaning of “heap”, an array containing objects and indexes of (pointers to) other objects in the array. I used the technique quite often, seeing as memory was not as plentiful then as it is today, as long as you had disk space, you had memory, and you didn’t consume the disk space, you just reused it if/when you felt like it. You don’t allocate blocks like you do file extents or memory allocations, if you need to pack things in you write a bit of code to pack multiple items into a buffer, otherwise just allocate one block per item, or more if necessary. They’re only “allocated” while you’re running, it’s not a persistent allocation. Actually, I would just throw in a new 8″ floppy if I needed more “memory” after the program prompted me to.
First-class functions are pretty much built right into FORTH, by “ticking” a function to get its address, followed by adding an offset to get to the execution address (we used to call it the CFA, Code Field Address, if it’s indirect threaded code, which is the only way to go if you ask me). You have now got effectively a “function pointer” that you can pass around on the stack, along with any parameters you would like to carry with it (or maybe I should have said “curry”). When you want to invoke it, you call EXECUTE with the CFA on the TOS. Can’t get much simpler. You can create a FORTH word at runtime, either by invoking the compiler on a text buffer, or splicing together lists of addresses and sticking a header in front of it. You can then EXECUTE that or return it as a result. You just have to manipulate the CFA on the stack along with any other data parameters, gee, isn’t that the definition of “first class function”?
Type discipline is easily accomplished by using a “type-stack”, which is a separate stack that parallels the parameter / data stack, each entry on the parameter stack has a corresponding value on the type stack indicating what type that value is. By checking values on the type stack before accessing the data on the parameter stack, you can decide if the types are acceptable, and even implement polymorphic words that have different implementations based on the types of the parameters passed. String-stacks can even be added so you could have the word + check the type-stack and either add two numbers yielding a number or concatenate two strings yielding a string or convert a number to a string and concatenate it to another string parameter or evaluate a string if it is numeric and add it numerically to another numeric parameter, even checking for single, double, floating or integer or anything else along the way. Whatever type you return, you just have to return the type indicator on the type-stack when you’re finished, and maintain the stacks together. A wider single stack has also been used, with the type indicator next to the value on the same stack, manipulated together with the value.
When he said FORTH lacked “modularization”, that’s crazy, you have “screens” to create source modules, using –> to link consecutive ones (until actual files were implemented). You effectively had “namespaces” long before I ever heard of namespaces in other languages, they’re called “vocabularies” in FORTH, and you can have as many as you need. Each vocabulary is a “module” of associated words that can be selected by just specifying the vocabulary name just before the word you want from that vocabulary, each one is a separate linked list, using the vocabulary name moves that linked list to the front of the search order.
It looked very much like prepending std:: to a C++ identifier long before C++ was popular. Using a vocabulary name followed by DEFINITIONS would let you add new words to that vocabulary. This is like using different namespaces in different source files in C++. It’s not all that different from the partial classes that you now see in C#.
Notwithstanding all that, I still agree strongly with most of what he says in the article, and I actually liked it.
He just struck a nerve when he criticized FORTH on his blog link. And it’s FORTH, in all capitals, because the original implementation used a limited character set that lacked lower case, and it’s FORTH instead of the intended FOURTH because the names were limited to 5 characters. It means “fourth generation language” before anyone else co-opted that phrase. FORTH was ahead of its time, and was a true joy to program with. It’s my turn to be a “purist” (re my post on gbc site). I know, things are more relaxed now on that.
People (including me) love unix shell scripts because they are so expressive and powerful.
FORTH is all that times a thousand.
EOR (End Of Rant)

2 thoughts on “Why Mike Vanier loves computer science”

blangenbach says:

2008-08-09 at 01:23

From his blog: “One other thing: even though I think Forth is way cool, it would not be my first choice for most of my programming projects. There are too many features I consider essential for most of my programming work that are missing from Forth (and are very hard to add to it). These include garbage collection, first-class functions and a type discipline. So Forth is not going to steal the place in my programmer’s heart where Scheme, Ocaml, Haskell or even Python currently live. Nevertheless, I still think it’s one of the most beautiful ideas in the history of programming languages, and it’s well worth studying and admiring.”
— and I was just starting to like this guy, from reading his article, then he goes and says that about FORTH on his blog.
Well FORTH WAS my first choice for most of my programming projects for 4 years in college and 4 years on the job right out of college. Even after dabbling with assembly, “C”, Pascal, LISP, APL, SNOBOL, SmallTalk and a few more.
You can implement garbage collection quite easily in FORTH, in fact, easier than any other language.
You don’t bother making a “modern” heap (although I have done that, just for kicks), and written garbage collection algorithms to work on them. The better way is to just use the built-in virtual memory capability, BLOCK. You don’t need to collect garbage, you just ignore it when you are finished using it, or you overwrite it with other data, you don’t have to free() it or anything, quite simple actually. I used this to great effect when building optimal huffman code (binary) trees with the highest possible compression ratio for large datasets of text sentences. I made a “virtual memory heap” with the true meaning of “heap”, an array containing objects and indexes of (pointers to) other objects in the array. I used the technique quite often, seeing as memory was not as plentiful then as it is today, as long as you had disk space, you had memory, and you didn’t consume the disk space, you just reused it if/when you felt like it. You don’t allocate blocks like you do file extents or memory allocations, if you need to pack things in you write a bit of code to pack multiple items into a buffer, otherwise just allocate one block per item, or more if necessary. They’re only “allocated” while you’re running, it’s not a persistent allocation. Actually, I would just throw in a new 8″ floppy if I needed more “memory” after the program prompted me to.
First-class functions are pretty much built right into FORTH, by “ticking” a function to get its address, followed by adding an offset to get to the execution address (we used to call it the CFA, Code Field Address, if it’s indirect threaded code, which is the only way to go if you ask me). You have now got effectively a “function pointer” that you can pass around on the stack, along with any parameters you would like to carry with it (or maybe I should have said “curry”). When you want to invoke it, you call EXECUTE with the CFA on the TOS. Can’t get much simpler. You can create a FORTH word at runtime, either by invoking the compiler on a text buffer, or splicing together lists of addresses and sticking a header in front of it. You can then EXECUTE that or return it as a result. You just have to manipulate the CFA on the stack along with any other data parameters, gee, isn’t that the definition of “first class function”?
Type discipline is easily accomplished by using a “type-stack”, which is a separate stack that parallels the parameter / data stack, each entry on the parameter stack has a corresponding value on the type stack indicating what type that value is. By checking values on the type stack before accessing the data on the parameter stack, you can decide if the types are acceptable, and even implement polymorphic words that have different implementations based on the types of the parameters passed. String-stacks can even be added so you could have the word + check the type-stack and either add two numbers yielding a number or concatenate two strings yielding a string or convert a number to a string and concatenate it to another string parameter or evaluate a string if it is numeric and add it numerically to another numeric parameter, even checking for single, double, floating or integer or anything else along the way. Whatever type you return, you just have to return the type indicator on the type-stack when you’re finished, and maintain the stacks together. A wider single stack has also been used, with the type indicator next to the value on the same stack, manipulated together with the value.
When he said FORTH lacked “modularization”, that’s crazy, you have “screens” to create source modules, using –> to link consecutive ones (until actual files were implemented). You effectively had “namespaces” long before I ever heard of namespaces in other languages, they’re called “vocabularies” in FORTH, and you can have as many as you need. Each vocabulary is a “module” of associated words that can be selected by just specifying the vocabulary name just before the word you want from that vocabulary, each one is a separate linked list, using the vocabulary name moves that linked list to the front of the search order.
It looked very much like prepending std:: to a C++ identifier long before C++ was popular. Using a vocabulary name followed by DEFINITIONS would let you add new words to that vocabulary. This is like using different namespaces in different source files in C++. It’s not all that different from the partial classes that you now see in C#.
Notwithstanding all that, I still agree strongly with most of what he says in the article, and I actually liked it.
He just struck a nerve when he criticized FORTH on his blog link. And it’s FORTH, in all capitals, because the original implementation used a limited character set that lacked lower case, and it’s FORTH instead of the intended FOURTH because the names were limited to 5 characters. It means “fourth generation language” before anyone else co-opted that phrase. FORTH was ahead of its time, and was a true joy to program with. It’s my turn to be a “purist” (re my post on gbc site). I know, things are more relaxed now on that.
People (including me) love unix shell scripts because they are so expressive and powerful.
FORTH is all that times a thousand.
EOR (End Of Rant)

Grant says:

2008-08-09 at 09:05

I bet the part about “and are very hard to add to it” is something that he threw in sort of without thinking. Or alternately, he threw it in without thinking about any FORTH programmers reading the post!
Carry vs Curry! And yes that sounds like a first class function.
Adding polymorphic functions to FORTH sounds…fun, for lack of a better word.
I bet he just meant that he didn’t want to implement them himself; you should ask him!

Why Mike Vanier loves computer science

You might also like some of these

2 thoughts on “Why Mike Vanier loves computer science”

Leave a Reply Cancel reply