Jeff's Scheme Interpretor

Yes! It is now concurrent!

What is Scheme?

Scheme is a functional language, and thus considerably different from more mainstream languages such as C or Pascal. It is similar to LISP in syntax and approach to problem solving.

History of my interpretor

I first was introduced to scheme as a result of my CS241 class' course notes. Looking over them at the start of the year, I found a section on Scheme. The description of the language excited me - it was immediately apparent that it would be quite trivial to parse such a language. Furthermore, it seemed an excellent example of an easy to implement language which is nonetheless powerful. For some fateful reason, logging on to the school system and running their version of scheme did not occur to me. Thus, to play with the language I needed to first write the interpretor. Twenty-four hours later, scheme (such an imaginative name, eh?) was built. Unfortunately, it was built to the specs of the examples in the course notes. Surprisingly, it only required minor modifications since then! The most important was that the original version did not support floating point numbers.

Incompatibilities between it with real Scheme

Real scheme is a bit different from my actual implementation:

Not all predicates are defined.
(quote foo) not supported (though 'foo, of course, is)
Cannot create functions with variable number of parameters.
Body of functions is always unquoted.
letrec, let, etc all behave in the same manner inconsistant with all the standard scheme forms. I've never had a problem with this though, touch wood.
It's not tail recursive.

Interesting features of my Scheme

So what distinquishes my scheme interpretor above others? Not much, and not all of them good...

I tried to minimize the built-in functions, as I was learning Scheme, and also lazy. Thus, the majority of the actual implemented functions are defined in scheme.scm thereby minimizing the work required in writing the interpretor.
It was written in C, not C++. This is actually a good thing, as it simplified a good portion of the code.
It has been integrated with my raytracer.
It has been compiled and run on Wintel, Linux, Solaris.

Future of my Scheme interpretor

In addition to whatever tweaks which will be necessary to keep my interpretor copacetic with my raytracer, I have moved it to uC++ and made a concurrent version of Scheme.

Concurrent Scheme

As I had mentioned earlier, I have moved my scheme interpretor into the uC++ environment, allowing me to create a concurrent version. As allowing for parrallel execution of the interpretor raises many interesting problems, it required about a 75% rewrite. That was good, as it allowed me to add some features (of the implementation, not the result) which I had been missing.

How it works

As scheme is a functional language, traditional thoughts such as providing locks for data areas, etc, is no longer applicable. As each expression is ideally side effectless, one could evaluate all the parameters to a function on seperate threads with no concerns about interference. Unfortunately, such an aggressive maneuveur would drown any gains in task management overhead. Thus, I decided to put the onus on the programmer to determine what functions should be spawned into another thread.

Thus I have introduced a new atom type, the FLIST. This is a normal everyday list but if it is executed, it spawns a new task which runs the actual code. Meanwhile, the FLIST returns immediately with a FUTURE, which is a meta-atom. That is, its actual type is decided by what the result of FLIST will be. As Scheme is dynamically typed, there is no way to know even what type the FUTURE will be without actually opening it.

Futures can be copied and moved like any other data type, it is only when an internal function is run on them (and then only one which uses the result, as opposed to define, etc), that the user will block until the result is ready. The result is placed in the future by the spawned task, at which point the spawned task can suicide.

Access to the new list type is simple, one merely uses braces ({}) as opposed to paranetheses (()) to define the list. There is also an additional internal function flist, which performs as list but returns FLISTs. car, cdr, cons, etc. all perform as normal on the FLISTs, returning FLISTs where appropriate.

Example

Say you have your standard poorly implemented fibbonnocci function, so choosen as its a good cycle sucker:

->  (expand~ fib)
I read:
( expand~ fib )
Symbol fib resolves to alias of:
Symbol ~36 resolves to function taking n
Body is:
( cond ( ( <= n 2 ) 1 ) ( else ( + ( fib ( - n 1 ) ) ( fib ( - n 2 ) ) ) ) )
Result:
( )

Then, one can define a variable to be, say, the 20th fibbonnaci number. Of course, this takes a few seconds to calculate, so you decide to add 4 to 7 in the mean time:

->  (define a {fib 20})
I read:
( define a { fib 20 } )
Result:
a
->  (+ 4 7)
I read:
( + 4 7 )
Result:
11
->  (expand~ a)
I read:
( expand~ a )
Symbol a resolves to alias of:
< A FUTURE> 
Result:
( )

The command line was returned almost immediately after the define statement. Thus, the (+ 4 7) was done while the other thread was still processing the future. The expand~ shows the type of the future, which cannot yet be known. Finally, you decide you need to use a, so you do so:

->  (+ a 5)
I read:
( + a 5 )
Result:
6770

Upon recieving the (+ a 5), the + operator (on the main thread which is responsible for executing command lines) will block on trying to resolve a's actual value. Thus, there is a ten second wait (not shown :>) whilst the fibbonnaci thread finishes, at which point the addition is performed and the result printed.

Interesting Optimizations

Trying to get maximum concurrency out of an environment which has a global symbol table proved interesting. Some of the techniques I used are:

Atomocity of Symbol table updates: As all symbol tables are shared between evaluators (especially the global table). To allow multiple readers without the hassle of a generalized readers/writers problem, only writers are mutexed and they are required to only make atomic updates. ie: the symbol table must be valid (with no missing entries) for the duration of the update. As one does not know whether someone is currently accessing a symbol entry, redefinitions leave the old one at the end of the list, with a warning. Also, as readers are mutexed, adaptive methods (eg: move to front) are unavailable to optimize the global symbol table.
Thus, a simple 256 entry hash table is used to trach the global symbol table, and a single linked list for local tables (the cost of managing the hash table outweighs the loss to search times at this level, as usually it consists of less than half a dozen entries)
Allocation of Atoms: No small amount of time is spent creating and destroying atoms in this interpretor. Instead of suffering a large loss to concurrency from shoving all such operations through a mutexed method, each evaluator is given its own atom heap. The first 128 atoms are also allocated on the evaluator's stack to improve performance further.

Code

Here's the code to uscheme, along with descriptions of what the files are responsible for. Feel free to send any comments to my contact address way below. (Actually, you'll have to follow the link there to my root homepage, where an actual contact address can be found) If you should feel the strange desire to redistribute modified versions of this (or the compiled results), you must gain my consent.

Makefile: Std uC++ make file, single processor, etc.
defines.h: Catch-all include file, also has scheme related definitions.
atom.C, atom.h: Allocation and manipulation routines for atoms. Also has FUTURE implementation
evaluate.C, evaluate.h: Implementation of the tasks which perform all the atom evaluation.
frontend.C: The user interface - Command line reader & echoer.
reaper.C, reaper.h: The reaper task, used to allow finished evaluators to suicide.
symtable.C, symtable.h: Implementation of the symbol tables used to lookup symbols. Also the handling of chaining different symbol stacks together without excessive copying.
ugeneric.C, ugeneric.h: Catch all utility functions. A few (FreeMem, GetMem) are quite silly in this form, as they've been stripped down for uC++ compatibility & thread safety.

Future of concurrent scheme

For now, I am concentrating on ensuring it actually works. As a significant portion was rewritten, there is potential for minor errors in seldom used functions to have been introduced.

Some miscellenous things:

Delay shutdown until all futures resolved - causes a crash now.
Allow redefinition of symbols in global scope with performance penalty to redefinition, but removal of previous symbol entry.
Type safe move to front heuristic.
Garbage collection of unused temporary global variables.

Back to Jeff's Homepage