This project is a kind of research project for gaining insight in the original list processing used in Lisp, together with the memory management of it. For this, a small C library should be set up, which manages a relatively small amount of memory (e.g. 16k bytes, but parametrised), to be used for cons'ing lists the Lisp way, and then building upon that to check ideas, to check fragmentation, what would be the influence of adding new data types of different size, and so on.
The evolution of it should serve to guide the implementation and optimisation of a register-based processor on an FPGA platform. If possible, it should also guide the evolution of a Lisp-like system compiler (like C now is).
There are two parts necessary, which should be simplified as much as possible. In order to create lists in the available memory space, it is necessary to describe them in an external language, really called Lisp. At that point, the resemblance ends, because no interpretation of the lists will be done. At the start, it will really be only
description of list →
internal representation of list.
The internal representation will also be used to investigate the usage of tagged data types, which in C translate into the usage of unions.
What would then be the main points of the research project, in order to execute it as a project?
- Define S-expressions and their basic semantics
- This part will define the minimum representation necessary in order to write S-expressions. The basic semantics is about the entities that need to be defined in order to represent meaningful data.
- Define a basic internal representation
- This is a studying phase, where the basic ways that are currently used to represent S-expressions and internally used data structures are be taken stock for further usage in the next phase.
- Implementation of the internal representation
- This part builds from the previous one an implementation.
- Initialization of the environment
- Here the necessary environment and its pointers will be prepared for usage.
- Allocation and deallocation of cells
- This part concerns the algorithm for using the bofl pointer to return a fresh cell, and the eofl pointer to return used cells back to the free list.
- Build a translator for S-expressions
- This part will be about reading an input stream and validating this input as correct S-expressions or a list of S-expressions, and transforming them into data structures.
- Map symbols and strings onto internal list representation
- In this part, the interpretation of the S-expressions and the mapping to their internal representations are taken together in order to build a working implementation of a system that can take either a manually entered expression or a file, and use that to build the representation in reality. This will do the allocation in the simplest way possible.
- Map lists onto internal list representation
- Continuation of the previous part
- Printing expressions
- It should be possible to turn all kinds of expressions back into a readable form.
- Extended diagnostics
- It is difficult to debug the reading and writing of generalized expressions.
- Build an evaluator
- After reading and printing S-expressions has been finished, this is the next part: Running a program as entered by the translator.
The following two parts should be taken into account long term, but are not really necessary to build and test the evaluator.
- Memory management
- This part should build further upon the previous one in order to look at ways that the usage of memory can be optimised, and to see what effects this has on the run-time speed of all the algorithms.
- Extended data types
- The basic data types used will probably map on a very limited base type. The investigation here will add other data types, both with fixed and varying size, in order to see what needs to be changed to the previous implementation on how memory needs to be handled, and what its run-time effects will be.
Results (4 December 2012)
Looking through the above article topic by topic, it seems that the stated goals all have been reached, except the garbage collection. But
- A small C library has been set up, which is used to manage cell memory
- All management of data structures was done through use of
cons
, car
, cdr
, rplaca
, rplacd
- After implementing most of it, the design of a simple 16 bit ISA was started
- Everything was described using Lisp S-expressions
- A union type was used to represent the basic data types in C
- An evaluator was built and extended
After this research, an assembler for a simple 16 bit processor was implemented in Lisp, and using Lisp structures as assembler statements. This makes it possible to use the list manipulation possibility of Common Lisp to easily add new structures, which are immediately written as basic assembler before being translated.
After that, a simulator for the 16 bit processor was implemented, also in Common Lisp.
The next part is effectively starting again the implementation of the system, but now in assembler. Since everything is based upon Lisp structures, the goal is to develop patterns which can be used to describe a Lisp system in terms of Lisp and going down to assembler, by rewriting higher-level structures in lower level structures.