zondag 27 april 2014

Multiplexers in the FPGA

Multiplexers are important components in an FPGA. While discrete logic and microprocessors would use tri-state buffers for connecting driving components to the same bus, this is not used any more in FPGA's. It is necessary to think about the system and to connect components which need to go to the same destination through a multiplexer. This link shows all the possible ways to define a multiplexer in VHDL.

What is further interesting, however, is how multiplexers are implemented on an FPGA. XAPP522, "Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources" is a nice document from Xilinx which shows how multiplexers are formed on the FPGA after VHDL synthesis and translation. Basically, it comes down to a function of a lookup table. It is, however, not necessary to design the multiplexer down to this level of detail using a LUT6 component, except in special cases.

Using the standard VHDL statements to implement a multiplexer in VHDL, which are generic, leads to the same results. The generated result will also use a LUT6. However, the standard VHDL code can be used in any FPGA, while the one based upon the LUT6 component can only be used in certain Xilinx FPGA's.

To see the real result of the synthesis using Xilinx design tools, "View Technology Schematic" should be chosen, and not "View RTL Schematic". In the technology schematic it can really be seen how a LUT is used on the FPGA, while the RTL schematic is more symbolic (which put me on the wrong foot when designing an ALU).Using "View RTL Schematic" for a multiplexer does not even give expansion possibilities for the multiplexer block.

The simplest way to define a multiplexer in a design is to implement it using CASE in a process statement, or as a WHEN statement.

  WITH SEL SELECT
    ISUM <=
    STD_LOGIC_VECTOR(UNSIGNED(IA) + UNSIGNED(IB)) WHEN "00",
    STD_LOGIC_VECTOR(UNSIGNED(IA) - UNSIGNED(IB)) WHEN "01",
    STD_LOGIC_VECTOR(SIGNED(IA) + SIGNED(IB))     WHEN "10",
    STD_LOGIC_VECTOR(SIGNED(IA) - SIGNED(IB))     WHEN "11";


Doing it this way, all bus-widths are always automatically accounted for. It is not really necessary to define a multiplexer as a generic component.

Symbol

To have a certain standard in drawing the system, IEEE based symbols are used. This gives consistency over all components. Multiplexers are designated with the following symbols:

4 to 1 multiplexer










8 to 1 multiplexer
The following schematic is the technology view generated by ISE for a 1-bit 4 to 1 multiplexer.
Four input bits of the lookup table are used as the data inputs, and two other bits are used as the selector bits.

donderdag 25 juli 2013

Working simulator

Revision 292 of the source code finally contains a working simulator. This simulator contains the following components:
  • A simple CPU which should relatively easy to implement using RISC principles
  • 64 kB memory
  • A simple tty output device, connected to standard output
  • A simple tty input device, connected to standard input
The software architecture of the system is built around two queues.

The first queue is an event queue on which all system operations are put and executed. This is done through callbacks. Every callback must have its own lexical state and return 0. Only the callback which executes the CPU returns 1 or 2. 1 signifies that the CPU has entered HALT state, while 2 just notifies that a CPU step has been executed. This makes it possible to know that the simulator has executed one cycle, which can consist of a whole series of events. This is necessary for single-step functionality.

The second queue is a timer event queue. On this queue a count value and a callback are placed. The callbacks here are ordered ascending. On the event queue a count down event is placed, so that every cycle the timers are decreased. For timers which reach zero, the associated callbacks are placed on the event queue and executed from there.

The system is complete with an assembler. It uses s-expression syntax and is written in Common Lisp. Using pattern matching functionality from Common Lisp makes it possible to easily extend the assembler with higher level functions which are translated to assembler.

In the directory asm/util there is also some Emacs Lisp code which makes it possible to point at a directory where Emacs can find assembly code, which can then be edited, assembled and loaded and executed onto the simulator. This should improve the turnaround time for developing programs.

With this system set up, it now becomes possible to develop the same microprocessor on a Digilent Atlys board. The simulator makes it possible to test programs first before they are loaded onto the hardware.

The simulator has been tested with CLISP and SBCL. On a MacBook Pro it provides the following performance:

Implementation Instructions/s
CLISP, interpreted 4300
CLISP, compiled 110000
SBCL 2000000

Since SBCL always compiles before it executes, there is no need to make the same distinction as for CLISP.

zondag 30 december 2012

This project is a kind of research project for gaining insight in the original list processing used in Lisp, together with the memory management of it. For this, a small C library should be set up, which manages a relatively small amount of memory (e.g. 16k bytes, but parametrised), to be used for cons'ing lists the Lisp way, and then building upon that to check ideas, to check fragmentation, what would be the influence of adding new data types of different size, and so on.

The evolution of it should serve to guide the implementation and optimisation of a register-based processor on an FPGA platform. If possible, it should also guide the evolution of a Lisp-like system compiler (like C now is).

There are two parts necessary, which should be simplified as much as possible. In order to create lists in the available memory space, it is necessary to describe them in an external language, really called Lisp. At that point, the resemblance ends, because no interpretation of the lists will be done. At the start, it will really be only
description of listinternal representation of list.

The internal representation will also be used to investigate the usage of tagged data types, which in C translate into the usage of unions.

What would then be the main points of the research project, in order to execute it as a project?
Define S-expressions and their basic semantics
This part will define the minimum representation necessary in order to write S-expressions. The basic semantics is about the entities that need to be defined in order to represent meaningful data.
Define a basic internal representation
This is a studying phase, where the basic ways that are currently used to represent S-expressions and internally used data structures are be taken stock for further usage in the next phase.
Implementation of the internal representation
This part builds from the previous one an implementation.
Initialization of the environment
Here the necessary environment and its pointers will be prepared for usage.
Allocation and deallocation of cells
This part concerns the algorithm for using the bofl pointer to return a fresh cell, and the eofl pointer to return used cells back to the free list.
Build a translator for S-expressions
This part will be about reading an input stream and validating this input as correct S-expressions or a list of S-expressions, and transforming them into data structures.
Map symbols and strings onto internal list representation
In this part, the interpretation of the S-expressions and the mapping to their internal representations are taken together in order to build a working implementation of a system that can take either a manually entered expression or a file, and use that to build the representation in reality. This will do the allocation in the simplest way possible.
Map lists onto internal list representation
Continuation of the previous part
Printing expressions
It should be possible to turn all kinds of expressions back into a readable form.
Extended diagnostics
It is difficult to debug the reading and writing of generalized expressions.
Build an evaluator
After reading and printing S-expressions has been finished, this is the next part: Running a program as entered by the translator.
The following two parts should be taken into account long term, but are not really necessary to build and test the evaluator.
Memory management
This part should build further upon the previous one in order to look at ways that the usage of memory can be optimised, and to see what effects this has on the run-time speed of all the algorithms.
Extended data types
The basic data types used will probably map on a very limited base type. The investigation here will add other data types, both with fixed and varying size, in order to see what needs to be changed to the previous implementation on how memory needs to be handled, and what its run-time effects will be.

Results (4 December 2012)


Looking through the above article topic by topic, it seems that the stated goals all have been reached, except the garbage collection. But
  • A small C library has been set up, which is used to manage cell memory
  • All management of data structures was done through use of cons, car, cdr, rplaca, rplacd
  • After implementing most of it, the design of a simple 16 bit ISA was started
  • Everything was described using Lisp S-expressions
  • A union type was used to represent the basic data types in C
  • An evaluator was built and extended

After this research, an assembler for a simple 16 bit processor was implemented in Lisp, and using Lisp structures as assembler statements. This makes it possible to use the list manipulation possibility of Common Lisp to easily add new structures, which are immediately written as basic assembler before being translated.
After that, a simulator for the 16 bit processor was implemented, also in Common Lisp.
The next part is effectively starting again the implementation of the system, but now in assembler. Since everything is based upon Lisp structures, the goal is to develop patterns which can be used to describe a Lisp system in terms of Lisp and going down to assembler, by rewriting higher-level structures in lower level structures.

maandag 1 oktober 2012

When I started this blog, I did not know yet that I would do something completely different from 2010 to 2012. In the summer of 2010, I decided to get my MSc degree in Electronics/ICT, what would be the closest that I could get to a degree in software engineering. An additional benefit was that I also got a fairly thorough introduction into VHDL. I used this with this board to also build my master's thesis (there weren't any interesting topics in SE, unfortunately).

Last week I got my degree then officially.

Since a week or two I have been coding a simple environment for encoding S-expressions. It currently consists of the following parts:
  • A system for managing cells. Each cell is either a CONS or part of a string. All cells are consistently 4 bytes in size. There are two functions here: one to get the address of a new cell, and one to return a cell back to the free list.
  • Functions to build a string or a symbol
  • Functions to build lists
  • A test harness with unit tests
One purpose of this part is only to encode S-expressions into a list or tree representation. Processing of this structures will come later. Another purpose was to check if it was possible to implement all this without stop-and-go garbage collection.

The mechanism used to avoid this is that all cells are at the beginning initialised to belong to a free list. This is a singly linked list with a pointer to the beginning and a pointer to the end of it. Allocation of a cell will always return a cell at the beginning of the list, while deallocation will append the passed cell to the end of the list. The pointers at the beginning and the end will be adjusted accordingly. All other functions build on top of this mechanism. In the development, the usage of unit tests is beneficial. Several defects have been detected through those, even in the later stages of development.

This comes of course at a price. One should make sure that no cyclic structures are created. In a stop-and-go collector it is much easier to clean up such structures.

Another price is that strings use double the space, since each string is composed of a cell with a pointer to the next cell of the string and two characters. Optimising this could be done with a separate pool for strings, but that would double the work needed. Now there is only one mechanism which can be used for everything.

The next part to code will then be the S-expression encoder and decoder. It should not only be possible to encode an expression like

(loop
    for i in (quote (1 2 3 4 5 6 7 8 9 10))
    do (print i))
 

But also to print an S-expression based upon the internal list structure.

dinsdag 6 april 2010

Goal of this journal

While having kept another journal for some time, about problems and solutions, it is probably time to start a journal more specific about the implementation of SSIM-2.

At this point in time, I do have the following available :

- Simulation of certain simple system consisiting of :
- a 32-bit ISA
- A keyboard
- An output device
- A simple system front-end
- An assembler
- A basic software architecture consisting of
- A simple raw memory allocation and control layer
- A definition of basic data types
- Some higher level object factory and garbage collection
interface

What I will be starting on is the implementation of this library using assembler and the call conventions which should be generated by default from the compiler in the future.

The problem is a little bit the same as with Linux. Development consists of a binary image which can be loaded into memory and which takes over the complete system. Before it is finished there will probably be some changes along the way in the assembler, because it should be possible to add these predefined functions as parts of the Lisp symbol table (packages or obtable).

In the meantime, I have obtained some more experience with Common Lisp, and one of the things I learnt is that an image can be enhanced by loading the necessary packages and then dumping it. This obviates the problem of always initialising the standard version of Common Lisp used with the necessary packages.

This means that a development image can be set up which contains :

- a 32-bit ISA
- A keyboard
- An output device
- A simple system front-end
- An assembler

and that these can be accessed and used at will, while always returning to the CL REPL when finished. Ah, well, one problem remains, that is raw input and output must be enabled externally, and if this is done, then normal input is not available.

Currently the main purpose would be to read and assemble the system definitions, having the possibility to load them into the simulator, run them and have the possibility that the system itself formats and displays its output.

It would also be wise to have a standard procedure to rebuild the image when the definitions of the above building blocks change.

donderdag 18 februari 2010

Introduction

This is my attempt at a public blog about a project that I basically started a couple of years ago. Publishing a blog, as far as I know, is like writing. I like to read a good written book, and I like to publish my ideas the same way. So this blog is an attempt to become a better writer, but also an attempt to get people interested in my project : designing and building a computer system and use an adapted Lisp as system language.

The things I currently have built are :
  • A 32-bit ISA with 32 registers
  • A simulator for the above ISA
  • A Lisp like assembler
  • Lots of documentation
I am currently busy on a compiler for an s-expression based language, which is to become the system language for the system. This language currently only supports 32-bit variables, no arrays and no structures.

When this compiler can be used to start writing system software which can be run on the simulator, it becomes time to learn FPGA design and buy a board which can accomodate the minimum things needed for implementing a computer system : keyboard, console, memory and storage. Using this, it should be possible to move from the simulator to a real, FPGA based, computer system.

And that describes somewhat the bounds of the project.