Hello HN, here is a project that I have recently finished, it is a very niche
CPU written in VHDL and verified to work on an [FPGA][]. It is a 16-bit [bit-serial][]
CPU, which means the processor is incredibly slow taking 102 clock cycles to complete
some instructions, the trade-off is that the CPU is very small, almost being free to implement
in terms of floor space on the FPGA, the entire project takes just 73 slices, with the
CPU itself taking 23 slices.
The cross-compiler and the cross compiled program, a [Forth][] interpreter, are available at:
To build the C simulator and run it on a pre-compiled image. Typing 'words' and hitting
return shows you a list of all defined functions.
In all likeliness the project will not have that much utility to anyone, but I have
wanted to make a bit-serial CPU after completing my previous FPGA project because the
architecture is quite rare nowadays and might be something of a curiosity.
Very cool! For those of us less versed in FPGA, do you have an idea of how many transistors would be required to build this CPU? I have long been curious about the minimum number of transistors needed to build the smallest "useful" CPU.
I do not think I could give an accurate conversion between FPGA Slices and transistors myself, it is certainly small (23 slices) and processors with similar capabilities ran in thousands of transistors/tubes range. You should look into more esoteric processors such as Forth CPUs, other bit-serial CPUs, CPUs built around single instructions (such as SUBLEQ) and new CPUs built out of 7400 series parts.
Thanks! Making the initial (working) version of the CPU was fairly easy, however I wanted to do more than just write in low-level assembly and blink lights on and off, so keeping the system small whilst being able to support a higher level language like Forth actually took most of the time.
Speaking of single bit ALUs the original connection machine design was for 1024 single-bit CPUs in a full crossbar. I don’t believe that architecture was actually built.
CM1 had 65536 single bit processors. They were not connected in a crossbar, instead the chips that were the fundamental building block of the supercomputer each contained 16 of those bit serial processors and 4096 of those chips were connected in a 12 dimensional hypercube. The total dimension of the hypercube was 16 but the 16 on-chip processors had local routing.
The Connection Machine is what I was going to mention but you get some of the details wrong. As another commenter has noted single-bit processors were connected roughly as a 2ⁿ hypercube (hence the commercialised CM-2’s tesseract enclosure) and not by a crossbar.
CM-5 had a dual connection architecture as well as featuring SPARC CPUs on the nodes.
There were a few bit-serial economy designs going into the transistor era -- the latest I'm aware of is the PDP-8/S (S for Serial) from 1967. If you want to build a really cheap machine and transistors are $1 each (closer to $8 current dollars after inflation), this is how you do it. https://spectrum.ieee.org/tech-talk/semiconductors/devices/h...
A historical note. The earliest bit-serial processors (EDVAC, EDSAC) were motivated by their serial memory. They stored data as ultrasonic pulses sent through tubes of mercury. Since you get the bits out of memory serially, it is easiest to do arithmetic serially too.
The Datapoint 2200 desktop computer (whose architecture became the Intel 8008) used a serial processor for similar reasons. Except this time the processor was built from TTL chips instead of tubes and the memory was Intel shift-register memory chips. (Shift register memory could be built cheaper than RAM memory, but with the disadvantage that you had to wait for the bit you wanted to circulate through the shift register.)
Not really, I was only inspired by their existence and nothing else, in retrospect it might have been nice to copy the instruction set of something like a PDP8-S and get software for one of them running on it, but I like having something that is my own.
The cross-compiler and the cross compiled program, a [Forth][] interpreter, are available at:
<https://github.com/howerj/bit-serial/blob/master/bit.fth>
It compiles down to an image that uses just 4802 bytes (out of 16KiB).
And a C simulator if you would like to try out the Forth interpreter but lack an FPGA to try things on is available at:
<https://github.com/howerj/bit-serial/blob/master/bit.c>
You can type:
make run
To build the C simulator and run it on a pre-compiled image. Typing 'words' and hitting return shows you a list of all defined functions.
In all likeliness the project will not have that much utility to anyone, but I have wanted to make a bit-serial CPU after completing my previous FPGA project because the architecture is quite rare nowadays and might be something of a curiosity.
[bit-serial]: https://en.wikipedia.org/wiki/Bit-serial_architecture
[Forth]: https://en.wikipedia.org/wiki/Forth_(programming_language)
[FPGA]: https://en.wikipedia.org/wiki/Field-programmable_gate_array
[VHDL]: https://en.wikipedia.org/wiki/VHDL