This document covers how Squeak/SqueakPharo works in the lower level, that is, the Virtual Machine. If you are looking for documentation about how to program in Squeak environment, you should go to www.squeak.org. I wrote this guide because I couldn't find a lot of docs explaining this but just a bunch of separate seemingly abandoned, old and not-enough-detailed articles gathered along squeak's wiki. One of the best I found is this about VMMaker. If you think something is missing please leave a comment so I can fix it and this guide gets better.
Here I'll try to show how to compile only the base VM, without any tweaking, because I think is the most difficult part if you want to make a custom one. I hope that after reading these lines, you'll be able to make any modifications you like to the VM, while knowing what you're doing. The guide is thought to work with unix VM, but the main concepts should be useful in any OS. Also you may wan't to know that now squeak VM is exactly the same than Pharo one. Some time ago Pharo introduced the need of support for something called full closures in the VM, but that was added to the standard squeak VM, so now even squeak has support for them (thanks David T. Lewis for that info).
How squeak works internally
When you run an instance of squeak, you are actually running a virtual machine, with the specified image passed as an argument. Squeak is made of a VM in the sense that it's executable consists of an interpreter, which contains it's own instrucction set (called it's bytecode) to which all smalltalk methods are compiled. When a smalltalk method needs to be executed, the virtual machine lookups the method's bytecodes and interprets it one by one. You can see the bytecodes within the image, try exploring
SmallInteger >> #timesRepeat:
this will show you the instance of CompiledMethod that gets executed when you do something like 20 timesRepeat: aBlock. You can even see the bytecodes there. The term Virtual Machine has been given diverse meanings, as you can read here, so when talking about Squeak's virtual machine I prefer to use a term taken from A Tour of the Squeak Object Engine, which is Object Engine (OE), because I think it fits much better and is more apropiate. So I'll use it from now on.
Where is Squeak's source?
The OE you execute each time you run Squeak is programmed half in Smalltalk and half in C. It would be nice to have all of it written in Smalltalk, but for many reasons that's not possible nor desireable today. That's because part of the code are platform specific support files, so it's easier/better to have them written directly in C. I call this code the handwritten C part (HC). The not-C part of the OE is all written in Smalltalk, and to be more precise it is written in Slang, which is a subset of Smalltalk that can be easily translated to C. I call this the Slang code (SC).
Before compiling an entire working OE, it is required to translate the Slang code to C using VMMaker, or download an already translated copy of it. I call this C code the automatically generated C code (GenC).

So, how is the entire process of building a OE in practice? It basically consists of two steps: 1 - Gathering all the C code, and 2 - Compiling everything.
The detailed steps are these:
-
Gathering all the OE C code.
As I said, the code is divided in handwritten C code (HC) and Slang code (SC) that gets translated to C (GenC). The HC part can be directly downloaded from www.squeakvm.org. In some cases, you may not need the Slang code because the package that contains HC also contains a copy of the GenC made from some SC. As the GenC code is platform-specific, it's only available in some platforms (linux and windows only I think).
You can download the HC source as a zip or use the latest version from svn (which also includes GenC for linux):$ svn co http://squeakvm.org/svn/squeak/trunk squeak.
There are many situations where you want to use your own GenC code, and in that cases you use VMMaker. You won't find the Slang code included inside Squeak's default image, it is actually included inside VMMaker package, which you have to download separatedly. So VMMaker includes two things: everything you need to generate C from Slang, and also the Squeak's SC part. Notice that to generate C from Slang you'll need an already working OE and image.
If you are using Squeak, you can download VMMaker from Package Universe Browser which will download all dependencies for you. In case you use Pharo, I recomend downloading VMMaker from www.squeaksource.com monticello repository.The latest version of VMMaker should be downloaded directly from monticello in www.squeaksource.com, and not from Universe nor SqueakMap, because those have versions that are very old now. There's another alternative that is to use an image already prepared to generate C. You can download them from www.squeakvm.org.
Once upon you get VMMaker package, you'll use VMMakerTool to generate the files. To open the VMMakerTool, left click background -> open... -> VMMaker. But before using it let's see where the HC files are and where the GenC files will be placed. VM HC sources use this struct:
- /platforms/
- /Cross/
- /Mac OS/
- /RiscOS/
- /unix/
- /win32/with the obvious meaning. Generated files may teoretically be placed in anywhere you want, but a tested one is (in case of Unix) /platforms/unix/my-gen-src or something like that. Actually, the GenC files that come bundled in unix are placed in /platforms/unix/src.
- /platforms/
- ...
- /unix/
- ...
- /my-gen-src/ -
Compiling everything
Now you are ready to compile everything. The instucions I write next are more or less what I found in /platforms/unix/README.cmake plus my own experience. To do so, create a folder next to platforms called build, or bld or whatever.
$ mkdir build
$ cd buildThen, issue the configure script
$ ../platforms/unix/cmake/configure --src=../platforms/unix/my-gen-src
You don't need the --src flag if the GenC code you want to use is in /platforms/unix/src
To finish call make which will generate the binaries.
$ make
In case you were wondering, this won't generate any .image, .sources or .changes files. These are independent of the OE and should be downloaded from squeak's site.
See you soon!