Persistence by Reachability
MagLev stores ruby objects in a persistent repository using “persistence by reachability”: given a well known, persistent object, the “root object”, all objects that are reachable from the root will also be persistent (saved in the repository). The most common form of reachability is for one object to refer to another in an instance variable. ObjectA is reachable from ObjectB if ObjectB has ObjectA as the value of one of its instance variables. There are a some other cases as well. An object’s class is reachable from the object (as are any mixed in modules). Constants of a persistent class or module are also reachable from that class or module.
Not all objects in a MagLev VM are persistent, nor should they be. There are many, many objects that are created, used and thrown away during normal processing (items on the stack, temporary variables in methods or blocks, etc.). These objects will not be persisted to the repository, unless they become reachable from an already persistent object.
An example
In order to save a new object in the repository, you first need an already persistent object. MagLev provides one for just such purposes: Maglev::PERSISTENT_ROOT (a ruby Hash). So, if you want to save your rabbits, but not your dust bunnies, you could do:
# example1.rb
# Create some objects
hat = Array.new
hat << "Rabbit 1"
hat << "Rabbit 2"
under_couch = "Dust Bunny 1"
# Connect the rabbits to a persistent object
Maglev::PERSISTENT_ROOT[:hat] = hat
# Save the changes to the repository
Maglev.commit_transaction
puts "Committed hat to the repository: #{Maglev::PERSISTENT_ROOT[:hat].inspect}"
Now we run the code:
$ maglev-ruby example1.rb Committed hat to the repository: ["Rabbit 1", "Rabbit 2"]
Now, the hat and the two rabbits are saved in the repository and available to all VMs connected to the repository. The dust bunny still exists, but only in the original VM’s temporary object space. No other VM will see the dust bunny.
$ maglev-ruby -e 'p Maglev::PERSISTENT_ROOT[:hat]' ["Rabbit 1", "Rabbit 2"]
A more detailed look
For those of you who prefer pictures, here’s a recap. Before we execute $ maglev-ruby example1.rb, all we have is the repository with its previously committed objects (no VM has been started yet).
The repository contains some pre-built system objects (blue, Smalltalk) that are generally not available to Ruby. It also contains the core ruby (red) objects: Object, Kernel, etc. The repository is started by maglev start or cd $MAGLEV_HOME ; rake maglev:start.
There are no VMs running in Figure 1, i.e., no place to actually run objects or create new ones. A VM fires up only when we execute maglev-ruby. So, maglev-ruby example1.rb, first creates a new, empty VM.
Next, the VM connects to the repository, and starts loading objects it needs (similar, but different, to class loading in a JVM). After a while, the VM might look like Figure 3.
Figure 3 is a bit misleading, since MagLev does not load every object in the repository, only the ones it needs. I haven’t bothered drawing the millions of other objects that would typically be in the repository. The VM then executes the following lines:
hat = Array.new hat << "Rabbit 1" hat << "Rabbit 2" under_couch = "Dust Bunny 1"
The VM has created four new objects: an array, two rabbits (strings) and a dust bunny (string). These objects live in the VM and can interact with all of the other objects in the VM, but, they are not (yet) persistent: they are not reachable from the set of persistent objects (objects on the brownish background). If the VM were to exit at this point, the array, rabbits and dust bunny would disappear.
The next line of code makes some of our objects reachable from a persistent object:
Maglev::PERSISTENT_ROOT[:hat] = hat
The persistent Hash Maglev::PERSISTENT_ROOT now has a reference to the array (as well as to the :hat symbol). Maglev::PERSISTENT_ROOT is a convenient (but not the only) place to connect objects to the persistent graph.
Maglev::PERSISTENT_ROOT directly references the arrray, but since the array references the two rabbits, they are also reachable from a persistent root, and will be saved to the repository when we commit the transaction. The set of persistent objects in the system is the transitive closure of the objects reachable from the persistent root. The dust bunny is not referenced by anything (other than the variable under_couch), so it is not considered reachable from a persistent object.
Even though they are now reachable from a persistent object, the array and strings are still on the blue background. They will only become persistent when the current transaction is committed to the repository.
All persistent changes to the MagLev repository are done in an ACID transaction. This VM is the only VM that has a hat added to Maglev::PERSISTENT_ROOT. I.e., the current state of this VM is isolated (the “I” in “ACID”) from the views of other transactions. If a new VM were to fire up and load objects from the repository, its view of the repository would be the same as in Figure 3. Likewise, this VM does not see any changes to the repository other VMs might have committed (repeatable reads).
By default, MagLev starts each VM in auto-transaction mode. The VM starts a transaction (Maglev.begin_transaction) automatically before it starts running your code, and it will start a new transaction when you end the current transaction by either MagLev.commit_transaction or Maglev.abort_transaction. I.e., with auto-transaction mode, you are always in a transaction.
Finally, we commit our changes, which will write the array and rabbits to the repository, leaving the dust bunny…in the dust…
Maglev.commit_transaction
puts "Committed hat to the repository: #{Maglev::PERSISTENT_ROOT[:hat].inspect}"
Maglev.commit_transaction initiates the VM’s ACID commit procedures (a topic for another day), which calculate which persistent objects have changed (Maglev::PERSISTENT_ROOT), which new objects need to be committed (the array and two rabbit strings), ensures that there are no conflicts with other transactions that were committed during our transaction. Finally, MagLev writes the data to disk.
You can see that the array with the rabbits is now written to the repository, but that the dust bunny only exists in the VM’s memory. When the VM exits, the dust bunny will disappear, ending the great Dust Bunny plague of 2010.






So, as I get it, the PERSITENT_ROOT is nothing more than a Hash that happens to be in the repository? So, if Object or Maglev would have a reference to another object, Maglev.commit_transaction would store it in the repository? Like:
module Foo
Object.send :include, self
def fancy_method
“bar”
end
end
Maglev.commit_transaction
would that store Foo in the repository?
Konstantin Haase
January 27, 2010 at 12:06 am
You have the right idea, but one detail missing. To make a persistent modification to a class or module, you need to enclose the modification in a Maglev.persistent block:
Maglev.persistent do module Foo Object.send :include, self def fancy_method "bar" end end end Maglev.commit_transactionWe want to ensure that any modifications to a class or module are intentional. To test that the modification took place, you can fire up a new vm and look at Object’s ancestors:
pbmclain
January 27, 2010 at 8:00 am
Also, can I use different repositories? I would really appreciate a blog post about managing repositories. Running all my experiments in the same repo kinda scares me.
Konstantin Haase
January 27, 2010 at 12:08 am
Yes, you can manage multiple repositories. I’ll try to crank out a post on that soon. In the meantime, here is a brief synopsis:
pbmclain
January 27, 2010 at 8:07 am
Seems that wordpress tried to turn some of the output into mailto: links…
pbmclain
January 27, 2010 at 8:10 am
[...] MagLev implements Persistence by Reachability, the array that is assigned to PERSISTENT_ROOT[:q], and the contents of that array will also be [...]
Simple Worker Queue with Procs in MagLev « Maglevity
July 8, 2011 at 3:29 pm