Maglevity

MagLev, Ruby, …

Persistence by Reachability

with 6 comments

MagLev stores ruby objects in a persistent repository using “persistence by reachability”: given a well known, persistent object, the “root object”, all objects that are reachable from the root will also be persistent (saved in the repository). The most common form of reachability is for one object to refer to another in an instance variable. ObjectA is reachable from ObjectB if ObjectB has ObjectA as the value of one of its instance variables. There are a some other cases as well. An object’s class is reachable from the object (as are any mixed in modules). Constants of a persistent class or module are also reachable from that class or module.

Not all objects in a MagLev VM are persistent, nor should they be. There are many, many objects that are created, used and thrown away during normal processing (items on the stack, temporary variables in methods or blocks, etc.). These objects will not be persisted to the repository, unless they become reachable from an already persistent object.

An example

In order to save a new object in the repository, you first need an already persistent object. MagLev provides one for just such purposes: Maglev::PERSISTENT_ROOT (a ruby Hash). So, if you want to save your rabbits, but not your dust bunnies, you could do:

# example1.rb

# Create some objects
hat = Array.new
hat << "Rabbit 1"
hat << "Rabbit 2"

under_couch = "Dust Bunny 1"

# Connect the rabbits to a persistent object
Maglev::PERSISTENT_ROOT[:hat] = hat

# Save the changes to the repository
Maglev.commit_transaction
puts "Committed hat to the repository: #{Maglev::PERSISTENT_ROOT[:hat].inspect}"

Now we run the code:

$ maglev-ruby example1.rb
Committed hat to the repository: ["Rabbit 1", "Rabbit 2"]

Now, the hat and the two rabbits are saved in the repository and available to all VMs connected to the repository. The dust bunny still exists, but only in the original VM’s temporary object space. No other VM will see the dust bunny.

$ maglev-ruby -e 'p Maglev::PERSISTENT_ROOT[:hat]'
["Rabbit 1", "Rabbit 2"]

A more detailed look

For those of you who prefer pictures, here’s a recap. Before we execute $ maglev-ruby example1.rb, all we have is the repository with its previously committed objects (no VM has been started yet).

Just the repository

Figure 1: The Repository

The repository contains some pre-built system objects (blue, Smalltalk) that are generally not available to Ruby. It also contains the core ruby (red) objects: Object, Kernel, etc. The repository is started by maglev start or cd $MAGLEV_HOME ; rake maglev:start.

There are no VMs running in Figure 1, i.e., no place to actually run objects or create new ones. A VM fires up only when we execute maglev-ruby. So, maglev-ruby example1.rb, first creates a new, empty VM.

Empty VM and Repository

Figure 2: An Empty VM and the Repository

Next, the VM connects to the repository, and starts loading objects it needs (similar, but different, to class loading in a JVM). After a while, the VM might look like Figure 3.

VM Loaded with Objects

Figure 3: A VM with objects loaded from the repository

Figure 3 is a bit misleading, since MagLev does not load every object in the repository, only the ones it needs. I haven’t bothered drawing the millions of other objects that would typically be in the repository. The VM then executes the following lines:

hat = Array.new
hat << "Rabbit 1"
hat << "Rabbit 2"

under_couch = "Dust Bunny 1"
New Temporary Objects

Figure 4: New, temporary objects in the VM

The VM has created four new objects: an array, two rabbits (strings) and a dust bunny (string). These objects live in the VM and can interact with all of the other objects in the VM, but, they are not (yet) persistent: they are not reachable from the set of persistent objects (objects on the brownish background). If the VM were to exit at this point, the array, rabbits and dust bunny would disappear.

The next line of code makes some of our objects reachable from a persistent object:

Maglev::PERSISTENT_ROOT[:hat] = hat
Uncommitted Changes

Figure 5: Objects reachable from persistent root, but not committed

The persistent Hash Maglev::PERSISTENT_ROOT now has a reference to the array (as well as to the :hat symbol). Maglev::PERSISTENT_ROOT is a convenient (but not the only) place to connect objects to the persistent graph.

Maglev::PERSISTENT_ROOT directly references the arrray, but since the array references the two rabbits, they are also reachable from a persistent root, and will be saved to the repository when we commit the transaction. The set of persistent objects in the system is the transitive closure of the objects reachable from the persistent root. The dust bunny is not referenced by anything (other than the variable under_couch), so it is not considered reachable from a persistent object.

Even though they are now reachable from a persistent object, the array and strings are still on the blue background. They will only become persistent when the current transaction is committed to the repository.

All persistent changes to the MagLev repository are done in an ACID transaction. This VM is the only VM that has a hat added to Maglev::PERSISTENT_ROOT. I.e., the current state of this VM is isolated (the “I” in “ACID”) from the views of other transactions. If a new VM were to fire up and load objects from the repository, its view of the repository would be the same as in Figure 3. Likewise, this VM does not see any changes to the repository other VMs might have committed (repeatable reads).

By default, MagLev starts each VM in auto-transaction mode. The VM starts a transaction (Maglev.begin_transaction) automatically before it starts running your code, and it will start a new transaction when you end the current transaction by either MagLev.commit_transaction or Maglev.abort_transaction. I.e., with auto-transaction mode, you are always in a transaction.

Finally, we commit our changes, which will write the array and rabbits to the repository, leaving the dust bunny…in the dust…

Maglev.commit_transaction
puts "Committed hat to the repository: #{Maglev::PERSISTENT_ROOT[:hat].inspect}"
Commit Data

Figure 6: All reachable objects committed to repository

Maglev.commit_transaction initiates the VM’s ACID commit procedures (a topic for another day), which calculate which persistent objects have changed (Maglev::PERSISTENT_ROOT), which new objects need to be committed (the array and two rabbit strings), ensures that there are no conflicts with other transactions that were committed during our transaction.  Finally, MagLev writes the data to disk.

You can see that the array with the rabbits is now written to the repository, but that the dust bunny only exists in the VM’s memory. When the VM exits, the dust bunny will disappear, ending the great Dust Bunny plague of 2010.

About these ads

Written by maglevdevelopment

January 17, 2010 at 8:11 pm

Posted in MagLev

6 Responses

Subscribe to comments with RSS.

  1. So, as I get it, the PERSITENT_ROOT is nothing more than a Hash that happens to be in the repository? So, if Object or Maglev would have a reference to another object, Maglev.commit_transaction would store it in the repository? Like:

    module Foo
    Object.send :include, self
    def fancy_method
    “bar”
    end
    end

    Maglev.commit_transaction

    would that store Foo in the repository?

    Konstantin Haase

    January 27, 2010 at 12:06 am

    • You have the right idea, but one detail missing. To make a persistent modification to a class or module, you need to enclose the modification in a Maglev.persistent block:

      Maglev.persistent do
        module Foo
          Object.send :include, self
          def fancy_method
            "bar"
          end
        end
      end
      Maglev.commit_transaction
      

      We want to ensure that any modifications to a class or module are intentional. To test that the modification took place, you can fire up a new vm and look at Object’s ancestors:

      $ maglev-ruby -e 'p Object.ancestors'
      [Object, Foo, Kernel]
      

      pbmclain

      January 27, 2010 at 8:00 am

  2. Also, can I use different repositories? I would really appreciate a blog post about managing repositories. Running all my experiments in the same repo kinda scares me.

    Konstantin Haase

    January 27, 2010 at 12:08 am

    • Yes, you can manage multiple repositories. I’ll try to crank out a post on that soon. In the meantime, here is a brief synopsis:

      $ cd $MAGLEV_HOME
      $ rake -T stone
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      rake stone:all[task_name]        # Invoke a task on all MagLev servers
      rake stone:create[server_name]   # Create a new MagLev server and repository
      rake stone:destroy[server_name]  # Destroy an existing MagLev server and repository
      rake stone:list                  # List MagLev servers managed by this Rakefile
      
      $ rake stone:list
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      maglev
      
      
      $ rake stone:create[quux]
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      Creating server "quux"
      
      $ rake stone:list
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      maglev
      quux
      
      $ rake -T quux
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      rake quux:reload            # Destroy the "quux" repository then load a fresh one
      rake quux:reload_prims      # [DEV] Reset the ruby context in "quux" then reload primitives
      rake quux:restart           # Stop then start the "quux" server
      rake quux:restore_snapshot  # Restore the "quux" repository from its previous snapshot
      rake quux:start             # Start the "quux" server
      rake quux:status            # Report status of the "quux" server
      rake quux:stop              # Stop the "quux" server
      rake quux:take_snapshot     # Stop the "quux" server then make a backup copy of its repository
      
      $ rake quux:start
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      startstone[Info]: Starting Stone repository monitor "quux".
      startstone[Info]: GemStone server 'quux' has been started.
      Parser already running...
      Loading Kernel for quux.  This may take a few seconds...
      
      $ rake
      (in /Users/pmclain/GemStone/snapshots/MagLev-2010-01-26)
      Status  Version    Owner    Pid   Port   Started     Type  Name
      ------ --------- --------- ----- ----- ------------ ------ ----
        OK   3.0.0     pmclain    5612 50377 Jan 26 09:20 Netldi gs64ldi
        OK   3.0.0     pmclain   95141 57924 Jan 26 10:42 Stone  maglev
        OK   3.0.0     pmclain   95142 57916 Jan 26 10:42 cache  maglev@cairo.gemstone.com
        OK   3.0.0     pmclain     249 60190 Jan 27 08:03 Stone  quux
        OK   3.0.0     pmclain     250 60182 Jan 27 08:03 cache  quux@cairo.gemstone.com
      MagLev Parse Server running on port 2001
      
      $ maglev-ruby --stone quux -e 'puts "Hello from quux"'
      Hello from quux
      

      pbmclain

      January 27, 2010 at 8:07 am

  3. Seems that wordpress tried to turn some of the output into mailto: links…

    pbmclain

    January 27, 2010 at 8:10 am

  4. […] MagLev implements Persistence by Reachability, the array that is assigned to PERSISTENT_ROOT[:q], and the contents of that array will also be […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: