Why choose Elephant?

Over the past few years quite a few solutions for object persistency in CL have emerged.

For my current project, I have chosen Elephant, after having looked at other popular alternatives. In this post, I’d like to talk about my reasons for choosing Elephant, taking a comparative approach. I didn’t take a look at the LW and ACL solutions since they are either not free or not portable.

Flexibility

In Elephant, I have flexibility, and in more than one way:

First, you choose your backend, and you don’t do it once and for virtual eternity. Elephant works best with Berkeley DB, but also has a performant (so I’m told) Postmodern backend, usable CL-SQL and SQLite3 backends and an experimental SEXP backend. Switching among those both in the development stage and in the production stage is (at least theoretically) easy. CL-PEREC and Submarine only work with PostgreSQL.

Second, you choose the storage model. I’m going to talk about this in a coming post, for now just accept the idea that you are in command and are able to choose the model which works best for you. Hey, doesn’t that resemble Common Lisp philosophy? :)

Third, Elephant is not an object-relational mapper. While a lot of people might that see as a disadvantage (it only stores key-value pairs at the database level). But this leads to a very flexible backend model and enables easier schema evolution.

Maturity

Elephant is used in production environments (you can read about that in the manual). The only other library being used in this way is, to my knowledge, CL-PEREC.
And in some areas of Submarine still has its still be dragons.

Liveliness

An active development and user community is vital to help you with using and hacking the library. Rucksack and CL-Prevalence don’t have that. CL-PEREC does.

Ease of use

Wow, CL-PEREC is the best negative example here. To try it, install about a dozen (no, that’s not hyperbole) packages from Darcs repositories. Then figure out from some test case output how it works. Yuk. Elephant doesn’t work out of the box either, but you only have to adapt a simple configuration file to your environment and comes with a good manual.

But ease of use is actually one of the big reasons for choosing object persistency instead of the all-popular SQL. I can just put away my objects instead of defining views, classes, relations in some DSL I don’t really wish to know details about.

Conclusion

Not mentioned here is PLOB!, which seems to me to be an unmaintained project using concepts that Elephant embodies in a clean way and expands upon.

The combination of the above factors was what drove me to use Elephant. Rucksack also seems to be sensible stuff, but it’s not mature yet (in terms of community, documentation, stability and features).

Naturally, there are some deficiencies in Elephant, but they don’t hit where it hurts. The code is a bit crufty in some spots, for example, it uses feature macros to work with the MOP instead of using a library like Closer to MOP (though that’s work in progress). It also doesn’t support advanced (semantical) schema evolution (but no other does), though it copes with slot addition and removal. When you change your storage semantices, you can convert the data manually by mapping over it.
The manual and Trac page of Elephant also state that there’s no query language, but in my experience Elephant already offers enough querying features for more purposes.

Problems being solved by none of the libraries, apart from semantical schema evolution, are function, closure and continuation serialization. But since this is Lisp you can for simple purposes just store forms and COMPILE them as needed, and there’s more than one silver lining on the horizon, see Paul’s Common Cold and David’s SB-HEAPDUMP (both SBCL-specific right now).

What are your experiences with object persistency and data storage in Lisp? I’m curious.

Multiple stores in Elephant

For simplicity, it’s good to have only one store in Elephant.
Often however this doesn’t scale well with reality :)

For example, in the browser game I am developing, we have separate “worlds” with separate sets of players.
The items are to be administrated centrally, though, so the natural solution is having one shared store for them.

The easiest way to do this is having two store controllers. An alternative would be rolling your own solution using separate indexed B trees (I went down that road first), but handling things at the store controller level lets us use all the existing indexed class infrastructure.

Now there’s good and bad news: the good news is that it’s possible to do this. The bad news is that the interface is a bit clumsy. Most functions do not accept a store controller argument but refer directly to the global *store-controller*. Yuk.

So what we need to do is find the set of functions that need to refer to the alternative store controller and let them have a nice interface. For the item example (consider all Elephant symbols to be imported, I just added the package prefix to the functions for clarity):

(defun get-items-by-type (type)
  (let ((*store-controller* *item-sc*))
    (elephant:get-instances-by-class type)))

Try exceptionally hard to draw clear lines between the different store controllers, or you’ll be in imperative hell before you can say “Practical Extraction and Report Language”. If you’re nesting those functions, use dynamic variables (also known as “globals done right”):

(defun first-thing ()
  (declare (special *store-controller*)) ; refer to the dynamic binding
  (get-instances-by-value 'item 'price 5))
 
(defun do-complex-things ()
  (let ((*store-controller* *item-sc*))
    (declare (special *store-controller*)) ; provide a dynamic binding
    (first-thing)))

EDIT: As instructive as this is, Mikael Jansson has pointed out that we don’t need the whole DECLARE hanky-panky since DEFVAR/DEFPARAMETER automatically establish variables as dynamic. Seems to be another case of the “do what I mean” support in Common Lisp.

Another thing to take care of is opening and closing alternative stores.

Self-explanatory, but not that easy to find out if you’re just starting out with Elephant:

;;; init
(defvar *item-sc* nil) ; newbies note that DEFVAR only assigns to non-existing variables
(when (null *item-sc*)
  (setf *item-sc* (elephant:open-store *item-store*)) ; *item-store* defined elsewhere
 
;;; shutdown
(elephant:close-store *item-sc*)

If you’re interested in more Elephant posts, drop me a short comment.

Revision control systems

Every serious programming team, even if it consists only of you, should use a distributed revision control system.
Over the last few years, I have tried the most popular. Here’s a quick comparison of them:

  • Git: Jeez, everyone seems to like this one. While I might agree that it’s powerful stuff, I do not wish to spend weeks figuring out the innards of my RCS. I got things like work to do, you know.
  • SVK: SVK is a hack to make SVN distributed. ‘Nuff said.
  • Monotone: Easier than Git, and does the job. But still too much boilerplate.
  • Bazaar: Does the job, but is a little conservative in its approach.
  • Darcs: Best candidate for me, were it only more performant. Importing the Yahoo! User Interface library takes ages and load of memory. No user-defined hooks and nested repositories either. Sorry. But still great for small projects.
  • Mercurial (hg): Easy to get started with. Nested repositories. Fine-grained control over user-based hooks.
    Extensions for everything (e.g. cherry picking, selective recording).

Darcs is my personal favorite, but it’s not possible to manage bigger projects with it, and you can hardly automate deployment or testing without hooks. Plus, certain virtualization software has problems with GHC’s memory allocation.

So I’m going to stay with Mercurial. See also this comparison between Darcs and Mercurial.

Automatic table layout

GTK+ has a table layout container that lets you render widgets in a grid. The widgets get assigned to the table cells automatically. I haven’t looked at their rendering algorithm, but this does something similar in a Weblocks widget:

(in-package :weblocks)
 
(export '(table-composite render-widget-body))
 
(defwidget table-composite (composite)
  ((cols :type integer :accessor cols :initarg :cols :initform 0))
  (:documentation "Renders a set of widgets in table layout."))
 
(defmethod render-widget-body ((widget table-composite) &rest args)
  (let* ((widgets (composite-widgets widget))
         (num-widgets (list-length widgets))
         (cols (cols widget))
         (rows (ceiling (/ num-widgets cols))))
    (with-html
      (:table :rows rows :cols cols
        (loop for r from 0 below rows
              do (htm (:tr
              (loop for c from 0 below cols
                    do (let ((child (nth (+ c (* r cols)) widgets)))
                         (htm (:td (when child (render-widget child)))))))))))))
 
;;; EXAMPLE CODE:
(make-instance 'table-composite :cols 3
  :widgets (loop for i from 1 to 10
                 collect (write-to-string i)))

Yeah, I know a DOLIST would do the same as the LOOP here, but I really like LOOP.

This should be easily adaptable for generic HTML rendering.

Laying logging foundations

Gary W. King’s logging library log5 is a lispy way to do logging.

It’s simple to set up and should probably be the first dependency you add to any serious project.

Unfortunately, the User Guide‘s examples are a bit bland.

So here’s some salt (for simplicity let us also ignore log5′s default categories and outputs):

Preliminaries

For comfort.

(defpackage :our-package
  (:use :common-lisp :log5))
 
(in-package :our-package)

Categories

Define new categories every time you add a semantic part to your project:

(defcategory borg-attack)
(defcategory federation-cabal)
(defcategory ion-flux)
(defcategory all-categories (or borg-attack federation-cabal ion-flux))

…or want to differentiate on the seriousness level of logging messages:

(defcategory debug)
(defcategory error)
(defcategory info)
(defcategory all-levels (or debug info error))

Outputs

Outputs get added to every logging message automatically.
They are evaluated at the time the log messages gets sent.

Let’s add one for the time and one for a line break:

(defoutput newline (format nil "~%"))
 
(defoutput time-hms
  (multiple-value-bind (second minute hour day month year)
    (decode-universal-time (get-universal-time))
    (format nil "~D:~2,'0D:~2,'0D" hour minute second)))

And one for the load averages, so we can see whether an error might have occurred due to heavy load (maybe a race condition):

(defoutput load (format nil "[~A]" (load-averages)))

If your project is web-based, you might also want to add the username associated with the current session, like this:

(defoutput username (format nil "[~A]"
  (when (boundp hunchentoot:*session*)
    (session-value 'username))))

Finally, if you’re running SBCL and don’t mind a bit of a hack, you can also identify the function context:

(defmacro current-function-name-log5 ()
  `(caaddr (sb-debug::backtrace-as-list)))

(defoutput function-name (format nil “[~A]” (current-function-name-log5)))

Senders

Senders decide where logging messages from certain categories go, and what they look like.
Here’s one that will log messages from all of the above categories to the standard output, utilizing all of the outputs we defined:

(start-sender 'debug
  (stream-sender :location *standard-output*)
  :category-spec '(all-levels all-categories)
  :output-spec '(time-hms load username function-name message newline))

Usage

You would log a message like this:

(log-for (borg-attack ion-flux)
  "The ion flux of vessel ~A broke down due to a borg attack"
  (get-current-vessel))

That’s already incredibly useful!

Getting started with CFFI

I really like the Lisp approach of accessing foreign functions; it puts the programmer in charge (as usual) instead of making him wait for some bindings to appear or get updated.

Here’s a little recipe that shows how to get the load averages (the thing uptime shows) in Lisp, which will be useful later when we build a solid logging foundation with Gary’s log5 package.
In case you don’t know, the load average shows an approximation of the number of processes in the system’s task queue, thus serving as indication for machine load.

First, let’s do the initialization work for CFFI, as pointed out in its user guide:

(asdf:oos 'asdf:load-op 'cffi)
 
(defpackage :cffi-user
  (:use :common-lisp :cffi))
 
(in-package :cffi-user)
 
(define-foreign-library libc
  (:unix (:or "libc.so.6" "libc.so.5" "libc.so"))
  (t (:default "libc.so")))
 
(use-foreign-library libc)

Now we actually need to start thinking. How do we get at the numbers?
Let’s find the C function:

% apropos load | egrep -i "average|avg"
getloadavg           (3)  - get system load averages
[...]
% man 3 getloadavg

The man page gives us this prototype:

int getloadavg(double loadavg[], int nelem);

It also tells us that the first parameter will be filled with nelem samples and notes that (at least on my Linux system) the maximum number of samples is three, denoting the load averages of the last 1, 5 and 15 minutes. Let’s say that we want all three.

Now unfortunately the CFFI manual doesn’t say anything about arrays. But we can rewrite the prototype as

int getloadavg(double* loadavg, int nelem);

leading to the following CFFI function spec:

(defcfun "getloadavg" :int (loadavg :pointer) (nelem :int))

Now we are able to use foreign-alloc to allocate a pointer of the correct size (i.e. 3*sizeof(double)), and mem-aref to access the resulting array.

Combined with matching LOOP and FORMAT programs and the manual garbage collection we get:

(defun load-averages ()
  (let ((loadavg (foreign-alloc :double :count 3)))
    (getloadavg loadavg 3) ; note the imperative style we are forced to use
    (prog1 ; we need to clean up after producing the return value
      (format nil "~{~,2F~^ ~}" (loop for i from 0 to 2
                                    collect (mem-aref loadavg :double i)))
      (foreign-free loadavg))))

If you have questions regarding any part of that last snippet, feel free to ask.

Note that we don’t do any error checking here; getloadavg will return -1 on failure, although I can’t imagine why it would do so.

You can access the full code at http://paste.lisp.org/display/54746.

I hope this post wasn’t overly verbose (read: boring) to you.
It was my intention to make this understandable for beginners.

Including static HTML snippets in Weblocks

Generating your HTML directly in Lisp with CL-WHO (or some other HTML generation toolkit) is effective.
However, sometimes you need to include HTML from external sources, for example when you want HTML writers to provide page elements.

Here’s a widget for Weblocks that will create a widget from static HTML:

(in-package :weblocks)
 
(export '(static-html make-static-html-from-file
          with-widget-header render-widget-body))
 
(defwidget static-html (widget)
  ((html :type string :accessor html :initarg :html :initform ""))
  (:documentation "Represents a piece of static HTML body mark-up."))
 
(defmethod with-widget-header ((widget static-html) body-fn &rest args &key
                                                    prewidget-body-fn postwidget-body-fn &allow-other-keys)
    (apply body-fn widget args))
 
(defmethod render-widget-body ((widget static-html) &rest args)
  (format *weblocks-output-stream* "~A~_" (html widget)))
 
(defun make-static-html-from-file (file)
  "Create a static-html widget representing the mark-up in “file”."
  (with-open-file (input file :direction :input)
    (let ((data (make-string (file-length input))))
      (read-sequence data input)
      (make-instance 'static-html :html data))))

The Lisp paste is at http://paste.lisp.org/display/54023.

Sending mails in Common Lisp

Just a quickie that might be useful for people new to CL.

Install CL-SMTP if you don’t have it yet:

(asdf-install:install 'cl-smtp)

You’re now able to load it into your Lisp core at any time:

(asdf:oos 'asdf:load-op 'cl-smtp)

And use it, either interactively (HOST is your SMTP server, e.g. localhost):

(cl-smtp:send-email "HOST" "from@foo.com" "recipient@bar.com"
                    "SUBJECT" "BODY")

or non-interactively:

(handler-case
  (prog1 t
    (cl-smtp:send-email "HOST" "from@foo.com" "recipient@bar.com"
                        "SUBJECT" "BODY"))
  (error (msg) (format t "Could not send mail: ~A~%" msg) nil))

We need the prog1 because send-email always returns nil, even on success.
With the above, the whole expression will return true on success, which lets us decide afterwards whether we were able to hand our message to the mail server.

cl-i18n 0.4 released

If it continues like this, we will be at 1.0 soon. ;)

Vilson Vieira has added translation resource merging and has refactored the code base in a nice way.

I also put up my darcs repository at http://viridian-project.de/~sky/cl-i18n/ so you can provide patches against it if you want to contribute.

Kalman filtering

Engineering reality often confronts us with systems that can’t be modelled very well with fixed formulas.
Sometimes we don’t even know some of the system equation’s variables; think of signal noise, for example.

Enter the Kalman filtering algorithm. Given a number of observable and non-observable states, a Kalman filter tries to predict the non-observables ones.

Normally the state vectors have a dimension greater than 1, so you have to fiddle around with matrices. We once implemented the filter in Java; it was astonishingly simple, but the Java idiosyncrasy of making even the most simple code look complicated diminished this simplicity considerably. Here’s the core of it, i.e. the code minus the various “object-oriented” boilerplate methods:

public void filter(State state) throws MatrixException {
 
    setStateTransitionMatrix(delta, state.getAcceleration());
 
    x = lastX;
 
    y.setValue(0, 0, state.getPosition().getX());
    y.setValue(1, 0, state.getPosition().getY());
    y.setValue(2, 0, state.getPosition().getZ());
 
    /* PREDICT */
    x = A.multiply(x);
 
    if ((!E.isZeroMatrix()) && (R != null) && (!R.isZeroMatrix())) {
        E = ((A.multiply(E)).multiply(A.transpose())).add(Q);
        Matrix K = null;
 
        /* CORRECT */
        K = (E.multiply(B.transpose())).multiply(((B.multiply(E))
                .multiply(B.transpose()).add(R)).invert());
 
        if (!K.isZeroMatrix()) {
            x = x.add(K.multiply(y.subtract(B.multiply(x))));
            E = (I.subtract(K.multiply(B))).multiply(E);
        }
 
    }
    lastX = x;
    setState(state);
}

And now you may leave the torture chamber and take a look at an implementation in q. Beautiful!

« Previous PageNext Page »