A kind of magic

One often has to checking file uploads for correctness, for example with respect to size, file type or file name.

Here’s a sketch for checking the type of image files:

(defun matches-magic (file magic &optional (offset 0))
    (with-open-file (s file :element-type '(unsigned-byte 8))
      (file-position s offset)
      (loop for c across magic
            unless (eql c (code-char (read-byte s)))
            do (return-from matches-magic))
      t))
 
(defun jpeg-p (file) ; won't catch Exif files with JPEG inside
  (matches-magic file "JFIF" 6))
 
(defun png-p (file)
  (matches-magic file "PNG" 1))
 
(defun gif-p (file)
  (matches-magic file "GIF89a"))
 
(defun canonical-image-extension (file)
  (cond
    ((png-p file) "png")
    ((gif-p file) "gif")
    ((jpeg-p file) "jpeg")))
 
(defmacro any-predicate (preds &rest args)
    `(or ,@(loop for p in preds
                 collect `(,p ,@args))))
 
(defun valid-image-p (file)
  (any-predicate (jpeg-p png-p gif-p) file))

ANY-PREDICATE could also be written as a function (with a slightly different form of arguments), here’s another quick draft:

(defun any-predicate (preds &rest args) ; largely untested
  (some #'identity (mapcar (lambda (x) (apply x args))  preds)))
 
(defun valid-image-p (file)
  (any-predicate (list #'jpeg-p #'png-p #'gif-p #'exif-p) file))

Of course, you could also chain SYMBOL-FUNCTION to get rid of the sharp-signs in the call. Whatever suits you.
I like the macro better, though, since it’s clearer and probably more efficient. Update: see below for a comment by Zach Beane on this.

Homework would be writing a simple DSL to jot down file type data:

(define-file-type "jpeg" JFIF 6)

Alternatives from the outer world would be calling file(1) or parsing magic(4).

Comments

  1. Zach Beane
    May 1st, 2008 | 2:14 pm

    You don’t need to use SYMBOL-FUNCTION. A symbol is a valid function designator. You could make the call like this:

    (any-predicate ‘(jpeg-p pnng-p gif-p exif-p) file)

    The macro is really quite unnecessary.

    Why did you pass 0 as EOF-ERROR-P in READ-BYTE?

  2. May 1st, 2008 | 2:30 pm

    Thanks for the remark on function designators.

    I’m not sure why I passed a zero for EOF-ERROR-P (I wrote this some weeks ago), but it certainly is a mistake.

  3. foo
    May 1st, 2008 | 5:07 pm

    You also don’t need `(funcall (function ,p) ,@args) .

    `(,p ,@args) is fine.

    Anyway it is better not to use a macro.

  4. foo
    May 1st, 2008 | 5:10 pm

    You also don’t need to coerce the string to a list.

    Use (loop for character across string … )

  5. foo
    May 1st, 2008 | 5:16 pm

    also check out:

    (some (lambda (pred) (funcall pred file)) ‘(jpeg-p png-p gif-p))

  6. May 2nd, 2008 | 7:53 am

    foo, thanks for your help. I added your simplifications to the code.

  7. foo
    May 2nd, 2008 | 8:34 am

    You did not change it correctly. You need to do (LOOP FOR CHARACTER ACROSS STRING … ). ACROSS not in.

    Also in the Macro the GENSYM is not needed.

    Just use `(or ,@(loop for p in preds collect `(,p ,@args)))) as the body. The GENSYM would be needed, if your macro wants to introduce new symbols into the code. But it does not do that.

  8. May 2nd, 2008 | 10:07 am

    Fixed, thanks.

  9. May 20th, 2008 | 2:09 pm

    Check out the thereis loop clause for your any-predicate macro.

  10. May 29th, 2008 | 1:59 pm

    You can also use libmagic via ffi? I think, that it will return more information, then your own code

Leave a reply