May 1, 2008
A kind of magic
One often has to checking file uploads for correctness, for example with respect to size, file type or file name.
Here’s a sketch for checking the type of image files:
(defun matches-magic (file magic &optional (offset 0)) (with-open-file (s file :element-type '(unsigned-byte 8)) (file-position s offset) (loop for c across magic unless (eql c (code-char (read-byte s))) do (return-from matches-magic)) t)) (defun jpeg-p (file) ; won't catch Exif files with JPEG inside (matches-magic file "JFIF" 6)) (defun png-p (file) (matches-magic file "PNG" 1)) (defun gif-p (file) (matches-magic file "GIF89a")) (defun canonical-image-extension (file) (cond ((png-p file) "png") ((gif-p file) "gif") ((jpeg-p file) "jpeg"))) (defmacro any-predicate (preds &rest args) `(or ,@(loop for p in preds collect `(,p ,@args)))) (defun valid-image-p (file) (any-predicate (jpeg-p png-p gif-p) file))
ANY-PREDICATE could also be written as a function (with a slightly different form of arguments), here’s another quick draft:
(defun any-predicate (preds &rest args) ; largely untested (some #'identity (mapcar (lambda (x) (apply x args)) preds))) (defun valid-image-p (file) (any-predicate (list #'jpeg-p #'png-p #'gif-p #'exif-p) file))
Of course, you could also chain SYMBOL-FUNCTION to get rid of the sharp-signs in the call. Whatever suits you.
I like the macro better, though, since it’s clearer and probably more efficient. Update: see below for a comment by Zach Beane on this.
Homework would be writing a simple DSL to jot down file type data:
(define-file-type "jpeg" JFIF 6)
Alternatives from the outer world would be calling file(1) or parsing magic(4).
Comments(10)
You don’t need to use SYMBOL-FUNCTION. A symbol is a valid function designator. You could make the call like this:
(any-predicate ‘(jpeg-p pnng-p gif-p exif-p) file)
The macro is really quite unnecessary.
Why did you pass 0 as EOF-ERROR-P in READ-BYTE?
Thanks for the remark on function designators.
I’m not sure why I passed a zero for EOF-ERROR-P (I wrote this some weeks ago), but it certainly is a mistake.
You also don’t need `(funcall (function ,p) ,@args) .
`(,p ,@args) is fine.
Anyway it is better not to use a macro.
You also don’t need to coerce the string to a list.
Use (loop for character across string … )
also check out:
(some (lambda (pred) (funcall pred file)) ‘(jpeg-p png-p gif-p))
foo, thanks for your help. I added your simplifications to the code.
You did not change it correctly. You need to do (LOOP FOR CHARACTER ACROSS STRING … ). ACROSS not in.
Also in the Macro the GENSYM is not needed.
Just use `(or ,@(loop for p in preds collect `(,p ,@args)))) as the body. The GENSYM would be needed, if your macro wants to introduce new symbols into the code. But it does not do that.
Fixed, thanks.
Check out the thereis loop clause for your any-predicate macro.
You can also use libmagic via ffi? I think, that it will return more information, then your own code