Content tagged lisp

Oneliners #1
posted on 2017-07-04 22:3011:+01:00

So, let's replicate this blog post in CL. In case it's gone, we want to fetch all pictures from Reddit's /r/pics:

curl -s -H "User-Agent: cli:bash:v0.0.0 (by /u/codesharer)" \ \
  | jq '.data.children[].data.url' \
  | xargs -P 0 -n 1 -I {} bash -c 'curl -s -O {}'

First, we need a HTTP client (drakma being canonical) and a JSON parser (cl-json will suffice here, you might want to use a different one depending on the parsing/serialisation needs).

(asdf:load-systems '#:drakma '#:cl-json)

Next, we define our workflow, fetch JSON content from a URL, get the list of URLs and download each one of them.

We need to download the JSON first, so let's start with that:

(defun scrape-sub (name)
   (format NIL "" name)
   :user-agent "repl:abcl:v0.0.0 (by /u/nyir)"))

If we run, we'll see that the output is just a byte vector:

#(123 34 107 ...)

That's actually fine for the JSON library, but more readably we could either set a flag for Drakma, or convert it manually:

(babel:octets-to-string *)

The second step is parsing it from the JSON format, so extending it will be like the following:

(defun scrape-sub (name)
     (format NIL "" name)
     :user-agent "repl:abcl:v0.0.0 (by /u/nyir)"))))

The output now looks a bit different:

((:KIND . "Listing") (:DATA (:MODHASH . "") ...))

But it's already getting more manageable. Next we want the URL bit of the data. Unfortunately I don't know of a good library that would allow us to specify something as a kind of XPath-style selector. So we'll go ahead and to it manually. The .data.children bit will be something like (cdr (assoc :children (cdr (assoc :data <json>)))), since cl-json returns an association list; children[] means we'll iterate over all children and collect the results; data.url again is the same kind of accessor like (cdr (assoc :url (cdr (assoc :data <json>)))):

(defun scrape-sub (name)
  (let ((json (cl-json:decode-json-from-source
                 (format NIL "" name)
                 :user-agent "repl:abcl:v0.0.0 (by /u/nyir)")))))
    (mapcar (lambda (child)
              (cdr (assoc :url (cdr (assoc :data child)))))
            (cdr (assoc :children (cdr (assoc :data json)))))))

Now the output is just a list of strings:

("" "" ...)

Here's one a addition I'll put in, filtering for image file types. That might still be unreliable of course, but it'll remove a whole bunch of potentially wrong links already. For filtering, MAPCAR isn't suitable, either we could do it in multiple stages, or we'll use something like MAPCAN, or an explicit iteration construct like LOOP/ITERATE. I'll go with MAPCAN here, meaning every element to collect needs to be wrapped in a list:

(defun scrape-sub (name)
  (let ((json (cl-json:decode-json-from-source
                 (format NIL "" name)
                 :user-agent "repl:abcl:v0.0.0 (by /u/nyir)")))))
    (mapcan (lambda (child)
              (let ((url (cdr (assoc :url (cdr (assoc :data child))))))
                (and url
                     (member (pathname-type (pathname (puri:uri-path (puri:parse-uri url))))
                             '("jpg" "png")
                             :test #'string-equal)
                     (list url))))
            (cdr (assoc :children (cdr (assoc :data json)))))))

I'm happy with that and it now filters for two image types.

Last point, actually downloading all scraped results. For this, we just iterate and download them as before:

(defun scrape-sub (name)
  (let* ((agent "repl:abcl:v0.0.0 (by /u/nyir)")
         (json (cl-json:decode-json-from-source
                  (format NIL "" name)
                  :user-agent agent))))
           (mapcan (lambda (child)
                     (let ((url (cdr (assoc :url (cdr (assoc :data child))))))
                       (when url
                         (let ((pathname (pathname (puri:uri-path (puri:parse-uri url)))))
                           (when (member (pathname-type pathname)
                                         '("jpg" "png")
                                         :test #'string-equal)
                             `((,url ,pathname)))))))
                   (cdr (assoc :children (cdr (assoc :data json)))))))
    (mapc (lambda (download)
            (destructuring-bind (url pathname) download
              (with-open-file (stream (merge-pathnames *default-pathname-defaults* pathname)
                                      :direction :output
                                      :element-type '(unsigned-byte 8))
                 (drakma:http-request url :user-agent agent)

And this works.

Now. This is decidedly not a demonstration of how the final result should look like. In fact there a whole lof of things to improve and to consider when you'd put this into a reusable script.

From a maintainability perspective, we'd put each functional part into it's own component, be it a function or method, in order to make them easier to reason about and to test each bit individually.

From a performance part ... oh my, there's so much wrong with it, mostly slurping everything into memory multiple times, while drakma does support streams as results and HTTP Keep-Alive, both would improve things. The JSON parser could in theory also operate on tokens, but that's rarely worth the hassle (the CXML API can be used for that, by converting JSON "events" into a stream of SAX events basically). Lastly creating the output lists isn't necessary, this could all be done inline or with continuation passing style, but that's worse for maintaining a nice split between functions.

From a correctness part, all the URLS might have weird characters in them that don't work well with pathnames and/or the local filesystem. In fact PURI might not be the best choice here either. Also, even if the URLs are different, more than one of them might have the same filename, meaning there should either be some error handling in there, or the URLs should be hashed to be used as filename or some other scheme accomplishing the same thing. Lastly, the download files should be checked for emptiness, wrong content (HTML bits would indicate a failed download too), broken images, etc.

Another nice thing to add would be xattr support for indicating where the file was downloaded from.

Hacking Java fields in ABCL
posted on 2017-06-06 23:41:05

Just as a quick note, JSS and the JAVA package too won't allow you to treat LispObjects objects as JavaObjects for the purposes of JFIELD and JSS:GET-JAVA-FIELD. But if you still want to access internal fields and implementation details (the usual warnings apply!), try the following:

(jss:get-java-field (jss:new 'JavaObject (car (threads:mapcar-threads #'identity))) "javaThread" T)

This was because I was looking for an answer to a question on #abcl, but even then, wrapping the "Lisp" object in a JavaObject manually helps achieve the requested goal of retrieving the Java Thread object.

The T at the end is necessary here because the field is actually package-private - but that's to be expected if you want to access internals without an (official) API.

Byte code verification and ABCL
posted on 2017-06-01 19:51:41

A feature of JVM byte code that I knew about, but didn't concern me too much when I previously hacked on ABCL was interesting to see in a particular bug report, where a function, when compiled, got rejected by the byte code validator:

(compile NIL (lambda (list) (nth (lambda ()) list)))
;; => Compiled function can't be loaded

This is true for any class loader, so COMPILE-FILE with LOAD would show the same behaviour.

The reason here is inlining: In order to give better performance when some information about the types is known, the call to NTH will actually be optimised here. One option to see this is to set a debug option to dump the generated byte code while the compilation runs:

(setf jvm::*compiler-debug* T)

(Part) of the output will look like this:

  0 GETSTATIC      #24 <Field LispObject 42c1_92be_e5b48c1894c6.LFUN1812724>  
  1 ALOAD          (1)  
  2 SWAP             
  3 INVOKEVIRTUAL  #30 <Method LispObject LispObject.NTH(int)> -1 
  4 ARETURN          

Using DISASSEMBLE is not an option precisely because the byte code can't be loaded at all (that's actually a nice idea as another function to dump not only the compiled FASL content, but also disassemble the contained byte code too).

With that option set (and perhaps setting a few BREAK statements in the compiler), it's somewhat easy to debug the compiled byte code and to see that the validator notices that the Java method that implements the NTH function requires an integer (fixnum) argument, but the LispObject (the lambda) doesn't match that signature. This is not a problem if this was a regular call. In fact, with inlining disabled the NTH function will still raise an error for an argument with the wrong type!

Finally the fix is to check for the derived type in the "P2" transformation function for NTH, COMPILE-NTH:

(define-inlined-function compile-nth (form target representation)
  ((check-arg-count form 2))
  (let* ((index-form (second form))
         (list-form (third form))

    ;;; new check here
         (index-type (derive-compiler-type index-form)))
    (unless (fixnum-type-p index-type)
      (compile-function-call form target representation)
      (return-from compile-nth))
    ;;; till here

        ((compile-operand index-form :int)
         (compile-operand list-form nil)
         (maybe-emit-clear-values index-form list-form))
      (emit 'swap)
      (emit-invokevirtual +lisp-object+ "NTH" '(:int) +lisp-object+))
    (fix-boxing representation nil) ; FIXME use derived result type
    (emit-move-from-stack target representation)))

Note the falling back to a "general" function call using COMPILE-FUNCTION-CALL here in case the type is not known in advance, or not a fixnum type (though that could also raise a warning here already).

Again, compiling the above function again looks a bit different in the general case:

  0 GETSTATIC      #29 <Field Symbol 4faf_99d9_62a0381d4d65.SYM1814566>  
  1 GETSTATIC      #33 <Field LispObject 4faf_99d9_62a0381d4d65.LFUN1814565>  
  2 ALOAD          (1)  
  3 INVOKEVIRTUAL  #39 <Method LispObject LispObject.execute(LispObject,LispObject)> -2 
  4 ARETURN          

If instead a fixnum constant is used, (nth 1 list), this simplifies a lot:

  0 ICONST_1         
  1 ALOAD          (1)  
  2 SWAP             
  3 INVOKEVIRTUAL  #24 <Method LispObject LispObject.NTH(int)> -1 
  4 ARETURN          

Compare that to adding a (declare (type fixnum ...)) declaration - not as good as a constant argument, but still directly calling the Java method:

  0 ALOAD          (1)  
  1 INVOKEVIRTUAL  #24 <Method int LispObject.intValue()> 0 
  2 ISTORE         (1)  
  3 ILOAD          (1)  
  4 ALOAD          (2)  
  5 SWAP             
  6 INVOKEVIRTUAL  #28 <Method LispObject LispObject.NTH(int)> -1 
  7 ARETURN          

Note here that the type error (e.g. for again supplying the (lambda ())) will be raised at the caller site already!

Lastly, a good idea would also be to generally add more hints, as e.g. SBCL does, to debug other issues ("failed to inline because ..."), but that's for another day.

Memory reordering in Common Lisp and Kotlin
posted on 2016-11-23 21:43:41

I've been reading about multi-threaded programming out of curiosity over lock-free algorithms.

The first thing I've implemented for that was actually something completely different, namely a short example of memory reordering ported over from a C++ program.

In Common Lisp this relies on SBCL because of the availability of memory barriers and the Linux-specific API to set the thread affinity to a single CPU core (sb-cpu-affinity). Of course, given the appropriate compability libraries this could easily be done on other implementations too.

(in-package #:cl-user)

(defun main (&key single-cpu-p barrierp)
  (let ((begin-semaphore-1 (sb-thread:make-semaphore))
        (begin-semaphore-2 (sb-thread:make-semaphore))
        (end-semaphore (sb-thread:make-semaphore))
        (iterations 0)
        (detected 0)
        (x 0)
        (y 0)
        (r1 0)
        (r2 0)
    (declare (fixnum x y r1 r2))
    (declare (boolean done))
    (declare (optimize (speed 3) (safety 0) (space 0) (debug 0) (compilation-speed 0)))
    (labels ((single-cpu ()
               (when single-cpu-p
                 (with-cpu-affinity-mask (mask :save T)
                   (clear-cpu-affinity-mask mask)
                   (setf (cpu-affinity-p 0 mask) T))))
             (thread-1 ()
                 (when done
                 (sb-thread:wait-on-semaphore begin-semaphore-1)
                 (setf x 1)
                 (when barrierp
                   (sb-thread:barrier (:memory)))
                 (setf r1 y)
                 (sb-thread:signal-semaphore end-semaphore)))
             (thread-2 ()
                 (when done
                 (sb-thread:wait-on-semaphore begin-semaphore-2)
                 (setf y 1)
                 (when barrierp
                   (sb-thread:barrier (:memory)))
                 (setf r2 x)
                 (sb-thread:signal-semaphore end-semaphore))))
      (let ((thread-1 (sb-thread:make-thread #'thread-1))
            (thread-2 (sb-thread:make-thread #'thread-2)))
               (setf x 0 y 0)
               (sb-thread:signal-semaphore begin-semaphore-1)
               (sb-thread:signal-semaphore begin-semaphore-2)
               (sb-thread:wait-on-semaphore end-semaphore)
               (sb-thread:wait-on-semaphore end-semaphore)
               (when (and (eql r1 0) (eql r2 0))
                 (incf detected)
                 (format T "~D reorders detected after ~D iterations, every ~D on average~%" detected iterations (floor iterations detected)))
               (incf iterations))
          (setf done T)
          (sb-thread:signal-semaphore begin-semaphore-1)
          (sb-thread:signal-semaphore begin-semaphore-2)
          (sb-thread:join-thread thread-1)
          (sb-thread:join-thread thread-2))))))

On the JVM, respectively Kotlin, the code looks again remarkably similar. I haven't yet looked at thread affinity, but what's interesting here are two aspects related to the JVM memory model. Using volatile is, in contrast to C, viable, as it generates the necessary memory barriers.

import java.util.concurrent.Semaphore
import kotlin.concurrent.thread

class HelloWorld {
    var beginSempahore1 = Semaphore(0)
    var beginSempahore2 = Semaphore(0)
    var endSemaphore = Semaphore(0)

    var detected = 0
    var iterations = 0

    @Volatile var x = 0
    @Volatile var y = 0
    var r1 = 0
    var r2 = 0

    fun run() {
        thread {
            while (true) {
                x = 1
                r1 = y

        thread {
            while (true) {
                y = 1
                r2 = x

        while (true) {
            iterations += 1
            x = 0
            y = 0
            if (r1 == 0 && r2 == 0) {
                println("$detected reorders detected after $iterations iterations")

fun main(args: Array<String>) {

Also, when using a JVM plugin, the generated assembly code can be dumped during compilation, which allows us to confirm whether those instructions actually have been generated; for the first loop:

  0x00007f9ad4f503c9: mov    (%rcx),%r11d       ;*getfield this$0
                                                ; - HelloWorld$run$1::invoke@11 (line 21)

  0x00007f9ad4f503cc: test   %r11d,%r11d
  0x00007f9ad4f503cf: je     0x00007f9ad4f507a9
  0x00007f9ad4f503d5: movl   $0x1,0x14(%r12,%r11,8)
  0x00007f9ad4f503de: lock addl $0x0,(%rsp)     ;*putfield x
                                                ; - HelloWorld::setX@2 (line 12)
                                                ; - HelloWorld$run$1::invoke@15 (line 21)

  0x00007f9ad4f503e3: mov    (%rcx),%r10d       ;*getfield this$0
                                                ; - HelloWorld$run$1::invoke@19 (line 22)

  0x00007f9ad4f503e6: test   %r10d,%r10d
  0x00007f9ad4f503e9: je     0x00007f9ad4f507cd  ;*invokevirtual getY
                                                ; - HelloWorld$run$1::invoke@26 (line 22)

  0x00007f9ad4f503ef: mov    0x18(%r12,%r10,8),%r11d  ;*getfield y
                                                ; - HelloWorld::getY@1 (line 13)
                                                ; - HelloWorld$run$1::invoke@26 (line 22)

  0x00007f9ad4f503f4: mov    %r11d,0x1c(%r12,%r10,8)  ;*putfield r1
                                                ; - HelloWorld::setR1@2 (line 14)
                                                ; - HelloWorld$run$1::invoke@29 (line 22)

And for the second loop:

  0x00007f9ad4f47269: mov    (%rcx),%r11d       ;*getfield this$0
                                                ; - HelloWorld$run$2::invoke@11 (line 30)

  0x00007f9ad4f4726c: test   %r11d,%r11d
  0x00007f9ad4f4726f: je     0x00007f9ad4f47629
  0x00007f9ad4f47275: movl   $0x1,0x18(%r12,%r11,8)
  0x00007f9ad4f4727e: lock addl $0x0,(%rsp)     ;*putfield y
                                                ; - HelloWorld::setY@2 (line 13)
                                                ; - HelloWorld$run$2::invoke@15 (line 30)

  0x00007f9ad4f47283: mov    (%rcx),%r10d       ;*getfield this$0
                                                ; - HelloWorld$run$2::invoke@19 (line 31)

  0x00007f9ad4f47286: test   %r10d,%r10d
  0x00007f9ad4f47289: je     0x00007f9ad4f4764d  ;*invokevirtual getX
                                                ; - HelloWorld$run$2::invoke@26 (line 31)

  0x00007f9ad4f4728f: mov    0x14(%r12,%r10,8),%r11d  ;*getfield x
                                                ; - HelloWorld::getX@1 (line 12)
                                                ; - HelloWorld$run$2::invoke@26 (line 31)

  0x00007f9ad4f47294: mov    %r11d,0x20(%r12,%r10,8)  ;*putfield r2
                                                ; - HelloWorld::setR2@2 (line 15)
                                                ; - HelloWorld$run$2::invoke@29 (line 31)

And the main loop:

  0x00007fd383813838: lock addl $0x0,(%rsp)     ;*putfield x
                                                ; - HelloWorld::run@58 (line 38)

  0x00007fd38381383d: mov    $0x0,%edx
  0x00007fd383813842: mov    %edx,0x18(%rsi)
  0x00007fd383813845: lock addl $0x0,(%rsp)     ;*putfield y
                                                ; - HelloWorld::run@63 (line 39)

  0x00007fd38381384a: mov    0x24(%rsi),%edi

Note that in contrast to the original and the Common Lisp version there's one additional memory barrier here that's unnecessary.

Asynchronous file I/O
posted on 2016-11-11 01:19:28

Occasionally the topic of asynchronous I/O on local files comes up, though while there are APIs for event-based processing they don't work on regular files, resulting in the use of worker threads to circumvent this restriction.

The inciting incident for me to look into this was the fact that one of my (external, spinning-disk) hard drives has a rather short timeout for spin-down, such that when pausing while browsing through a directory tree would often annoy me when the shell (or any other program) was completely blocked as the motor was being turned on again.

At least on Linux (since version 2.5) there is actually a kernel syscall interface for asynchronous file I/O. Unfortunately it is not being exposed via the libc at all, requiring custom wrappers in order to trigger the proper syscalls. Apart from scheduling asynchronous read and write requests it also supports exposing the corresponding events via an evenfd queue, which then allows us to use epoll and friends.

SBCL being the ultimate breadboard that's not particularly hard though. Using CFFI and IOLib for convenience it's straightforward to port the C examples while writing a minimal amount of C code. The code is of course not very high-level, but can be plugged straight into the IOLib event loop as there's now a file descriptor available to listen on.

Grovelling & wrapping

The groveller can be used quite nicely to prevent us from having to drop down to C completely. Of course we're also using ASDF, so everything's peachy.

Now what's required? CFFI provides additional components for ASDF, namely :CFFI-GROVEL-FILE and :CFFI-WRAPPER-FILE, which make the process seamless and don't require us to write and code related to compiling and loading the C wrapper:

;; -*- mode: lisp; syntax: common-lisp; coding: utf-8-unix; package: cl-user; -*-

(in-package #:cl-user)

(asdf:defsystem #:example
  :defsystem-depends-on (#:cffi-grovel)
  #+asdf-unicode :encoding #+asdf-unicode :utf-8
  :depends-on (#:iterate
  :serial T
  :components ((:module "src"
                ((:file "package")
                 (:file "wrapper")
                 (:cffi-grovel-file "linux-aio-grovel")
                 (:cffi-wrapper-file "linux-aio-wrapper")
                 (:file "linux-aio")))))

The package definition is probably not very interesting at this point:

(in-package #:cl-user)

(defpackage #:example
  (:use #:cl #:iterate #:iolib #:cffi))

I've added IOLib and CFFI, usually also ITERATE for convenience.

Next we grovel a couple of definitions related to the kernel API for the asynchronous requests and for eventfd. This is the linux-aio-grovel file mentioned above:

(in-package #:example)

(include "stdio.h" "unistd.h" "sys/syscall.h" "linux/aio_abi.h" "inttypes.h"
         "signal.h" "sys/eventfd.h")

(ctype aio-context-t "aio_context_t")

(cenum iocb-cmd-t
  ((:pread "IOCB_CMD_PREAD"))
  ((:pwrite "IOCB_CMD_PWRITE"))
  ((:fsync "IOCB_CMD_FSYNC"))
  ((:fdsync "IOCB_CMD_FDSYNC"))
  ((:noop "IOCB_CMD_NOOP"))
  ((:preadv "IOCB_CMD_PREADV"))
  ((:pwritev "IOCB_CMD_PWRITEV")))

(constantenum iocb-flags-t
  ((:resfd "IOCB_FLAG_RESFD")))

(cstruct iocb "struct iocb"
  (aio-data "aio_data" :type :uint64)
  ;; #-little-endian
  ;; (aio-reserved1 "aio_reserved1" :type :uint32)
  (aio-key "aio_key" :type :uint32)
  ;; #+little-endian
  ;; (aio-reserved1 "aio_reserved1" :type :uint32)
  (aio-lio-opcode "aio_lio_opcode" :type iocb-cmd-t)
  (aio-fildes "aio_fildes" :type :uint32)
  (aio-buf "aio_buf" :type :uint64)
  (aio-nbytes "aio_nbytes" :type :uint64)
  (aio-offset "aio_offset" :type :int64)
  ;; (aio-reserved2 "aio_reserved2" :type :uint64)
  (aio-flags "aio_flags" :type :uint32)
  (aio-resfd "aio_resfd" :type :uint32))

(cstruct io-event "struct io_event"
  (data "data" :type :uint64)
  (obj "obj" :type :uint64)
  (res "res" :type :int64)
  (res2 "res" :type :int64))

(cenum eventfd-flags-t
  ((:cloexec "EFD_CLOEXEC"))
  ((:nonblock "EFD_NONBLOCK"))
  ((:semaphore "EFD_SEMAPHORE")))

Note that this not a complete list and a couple of reserved members are commented out as they're primarily used to provide space for further expansion. Fortunately offsets for the rest of the struct aren't affected by leaving out parts in the Lisp-side definition.

The enums are easy enough, even though they both represent flags, so should be or-ed together, which might be necessary to do manually or by finding a way to let CFFI do the coercion from a list of flags perhaps.

In order to have nice syscall wrappers we'd normally use defsyscall from IOLib. Unfortunately we also want to use defwrapper from CFFI-GROVEL. This is an example of bad composability of macros, requiring copy and paste of source code. Of course with enough refactoring or an optional parameter this could be circumvented. This is the wrapper file from the ASDF definition.

;; groan

(define-wrapper-syntax defwrapper/syscall* (name-and-options rettype args &rest c-lines)
  ;; output C code
  (multiple-value-bind (lisp-name foreign-name options)
      (cffi::parse-name-and-options name-and-options)
    (let ((foreign-name-wrap (strcat foreign-name "_cffi_wrap"))
          (fargs (mapcar (lambda (arg)
                           (list (c-type-name (second arg))
                                 (cffi::foreign-name (first arg) nil)))
      (format out "~A ~A" (c-type-name rettype)
      (format out "(~{~{~A ~A~}~^, ~})~%" fargs)
      (format out "{~%~{  ~A~%~}}~%~%" c-lines)
      ;; matching bindings
      (push `(iolib/syscalls:defsyscall (,foreign-name-wrap ,lisp-name ,@options)
                 ,(cffi-type rettype)
               ,@(mapcar (lambda (arg)
                           (list (symbol* (first arg))
                                 (cffi-type (second arg))))

The only change from DEFWRAPPER is the use of IOLIB/SYSCALLS:DEFSYSCALL instead of DEFCFUN, which then performs additional checks with respect to the return value, raising a IOLIB/SYSCALLS:SYSCALL-ERROR that we can then catch rather than having to check the return value ourselves.

Lastly, the actual wrappers. Note that some inline C is used to define the function bodies. This is linux-aio-wrapper from the ASDF definition:

(define "_GNU_SOURCE")

(include "stdio.h" "unistd.h" "sys/syscall.h" "linux/aio_abi.h" "inttypes.h"

(defwrapper/syscall* "io_setup" :int
  ((nr :unsigned-int)
   (ctxp ("aio_context_t*" (:pointer aio-context-t))))
  "return syscall(__NR_io_setup, nr, ctxp);")

(defwrapper/syscall* "io_destroy" :int
  ((ctx aio-context-t))
  "return syscall(__NR_io_destroy, ctx);")

(defwrapper/syscall* "io_submit" :int
  ((ctx aio-context-t)
   (nr :long)
   (iocbpp ("struct iocb**" (:pointer (:pointer (:struct iocb))))))
  "return syscall(__NR_io_submit, ctx, nr, iocbpp);")

(defwrapper/syscall* "io_cancel" :int
  ((ctx aio-context-t)
   (iocbp ("struct iocb*" (:pointer (:struct iocb))))
   (result ("struct io_event*" (:pointer (:struct io-event)))))
  "return syscall(__NR_io_cancel, ctx, iocbp, result);")

(defwrapper/syscall* "io_getevents" :int
  ((ctx aio-context-t)
   (min-nr :long)
   (max-nr :long)
   (events ("struct io_event*" (:pointer (:struct io-event))))
   (timeout ("struct timespec*" (:pointer (:struct iolib/syscalls::timespec)))))
  "return syscall(__NR_io_getevents, ctx, min_nr, max_nr, events, timeout);")


Now that we have all definitions in place, let's translate a moderately complex example of reading from an existing file.

Also EVENTFD is defined here as the C function is already defined in the libc and doesn't have to generated.

(iolib/syscalls:defsyscall eventfd :int
  (initval :unsigned-int)
  (flags eventfd-flags-t))

(defun linux-aio-test (pathname &key (chunk-size 4096))
  (with-foreign-object (context 'aio-context-t)
    (iolib/syscalls:memset context 0 (foreign-type-size 'aio-context-t))
    (let ((eventfd (eventfd 0 :nonblock)))
           (with-open-file (stream pathname :element-type '(unsigned-byte 8))
             (let* ((length (file-length stream))
                    (chunks (ceiling length chunk-size)))
               (with-foreign-object (buffer :uint8 length)
                 (with-event-base (event-base)
                   (io-setup 1 context) ; set up with number of possible operations
                   (with-foreign-object (iocbs '(:struct iocb) chunks)
                     (iolib/syscalls:memset iocbs 0 (* (foreign-type-size '(:struct iocb)) chunks))
                     ;; set up array of operations
                     (dotimes (i chunks)
                       (let ((iocb (mem-aptr iocbs '(:struct iocb) i)))
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-lio-opcode) :pread)
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-buf) (pointer-address (mem-aptr buffer :uint8 (* i chunk-size))))
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-nbytes) (if (eql i (1- chunks)) (- length (* i chunk-size)) chunk-size))
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-fildes) (sb-sys:fd-stream-fd stream))
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-offset) (* i chunk-size))
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-flags) (foreign-enum-value 'iocb-flags-t :resfd))
                         (setf (foreign-slot-value iocb '(:struct iocb) 'aio-resfd) eventfd)))
                     ;; set up array of pointers to operations
                     (with-foreign-object (iocbp '(:pointer (:struct iocb)) chunks)
                       (dotimes (i chunks)
                         (setf (mem-aref iocbp '(:pointer (:struct iocb)) i) (mem-aptr iocbs '(:struct iocb) i)))
                       ;; submit as many operations as possible
                       (let ((submitted (io-submit (mem-ref context 'aio-context-t) chunks iocbp)))
                         ;; keep track of how many operations completed total
                         (let ((total-events-read 0))
                           (flet ((get-events () ; named to be able to RETURN-FROM
                                    (with-foreign-object (events '(:struct io-event) 3)
                                            (with-foreign-object (available-buffer :uint64)
                                              (iolib/syscalls:read eventfd available-buffer 8)
                                              (let ((available (mem-ref available-buffer :uint64)))
                                                (dotimes (i available)
                                                  (let ((events-read (io-getevents (mem-ref context 'aio-context-t) 0 3 events (null-pointer))))
                                                    (when (eql events-read 0)
                                                    (incf total-events-read events-read))
                                                  (when (eql total-events-read chunks)
                                                    (return-from linux-aio-test)))))
                                          ;; in case reading would block
                                          (iolib/syscalls:syscall-error ()
                                            (when (< submitted chunks)
                                              (let ((more-submitted (io-submit (mem-ref context 'aio-context-t) chunks (mem-aptr iocbp '(:pointer (:struct iocb)) submitted))))
                                                (incf submitted more-submitted)))
                                            (return-from get-events)))))))
                              event-base eventfd :read
                              (lambda (fd event exception)
                                (declare (ignore fd event exception))
                             (event-dispatch event-base))))))))))
        (io-destroy (mem-ref context 'aio-context-t))
        (iolib/syscalls:close eventfd)))))

Relatively straightforward. Complexity comes from accurately submitting chunks, reading the number of available events on demand and submitting a new batch of chunks as long as there are some remaining ones.

Insert FORMAT statement as you like. Tuning the values would need to be considered in order to keep memory consumption in check. Finally


We still can't do many local file operations asynchronously. The whole reason to jump through these hoops is of course to integrate potentially blocking operations into an event loop, so some care still needs to be taken to do some work ahead of time or in separate threads as to not block the main part of the program from issuing the I/O requests.

Debugging Java with ABCL
posted on 2016-09-10 00:05:24

So after adding a dispatch mechanism into the DISASSEMBLE function of ABCL I also want to show a neat way of disassembling arbitrary Java classes using one of the provided disassemblers.

First of, make sure that you have the ObjectWeb ASM library in your classpath (everything up from version 3 should work at least, perhaps even lower versions), note that in future releases you might be able to load it via Maven and one of the optional contribs as well, likely via (require '#:asm-all).

Next up, try (disassemble #'list) and confirm that some output is shown.

Now to show bytecode for arbitrary Java classes we've got to jump through some additional hoops - though perhaps at some point this could become a stable API as well:

(system::objectweb-disassemble (#"getResourceAsStream" (java:jclass "java.lang.Object") "/java/util/Random.class"))

I haven't tried this via the JAD disassembler, but it seems likely that a similar approach should work for it too.

Setting up ABCL and LIRE
posted on 2015-11-11 20:34:29


The purpose of this article is to examine how using ABCL with existing libraries (arguably the main point of using ABCL at the moment) actually looks like in practice. Never mind integration with Spring, or other more involved frameworks, this will only touch a single library and won't require us to write from-Java-callable classes.

In the process of refining this I'm hoping to also get ideas about the requirements for building a better DSL for the Java FFI, based on the intended "look" of the code (that is, coding by wishful thinking).


Ensuring the correct package is somewhat optional:

(in-package #:cl-user)

Generally using JSS is a bit nicer than the plain Java FFI. After the contribs are loaded, JSS can be required and used:

(require '#:abcl-contrib)
(require '#:jss)

(use-package '#:jss)

Next, we need access to the right libraries. After building LIRE from source and executing the mvn dist command we end up with a JAR file for LIRE and several dependencies in the lib folder. All of them need to be on the classpath:

  (add-to-classpath "~/src/LIRE/dist/lire.jar")
  (mapc #'add-to-classpath (directory "~/src/LIRE/dist/lib/*.jar")))


Since we're going to read pictures in a couple of places, a helper to load one from a pathname is a good start:

(defun read-image (pathname)
  (#"read" 'javax.imageio.ImageIO (new ' (namestring pathname))))

To note here is the use of NEW from JSS with a symbol for the class name, the conversion of the pathname to a regular string, since the Java side doesn't expect a Lisp object and the #"" reader syntax from JSS to invoke the method read in a bit of a simpler way than using the FFI calls directly.

JSS will automatically "import" Java names, so the same function can simply be the following instead (provided that the names aren't ambiguous):

(defun read-image (pathname)
  (#"read" 'ImageIO (new 'File (namestring pathname))))

The names will be looked up again on every call though, so this option isn't the best performing one.

For comparison, the raw FFI would be a bit more verbose, but explicitely specifies all names:

(defun read-image (pathname)
  (jstatic "read" "javax.imageio.ImageIO" (jnew "" (namestring pathname))))

Though with a combination of JSS and cached lookup it could be nicer, even though the setup is more verbose:

(defvar +image-io+ (jclass "javax.imageio.ImageIO"))
(defvar +file+ (jclass ""))

(defun read-image (pathname)
  (#"read" +image-io+ (jnew +file+ (namestring pathname))))

At this point without other improvements (auto-coercion of pathnames, importing namespaces) it's about as factored as it will be (except moving every single call into its own Lisp wrapper function).

Building an index

To keep it simple building the index will be done from a list of pathnames in a single step while providing the path of the index as a separate parameter:

(defun build-index (index-name pathnames)
  (let ((global-document-builder
          (new 'GlobalDocumentBuilder (find-java-class 'CEDD)))
        (index-writer (#"createIndexWriter"
                       (get-java-field 'LuceneUtils$AnalyzerType "WhitespaceAnalyzer"))))
         (dolist (pathname pathnames)
           (let ((pathname (namestring pathname)))
             (format T "Indexing ~A ..." pathname)
             (let* ((image (read-image pathname))
                    (document (#"createDocument" global-document-builder image pathname)))
               (#"addDocument" index-writer document))
             (format T " done.~%")))
      (#"closeWriter" 'LuceneUtils index-writer))))

Note: This code won't work on current ABCL as is, because the lookup is disabled for for nested classes (those containing the dollar character). Because of this, the AnalyzerType class would have to be looked up as follows:

(jfield "net.semanticmetadata.lire.utils.LuceneUtils$AnalyzerType" "WhitespaceAnalyzer")

All in all nothing fancy, JSS takes care of a lot of typing as the names are all unique enough.

The process is simply creating the document builder and index writer, reading all the files one by one and adding them to the index. There's no error checking at the moment though.

To note here is that looking up the precise kind of a Java name is a bit of a hassle. Of course intuition goes a long way, but again, manually figuring out whether a name is a nested class or static/enum field is annoying enough since it involves either repeated calls to JAPROPOS, or reading more Java documentation.

Apart from that, this is mostly a direct transcription. Unfortunately written this way there's no point in creating a WITH-OPEN-* macro to automatically close the writer, however, looking at the LuceneUtils source this could be accomplished by directly calling close on the writer object instead - a corresponding macro might this then:

(defmacro with-open ((name value) &body body)
  `(let ((,name ,value))
          (progn ,@body)
       (#"close" ,name))))

It would also be nice to have auto conversion using keywords for enum values instead of needing to look up the value manually.

Querying an index

The other way round, looking up related pictures by passing in an example, is done using an image searcher:

(defun query-index (index-name pathname)
  (let* ((image (read-image pathname))
         (index-reader (#"open" 'DirectoryReader
                                (#"open" 'FSDirectory
                                         (#"get" 'Paths index-name (jnew-array "java.lang.String" 0))))))
         (let* ((image-searcher (new 'GenericFastImageSearcher 30 (find-java-class 'CEDD)))
                (hits (#"search" image-searcher image index-reader)))
           (dotimes (i (#"length" hits))
             (let ((found-pathname (#"getValues" (#"document" index-reader (#"documentID" hits i))
                                                 (get-java-field 'builders.DocumentBuilder "FIELD_NAME_IDENTIFIER"))))
               (format T "~F: ~A~%" (#"score" hits i) found-pathname))))
      (#"closeReader" 'LuceneUtils index-reader))))

To note here is that the get call on java.nio.file.Paths took way more time to figure out than should've been necessary: Essentially the method is using a variable number of arguments, but the FFI doesn't help in any way, so the array (of the correct type!) needs to be set up manually, especially if the number of variable arguments is zero. This is not obvious at first and also takes unnecessary writing.

The rest of the code is straightforward again. At least a common wrapper for the length call would be nice, but since the result object doesn't actually implement a collection interface, the point about having better collection iteration is kind of moot here.

A better DSL

Considering how verbose the previous examples were, how would the "ideal" way look like?

There are different ways which are more, or less intertwined with Java semantics. On the one end, we could imagine something akin to "Java in Lisp":

(defun read-image (pathname)
  (ImageIO/read (FileInputStream. pathname)))

Which is almost how it would look like in Clojure. However, this is complicating semantics. While importing would be an extension to the package mechanism (or possibly just a file-wide setting), the Class/field syntax and Class. syntax are non-trivial reader extensions, not from the actual implementation point of view, but from the user point of view. They'd basically disallow a wide range of formerly legal Lisp names.

(defun read-image (pathname)
  (#"read" 'ImageIO (new 'FileInputStream pathname)))

This way is the middle ground that we have now. The one addition here could be that name lookup is done at macro expansion / compilation time, so they are fixed one step before execution, whereas at the moment the JSS reader macro will allow for very late bound name lookup instead.

The similarity with CLOS would be the use of symbols for class names, but the distinction is still there, since there's not much in terms of integrating CLOS and Java OO yet (which might not be desirable anyway?).

Auto-coercion to Java data types also takes place in both cases. Generally this would be appropriate, except for places where we'd really want the Java side to receive a Lisp object. Having a special variable to disable conversion might be enough for these purposes.

If we were to forego the nice properties of JSS by requiring a function form, the following would be another option:

(defun read-image (pathname)
  $(read 'ImageIO (new 'FileInputStream pathname)))

Where $(...) would be special syntax indicating a Java method call. Of course the exact syntax is not very relevant, more importantly static properties could be used to generate a faster, early bound call by examining the supplied arguments as a limited form of type inference.


After introducing the necessary steps to start using ABCL with "native" Java libraries, we transcribed two example programs from the library homepage.

Part of this process was to examine how the interaction between the Common Lisp and Java parts looks like, using the "raw" and the simplified JSS API. In all cases the FFI is clunkier than needs be. Especially the additional Java namespaces are making things longer than necessary. The obvious way of "importing" classes by storing a reference in a Lisp variable is viable, but again isn't automated.

Based on the verbose nature of the Java calls an idea about how a more concise FFI DSL could look like was developed next and discussed. At a future point in time this idea could now be developed fully and integrated (as a contrib) into ABCL.

Image viewers and other systems tools
posted on 2015-06-12 16:20:23

After all the trouble I went through with trying to get an image viewer to work in CL (due to the used GTK+ 3 library being problematic, that is, unmaintained) maybe a different approach is more viable. It would be possible to use one existing program as a front end by calling it via IPC. So e.g. feh, my go-to program for that task, already has configurable keybindings; it should be a smaller problem to remote control it (even with adding some code).

However, as with all these *nix combinators, it feels like a mish-mash of tools intertwined and not-quite the optimal solution.

Consider what happens if you want to add new functionality, i.e. new widgets. In that case composability breaks down since feh is relatively minimal and therefore doesn't have much options in terms of providing different menus, input widgets, etc. Therefore you'd have to find either a different viewer with more scripting capabilities (which is counter to the "one-tool" mantra), or switch to a more integrated approach to have this component as an internal part of your environment.

Obviously now would be the time for either components/CORBA, or a Lisp Machine to hack up other programs.

Or switch to Qt. It seems that the bindings for that framework are more stable than the GTK bindings and additionally they (Qt) just have more people working on the framework.

Since one of the problems with the GTK bindings is the relatively recent upgrade to GTK+ 3, there seems to be a point in using the previous version 2 instead, considering that even GIMP didn't update yet.

ELS 2015
posted on 2015-04-22 21:35:23+01:00

Yesterday the 8th European Lisp Symposium finished. In short it was a great experience (I was there the first time, but hopefully not the last). The variety and quality of talks was great, a good number of people attended both the actual talks as well as both(!) dinners, so there were lots of opportunities to exchange thoughts and quiz people, including on Lisp. Also except for one talk I believe all talks happened, which is also a very good ratio.

For the talks I still have to go through the proceedings a bit for details, but obviously the talk about the Lisp/C++ interoperability with Clasp was (at least for me) long awaited and very well executed. Both the background information on the origins, as well as the technical description on the use of LLVM and the integration of multiple other projects (ECL, SICL, Cleavir) were very interesting and informative.

There were also quite a number of Racket talks, which was surprising to me, but given the source of these projects it makes sense since the GUI is pretty good. VIGRA, although it's a bit unfortunate name, looks pretty nice. The fact that the bindings to a number of languages are available and in the case of the Lisps make the interaction a lot easier is good to see, so it might be a good alternative to OpenCV. It's also encouraging that students enjoy this approach and are as it seems productive with the library.

P2R, the Processing implementation in Racket is similarly interesting as there is a huge community using Processing and making programming CAD applications easier via a known environment is obviously nice and should give users more opportunities in that area.

If I remember correctly the final Racket talk was about constraining application behaviour, which was I guess more of a sketch how application modularity and user-understandable permissions could be both implemented and enforced. I still wonder about the applicability in e.g. a Lisp or regular *nix OS.

The more deeply technical talks regarding the garbage collector (be it in SBCL, or Allegro CL) were both very interesting in that normally I (and I imagine lots of people) don't have (a chance) to get down to that level and therefore learning about some details about those things is appreciated.

Same goes for the first talk by Robert Strandh, Processing List Elements in Reverse Order, which was really great to hear about in the sense that I usually appreciate the :from-end parameter of all the sequence functions and still didn't read the details of the interaction between actual order of iteration vs. the final result of the function. Then again, the question persists if any programs are actually processing really long lists in reverse in production. Somehow the thought that even this case is optimised would make me sleep easier, but then again, the tradeoff of maintainable code vs. performance improvements remains (though I don't think that the presented code was very unreadable).

Escaping the Heap was nice and it'll be great to see an open-sourced library for shared memory and off-heap data structures, be it just for special cases anyway.

Lots of content, so I doubt I'll get to the lightning talks. It'll be just this for now then. Hopefully I have time/opportunity to go to the next ELS or another Lisp conference; I can only recommend going.

A mocking library for Common Lisp
posted on 2015-01-06 14:52:47+01:00

After some time thinking about and rewriting the library in a subtly different approaches, CL-MOCK now looks good to me as a version one.

I've removed all mentions of generic functions for now, as first of all I'm unsure if functionality to dynamically rebind methods is even necessary, and second, because doing that is complicated by the details of that protocol. Which means that specifying which method to override is a bit hairy and I really want a good syntax before I let that stuff loose. So it'll have to wait until I figure out a good way to do that. Since it should be easily added to the existing frontend, it will very probably be done with some overloading of existing functions / macros (e.g. with a :method specifier or so).

I'm hoping to test all of this and possibly investigate the generic function issue on some other library. At the moment my single more complex example is a replacement for the DRAKMA HTTP-REQUEST call, which worked surprisingly well and might even make it into a new test suite. The benefit is obviously the improved reliability of not having to have a running network connection for testing libraries against a (HTTP) server.

Tumblelog design
posted on 2014-11-28 21:30

Yay, found time to give my tumble log a little theme upgrade. And while I'm doing that, I thought I could write down some thoughts on the current design. I might even update it later.

In contrast to this Coleslaw blog I've made a small static renderer to HTML/ATOM for the tumble log, so the interface is mostly console vim to text file with S-expressions. Since at the moment there is only a single file I might just show the syntax for that one:

;; -*- mode: lisp; coding: utf-8-unix; -*-

(video :id 1 :date @2013-05-23T00:00:00+01:00 :tags (music) :url #U"" :title "I Charleston Tel Aviv" :via #U"")
(image :id 3 :date @2014-08-16T01:01:00+01:00 :url #U"" :src #P"data/tumblr_n17lclGKp51qdewwao1_1280.jpg" :alt "Scene from the mange BLAME!")
(text :id 22 :date @2014-08-28T21:22:58.047809+01:00 :tags (emacs lisp) :text #"`(global-set-key (kbd "H-1") ...)`  Hyper Hyper!  In other words, bind `Caps-Lock` to `Hyper` and rejoice!"#)
(link :id 88 :date @2014-11-20T17:27:05.107413+01:00 :url #U"" :title "HTSQL")

Clearly this would be viable to store in a DBMS as well, but this is good enough as they say. Featuring PURI, a few other syntax extensions, (my fork of) CL-WHO (for XML extensions) and CL-MARKDOWN for text rendering. The Markdown rendering could use a few fixes for edge cases, but in general it works pretty well and fast. Tags are also written to separate files (both HTML and ATOM), so readers could actually restrict themselves to a few subsets, even though it's not a by any stretch a full-fledged query system.

Still, by rendering to static files a lot of problems go away and the whole site can be served very efficiently via a small web server (I'm leaving Hunchentoot/teepeedee2 for more involved projects).

Lisp layered on Unix
posted on 2014-11-27 21:21:10

This blog post on LispCast is a pretty good start to get thinking about the intricacies of the interaction between Lisp (Machine) ideas and the current Unix environment. Of course that includes plan9 on the one side and Emacs on the other.

Lisp shell

There is scsh, but it's not really what I'm looking for. Using emacs as login shell (with the eshell package) comes closest to it regarding both with existing commands and integration of Lisp-based ones. However, while pipes work as expected with eshell, data is still passed around as (formatted) text. There doesn't seem to be an easy way to pass around in-memory objects, at least while staying in Emacs itself. That would of course mean to reimplement some (larger?) parts of that system.

This all ties in to the idea that unstructured text isn't the best idea to represent data between processes. Even though Unix pipes are extremely useful, the ecosystem of shell and C conventions means that the obvious way isn't completely correct, meaning that there are edge cases to consider. The best is something as innocent as ls | wc -l, which will break, depending on the shell settings, with some (unlikely) characters in filenames, i.e. newlines.

Common formats

One of the problems is obviously that in order to pass around structured data, i.e. objects, all participants have to understand their format. Passing references won't work without OS support though.

Instead of having unstructured streams, use streams of (data) objects. The distinction here is Plain Old Objects (PODs) instead of objects with an associated behaviour.

Let's take a look at standard Unix command line tools (I'm using GNU Coreutils here) in order to reproduce the behaviour and/or intent behind them:

Output of entire files

The first command here is cat. Although GNU cat includes additional transformations, this command concatenates files. Similar to the description, we can image a CAT to perform a similar operation on streams of objects.

It doesn't make much sense to concatenate a HTML document and an MP3 file (hence you won't do it in most cases anyway). However, since files are unstructured, cat can work on them.

Registering functionality

Although you can call commands individually on files, some of them form an ad-hoc service interface already: The C compiler, along with the toolchain forms one such interface, where you're required to use the same interface if you want to seamlessly replace one part of the toolchain.

Same goes for the Coreutils: As long as you honour the interface, programs can be replaced with different implementations.

Interactive commands

Emacs has a special form interactive to indicate whether a command can be directly called via the command prompt. There is also special handling there to accomodate the use of interactive arguments. This is something that can be generalised to the OS. An example of where this already happens is the .mailcap file and the .desktop files from desktop suites.

Threading and job control

Unfortunately getting proper job control to work will be a bit of a problem in any Lisp implementation, since the best way to implement the concurrent jobs is using threads, which are not particularly suited for handling the multitude of signals and other problems associated with them. Even without job control pipelines implemented in Lisp require shared in-memory communication channels, so something like (object-based) streams, mailboxes, or queues are necessary to move IO between different threads.

Again, Coleslaw
posted on 2014-08-18 23:11:32

Another year, another blog. Well, in this case I'd already setup another Coleslaw instance some time ago, but didn't bother to actually fix some issues. It's still not the best setup, but it's fixable. Removing all mentions of Quicklisp (because in the way it's used it should rather be replaced by ASDF dependencies) and all COMPILE-FILE statements would be a start (because in my setup the git user won't have permissions to write FASL files in that directory).

But the point stands: A fix for both of these issues is not obvious. An additional ASD file for each plugin is a bit wasteful, but probably one of the better options, apart from the need to register plugins. COMPILE-FILE is more complicated. However maybe not using compilation explicitely would be enough to fix this in the short run.

And it badly needs a theme, any theme, at least something different from the standard one.

This blog covers unix, tachikoma, postgresql, lisp, kotlin, java, git, emacs

View content from 2014-08, 2014-11, 2014-12, 2015-01, 2015-02, 2015-04, 2015-06, 2015-08, 2015-11, 2016-08, 2016-09, 2016-10, 2016-11, 2017-06, 2017-07

Unless otherwise credited all material Creative Commons License by Olof-Joachim Frahm