Content from 2015-11
Intro
The purpose of this article is to examine how using ABCL with existing libraries (arguably the main point of using ABCL at the moment) actually looks like in practice. Never mind integration with Spring, or other more involved frameworks, this will only touch a single library and won't require us to write from-Java-callable classes.
In the process of refining this I'm hoping to also get ideas about the requirements for building a better DSL for the Java FFI, based on the intended "look" of the code (that is, coding by wishful thinking).
Setup
Ensuring the correct package is somewhat optional:
(in-package #:cl-user)
Generally using JSS is a bit nicer than the plain Java FFI. After the contribs are loaded, JSS can be required and used:
(require '#:abcl-contrib)
(require '#:jss)
(use-package '#:jss)
Next, we need access to the right libraries. After building LIRE from
source and executing the mvn dist
command we end up with a JAR file
for LIRE and several dependencies in the lib
folder. All of them need
to be on the classpath:
(progn
(add-to-classpath "~/src/LIRE/dist/lire.jar")
(mapc #'add-to-classpath (directory "~/src/LIRE/dist/lib/*.jar")))
Prelude
Since we're going to read pictures in a couple of places, a helper to load one from a pathname is a good start:
(defun read-image (pathname)
(#"read" 'javax.imageio.ImageIO (new 'java.io.File (namestring pathname))))
To note here is the use of NEW
from JSS with a symbol for the class
name, the conversion of the pathname to a regular string, since the Java
side doesn't expect a Lisp object and the #""
reader syntax from JSS
to invoke the method read
in a bit of a simpler way than using the FFI
calls directly.
JSS will automatically "import" Java names, so the same function can simply be the following instead (provided that the names aren't ambiguous):
(defun read-image (pathname)
(#"read" 'ImageIO (new 'File (namestring pathname))))
The names will be looked up again on every call though, so this option isn't the best performing one.
For comparison, the raw FFI would be a bit more verbose, but explicitely specifies all names:
(defun read-image (pathname)
(jstatic "read" "javax.imageio.ImageIO" (jnew "java.io.File" (namestring pathname))))
Though with a combination of JSS and cached lookup it could be nicer, even though the setup is more verbose:
(defvar +image-io+ (jclass "javax.imageio.ImageIO"))
(defvar +file+ (jclass "java.io.File"))
(defun read-image (pathname)
(#"read" +image-io+ (jnew +file+ (namestring pathname))))
At this point without other improvements (auto-coercion of pathnames, importing namespaces) it's about as factored as it will be (except moving every single call into its own Lisp wrapper function).
Building an index
To keep it simple building the index will be done from a list of pathnames in a single step while providing the path of the index as a separate parameter:
(defun build-index (index-name pathnames)
(let ((global-document-builder
(new 'GlobalDocumentBuilder (find-java-class 'CEDD)))
(index-writer (#"createIndexWriter"
'LuceneUtils
index-name
+true+
(get-java-field 'LuceneUtils$AnalyzerType "WhitespaceAnalyzer"))))
(unwind-protect
(dolist (pathname pathnames)
(let ((pathname (namestring pathname)))
(format T "Indexing ~A ..." pathname)
(let* ((image (read-image pathname))
(document (#"createDocument" global-document-builder image pathname)))
(#"addDocument" index-writer document))
(format T " done.~%")))
(#"closeWriter" 'LuceneUtils index-writer))))
Note: This code won't work on current ABCL as is, because the lookup
is disabled for for nested classes (those containing the dollar
character). Because of this, the AnalyzerType
class would have to be
looked up as follows:
(jfield "net.semanticmetadata.lire.utils.LuceneUtils$AnalyzerType" "WhitespaceAnalyzer")
All in all nothing fancy, JSS takes care of a lot of typing as the names are all unique enough.
The process is simply creating the document builder and index writer, reading all the files one by one and adding them to the index. There's no error checking at the moment though.
To note here is that looking up the precise kind of a Java name is a bit
of a hassle. Of course intuition goes a long way, but again, manually
figuring out whether a name is a nested class or static/enum field is
annoying enough since it involves either repeated calls to JAPROPOS
,
or reading more Java documentation.
Apart from that, this is mostly a direct transcription. Unfortunately
written this way there's no point in creating a WITH-OPEN-*
macro to
automatically close the writer, however, looking at the LuceneUtils
source this could be accomplished by directly calling close
on the
writer object instead - a corresponding macro might this then:
(defmacro with-open ((name value) &body body)
`(let ((,name ,value))
(unwind-protect
(progn ,@body)
(#"close" ,name))))
It would also be nice to have auto conversion using keywords for enum values instead of needing to look up the value manually.
Querying an index
The other way round, looking up related pictures by passing in an example, is done using an image searcher:
(defun query-index (index-name pathname)
(let* ((image (read-image pathname))
(index-reader (#"open" 'DirectoryReader
(#"open" 'FSDirectory
(#"get" 'Paths index-name (jnew-array "java.lang.String" 0))))))
(unwind-protect
(let* ((image-searcher (new 'GenericFastImageSearcher 30 (find-java-class 'CEDD)))
(hits (#"search" image-searcher image index-reader)))
(dotimes (i (#"length" hits))
(let ((found-pathname (#"getValues" (#"document" index-reader (#"documentID" hits i))
(get-java-field 'builders.DocumentBuilder "FIELD_NAME_IDENTIFIER"))))
(format T "~F: ~A~%" (#"score" hits i) found-pathname))))
(#"closeReader" 'LuceneUtils index-reader))))
To note here is that the get
call on java.nio.file.Paths
took way
more time to figure out than should've been necessary: Essentially the
method is using a variable number of arguments, but the FFI doesn't help
in any way, so the array (of the correct type!) needs to be set up
manually, especially if the number of variable arguments is zero. This
is not obvious at first and also takes unnecessary writing.
The rest of the code is straightforward again. At least a common
wrapper for the length
call would be nice, but since the result object
doesn't actually implement a collection interface, the point about
having better collection iteration is kind of moot here.
A better DSL
Considering how verbose the previous examples were, how would the "ideal" way look like?
There are different ways which are more, or less intertwined with Java semantics. On the one end, we could imagine something akin to "Java in Lisp":
(defun read-image (pathname)
(ImageIO/read (FileInputStream. pathname)))
Which is almost how it would look like in Clojure. However, this is
complicating semantics. While importing would be an extension to the
package mechanism (or possibly just a file-wide setting), the
Class/field
syntax and Class.
syntax are non-trivial reader
extensions, not from the actual implementation point of view, but from
the user point of view. They'd basically disallow a wide range of
formerly legal Lisp names.
(defun read-image (pathname)
(#"read" 'ImageIO (new 'FileInputStream pathname)))
This way is the middle ground that we have now. The one addition here could be that name lookup is done at macro expansion / compilation time, so they are fixed one step before execution, whereas at the moment the JSS reader macro will allow for very late bound name lookup instead.
The similarity with CLOS would be the use of symbols for class names, but the distinction is still there, since there's not much in terms of integrating CLOS and Java OO yet (which might not be desirable anyway?).
Auto-coercion to Java data types also takes place in both cases. Generally this would be appropriate, except for places where we'd really want the Java side to receive a Lisp object. Having a special variable to disable conversion might be enough for these purposes.
If we were to forego the nice properties of JSS by requiring a function form, the following would be another option:
(defun read-image (pathname)
$(read 'ImageIO (new 'FileInputStream pathname)))
Where $(...)
would be special syntax indicating a Java method call.
Of course the exact syntax is not very relevant, more importantly static
properties could be used to generate a faster, early bound call by
examining the supplied arguments as a limited form of type inference.
Summary
After introducing the necessary steps to start using ABCL with "native" Java libraries, we transcribed two example programs from the library homepage.
Part of this process was to examine how the interaction between the Common Lisp and Java parts looks like, using the "raw" and the simplified JSS API. In all cases the FFI is clunkier than needs be. Especially the additional Java namespaces are making things longer than necessary. The obvious way of "importing" classes by storing a reference in a Lisp variable is viable, but again isn't automated.
Based on the verbose nature of the Java calls an idea about how a more concise FFI DSL could look like was developed next and discussed. At a future point in time this idea could now be developed fully and integrated (as a contrib) into ABCL.