Recent Content

Scala Macros to the Rescue
posted on 2018-12-20 21:38:45+01:00

Did you know Scala has macros? Coming from Common Lisp they serve pretty much the same purpose, doing things that the (plethora of) other language features don't support and to shortcut the resulting boilerplate code. And even the S-expressions can be had when macro debugging is turned on, though the pretty-printed Scala code is arguably much more useful here.

Why would you realistically use them then? Turns out I had to deal with some auto-generated code dealing with Protobuf messages. The generated classes for any message look something like this (in Java syntax since that's what the generated code is):

public interface ExampleResponseOrBuilder
  extends com.google.protobuf.MessageOrBuilder;

public static final class ExampleResponse
  extends com.google.protobuf.GeneratedMessageV3
  implements ExampleReponseOrBuilder {

  public static Builder newBuilder();

  public static final class Builder
    extends com.google.protobuf.GeneratedMessageV3.Builder<Builder>
    implements ExampleResponseOrBuilder;
}

That is, we have one interface, two classes, one of them conveniently gives you a builder for new objects of the class. That's used like this (back to Scala here):

val builder: ExampleResponse.Builder = Example.newBuilder()
builder.mergeFrom(stream)
val result: ExampleResponse = builder.build()

If you try and make a generic builder here, you'll quickly notice that this is rather hard as the generic types don't really express the relationship between ExampleResponse and ExampleResponse.Builder well.

As an aside, you want to have a generic builder parametrised on the return type to be able to write something like this:

val result = build[ExampleResponse](stream)

Without ever having to pass through the type as a value. Better even if you just specify the result type and the type parameter for build is then automatically derived.

These builders look something like this then:

trait ProtobufBuilder[T <: Message] {
  def underlying(): Message.Builder

  def build(string: String)(implicit parser: JsonFormat.Parser): T = {
    val builder = underlying()
    parser.merge(string, builder)
    builder.build().asInstanceOf[T]
  }
}

class ExampleResponseBuilder() extends ProtobufBuilder[ExampleResponse] {
  override def underlying(): ExampleResponse.Builder =
    ExampleResponse.newBuilder()
}

This then allows us to use some implicit magic to pass these through to the decoder (sttp's framework in this case) to correctly decode the incoming data.

But, we've to 1. write one class for each type, 2. instantiate it. This is roughly five lines of code per type depending on the formatting.

Macros to the rescue!

Inspired by the circe derivation API I finally got all the pieces together to create such a macro:

def deriveProtobufBuilder[T <: Message]: ProtobufBuilder[T] = macro deriveProtobufBuilder_impl[T]

def deriveProtobufBuilder_impl[T <: Message: c.WeakTypeTag](
    c: blackbox.Context): c.Expr[ProtobufBuilder[T]] = {
  import c.universe._

  val messageType   = weakTypeOf[T]
  val companionType = messageType.typeSymbol.companion

  c.Expr[ProtobufBuilder[T]](q"""
    new ProtobufBuilder[$messageType] {
      override def underlying(): $companionType.Builder = $companionType.newBuilder()
    }
   """)
}

Used then like this:

private implicit val exampleResponseBuilder: ProtobufBuilder[ExampleResponse] = deriveProtobufBuilder

That's one or two lines and the types are only mentioned once (the variable name can be changed). Unfortunately getting rid of the variable name doesn't seem to be possible.

Easy, wasn't it? Unfortunately all of this is hampered by the rather undocumented APIs, you really have to search for existing code or Stackoverflow questions to figure this out.

One thing that helped immensly was the -Ymacro-debug-lite option, which prints the expanded macro when used in sbt via compile.

Act! Heuristics for determining intent.
posted on 2018-12-09 14:11:18+01:00

I wrote down a little scenario for myself for the next iteration of act:

  • mark "http://www.google.de" with mouse (selection -> goes to x11 buffer-cut)
  • listen on those changes and show current context and choices
  • press+release "act primary" key -> runs primary action immediately
  • (or) press "act" key once -> opens buffer for single character input to select choice
  • buffer captures keyboard, after single character action is taken, esc aborts, "act" again should expand the buffer to full repl
  • mouse -> selection of source? history?
  • context -> focused window? -> lookup for program gives cwd, etc.
  • since that's not foolproof an obvious addition to the protocol would be to store a reference to the source of the clipboard (or the program that will be queried for it)
  • you should always be able to interrogate windows about their origin and capabilities, including their scripting interface
  • pattern matching url -> rule based decisions
  • primary / secondary actions -> rule of three basically, except we err on the side of caution and just bind two actions two keys
  • special handling for clipboard types that aren't text? allow for example to override a rule based on whether a picture would be available

Now, there are several problems with this on a regular X11-based desktop already:

  • How do you identify the program / PID belonging to a particular window?
  • How do you get information about that program, e.g. its current working directory?

In addition, since I'm using tmux inside a terminal emulator there are some more problems:

  • How do you know which tmux session is currently active?
  • How do you know which program is currently "active" in a tmux session?

Basically it's a recursive lookup for the "current state" of what's being displayed to the user. Not only that, but for things like browsers, image editors, video editors, anything document based it's still the same problem at another level, namely, what the current "context" is, like the currently open website, picture, video, scene, what have you.

Coming back to earlier thoughts about automation, there's no way for most of these to be accurately determined at this time. Neither is e.g. DBUS scripting common enough to "just use it" for scripting, there are also several links missing in the scenario described above and some can likely never be fixed to a sufficient degree to not rely on heuristics.

Nevertheless, with said heuristics it's still possible to get to a state where a productivity plus can be achieved with only moderate amount of additional logic to step through all the indirections between processes and presentation layers.

Now let me list a few answers to the questions raised above:

  • The PID for a particular window can be derived from an X11 property, together with xdotool this gives us an easy pipeline to get this value: ``.
  • Information about the running process can then be retrieved via the proc filesystem, e.g. readlink /proc/$PID/cwd for the current working directory. Of course this has limited value in a multi-threaded program or any environment that doesn't rely on the standard filesystem interface (but uses its own defaults).
  • I do not have an answer for the currently active tmux session yet, presumably you should somehow be able to get from a PID to a socket and thus to the session?
  • For tmux, the currently active program in a session is a bit more complex, ``, which we'll also have to filter for the active values.

For scripting interfaces, many programs have their own little implementation of this, but most problematic here is that you want to go from a X11 window / PID to the scripting interface, not through some workaround by querying for interfaces!

For programs like Emacs and any programmable environment we can likely script something together, but again it's not a very coherent whole by any means.

DBus and PolicyKit from Common Lisp
posted on 2018-12-04 23:56:27+01:00

Intro

Regardless of your position on DBus, sometimes you might need to interact with it. Common Lisp currently has at least two libraries ready for you, one of them is in Quicklisp, https://github.com/death/dbus/.

Setup

Have it loaded and create a package to use it, then change into it.

;; (asdf:load-system '#:dbus)
;; or
;; (ql:quickload "dbus)

(defpackage #:example
    (:use #:cl #:dbus))

(in-package #:example)

For reference, I'm going to refer to a (very old) polkit example in Python. For reference I'm reproducing it here (it still works in current Python 3 without any changes except the print):

import dbus

bus = dbus.SystemBus()
proxy = bus.get_object('org.freedesktop.PolicyKit1', '/org/freedesktop/PolicyKit1/Authority')
authority = dbus.Interface(proxy, dbus_interface='org.freedesktop.PolicyKit1.Authority')

system_bus_name = bus.get_unique_name()

subject = ('system-bus-name', {'name' : system_bus_name})
action_id = 'org.freedesktop.policykit.exec'
details = {}
flags = 1            # AllowUserInteraction flag
cancellation_id = '' # No cancellation id

result = authority.CheckAuthorization(subject, action_id, details, flags, cancellation_id)

print(result)

So, how does this look in Common Lisp? Mostly the same, except that at least at the moment you have to specify the variant type explicitly! This was also the reason to document the example, it's quite hard to understand what's wrong if there's a mistake, including the socket connection just dying on you and other fun stuff.

(with-open-bus (bus (system-server-addresses))
  (with-introspected-object (authority bus
                                       "/org/freedesktop/PolicyKit1/Authority"
                                       "org.freedesktop.PolicyKit1")
    (let* ((subject `("system-bus-name" (("name" ((:string) ,(bus-name bus))))))
           (action-id "org.freedesktop.policykit.exec")
           (details ())
           (flags 1)
           (cancellation-id "")
           (result
                (authority "org.freedesktop.PolicyKit1.Authority" "CheckAuthorization"
                           subject action-id details flags cancellation-id)))
      (format T "~A~%" result))))

Note the encoding of the dictionary: The type of the whole argument is specified as (sa{sv}), a structure of a string and a dictionary of strings to variants - we're spelling out the variant type here, compared to what's automatically done by the Python library.

Lisp shell semantics
posted on 2018-08-08 01:36:29

Continuing from an earlier post, what might the semantics be that we'd like to have for a more useful shell integrated with a Lisp environment?

Syntax

Of course we can always keep the regular Common Lisp reader; however it's not best suited for interactive shell use. In fact I'd say interactive use period requires heavy use of editing commands to juggle parens. Which is why some people use one of the simplifications of not requiring the outermost layer of parens on the REPL.

So, one direction would be to have better S-expression manipulation on the REPL, the other to have a syntax that's more incremental than S-expressions.

E.g. imagine the scenario that,

  1. as a terminal user,
  2. I'm navigating directories,
  3. listing files,
  4. then looking for how many files there were,
  5. then grepping for a particular pattern,
  6. then opening one of the matches.

In sh terms that's something the lines of

$ cd foo
$ ls
...
$ ls | wc -l
42
$ ls | grep bar
...
$ vim ...

In reality I'm somewhat sure no one's going as far as doing the last two steps in an iterative fashion, like via ls | grep x | grep y | xargs vim, as most people will have a mouse readily available to select the desired name. There are some terminal widgets which allow the user to select from e.g. one of the input lines in a command line dialog, but again it's not a widespread pattern and still requires manual intervention in such a case.

Also note that instead of reusing the computation the whole expression keeps being reused, which also makes this somewhat inefficient. The notion of the "current" thing being worked on ("this", "self") also isn't expressed directly here.

In the new shell I'd like to see part of explored. We already have the three special variables *, ** and *** (and / etc.!) in Common Lisp - R, IPython and other environments usually generalise this to a numbered history, which arguably we might want to add here as well - so it stands to reason that these special variables make sense for a shell as well.

$ cd foo
$ ls
...
$ * | wc
42
$ grep ** bar
...
$ vim *

(Disregarding the need for a different globbing character ...)

There's also the very special exit status special variable in shells that we need to replicate. This will likely be similar to Perl special variables that keep track of one particular thing instead of reusing the * triad of variables for this too.

Pipelines

The expression compiler should be able to convert from a serial to a concurrent form as necessary, that is, by converting to the required form at each pipeline step.

ls | wc -l | vim

Here, ls will be the built-in LIST-DIRECTORY, which is capable of outputting e.g. a stream of entries. wc -l might be compiled to simply LENGTH, which, since it operates on fixed sequences, will force ls to be fully evaluated (disregarding a potential optimisation of just counting the entries instead of producing all filenames). The pipe to vim will then force the wc output to text format, since vim is an unknown native binary.

These would be the default conversions. We might force a particular interpretation by adding explicit (type) conversions, or adding annotations to certain forms to explain that they accept a certain input/output format; for external binaries the annotations necessarily would be more extensive.

For actual pipelines it's extremely important that each step progresses as new data becomes available. Including proper backpressure this will ensure good performance and keep the buffer bloat to a minimum. Again this might be a tunable behaviour too.

I/O

The shell should be convenient. As such any output and error output should be captured as a standard behaviour, dropping data if the buffers would get too big. This will then allow us to refer to previous output without having to reevaluate any expression / without having to run a program again (which in both cases might be expensive).

Each datatype and object should be able to be rendered and parsed in more than a single format too. E.g. a directory listing itself might be an object, but will typically be rendered to text/JSON/XML, or to a graphical representation even.

Keyboard protocol #2
posted on 2018-07-25 23:17:22

Now in fact there are still reasons we need key codes that are different to the eventual text representation, e.g. for cursor movement and other special characters like function keys.

Looking at the Xorg source code now, there's a relatively fixed notion of what a keyboard can actually do. I suspect that conceptually a somewhat backwards-compatible extension would be to have a new dedicated kind of device, that is exposed (similarly to a keyboard with an integrated touchpad) separately to the other functionality of the device.

In particular, I'd like to keep the keyboard in "regular" mode as long as the host hasn't signaled that it wants to use the extended functionality via, presumably some part of the USB negotiation. Only at that point would the extension be activated and the keyboard output would be sent via it. The regular keyboard device would then be virtually unplugged.

I suspect that this is better than having two devices, one for key codes and one for text input, especially because we'd not be able to guarantee in which order the two devices would be read. This is less / not a problem between a keyboard and a pointer device of course.

Now given that X11 isn't the interface most applications are written against, how would the text actually arrive at an application? I'd imagine basically extending the whole event handling by one more event, the text event, which wouldn't correspond to any key (thus, it can't be in a pressed or unpressed state). In terms of GTK+ and QT this might be even easier than for a lower level application since many applications will only want to deal with text input and pre-defined keyboard shortcuts anyway.

Speaking of which, what does "Ctrl-C" actually mean? Of course the mnemonic for "copy" is in there, but also the "control" modifier. How well does this play with text input? Not at all, and I believe modifiers work better logically on the key code level; for text input I imagine other modifiers like "emphasis", or, more specifically, "bold", would make more sense, possible "URL", or "footnote".

Overall there can of course be modifiers active while text input occurs, it's more a question of whether they (can) be assigned any meaning without falling back to the flawed character equals key press comparison.

What does this gain us? Ideally every application (or more accurately: each toolkit) could now drop logic specifically to translate key codes to text, since all of it would already be handled by the keyboard itself. Keypresses would come in via the same interface and be used for specific, non-text functionality.

Keyboard protocol
posted on 2018-07-19 11:48:07

The keyboard protocol is still using the same approach as roughly since the start of computing: The keyboard is a dumb device, reporting each mechanical button that's pressed. The computer is the intelligent device, translating those into characters.

There are some attempts that have made it into various places, e.g. there's a flag in the USB HID protocol to indicate the physical layout, be it US, or some other one. Except no manufacturer sets it, so no driver uses it.

But, what if we had a keyboard, courtesy of QMK and similar firmwares, that is substantially more intelligent? If the protocol allowed for it we'd be able to have such nice things as sending an "a", an "あ", or a "●" without any remapping to be done! In fact if the keyboard could send Unicode sequences we can do things that aren't possible by remapping, like sending characters from various scripts through a macro key without impacting any keypress since we have an immensely increased value space to work with.

Command Line Completion
posted on 2018-07-17 21:50:04

One part of the lack of cohesion on *nix systems is the hack of command line completion via e.g. Bash or ZSH modules. I can think of roughly three ways in the *nix paradigm to have a more cohesive pattern here:

  1. Give full control to the command line program and invoke it with a special handler to parse and generate appropriate output for command line completion.

  2. Give partial control and generate a script like description of the completion, e.g. as a Bash- or ZSH-specific script.

  3. Which is pretty similar, generate a static description on request.

All of these use the program, the thing that's being invoked, as the central mechanism of how to do completion. No extra files are required to be put anywhere and the description can always be in sync with what the program actually does later on. The downside might very well be that the program is invoked multiple times.

Therefore perhaps also

  1. Invoke the program in a special mode and keep it alive while it handles command line completion. Once finished and the user wants to invoke it, it would simply use the already existing in-memory structures and continue.

Of course I can see the downsides here too, though in the interest of having a more interwoven system I'd argue that the benefit might be worth it overall, in particular for programs whose arguments are a little bit more like a DSL than just simple file arguments. Notably though any shell features would not be available in variant 4, instead the program would have to call back or replicate the required functionality, making it somewhat less appealing?

The Curious Case of the Uninterruptible Sleep
posted on 2018-07-09 16:16:37

Let me tell you about one particular annoying issue I came across when using a GC-ed environment. No, not Python, though I'm almost definitely sure it will have the same issue. No, it's of course Common Lisp, in this case CCL and SBCL in particular, both of which have a stop-the-world GC (as far as I know there are no knobs to change the behaviour outside of what I'll describe below).

Now, do you also remember these things called HDDs? Turns out that one of my external hard drives likes to shut itself down after a while. You can hear that very well. However, since the drive is still mounted, accessing (uncached) parts of the filesystem will trigger a wake event. But getting the platters up to speed again takes a lot of time, so in between what happens?

Exactly, uninterruptible sleep for the process in question. It's one of the few possibilities where that process state can happen and if it was without the specifics of the GC involved it would just block one thread while the rest, in particular the GUI thread, would keep moving.

In this particular instance though, the GUI would work for a moment and then freeze until the drive finally responded with the requested data.

Now why is that?

Turns out the GC asks each (runtime) thread to block for the GC run. This is done via signals. Except one thread won't respond to signals since it's ... sleeping. Uninterruptibly.

Great.

Suggested options include spawning a separate process and wait for I/O (which I'd rather not do, since it'd mean doing that for every single library call that might operate with e.g. files, which is basically a lot. It just seems there's no good way to deal with this except in changing the runtime and dealing with the fact that the GC might not be able to reach all threads.

I looked a bit at the CCL runtime and it seems if we promise not to do anything bad with regards to the GC, we might be able to work around it by setting a new flag on a per-thread level that excludes it from the GC waiting loop. We'd only do this around potentially problematic FFI calls, but also around known internal system calls. When returning from the foreign land we'd also need to "catch up" with the GC, potentially blocking until the current run is done (if there is one). But that's solvable, with a little bit more overhead for FFI calls. I believe that's worth it.

Since I'm all around not familiar with either the CCL or SBCL code bases I suspect that even the generalisation of doing this for all FFI would be tenable, in which case the pesky annotations would cease to be necessary, fixing the issue without having to manually adjust anything on the developer's side.

Lastly, if I knew the right terms I'm sure there are solutions for this in either GHC or Erlang, except that I wouldn't really know what to search for.

Btw. this is all super iffy to debug too, since e.g. gdb will just hang in case the target process is in this specific task state, being based on ptrace and seemingly inheriting it through there.

New Job, New ...
posted on 2018-04-15 18:46:10

Seems a good idea to write down some notes about this. I'm switching jobs starting May. I was hesitant about this, mostly because I wanted to figure out the right direction, but the opportunity presented itself, so why not. I'll see how that works out and not travelling for a while is absolutely necessary for me to get ... back on track, I guess.

I hope to get back into a bit of Open Source work as well, with a bit more time that might be possible. Notably that will be related to cl-cffi-gtk, which I hope to get into better shape with some additional collaborators, so there's a good chance the library will (finally) be switched over on Quicklisp too.

At the same time there's things on the bucket list I hope to start this year. It's already a bit late during the year, but better now than not at all. That includes lending a bow and signing up for training.

Lastly I'm not super happy about the lack of reading during the last months, I'm currently still on some Dune-related books and I can't say I enjoy them a lot (big surprise given that people have been slamming them I believe). Music-wise there's fortunately quite a lot of variety going on.

An early Christmas present^Wkeyboard
posted on 2017-12-17 12:25:40

Picture of the half-built keyboard

Picture of the keyboard with soldering iron

Almost at the end of this year I finally had enough components to build a custom keyboard. The case is reused from my existing Poker II, but I'm on the look-out for either a wooden or all-metal case instead of black plastic. Additionally I'm missing one 1.25u and one 1.5u key - with the maximum set of keys you end up needing a few more of those than on even a regular sized keyboard (but I do have an ISO Enter and Shift key too many).

Okay, so from the start. The goal was to have a fully reflashable keyboard. Although Vortex (the makers of the Poker keyboards) have provided some updated firmware, they don't want to share the firmware sources and the one GitHub project I found to reverse engineer it didn't go anywhere in the last few years unfortunately.

Looking through the number of various projects it seemed that a GeekHack GH60 or a similar model would suffice for me. The layout isn't much different from my previous keyboard and it would fit a lot of keys in a good form factor.

I actually went with the (Chinese?) "Satan" variant - comes set up for more LEDs, a feature which I'm not yet using (and might not in the long run at all actually). With that PCB, a black plate for mounting the switches, a stabiliser for the Space key, green Cherry switches and keys I now have something matching my taste quite well. The keys I'd actually like to replace with a nicer option later on, but they're alright for the moment.

All in all thumbs up, would build it again.

Build

Testing the PCB with a bit of wire is a good idea, so check every key ... possibly also the LEDs, but I didn't have any handy for it. As far as I understand modding the switches with SIP sockets can be done even after soldering in the switches, so I didn't do it yet.

Once that's confirmed the rest is also easy: Put in the first switches into the plate. Stabiliser goes onto the PCB, then the plate on top. Check the contacts are visible on the other side of the PCB and nothing was bent. Solder the switches to give a bit of stability. Then fill up the rest of the keyboard. I checked the layout with the keys in the most critical areas too because I wasn't sure it would all align well.

Check the solder joints, movement of the switches (my space key was a bit weird, sometimes getting stuck a bit), plug it in and confirm with debugging mode or some online keyboard tester.

Firmware

The default firmware is fine for testing, but of course the objective is to customise the hell out of it.

The best resource for me was this gist that is a bit cumbersome to set up with the different repositories, but works pretty well.

The wiring isn't completely in sync with the regular GH60, therefore a separate revision needs to be selected in the TMK sources. Also, it seems like at least one key is completely missing even with that and the default layouts are all ANSI! To use the remaining three keys the regular KEYMAP macro needs to be used. Note that also a few keys are on a different position in the macro, therefore I changed it slightly such that it more closely resembles the physical layout.

For debugging either cat from the /dev/hidraw devices (might take a moment to find the right one), or perhaps compile the HID Listen program. You really don't want to disable this until you've sorted out all keymap issues. By default the magic keys are left shift + right shift - h will print the short help, while x, the most interesting one for me, will print the keyboard matrix on every raw keypress.

Previous

This blog covers emacs, git, hardware, java, kotlin, lisp, postgresql, scala, tachikoma, unix, work

View content from 2018-12, 2018-08, 2018-07, 2018-04, 2017-12, 2017-07, 2017-06, 2016-11, 2016-10, 2016-09, 2016-08, 2015-11, 2015-08, 2015-06, 2015-04, 2015-02, 2015-01, 2014-12, 2014-11, 2014-08


Unless otherwise credited all material Creative Commons License by Olof-Joachim Frahm