Recent Content

Lisp shell semantics
posted on 2018-08-08 01:36:29

Continuing from an earlier post, what might the semantics be that we'd like to have for a more useful shell integrated with a Lisp environment?

Syntax

Of course we can always keep the regular Common Lisp reader; however it's not best suited for interactive shell use. In fact I'd say interactive use period requires heavy use of editing commands to juggle parens. Which is why some people use one of the simplifications of not requiring the outermost layer of parens on the REPL.

So, one direction would be to have better S-expression manipulation on the REPL, the other to have a syntax that's more incremental than S-expressions.

E.g. imagine the scenario that,

  1. as a terminal user,
  2. I'm navigating directories,
  3. listing files,
  4. then looking for how many files there were,
  5. then grepping for a particular pattern,
  6. then opening one of the matches.

In sh terms that's something the lines of

$ cd foo
$ ls
...
$ ls | wc -l
42
$ ls | grep bar
...
$ vim ...

In reality I'm somewhat sure no one's going as far as doing the last two steps in an iterative fashion, like via ls | grep x | grep y | xargs vim, as most people will have a mouse readily available to select the desired name. There are some terminal widgets which allow the user to select from e.g. one of the input lines in a command line dialog, but again it's not a widespread pattern and still requires manual intervention in such a case.

Also note that instead of reusing the computation the whole expression keeps being reused, which also makes this somewhat inefficient. The notion of the "current" thing being worked on ("this", "self") also isn't expressed directly here.

In the new shell I'd like to see part of explored. We already have the three special variables *, ** and *** (and / etc.!) in Common Lisp - R, IPython and other environments usually generalise this to a numbered history, which arguably we might want to add here as well - so it stands to reason that these special variables make sense for a shell as well.

$ cd foo
$ ls
...
$ * | wc
42
$ grep ** bar
...
$ vim *

(Disregarding the need for a different globbing character ...)

There's also the very special exit status special variable in shells that we need to replicate. This will likely be similar to Perl special variables that keep track of one particular thing instead of reusing the * triad of variables for this too.

Pipelines

The expression compiler should be able to convert from a serial to a concurrent form as necessary, that is, by converting to the required form at each pipeline step.

ls | wc -l | vim

Here, ls will be the built-in LIST-DIRECTORY, which is capable of outputting e.g. a stream of entries. wc -l might be compiled to simply LENGTH, which, since it operates on fixed sequences, will force ls to be fully evaluated (disregarding a potential optimisation of just counting the entries instead of producing all filenames). The pipe to vim will then force the wc output to text format, since vim is an unknown native binary.

These would be the default conversions. We might force a particular interpretation by adding explicit (type) conversions, or adding annotations to certain forms to explain that they accept a certain input/output format; for external binaries the annotations necessarily would be more extensive.

For actual pipelines it's extremely important that each step progresses as new data becomes available. Including proper backpressure this will ensure good performance and keep the buffer bloat to a minimum. Again this might be a tunable behaviour too.

I/O

The shell should be convenient. As such any output and error output should be captured as a standard behaviour, dropping data if the buffers would get too big. This will then allow us to refer to previous output without having to reevaluate any expression / without having to run a program again (which in both cases might be expensive).

Each datatype and object should be able to be rendered and parsed in more than a single format too. E.g. a directory listing itself might be an object, but will typically be rendered to text/JSON/XML, or to a graphical representation even.

Keyboard protocol #2
posted on 2018-07-25 23:17:22

Now in fact there are still reasons we need key codes that are different to the eventual text representation, e.g. for cursor movement and other special characters like function keys.

Looking at the Xorg source code now, there's a relatively fixed notion of what a keyboard can actually do. I suspect that conceptually a somewhat backwards-compatible extension would be to have a new dedicated kind of device, that is exposed (similarly to a keyboard with an integrated touchpad) separately to the other functionality of the device.

In particular, I'd like to keep the keyboard in "regular" mode as long as the host hasn't signaled that it wants to use the extended functionality via, presumably some part of the USB negotiation. Only at that point would the extension be activated and the keyboard output would be sent via it. The regular keyboard device would then be virtually unplugged.

I suspect that this is better than having two devices, one for key codes and one for text input, especially because we'd not be able to guarantee in which order the two devices would be read. This is less / not a problem between a keyboard and a pointer device of course.

Now given that X11 isn't the interface most applications are written against, how would the text actually arrive at an application? I'd imagine basically extending the whole event handling by one more event, the text event, which wouldn't correspond to any key (thus, it can't be in a pressed or unpressed state). In terms of GTK+ and QT this might be even easier than for a lower level application since many applications will only want to deal with text input and pre-defined keyboard shortcuts anyway.

Speaking of which, what does "Ctrl-C" actually mean? Of course the mnemonic for "copy" is in there, but also the "control" modifier. How well does this play with text input? Not at all, and I believe modifiers work better logically on the key code level; for text input I imagine other modifiers like "emphasis", or, more specifically, "bold", would make more sense, possible "URL", or "footnote".

Overall there can of course be modifiers active while text input occurs, it's more a question of whether they (can) be assigned any meaning without falling back to the flawed character equals key press comparison.

What does this gain us? Ideally every application (or more accurately: each toolkit) could now drop logic specifically to translate key codes to text, since all of it would already be handled by the keyboard itself. Keypresses would come in via the same interface and be used for specific, non-text functionality.

Keyboard protocol
posted on 2018-07-19 11:48:07

The keyboard protocol is still using the same approach as roughly since the start of computing: The keyboard is a dumb device, reporting each mechanical button that's pressed. The computer is the intelligent device, translating those into characters.

There are some attempts that have made it into various places, e.g. there's a flag in the USB HID protocol to indicate the physical layout, be it US, or some other one. Except no manufacturer sets it, so no driver uses it.

But, what if we had a keyboard, courtesy of QMK and similar firmwares, that is substantially more intelligent? If the protocol allowed for it we'd be able to have such nice things as sending an "a", an "あ", or a "●" without any remapping to be done! In fact if the keyboard could send Unicode sequences we can do things that aren't possible by remapping, like sending characters from various scripts through a macro key without impacting any keypress since we have an immensely increased value space to work with.

Command Line Completion
posted on 2018-07-17 21:50:04

One part of the lack of cohesion on *nix systems is the hack of command line completion via e.g. Bash or ZSH modules. I can think of roughly three ways in the *nix paradigm to have a more cohesive pattern here:

  1. Give full control to the command line program and invoke it with a special handler to parse and generate appropriate output for command line completion.

  2. Give partial control and generate a script like description of the completion, e.g. as a Bash- or ZSH-specific script.

  3. Which is pretty similar, generate a static description on request.

All of these use the program, the thing that's being invoked, as the central mechanism of how to do completion. No extra files are required to be put anywhere and the description can always be in sync with what the program actually does later on. The downside might very well be that the program is invoked multiple times.

Therefore perhaps also

  1. Invoke the program in a special mode and keep it alive while it handles command line completion. Once finished and the user wants to invoke it, it would simply use the already existing in-memory structures and continue.

Of course I can see the downsides here too, though in the interest of having a more interwoven system I'd argue that the benefit might be worth it overall, in particular for programs whose arguments are a little bit more like a DSL than just simple file arguments. Notably though any shell features would not be available in variant 4, instead the program would have to call back or replicate the required functionality, making it somewhat less appealing?

The Curious Case of the Uninterruptible Sleep
posted on 2018-07-09 16:16:37

Let me tell you about one particular annoying issue I came across when using a GC-ed environment. No, not Python, though I'm almost definitely sure it will have the same issue. No, it's of course Common Lisp, in this case CCL and SBCL in particular, both of which have a stop-the-world GC (as far as I know there are no knobs to change the behaviour outside of what I'll describe below).

Now, do you also remember these things called HDDs? Turns out that one of my external hard drives likes to shut itself down after a while. You can hear that very well. However, since the drive is still mounted, accessing (uncached) parts of the filesystem will trigger a wake event. But getting the platters up to speed again takes a lot of time, so in between what happens?

Exactly, uninterruptible sleep for the process in question. It's one of the few possibilities where that process state can happen and if it was without the specifics of the GC involved it would just block one thread while the rest, in particular the GUI thread, would keep moving.

In this particular instance though, the GUI would work for a moment and then freeze until the drive finally responded with the requested data.

Now why is that?

Turns out the GC asks each (runtime) thread to block for the GC run. This is done via signals. Except one thread won't respond to signals since it's ... sleeping. Uninterruptibly.

Great.

Suggested options include spawning a separate process and wait for I/O (which I'd rather not do, since it'd mean doing that for every single library call that might operate with e.g. files, which is basically a lot. It just seems there's no good way to deal with this except in changing the runtime and dealing with the fact that the GC might not be able to reach all threads.

I looked a bit at the CCL runtime and it seems if we promise not to do anything bad with regards to the GC, we might be able to work around it by setting a new flag on a per-thread level that excludes it from the GC waiting loop. We'd only do this around potentially problematic FFI calls, but also around known internal system calls. When returning from the foreign land we'd also need to "catch up" with the GC, potentially blocking until the current run is done (if there is one). But that's solvable, with a little bit more overhead for FFI calls. I believe that's worth it.

Since I'm all around not familiar with either the CCL or SBCL code bases I suspect that even the generalisation of doing this for all FFI would be tenable, in which case the pesky annotations would cease to be necessary, fixing the issue without having to manually adjust anything on the developer's side.

Lastly, if I knew the right terms I'm sure there are solutions for this in either GHC or Erlang, except that I wouldn't really know what to search for.

Btw. this is all super iffy to debug too, since e.g. gdb will just hang in case the target process is in this specific task state, being based on ptrace and seemingly inheriting it through there.

New Job, New ...
posted on 2018-04-15 18:46:10

Seems a good idea to write down some notes about this. I'm switching jobs starting May. I was hesitant about this, mostly because I wanted to figure out the right direction, but the opportunity presented itself, so why not. I'll see how that works out and not travelling for a while is absolutely necessary for me to get ... back on track, I guess.

I hope to get back into a bit of Open Source work as well, with a bit more time that might be possible. Notably that will be related to cl-cffi-gtk, which I hope to get into better shape with some additional collaborators, so there's a good chance the library will (finally) be switched over on Quicklisp too.

At the same time there's things on the bucket list I hope to start this year. It's already a bit late during the year, but better now than not at all. That includes lending a bow and signing up for training.

Lastly I'm not super happy about the lack of reading during the last months, I'm currently still on some Dune-related books and I can't say I enjoy them a lot (big surprise given that people have been slamming them I believe). Music-wise there's fortunately quite a lot of variety going on.

An early Christmas present^Wkeyboard
posted on 2017-12-17 12:25:40

Picture of the half-built keyboard

Picture of the keyboard with soldering iron

Almost at the end of this year I finally had enough components to build a custom keyboard. The case is reused from my existing Poker II, but I'm on the look-out for either a wooden or all-metal case instead of black plastic. Additionally I'm missing one 1.25u and one 1.5u key - with the maximum set of keys you end up needing a few more of those than on even a regular sized keyboard (but I do have an ISO Enter and Shift key too many).

Okay, so from the start. The goal was to have a fully reflashable keyboard. Although Vortex (the makers of the Poker keyboards) have provided some updated firmware, they don't want to share the firmware sources and the one GitHub project I found to reverse engineer it didn't go anywhere in the last few years unfortunately.

Looking through the number of various projects it seemed that a GeekHack GH60 or a similar model would suffice for me. The layout isn't much different from my previous keyboard and it would fit a lot of keys in a good form factor.

I actually went with the (Chinese?) "Satan" variant - comes set up for more LEDs, a feature which I'm not yet using (and might not in the long run at all actually). With that PCB, a black plate for mounting the switches, a stabiliser for the Space key, green Cherry switches and keys I now have something matching my taste quite well. The keys I'd actually like to replace with a nicer option later on, but they're alright for the moment.

All in all thumbs up, would build it again.

Build

Testing the PCB with a bit of wire is a good idea, so check every key ... possibly also the LEDs, but I didn't have any handy for it. As far as I understand modding the switches with SIP sockets can be done even after soldering in the switches, so I didn't do it yet.

Once that's confirmed the rest is also easy: Put in the first switches into the plate. Stabiliser goes onto the PCB, then the plate on top. Check the contacts are visible on the other side of the PCB and nothing was bent. Solder the switches to give a bit of stability. Then fill up the rest of the keyboard. I checked the layout with the keys in the most critical areas too because I wasn't sure it would all align well.

Check the solder joints, movement of the switches (my space key was a bit weird, sometimes getting stuck a bit), plug it in and confirm with debugging mode or some online keyboard tester.

Firmware

The default firmware is fine for testing, but of course the objective is to customise the hell out of it.

The best resource for me was this gist that is a bit cumbersome to set up with the different repositories, but works pretty well.

The wiring isn't completely in sync with the regular GH60, therefore a separate revision needs to be selected in the TMK sources. Also, it seems like at least one key is completely missing even with that and the default layouts are all ANSI! To use the remaining three keys the regular KEYMAP macro needs to be used. Note that also a few keys are on a different position in the macro, therefore I changed it slightly such that it more closely resembles the physical layout.

For debugging either cat from the /dev/hidraw devices (might take a moment to find the right one), or perhaps compile the HID Listen program. You really don't want to disable this until you've sorted out all keymap issues. By default the magic keys are left shift + right shift - h will print the short help, while x, the most interesting one for me, will print the keyboard matrix on every raw keypress.

Oneliners #1
posted on 2017-07-04 22:3011:+01:00

So, let's replicate this blog post in CL. In case it's gone, we want to fetch all pictures from Reddit's /r/pics:

curl -s -H "User-Agent: cli:bash:v0.0.0 (by /u/codesharer)" \
  https://www.reddit.com/r/pics/.json \
  | jq '.data.children[].data.url' \
  | xargs -P 0 -n 1 -I {} bash -c 'curl -s -O {}'

First, we need a HTTP client (drakma being canonical) and a JSON parser (cl-json will suffice here, you might want to use a different one depending on the parsing/serialisation needs).

(asdf:load-systems '#:drakma '#:cl-json)

Next, we define our workflow, fetch JSON content from a URL, get the list of URLs and download each one of them.

We need to download the JSON first, so let's start with that:

(defun scrape-sub (name)
  (drakma:http-request
   (format NIL "https://www.reddit.com/r/~A/.json" name)
   :user-agent "repl:abcl:v0.0.0 (by /u/nyir)"))

If we run, we'll see that the output is just a byte vector:

#(123 34 107 ...)

That's actually fine for the JSON library, but more readably we could either set a flag for Drakma, or convert it manually:

(babel:octets-to-string *)

The second step is parsing it from the JSON format, so extending it will be like the following:

(defun scrape-sub (name)
  (cl-json:decode-json-from-source
   (babel:octets-to-string
    (drakma:http-request
     (format NIL "https://www.reddit.com/r/~A/.json" name)
     :user-agent "repl:abcl:v0.0.0 (by /u/nyir)"))))

The output now looks a bit different:

((:KIND . "Listing") (:DATA (:MODHASH . "") ...))

But it's already getting more manageable. Next we want the URL bit of the data. Unfortunately I don't know of a good library that would allow us to specify something as a kind of XPath-style selector. So we'll go ahead and to it manually. The .data.children bit will be something like (cdr (assoc :children (cdr (assoc :data <json>)))), since cl-json returns an association list; children[] means we'll iterate over all children and collect the results; data.url again is the same kind of accessor like (cdr (assoc :url (cdr (assoc :data <json>)))):

(defun scrape-sub (name)
  (let ((json (cl-json:decode-json-from-source
               (babel:octets-to-string
                (drakma:http-request
                 (format NIL "https://www.reddit.com/r/~A/.json" name)
                 :user-agent "repl:abcl:v0.0.0 (by /u/nyir)")))))
    (mapcar (lambda (child)
              (cdr (assoc :url (cdr (assoc :data child)))))
            (cdr (assoc :children (cdr (assoc :data json)))))))

Now the output is just a list of strings:

("https://www.reddit.com/r/pics/comments/6ewxd6/may_2017_transparency_report/" "https://i.redd.it/luxhqoj95q5z.png" ...)

Here's one a addition I'll put in, filtering for image file types. That might still be unreliable of course, but it'll remove a whole bunch of potentially wrong links already. For filtering, MAPCAR isn't suitable, either we could do it in multiple stages, or we'll use something like MAPCAN, or an explicit iteration construct like LOOP/ITERATE. I'll go with MAPCAN here, meaning every element to collect needs to be wrapped in a list:

(defun scrape-sub (name)
  (let ((json (cl-json:decode-json-from-source
               (babel:octets-to-string
                (drakma:http-request
                 (format NIL "https://www.reddit.com/r/~A/.json" name)
                 :user-agent "repl:abcl:v0.0.0 (by /u/nyir)")))))
    (mapcan (lambda (child)
              (let ((url (cdr (assoc :url (cdr (assoc :data child))))))
                (and url
                     (member (pathname-type (pathname (puri:uri-path (puri:parse-uri url))))
                             '("jpg" "png")
                             :test #'string-equal)
                     (list url))))
            (cdr (assoc :children (cdr (assoc :data json)))))))

I'm happy with that and it now filters for two image types.

Last point, actually downloading all scraped results. For this, we just iterate and download them as before:

(defun scrape-sub (name)
  (let* ((agent "repl:abcl:v0.0.0 (by /u/nyir)")
         (json (cl-json:decode-json-from-source
                (babel:octets-to-string
                 (drakma:http-request
                  (format NIL "https://www.reddit.com/r/~A/.json" name)
                  :user-agent agent))))
         (downloads
           (mapcan (lambda (child)
                     (let ((url (cdr (assoc :url (cdr (assoc :data child))))))
                       (when url
                         (let ((pathname (pathname (puri:uri-path (puri:parse-uri url)))))
                           (when (member (pathname-type pathname)
                                         '("jpg" "png")
                                         :test #'string-equal)
                             `((,url ,pathname)))))))
                   (cdr (assoc :children (cdr (assoc :data json)))))))
    (mapc (lambda (download)
            (destructuring-bind (url pathname) download
              (with-open-file (stream (merge-pathnames *default-pathname-defaults* pathname)
                                      :direction :output
                                      :element-type '(unsigned-byte 8))
                (write-sequence
                 (drakma:http-request url :user-agent agent)
                 stream))))
          downloads)))

And this works.

Now. This is decidedly not a demonstration of how the final result should look like. In fact there a whole lof of things to improve and to consider when you'd put this into a reusable script.

From a maintainability perspective, we'd put each functional part into it's own component, be it a function or method, in order to make them easier to reason about and to test each bit individually.

From a performance part ... oh my, there's so much wrong with it, mostly slurping everything into memory multiple times, while drakma does support streams as results and HTTP Keep-Alive, both would improve things. The JSON parser could in theory also operate on tokens, but that's rarely worth the hassle (the CXML API can be used for that, by converting JSON "events" into a stream of SAX events basically). Lastly creating the output lists isn't necessary, this could all be done inline or with continuation passing style, but that's worse for maintaining a nice split between functions.

From a correctness part, all the URLS might have weird characters in them that don't work well with pathnames and/or the local filesystem. In fact PURI might not be the best choice here either. Also, even if the URLs are different, more than one of them might have the same filename, meaning there should either be some error handling in there, or the URLs should be hashed to be used as filename or some other scheme accomplishing the same thing. Lastly, the download files should be checked for emptiness, wrong content (HTML bits would indicate a failed download too), broken images, etc.

Another nice thing to add would be xattr support for indicating where the file was downloaded from.

Unified Communication
posted on 2017-06-18 16:04:25

Lately I've been thinking again about some way to unify (my) usage of different communication channels. Part of that is the additional distraction and lack of ease of use for some of the applications I'm forced to use.

This is partially a feature of my habits, i.e. I'm not a mobile phone user. At all. But "apps" like WhatsApp, WeChat, while thankfully having web clients, still force me to use a comparatively clunky interface.

Then we have Slack, the IRC clone that will eat every other IRC clone there is. Or maybe not. In my case it's the primary business-related communication form, after email.

Good that I'm not using Facebook, the moloch, but I imagine for a lot of people it's the primary way to use the web and internet. I haven't researched Facebook integration at all though, so there might be ways of integrating it more or less easily.

The previously mentioned channels were all active ones, but there's also a lot of passive consumption going on. (RSS and blogs, forums,) reddit, Hacker News are all channels that I frequently use. In case of reddit and Hacker News of course there's is the active element of posting, commenting, voting, but I rarely do that, so they fall under passive for me too.

So again, why unification? For all the above, getting notified (if new content is available) is a pain, comparatively. Both in the sense that for some of them (chat) the threshold is quite low, so reacting in near real-time is important, while for others it's absolutely the other way round, even though I'm still habitually checking them in case I'm bored (see how that goes?).

Unifying them would allow to aggregate, combine, filter in a general fashion instead of having (or not having in most cases) a distinct way to do that for each channel.

So again, would that solve the problem? I'm doubtful. On the one hand, there's clearly a need to remove friction. On the other hand, the cost of implementing it, the lack of distinctive features for each channel (visual mostly) would also undermine some of information. Possibly only at the start, it's hard to tell. I can however say that using RSS readers for me never worked out, precisely because the visual appearance is such a strong discriminator to (not) consume content. Though rtv, a console-based reddit client, worked rather well for some highly text-based content.

What other considerations are there? Well, the split between different contexts would be one thing. There's at least the work/life split for me, if not for many others.

Fidelity, as in, most text-based content can be viewed on the console, even if it might look better in a different renderer (browser). Showing pictures/clips is difficult on the console, but there are ugly hacks that can work around that problem if absolutely necessary (I'm personally not a fan).

Amount, blogs are rather infrequent, but have lots of text per each post, chat is very high frequent comparatively, but only has a few words per "post".

Context, again, there's also different groups in the personal context, that is, e.g. family, friends, different hobbies and interests, with each group having a somewhat overlapping set of sources.

So again, what can be solved? Technically, at least getting more sources into a single format is achievable. There are bridges from Slack to IRC, from RSS to IRC, etc. I'm choosing IRC here because it's a form of lowest common denominator, but similarly it could be mapped to email too. While IRC isn't good for long-form content, it can contain links which can then be viewed in other renderers, solving the notification issue. (Well, you still need to pay attention to the IRC client. In my case I'm always online on a VPS, so I need still to pass through notifications from the IRC client to the current local machine.)

What options would a unified architecture give us? E.g. having a single feed for chat, email, blog posts etc. for a group of people (channels). This can again be achieved manually, by tying in bots to post on behalf of a user, though in the architecture of IRC it wouldn't make sense to post some of these things publically - it's "your" view of the conversation, not the public view. That is, you'd want to view a feed with incoming emails, blog posts (Twitter, what have you) from a person inline.

Now, inertia. Given how XMPP basically failed and how each platform provider is aggressively trying to get people into their walled garden, what chance is there for a standard here?

Apart from that, can this idea be implemented purely client-side? AFAIK yes, there's still friction with the different technologies being integrated, but as a central communication hub this would still make sense.

Building on top I have some further (obvious) extensions in mind, the usual spam filters, deduplication, aggregation/search, also everything statistics basically, that can be applied on top.

Different interfaces would be available to have a view on the streams, or groups of streams. Traditionally this all hasn't worked out I feel, with the exception of very, very narrow things like email and text-based chat there's just a lot of variation going on.

How would this look like? For me, one console window to show everything, with desktop notifications on top. For others, the same in a browser perhaps, or (take a deep breath) a native application instead.

In any case, food for thought. I'm hoping to follow up on this with a more focused design at some point.

Hacking Java fields in ABCL
posted on 2017-06-06 23:41:05

Just as a quick note, JSS and the JAVA package too won't allow you to treat LispObjects objects as JavaObjects for the purposes of JFIELD and JSS:GET-JAVA-FIELD. But if you still want to access internal fields and implementation details (the usual warnings apply!), try the following:

(jss:get-java-field (jss:new 'JavaObject (car (threads:mapcar-threads #'identity))) "javaThread" T)

This was because I was looking for an answer to a question on #abcl, but even then, wrapping the "Lisp" object in a JavaObject manually helps achieve the requested goal of retrieving the Java Thread object.

The T at the end is necessary here because the field is actually package-private - but that's to be expected if you want to access internals without an (official) API.

Previous

This blog covers work, unix, tachikoma, postgresql, lisp, kotlin, java, hardware, git, emacs

View content from 2014-08, 2014-11, 2014-12, 2015-01, 2015-02, 2015-04, 2015-06, 2015-08, 2015-11, 2016-08, 2016-09, 2016-10, 2016-11, 2017-06, 2017-07, 2017-12, 2018-04, 2018-07, 2018-08


Unless otherwise credited all material Creative Commons License by Olof-Joachim Frahm