Recent Content

How to use Stylus to stay sane
posted on 2021-03-22 21:15:07+01:00

Stylus (formerly known as Stylish) is a browser extension to apply custom CSS stylesheets to any website you might want to alter. This a fantastic capability, which allows you to remove unnecessary clutter and otherwise give control back to you, the user.

Let's walk through this using the Twitter website as an example.

As of 2021 you'll normally see a "Trends" and "Who to follow?" block on the right sight of the desktop version of the page, assuming enough screen space is available. You might want to skip these and by that, focus more on the actual content of the accounts you're following:

Picture of inspector showing the Twitter homepage

Now how do you achieve this goal? For one, you might download a pre-made stylesheet, or install a browser extensions, or use a custom script (say, via Greasemonkey).

However, Stylish makes it very low effort to filter out unwanted content yourself. (Later we will see some other tools (like uBlock Origin), which are less powerful in some ways, but can accomplish a subset of what we're doing here with perhaps greater speed.)

Unfortunately, Twitter doesn't like you enough to use unminified and unobfuscated CSS rules, therefore what you end up with in the inspector is something like this:

Picture of inspector showing the CSS class names of the Twitter homepage

Not to worry though, using the "Copy > CSS Selector" option allows us to get a matching selector that we can then paste into the User Styles editor:

Picture of inspector showing "Copy > CSS Selector" option

Since we're at it, let's do the same for the other two elements, but let's leave the search option, just in case:

Picture of inspector showing "Copy > CSS Selector" option

.r-1uhd6vh:nth-child(3), .r-1uhd6vh:nth-child(4), .r-1niwhzg:nth-child(5) {
    display: none;
}

Once saved, the homepage greets us with plenty empty and unobstrusive space. Much better.

Capturing JVM TLS traffic for SBT
posted on 2020-04-22 21:20:20+02:00

Today I've had to dig deeper into some problem authenticating against an HTTPS API. This client was sending Basic Authentication information following a 3XX redirect, which then would make the second server (well, S3 really) return a 400 Bad Request, since it's refusing to deal with more than one authentication method at the same time.

This is all and good, but debugging what was actually being sent is a little bit more difficult if curl is not the method of choice.

Instead I found the -Djavax.net.debug=all option for the JVM. This will make it dump a lot of information throughout a connection. Mostly that's already enough to debug the issue, since a hexdump of the HTTP traffic is included. On the other hand it's also pretty verbose.

Another option is the slightly more involved jSSLKeyLog, which requires the use of a JVM parameter to include the Java agent, e.g. for SBT like so:

env JAVA_OPTS="-javaagent:jSSLKeyLog.jar==/jsslkeylog.log" sbt

Two more notes here: Compiling the tool is really easy, once cloned mvn package results in a ready-to-use JAR file. Also the log contains more information when two equal signs are used (handy for manual inspection).

This file can then be directly fed into WireShark ("Edit", "Preferences", "Protocols", "TLS", "(Pre-)-Master-Secret log filename") and will then allow the decoding of the captured network traffic (e.g. via tcpdump -i any -s 0 -w dump.pcap).

Docker and Redis from scratch
posted on 2020-04-19 14:50:06+02:00

Docker is ubiquitous in many projects and therefore it may be useful to dig into more detail about its inner workings. Arguably those aren't too complicated to build a smallish program that does the essentials in a few hours.

The CodeCrafters challenges focus on exactly this kind of idea, taking an existing tool and rebuilding it from scratch. Since they're currently in Early Access, I've only had the opportunity to try out the Docker and Redis challenges so far, but I thought maybe a few insights from them would be good to share.

Part of the challenge is to run the entrypoint of a container; using Go it's actually fairly easy to run external programs. Using the os/exec package is straightforward, even redirecting I/O is easy enough by looking at the Cmd structure a bit closer and assigning values to the Stdin, Stdout and Stderr fields. Also the exit status can be easily gotten from the error return value by checking for ExitError (only if it was not successful, that is, non-zero):

if err = cmd.Run(); err != nil {
    if exitError, ok := err.(*exec.ExitError); ok {
        ...
    }
}

Interestingly enough the SysProcAttr field exposes some functionality that is a bit more difficult to use in, say, C. While using the syscall package is possible, it's mostly easier to assign a few values in that field instead, using the definition of the SysProcAttr structure itself.

Later on there's also the need to parse some JSON - that's again easily done with the standard library, using encoding/json, in particular Unmarshal to a map[string]interface{} (in case we just want to grab a top-level entry in a JSON object), or to a pointer of a custom class using structure tags like so:

type Foo struct {
     Bars []Bar `json:"bars"`
}

type Bar struct {
     Baz string `json:"baz"`
}

...

foo := Foo{}
if err := json.Unmarshal(body, &foo); err != nil {
    panic(err)
}

for _, bar := range foo.Bars {
    println(bar.Baz)
}

The Redis challenge is comparatively more contained to just using standard library tools, the most interesting thing I've noticed was that there's now a concurrency-friendly map implementation called sync.Map, so no external synchronization primitive is needed.

What else helped is the redis-cli tool, though I had to find out for myself that it doesn't interpret the specification very strictly, in fact, just about everything returned from the server response will be printed, even when not valid according to the spec.

Overall the biggest challenge here might be to accurately parse the command input and deal with expiration (I simply chose a lazy approach there, instead of clearing out the map on a timer I suppose - this will of course not be memory-friendly long-term, but for implementing a very simple Redis server it's more than enough to pass all tests).

Testing with Scala
posted on 2020-04-14 16:37:37+02:00

After working with Scala for a while now, I thought it would be good to write down a couple of notes on my current testing setup, in particular with regards to which libraries I've settled on and which style of testing I've ended up using.

Tests end up in the same package as the code that's tested. A group of tests are always in a class with the Tests suffix, e.g. FooTests. If it's about a particular class Foo the same applies.

scalatest is used as the testing framework, using AnyWordSpec, that means we're using the should / in pattern.

For mocking the only addition is MockitoSugar to make things more Scala-ish.

How does it look like?

package com.example.foo

import org.mockito.MockitoSugar
import org.scalatest.wordspec.AnyWordSpec

class FooTests extends AnyWordSpec with MockitoSugar {
  "Foo" should {
    "do something" in {
      val bar = mock[Bar]
      val foo = new Foo(bar)
      foo.baz(42L)
      verify(bar).qux(42L)
    }
  }
}

Easy enough. There's also some more syntactic sugar for other Mockito features, meaning ArgumentMatchersSugar should also be imported when needed. Same as scalatest has a number of additional helpers for particular types like Option or Either, e.g. OptionValues and EitherValues.

class BarTests extends AnyWordSpec with Matchers with EitherValues with OptionValues {
  "Bar" should {
    "do something else" in {
      val bar = new Bar
      bar.qux(42L).left.value should be(empty)
      bar.quux().value shouldBe "a value"
    }
  }
}

This can be done to the extreme, but usually it looks easier to me to simply assign highly nested values to a variable and continue with matchers on that variable instead.

Since sbt is often used, the two test dependencies would look like this:

libraryDependencies ++= Seq(
  "org.scalatest" %% "scalatest"               % "3.1.1"  % Test,
  "org.mockito"   %% "mockito-scala-scalatest" % "1.13.0" % Test,
)
Scala Macros to the Rescue
posted on 2018-12-20 21:38:45+01:00

Did you know Scala has macros? Coming from Common Lisp they serve pretty much the same purpose, doing things that the (plethora of) other language features don't support and to shortcut the resulting boilerplate code. And even the S-expressions can be had when macro debugging is turned on, though the pretty-printed Scala code is arguably much more useful here.

Why would you realistically use them then? Turns out I had to deal with some auto-generated code dealing with Protobuf messages. The generated classes for any message look something like this (in Java syntax since that's what the generated code is):

public interface ExampleResponseOrBuilder
  extends com.google.protobuf.MessageOrBuilder;

public static final class ExampleResponse
  extends com.google.protobuf.GeneratedMessageV3
  implements ExampleReponseOrBuilder {

  public static Builder newBuilder();

  public static final class Builder
    extends com.google.protobuf.GeneratedMessageV3.Builder<Builder>
    implements ExampleResponseOrBuilder;
}

That is, we have one interface, two classes, one of them conveniently gives you a builder for new objects of the class. That's used like this (back to Scala here):

val builder: ExampleResponse.Builder = Example.newBuilder()
builder.mergeFrom(stream)
val result: ExampleResponse = builder.build()

If you try and make a generic builder here, you'll quickly notice that this is rather hard as the generic types don't really express the relationship between ExampleResponse and ExampleResponse.Builder well.

As an aside, you want to have a generic builder parametrised on the return type to be able to write something like this:

val result = build[ExampleResponse](stream)

Without ever having to pass through the type as a value. Better even if you just specify the result type and the type parameter for build is then automatically derived.

These builders look something like this then:

trait ProtobufBuilder[T <: Message] {
  def underlying(): Message.Builder

  def build(string: String)(implicit parser: JsonFormat.Parser): T = {
    val builder = underlying()
    parser.merge(string, builder)
    builder.build().asInstanceOf[T]
  }
}

class ExampleResponseBuilder() extends ProtobufBuilder[ExampleResponse] {
  override def underlying(): ExampleResponse.Builder =
    ExampleResponse.newBuilder()
}

This then allows us to use some implicit magic to pass these through to the decoder (sttp's framework in this case) to correctly decode the incoming data.

But, we've to 1. write one class for each type, 2. instantiate it. This is roughly five lines of code per type depending on the formatting.

Macros to the rescue!

Inspired by the circe derivation API I finally got all the pieces together to create such a macro:

def deriveProtobufBuilder[T <: Message]: ProtobufBuilder[T] = macro deriveProtobufBuilder_impl[T]

def deriveProtobufBuilder_impl[T <: Message: c.WeakTypeTag](
    c: blackbox.Context): c.Expr[ProtobufBuilder[T]] = {
  import c.universe._

  val messageType   = weakTypeOf[T]
  val companionType = messageType.typeSymbol.companion

  c.Expr[ProtobufBuilder[T]](q"""
    new ProtobufBuilder[$messageType] {
      override def underlying(): $companionType.Builder = $companionType.newBuilder()
    }
   """)
}

Used then like this:

private implicit val exampleResponseBuilder: ProtobufBuilder[ExampleResponse] = deriveProtobufBuilder

That's one or two lines and the types are only mentioned once (the variable name can be changed). Unfortunately getting rid of the variable name doesn't seem to be possible.

Easy, wasn't it? Unfortunately all of this is hampered by the rather undocumented APIs, you really have to search for existing code or Stackoverflow questions to figure this out.

One thing that helped immensly was the -Ymacro-debug-lite option, which prints the expanded macro when used in sbt via compile.

Act! Heuristics for determining intent.
posted on 2018-12-09 14:11:18+01:00

I wrote down a little scenario for myself for the next iteration of act:

  • mark "http://www.google.de" with mouse (selection -> goes to x11 buffer-cut)
  • listen on those changes and show current context and choices
  • press+release "act primary" key -> runs primary action immediately
  • (or) press "act" key once -> opens buffer for single character input to select choice
  • buffer captures keyboard, after single character action is taken, esc aborts, "act" again should expand the buffer to full repl
  • mouse -> selection of source? history?
  • context -> focused window? -> lookup for program gives cwd, etc.
  • since that's not foolproof an obvious addition to the protocol would be to store a reference to the source of the clipboard (or the program that will be queried for it)
  • you should always be able to interrogate windows about their origin and capabilities, including their scripting interface
  • pattern matching url -> rule based decisions
  • primary / secondary actions -> rule of three basically, except we err on the side of caution and just bind two actions two keys
  • special handling for clipboard types that aren't text? allow for example to override a rule based on whether a picture would be available

Now, there are several problems with this on a regular X11-based desktop already:

  • How do you identify the program / PID belonging to a particular window?
  • How do you get information about that program, e.g. its current working directory?

In addition, since I'm using tmux inside a terminal emulator there are some more problems:

  • How do you know which tmux session is currently active?
  • How do you know which program is currently "active" in a tmux session?

Basically it's a recursive lookup for the "current state" of what's being displayed to the user. Not only that, but for things like browsers, image editors, video editors, anything document based it's still the same problem at another level, namely, what the current "context" is, like the currently open website, picture, video, scene, what have you.

Coming back to earlier thoughts about automation, there's no way for most of these to be accurately determined at this time. Neither is e.g. DBUS scripting common enough to "just use it" for scripting, there are also several links missing in the scenario described above and some can likely never be fixed to a sufficient degree to not rely on heuristics.

Nevertheless, with said heuristics it's still possible to get to a state where a productivity plus can be achieved with only moderate amount of additional logic to step through all the indirections between processes and presentation layers.

Now let me list a few answers to the questions raised above:

  • The PID for a particular window can be derived from an X11 property, together with xdotool this gives us an easy pipeline to get this value: ``.
  • Information about the running process can then be retrieved via the proc filesystem, e.g. readlink /proc/$PID/cwd for the current working directory. Of course this has limited value in a multi-threaded program or any environment that doesn't rely on the standard filesystem interface (but uses its own defaults).
  • I do not have an answer for the currently active tmux session yet, presumably you should somehow be able to get from a PID to a socket and thus to the session?
  • For tmux, the currently active program in a session is a bit more complex, ``, which we'll also have to filter for the active values.

For scripting interfaces, many programs have their own little implementation of this, but most problematic here is that you want to go from a X11 window / PID to the scripting interface, not through some workaround by querying for interfaces!

For programs like Emacs and any programmable environment we can likely script something together, but again it's not a very coherent whole by any means.

DBus and PolicyKit from Common Lisp
posted on 2018-12-04 23:56:27+01:00

Intro

Regardless of your position on DBus, sometimes you might need to interact with it. Common Lisp currently has at least two libraries ready for you, one of them is in Quicklisp, https://github.com/death/dbus/.

Setup

Have it loaded and create a package to use it, then change into it.

;; (asdf:load-system '#:dbus)
;; or
;; (ql:quickload "dbus)

(defpackage #:example
  (:use #:cl #:dbus))

(in-package #:example)

For reference, I'm going to refer to a (very old) polkit example in Python. For reference I'm reproducing it here (it still works in current Python 3 without any changes except the print):

import dbus

bus = dbus.SystemBus()
proxy = bus.get_object('org.freedesktop.PolicyKit1', '/org/freedesktop/PolicyKit1/Authority')
authority = dbus.Interface(proxy, dbus_interface='org.freedesktop.PolicyKit1.Authority')

system_bus_name = bus.get_unique_name()

subject = ('system-bus-name', {'name' : system_bus_name})
action_id = 'org.freedesktop.policykit.exec'
details = {}
flags = 1            # AllowUserInteraction flag
cancellation_id = '' # No cancellation id

result = authority.CheckAuthorization(subject, action_id, details, flags, cancellation_id)

print(result)

So, how does this look in Common Lisp? Mostly the same, except that at least at the moment you have to specify the variant type explicitly! This was also the reason to document the example, it's quite hard to understand what's wrong if there's a mistake, including the socket connection just dying on you and other fun stuff.

(with-open-bus (bus (system-server-addresses))
  (with-introspected-object (authority bus
                                       "/org/freedesktop/PolicyKit1/Authority"
                                       "org.freedesktop.PolicyKit1")
    (let* ((subject `("system-bus-name" (("name" ((:string) ,(bus-name bus))))))
           (action-id "org.freedesktop.policykit.exec")
           (details ())
           (flags 1)
           (cancellation-id "")
           (result
                (authority "org.freedesktop.PolicyKit1.Authority" "CheckAuthorization"
                           subject action-id details flags cancellation-id)))
      (format T "~A~%" result))))

Note the encoding of the dictionary: The type of the whole argument is specified as (sa{sv}), a structure of a string and a dictionary of strings to variants - we're spelling out the variant type here, compared to what's automatically done by the Python library.

Lisp shell semantics
posted on 2018-08-08 01:36:29

Continuing from an earlier post, what might the semantics be that we'd like to have for a more useful shell integrated with a Lisp environment?

Syntax

Of course we can always keep the regular Common Lisp reader; however it's not best suited for interactive shell use. In fact I'd say interactive use period requires heavy use of editing commands to juggle parens. Which is why some people use one of the simplifications of not requiring the outermost layer of parens on the REPL.

So, one direction would be to have better S-expression manipulation on the REPL, the other to have a syntax that's more incremental than S-expressions.

E.g. imagine the scenario that,

  1. as a terminal user,
  2. I'm navigating directories,
  3. listing files,
  4. then looking for how many files there were,
  5. then grepping for a particular pattern,
  6. then opening one of the matches.

In sh terms that's something the lines of

$ cd foo
$ ls
...
$ ls | wc -l
42
$ ls | grep bar
...
$ vim ...

In reality I'm somewhat sure no one's going as far as doing the last two steps in an iterative fashion, like via ls | grep x | grep y | xargs vim, as most people will have a mouse readily available to select the desired name. There are some terminal widgets which allow the user to select from e.g. one of the input lines in a command line dialog, but again it's not a widespread pattern and still requires manual intervention in such a case.

Also note that instead of reusing the computation the whole expression keeps being reused, which also makes this somewhat inefficient. The notion of the "current" thing being worked on ("this", "self") also isn't expressed directly here.

In the new shell I'd like to see part of explored. We already have the three special variables *, ** and *** (and / etc.!) in Common Lisp - R, IPython and other environments usually generalise this to a numbered history, which arguably we might want to add here as well - so it stands to reason that these special variables make sense for a shell as well.

$ cd foo
$ ls
...
$ * | wc
42
$ grep ** bar
...
$ vim *

(Disregarding the need for a different globbing character ...)

There's also the very special exit status special variable in shells that we need to replicate. This will likely be similar to Perl special variables that keep track of one particular thing instead of reusing the * triad of variables for this too.

Pipelines

The expression compiler should be able to convert from a serial to a concurrent form as necessary, that is, by converting to the required form at each pipeline step.

ls | wc -l | vim

Here, ls will be the built-in LIST-DIRECTORY, which is capable of outputting e.g. a stream of entries. wc -l might be compiled to simply LENGTH, which, since it operates on fixed sequences, will force ls to be fully evaluated (disregarding a potential optimisation of just counting the entries instead of producing all filenames). The pipe to vim will then force the wc output to text format, since vim is an unknown native binary.

These would be the default conversions. We might force a particular interpretation by adding explicit (type) conversions, or adding annotations to certain forms to explain that they accept a certain input/output format; for external binaries the annotations necessarily would be more extensive.

For actual pipelines it's extremely important that each step progresses as new data becomes available. Including proper backpressure this will ensure good performance and keep the buffer bloat to a minimum. Again this might be a tunable behaviour too.

I/O

The shell should be convenient. As such any output and error output should be captured as a standard behaviour, dropping data if the buffers would get too big. This will then allow us to refer to previous output without having to reevaluate any expression / without having to run a program again (which in both cases might be expensive).

Each datatype and object should be able to be rendered and parsed in more than a single format too. E.g. a directory listing itself might be an object, but will typically be rendered to text/JSON/XML, or to a graphical representation even.

Keyboard protocol #2
posted on 2018-07-25 23:17:22

Now in fact there are still reasons we need key codes that are different to the eventual text representation, e.g. for cursor movement and other special characters like function keys.

Looking at the Xorg source code now, there's a relatively fixed notion of what a keyboard can actually do. I suspect that conceptually a somewhat backwards-compatible extension would be to have a new dedicated kind of device, that is exposed (similarly to a keyboard with an integrated touchpad) separately to the other functionality of the device.

In particular, I'd like to keep the keyboard in "regular" mode as long as the host hasn't signaled that it wants to use the extended functionality via, presumably some part of the USB negotiation. Only at that point would the extension be activated and the keyboard output would be sent via it. The regular keyboard device would then be virtually unplugged.

I suspect that this is better than having two devices, one for key codes and one for text input, especially because we'd not be able to guarantee in which order the two devices would be read. This is less / not a problem between a keyboard and a pointer device of course.

Now given that X11 isn't the interface most applications are written against, how would the text actually arrive at an application? I'd imagine basically extending the whole event handling by one more event, the text event, which wouldn't correspond to any key (thus, it can't be in a pressed or unpressed state). In terms of GTK+ and QT this might be even easier than for a lower level application since many applications will only want to deal with text input and pre-defined keyboard shortcuts anyway.

Speaking of which, what does "Ctrl-C" actually mean? Of course the mnemonic for "copy" is in there, but also the "control" modifier. How well does this play with text input? Not at all, and I believe modifiers work better logically on the key code level; for text input I imagine other modifiers like "emphasis", or, more specifically, "bold", would make more sense, possible "URL", or "footnote".

Overall there can of course be modifiers active while text input occurs, it's more a question of whether they (can) be assigned any meaning without falling back to the flawed character equals key press comparison.

What does this gain us? Ideally every application (or more accurately: each toolkit) could now drop logic specifically to translate key codes to text, since all of it would already be handled by the keyboard itself. Keypresses would come in via the same interface and be used for specific, non-text functionality.

Keyboard protocol
posted on 2018-07-19 11:48:07

The keyboard protocol is still using the same approach as roughly since the start of computing: The keyboard is a dumb device, reporting each mechanical button that's pressed. The computer is the intelligent device, translating those into characters.

There are some attempts that have made it into various places, e.g. there's a flag in the USB HID protocol to indicate the physical layout, be it US, or some other one. Except no manufacturer sets it, so no driver uses it.

But, what if we had a keyboard, courtesy of QMK and similar firmwares, that is substantially more intelligent? If the protocol allowed for it we'd be able to have such nice things as sending an "a", an "あ", or a "●" without any remapping to be done! In fact if the keyboard could send Unicode sequences we can do things that aren't possible by remapping, like sending characters from various scripts through a macro key without impacting any keypress since we have an immensely increased value space to work with.

Previous

This blog covers work, unix, tachikoma, scala, sbt, redis, postgresql, no-ads, lisp, kotlin, jvm, java, hardware, go, git, emacs, docker

View content from 2014-08, 2014-11, 2014-12, 2015-01, 2015-02, 2015-04, 2015-06, 2015-08, 2015-11, 2016-08, 2016-09, 2016-10, 2016-11, 2017-06, 2017-07, 2017-12, 2018-04, 2018-07, 2018-08, 2018-12, 2020-04, 2021-03


Unless otherwise credited all material Creative Commons License by Olof-Joachim Frahm