Recent Content

posted on 2018-07-17 21:50:04

One part of the lack of cohesion on *nix systems is the hack of command line completion via e.g. Bash or ZSH modules. I can think of roughly three ways in the *nix paradigm to have a more cohesive pattern here:

Give full control to the command line program and invoke it with a special handler to parse and generate appropriate output for command line completion.
Give partial control and generate a script like description of the completion, e.g. as a Bash- or ZSH-specific script.
Which is pretty similar, generate a static description on request.

All of these use the program, the thing that's being invoked, as the central mechanism of how to do completion. No extra files are required to be put anywhere and the description can always be in sync with what the program actually does later on. The downside might very well be that the program is invoked multiple times.

Therefore perhaps also

Invoke the program in a special mode and keep it alive while it handles command line completion. Once finished and the user wants to invoke it, it would simply use the already existing in-memory structures and continue.

Of course I can see the downsides here too, though in the interest of having a more interwoven system I'd argue that the benefit might be worth it overall, in particular for programs whose arguments are a little bit more like a DSL than just simple file arguments. Notably though any shell features would not be available in variant 4, instead the program would have to call back or replicate the required functionality, making it somewhat less appealing?

The Curious Case of the Uninterruptible Sleep

posted on 2018-07-09 16:16:37

Let me tell you about one particular annoying issue I came across when using a GC-ed environment. No, not Python, though I'm almost definitely sure it will have the same issue. No, it's of course Common Lisp, in this case CCL and SBCL in particular, both of which have a stop-the-world GC (as far as I know there are no knobs to change the behaviour outside of what I'll describe below).

Now, do you also remember these things called HDDs? Turns out that one of my external hard drives likes to shut itself down after a while. You can hear that very well. However, since the drive is still mounted, accessing (uncached) parts of the filesystem will trigger a wake event. But getting the platters up to speed again takes a lot of time, so in between what happens?

Exactly, uninterruptible sleep for the process in question. It's one of the few possibilities where that process state can happen and if it was without the specifics of the GC involved it would just block one thread while the rest, in particular the GUI thread, would keep moving.

In this particular instance though, the GUI would work for a moment and then freeze until the drive finally responded with the requested data.

Now why is that?

Turns out the GC asks each (runtime) thread to block for the GC run. This is done via signals. Except one thread won't respond to signals since it's ... sleeping. Uninterruptibly.

Great.

Suggested options include spawning a separate process and wait for I/O (which I'd rather not do, since it'd mean doing that for every single library call that might operate with e.g. files, which is basically a lot. It just seems there's no good way to deal with this except in changing the runtime and dealing with the fact that the GC might not be able to reach all threads.

I looked a bit at the CCL runtime and it seems if we promise not to do anything bad with regards to the GC, we might be able to work around it by setting a new flag on a per-thread level that excludes it from the GC waiting loop. We'd only do this around potentially problematic FFI calls, but also around known internal system calls. When returning from the foreign land we'd also need to "catch up" with the GC, potentially blocking until the current run is done (if there is one). But that's solvable, with a little bit more overhead for FFI calls. I believe that's worth it.

Since I'm all around not familiar with either the CCL or SBCL code bases I suspect that even the generalisation of doing this for all FFI would be tenable, in which case the pesky annotations would cease to be necessary, fixing the issue without having to manually adjust anything on the developer's side.

Lastly, if I knew the right terms I'm sure there are solutions for this in either GHC or Erlang, except that I wouldn't really know what to search for.

Btw. this is all super iffy to debug too, since e.g. gdb will just hang in case the target process is in this specific task state, being based on ptrace and seemingly inheriting it through there.

New Job, New ...

posted on 2018-04-15 18:46:10

Seems a good idea to write down some notes about this. I'm switching jobs starting May. I was hesitant about this, mostly because I wanted to figure out the right direction, but the opportunity presented itself, so why not. I'll see how that works out and not travelling for a while is absolutely necessary for me to get ... back on track, I guess.

I hope to get back into a bit of Open Source work as well, with a bit more time that might be possible. Notably that will be related to cl-cffi-gtk, which I hope to get into better shape with some additional collaborators, so there's a good chance the library will (finally) be switched over on Quicklisp too.

At the same time there's things on the bucket list I hope to start this year. It's already a bit late during the year, but better now than not at all. That includes lending a bow and signing up for training.

Lastly I'm not super happy about the lack of reading during the last months, I'm currently still on some Dune-related books and I can't say I enjoy them a lot (big surprise given that people have been slamming them I believe). Music-wise there's fortunately quite a lot of variety going on.

An early Christmas present^Wkeyboard

posted on 2017-12-17 12:25:40

Picture of the half-built keyboard

Picture of the keyboard with soldering iron

Almost at the end of this year I finally had enough components to build a custom keyboard. The case is reused from my existing Poker II, but I'm on the look-out for either a wooden or all-metal case instead of black plastic. Additionally I'm missing one 1.25u and one 1.5u key - with the maximum set of keys you end up needing a few more of those than on even a regular sized keyboard (but I do have an ISO Enter and Shift key too many).

Okay, so from the start. The goal was to have a fully reflashable keyboard. Although Vortex (the makers of the Poker keyboards) have provided some updated firmware, they don't want to share the firmware sources and the one GitHub project I found to reverse engineer it didn't go anywhere in the last few years unfortunately.

Looking through the number of various projects it seemed that a GeekHack GH60 or a similar model would suffice for me. The layout isn't much different from my previous keyboard and it would fit a lot of keys in a good form factor.

I actually went with the (Chinese?) "Satan" variant - comes set up for more LEDs, a feature which I'm not yet using (and might not in the long run at all actually). With that PCB, a black plate for mounting the switches, a stabiliser for the Space key, green Cherry switches and keys I now have something matching my taste quite well. The keys I'd actually like to replace with a nicer option later on, but they're alright for the moment.

All in all thumbs up, would build it again.

Build

Testing the PCB with a bit of wire is a good idea, so check every key ... possibly also the LEDs, but I didn't have any handy for it. As far as I understand modding the switches with SIP sockets can be done even after soldering in the switches, so I didn't do it yet.

Once that's confirmed the rest is also easy: Put in the first switches into the plate. Stabiliser goes onto the PCB, then the plate on top. Check the contacts are visible on the other side of the PCB and nothing was bent. Solder the switches to give a bit of stability. Then fill up the rest of the keyboard. I checked the layout with the keys in the most critical areas too because I wasn't sure it would all align well.

Check the solder joints, movement of the switches (my space key was a bit weird, sometimes getting stuck a bit), plug it in and confirm with debugging mode or some online keyboard tester.

Firmware

The default firmware is fine for testing, but of course the objective is to customise the hell out of it.

The best resource for me was this gist that is a bit cumbersome to set up with the different repositories, but works pretty well.

The wiring isn't completely in sync with the regular GH60, therefore a separate revision needs to be selected in the TMK sources. Also, it seems like at least one key is completely missing even with that and the default layouts are all ANSI! To use the remaining three keys the regular KEYMAP macro needs to be used. Note that also a few keys are on a different position in the macro, therefore I changed it slightly such that it more closely resembles the physical layout.

For debugging either cat from the /dev/hidraw devices (might take a moment to find the right one), or perhaps compile the HID Listen program. You really don't want to disable this until you've sorted out all keymap issues. By default the magic keys are left shift + right shift - h will print the short help, while x, the most interesting one for me, will print the keyboard matrix on every raw keypress.

Oneliners #1

posted on 2017-07-04 22:3011:+01:00

So, let's replicate this blog post in CL. In case it's gone, we want to fetch all pictures from Reddit's /r/pics:

curl -s -H "User-Agent: cli:bash:v0.0.0 (by /u/codesharer)" \
  https://www.reddit.com/r/pics/.json \
  | jq '.data.children[].data.url' \
  | xargs -P 0 -n 1 -I {} bash -c 'curl -s -O {}'

First, we need a HTTP client (drakma being canonical) and a JSON parser (cl-json will suffice here, you might want to use a different one depending on the parsing/serialisation needs).

(asdf:load-systems '#:drakma '#:cl-json)

Next, we define our workflow, fetch JSON content from a URL, get the list of URLs and download each one of them.

We need to download the JSON first, so let's start with that:

(defun scrape-sub (name)
  (drakma:http-request
   (format NIL "https://www.reddit.com/r/~A/.json" name)
   :user-agent "repl:abcl:v0.0.0 (by /u/nyir)"))

If we run, we'll see that the output is just a byte vector:

#(123 34 107 ...)

That's actually fine for the JSON library, but more readably we could either set a flag for Drakma, or convert it manually:

(babel:octets-to-string *)

The second step is parsing it from the JSON format, so extending it will be like the following:

(defun scrape-sub (name)
  (cl-json:decode-json-from-source
   (babel:octets-to-string
    (drakma:http-request
     (format NIL "https://www.reddit.com/r/~A/.json" name)
     :user-agent "repl:abcl:v0.0.0 (by /u/nyir)"))))

The output now looks a bit different:

((:KIND . "Listing") (:DATA (:MODHASH . "") ...))

But it's already getting more manageable. Next we want the URL bit of the data. Unfortunately I don't know of a good library that would allow us to specify something as a kind of XPath-style selector. So we'll go ahead and to it manually. The .data.children bit will be something like (cdr (assoc :children (cdr (assoc :data <json>)))), since cl-json returns an association list; children[] means we'll iterate over all children and collect the results; data.url again is the same kind of accessor like (cdr (assoc :url (cdr (assoc :data <json>)))):

(defun scrape-sub (name)
  (let ((json (cl-json:decode-json-from-source
               (babel:octets-to-string
                (drakma:http-request
                 (format NIL "https://www.reddit.com/r/~A/.json" name)
                 :user-agent "repl:abcl:v0.0.0 (by /u/nyir)")))))
    (mapcar (lambda (child)
              (cdr (assoc :url (cdr (assoc :data child)))))
            (cdr (assoc :children (cdr (assoc :data json)))))))

Now the output is just a list of strings:

("https://www.reddit.com/r/pics/comments/6ewxd6/may_2017_transparency_report/" "https://i.redd.it/luxhqoj95q5z.png" ...)

Here's one a addition I'll put in, filtering for image file types. That might still be unreliable of course, but it'll remove a whole bunch of potentially wrong links already. For filtering, MAPCAR isn't suitable, either we could do it in multiple stages, or we'll use something like MAPCAN, or an explicit iteration construct like LOOP/ITERATE. I'll go with MAPCAN here, meaning every element to collect needs to be wrapped in a list:

(defun scrape-sub (name)
  (let ((json (cl-json:decode-json-from-source
               (babel:octets-to-string
                (drakma:http-request
                 (format NIL "https://www.reddit.com/r/~A/.json" name)
                 :user-agent "repl:abcl:v0.0.0 (by /u/nyir)")))))
    (mapcan (lambda (child)
              (let ((url (cdr (assoc :url (cdr (assoc :data child))))))
                (and url
                     (member (pathname-type (pathname (puri:uri-path (puri:parse-uri url))))
                             '("jpg" "png")
                             :test #'string-equal)
                     (list url))))
            (cdr (assoc :children (cdr (assoc :data json)))))))

I'm happy with that and it now filters for two image types.

Last point, actually downloading all scraped results. For this, we just iterate and download them as before:

(defun scrape-sub (name)
  (let* ((agent "repl:abcl:v0.0.0 (by /u/nyir)")
         (json (cl-json:decode-json-from-source
                (babel:octets-to-string
                 (drakma:http-request
                  (format NIL "https://www.reddit.com/r/~A/.json" name)
                  :user-agent agent))))
         (downloads
           (mapcan (lambda (child)
                     (let ((url (cdr (assoc :url (cdr (assoc :data child))))))
                       (when url
                         (let ((pathname (pathname (puri:uri-path (puri:parse-uri url)))))
                           (when (member (pathname-type pathname)
                                         '("jpg" "png")
                                         :test #'string-equal)
                             `((,url ,pathname)))))))
                   (cdr (assoc :children (cdr (assoc :data json)))))))
    (mapc (lambda (download)
            (destructuring-bind (url pathname) download
              (with-open-file (stream (merge-pathnames *default-pathname-defaults* pathname)
                                      :direction :output
                                      :element-type '(unsigned-byte 8))
                (write-sequence
                 (drakma:http-request url :user-agent agent)
                 stream))))
          downloads)))

And this works.

Now. This is decidedly not a demonstration of how the final result should look like. In fact there a whole lof of things to improve and to consider when you'd put this into a reusable script.

From a maintainability perspective, we'd put each functional part into it's own component, be it a function or method, in order to make them easier to reason about and to test each bit individually.

From a performance part ... oh my, there's so much wrong with it, mostly slurping everything into memory multiple times, while drakma does support streams as results and HTTP Keep-Alive, both would improve things. The JSON parser could in theory also operate on tokens, but that's rarely worth the hassle (the CXML API can be used for that, by converting JSON "events" into a stream of SAX events basically). Lastly creating the output lists isn't necessary, this could all be done inline or with continuation passing style, but that's worse for maintaining a nice split between functions.

From a correctness part, all the URLS might have weird characters in them that don't work well with pathnames and/or the local filesystem. In fact PURI might not be the best choice here either. Also, even if the URLs are different, more than one of them might have the same filename, meaning there should either be some error handling in there, or the URLs should be hashed to be used as filename or some other scheme accomplishing the same thing. Lastly, the download files should be checked for emptiness, wrong content (HTML bits would indicate a failed download too), broken images, etc.

Another nice thing to add would be xattr support for indicating where the file was downloaded from.

Unified Communication

posted on 2017-06-18 16:04:25

Lately I've been thinking again about some way to unify (my) usage of different communication channels. Part of that is the additional distraction and lack of ease of use for some of the applications I'm forced to use.

This is partially a feature of my habits, i.e. I'm not a mobile phone user. At all. But "apps" like WhatsApp, WeChat, while thankfully having web clients, still force me to use a comparatively clunky interface.

Then we have Slack, the IRC clone that will eat every other IRC clone there is. Or maybe not. In my case it's the primary business-related communication form, after email.

Good that I'm not using Facebook, the moloch, but I imagine for a lot of people it's the primary way to use the web and internet. I haven't researched Facebook integration at all though, so there might be ways of integrating it more or less easily.

The previously mentioned channels were all active ones, but there's also a lot of passive consumption going on. (RSS and blogs, forums,) reddit, Hacker News are all channels that I frequently use. In case of reddit and Hacker News of course there's is the active element of posting, commenting, voting, but I rarely do that, so they fall under passive for me too.

So again, why unification? For all the above, getting notified (if new content is available) is a pain, comparatively. Both in the sense that for some of them (chat) the threshold is quite low, so reacting in near real-time is important, while for others it's absolutely the other way round, even though I'm still habitually checking them in case I'm bored (see how that goes?).

Unifying them would allow to aggregate, combine, filter in a general fashion instead of having (or not having in most cases) a distinct way to do that for each channel.

So again, would that solve the problem? I'm doubtful. On the one hand, there's clearly a need to remove friction. On the other hand, the cost of implementing it, the lack of distinctive features for each channel (visual mostly) would also undermine some of information. Possibly only at the start, it's hard to tell. I can however say that using RSS readers for me never worked out, precisely because the visual appearance is such a strong discriminator to (not) consume content. Though rtv, a console-based reddit client, worked rather well for some highly text-based content.

What other considerations are there? Well, the split between different contexts would be one thing. There's at least the work/life split for me, if not for many others.

Fidelity, as in, most text-based content can be viewed on the console, even if it might look better in a different renderer (browser). Showing pictures/clips is difficult on the console, but there are ugly hacks that can work around that problem if absolutely necessary (I'm personally not a fan).

Amount, blogs are rather infrequent, but have lots of text per each post, chat is very high frequent comparatively, but only has a few words per "post".

Context, again, there's also different groups in the personal context, that is, e.g. family, friends, different hobbies and interests, with each group having a somewhat overlapping set of sources.

So again, what can be solved? Technically, at least getting more sources into a single format is achievable. There are bridges from Slack to IRC, from RSS to IRC, etc. I'm choosing IRC here because it's a form of lowest common denominator, but similarly it could be mapped to email too. While IRC isn't good for long-form content, it can contain links which can then be viewed in other renderers, solving the notification issue. (Well, you still need to pay attention to the IRC client. In my case I'm always online on a VPS, so I need still to pass through notifications from the IRC client to the current local machine.)

What options would a unified architecture give us? E.g. having a single feed for chat, email, blog posts etc. for a group of people (channels). This can again be achieved manually, by tying in bots to post on behalf of a user, though in the architecture of IRC it wouldn't make sense to post some of these things publically - it's "your" view of the conversation, not the public view. That is, you'd want to view a feed with incoming emails, blog posts (Twitter, what have you) from a person inline.

Now, inertia. Given how XMPP basically failed and how each platform provider is aggressively trying to get people into their walled garden, what chance is there for a standard here?

Apart from that, can this idea be implemented purely client-side? AFAIK yes, there's still friction with the different technologies being integrated, but as a central communication hub this would still make sense.

Building on top I have some further (obvious) extensions in mind, the usual spam filters, deduplication, aggregation/search, also everything statistics basically, that can be applied on top.

Different interfaces would be available to have a view on the streams, or groups of streams. Traditionally this all hasn't worked out I feel, with the exception of very, very narrow things like email and text-based chat there's just a lot of variation going on.

How would this look like? For me, one console window to show everything, with desktop notifications on top. For others, the same in a browser perhaps, or (take a deep breath) a native application instead.

In any case, food for thought. I'm hoping to follow up on this with a more focused design at some point.

Hacking Java fields in ABCL

posted on 2017-06-06 23:41:05

Just as a quick note, JSS and the JAVA package too won't allow you to treat LispObjects objects as JavaObjects for the purposes of JFIELD and JSS:GET-JAVA-FIELD. But if you still want to access internal fields and implementation details (the usual warnings apply!), try the following:

(jss:get-java-field (jss:new 'JavaObject (car (threads:mapcar-threads #'identity))) "javaThread" T)

This was because I was looking for an answer to a question on #abcl, but even then, wrapping the "Lisp" object in a JavaObject manually helps achieve the requested goal of retrieving the Java Thread object.

The T at the end is necessary here because the field is actually package-private - but that's to be expected if you want to access internals without an (official) API.

Byte code verification and ABCL

posted on 2017-06-01 19:51:41

A feature of JVM byte code that I knew about, but didn't concern me too much when I previously hacked on ABCL was interesting to see in a particular bug report, where a function, when compiled, got rejected by the byte code validator:

(compile NIL (lambda (list) (nth (lambda ()) list)))
;; => Compiled function can't be loaded

This is true for any class loader, so COMPILE-FILE with LOAD would show the same behaviour.

The reason here is inlining: In order to give better performance when some information about the types is known, the call to NTH will actually be optimised here. One option to see this is to set a debug option to dump the generated byte code while the compilation runs:

(setf jvm::*compiler-debug* T)

(Part) of the output will look like this:

  0 GETSTATIC      #24 <Field LispObject 42c1_92be_e5b48c1894c6.LFUN1812724>  
  1 ALOAD          (1)  
  2 SWAP             
  3 INVOKEVIRTUAL  #30 <Method LispObject LispObject.NTH(int)> -1 
  4 ARETURN

Using DISASSEMBLE is not an option precisely because the byte code can't be loaded at all (that's actually a nice idea as another function to dump not only the compiled FASL content, but also disassemble the contained byte code too).

With that option set (and perhaps setting a few BREAK statements in the compiler), it's somewhat easy to debug the compiled byte code and to see that the validator notices that the Java method that implements the NTH function requires an integer (fixnum) argument, but the LispObject (the lambda) doesn't match that signature. This is not a problem if this was a regular call. In fact, with inlining disabled the NTH function will still raise an error for an argument with the wrong type!

Finally the fix is to check for the derived type in the "P2" transformation function for NTH, COMPILE-NTH:

(define-inlined-function compile-nth (form target representation)
  ((check-arg-count form 2))
  (let* ((index-form (second form))
         (list-form (third form))

    ;;; new check here
         (index-type (derive-compiler-type index-form)))
    (unless (fixnum-type-p index-type)
      (compile-function-call form target representation)
      (return-from compile-nth))
    ;;; till here

    (with-operand-accumulation
        ((compile-operand index-form :int)
         (compile-operand list-form nil)
         (maybe-emit-clear-values index-form list-form))
      (emit 'swap)
      (emit-invokevirtual +lisp-object+ "NTH" '(:int) +lisp-object+))
    (fix-boxing representation nil) ; FIXME use derived result type
    (emit-move-from-stack target representation)))

Note the falling back to a "general" function call using COMPILE-FUNCTION-CALL here in case the type is not known in advance, or not a fixnum type (though that could also raise a warning here already).

Again, compiling the above function again looks a bit different in the general case:

  0 GETSTATIC      #29 <Field Symbol 4faf_99d9_62a0381d4d65.SYM1814566>  
  1 GETSTATIC      #33 <Field LispObject 4faf_99d9_62a0381d4d65.LFUN1814565>  
  2 ALOAD          (1)  
  3 INVOKEVIRTUAL  #39 <Method LispObject LispObject.execute(LispObject,LispObject)> -2 
  4 ARETURN

If instead a fixnum constant is used, (nth 1 list), this simplifies a lot:

  0 ICONST_1         
  1 ALOAD          (1)  
  2 SWAP             
  3 INVOKEVIRTUAL  #24 <Method LispObject LispObject.NTH(int)> -1 
  4 ARETURN

Compare that to adding a (declare (type fixnum ...)) declaration - not as good as a constant argument, but still directly calling the Java method:

  0 ALOAD          (1)  
  1 INVOKEVIRTUAL  #24 <Method int LispObject.intValue()> 0 
  2 ISTORE         (1)  
  3 ILOAD          (1)  
  4 ALOAD          (2)  
  5 SWAP             
  6 INVOKEVIRTUAL  #28 <Method LispObject LispObject.NTH(int)> -1 
  7 ARETURN

Note here that the type error (e.g. for again supplying the (lambda ())) will be raised at the caller site already!

Lastly, a good idea would also be to generally add more hints, as e.g. SBCL does, to debug other issues ("failed to inline because ..."), but that's for another day.

Memory reordering in Common Lisp and Kotlin

posted on 2016-11-23 21:43:41

I've been reading about multi-threaded programming out of curiosity over lock-free algorithms.

The first thing I've implemented for that was actually something completely different, namely a short example of memory reordering ported over from a C++ program.

In Common Lisp this relies on SBCL because of the availability of memory barriers and the Linux-specific API to set the thread affinity to a single CPU core (sb-cpu-affinity). Of course, given the appropriate compability libraries this could easily be done on other implementations too.

(in-package #:cl-user)

(defun main (&key single-cpu-p barrierp)
  (let ((begin-semaphore-1 (sb-thread:make-semaphore))
        (begin-semaphore-2 (sb-thread:make-semaphore))
        (end-semaphore (sb-thread:make-semaphore))
        (iterations 0)
        (detected 0)
        (x 0)
        (y 0)
        (r1 0)
        (r2 0)
        done)
    (declare (fixnum x y r1 r2))
    (declare (boolean done))
    (declare (optimize (speed 3) (safety 0) (space 0) (debug 0) (compilation-speed 0)))
    (labels ((single-cpu ()
               (when single-cpu-p
                 sb-cpu-affinity::
                 (with-cpu-affinity-mask (mask :save T)
                   (clear-cpu-affinity-mask mask)
                   (setf (cpu-affinity-p 0 mask) T))))
             (thread-1 ()
               (single-cpu)
               (loop
                 (when done
                   (return))
                 (sb-thread:wait-on-semaphore begin-semaphore-1)
                 (setf x 1)
                 (when barrierp
                   (sb-thread:barrier (:memory)))
                 (setf r1 y)
                 (sb-thread:signal-semaphore end-semaphore)))
             (thread-2 ()
               (single-cpu)
               (loop
                 (when done
                   (return))
                 (sb-thread:wait-on-semaphore begin-semaphore-2)
                 (setf y 1)
                 (when barrierp
                   (sb-thread:barrier (:memory)))
                 (setf r2 x)
                 (sb-thread:signal-semaphore end-semaphore))))
      (let ((thread-1 (sb-thread:make-thread #'thread-1))
            (thread-2 (sb-thread:make-thread #'thread-2)))
        (unwind-protect
             (loop
               (setf x 0 y 0)
               (sb-thread:signal-semaphore begin-semaphore-1)
               (sb-thread:signal-semaphore begin-semaphore-2)
               (sb-thread:wait-on-semaphore end-semaphore)
               (sb-thread:wait-on-semaphore end-semaphore)
               (when (and (eql r1 0) (eql r2 0))
                 (incf detected)
                 (format T "~D reorders detected after ~D iterations, every ~D on average~%" detected iterations (floor iterations detected)))
               (incf iterations))
          (setf done T)
          (sb-thread:signal-semaphore begin-semaphore-1)
          (sb-thread:signal-semaphore begin-semaphore-2)
          (sb-thread:join-thread thread-1)
          (sb-thread:join-thread thread-2))))))

On the JVM, respectively Kotlin, the code looks again remarkably similar. I haven't yet looked at thread affinity, but what's interesting here are two aspects related to the JVM memory model. Using volatile is, in contrast to C, viable, as it generates the necessary memory barriers.

import java.util.concurrent.Semaphore
import kotlin.concurrent.thread

class HelloWorld {
    var beginSempahore1 = Semaphore(0)
    var beginSempahore2 = Semaphore(0)
    var endSemaphore = Semaphore(0)

    var detected = 0
    var iterations = 0

    @Volatile var x = 0
    @Volatile var y = 0
    var r1 = 0
    var r2 = 0

    fun run() {
        thread {
            while (true) {
                beginSempahore1.acquire()
                x = 1
                r1 = y
                endSemaphore.release()
            }
        }

        thread {
            while (true) {
                beginSempahore2.acquire()
                y = 1
                r2 = x
                endSemaphore.release()
            }
        }

        while (true) {
            iterations += 1
            x = 0
            y = 0
            beginSempahore1.release()
            beginSempahore2.release()
            endSemaphore.acquire()
            endSemaphore.acquire()
            if (r1 == 0 && r2 == 0) {
                detected++
                println("$detected reorders detected after $iterations iterations")
            }
        }
    }
}

fun main(args: Array<String>) {
    HelloWorld().run()
}

Also, when using a JVM plugin, the generated assembly code can be dumped during compilation, which allows us to confirm whether those instructions actually have been generated; for the first loop:

  0x00007f9ad4f503c9: mov    (%rcx),%r11d       ;*getfield this$0
                                                ; - HelloWorld$run$1::invoke@11 (line 21)

  0x00007f9ad4f503cc: test   %r11d,%r11d
  0x00007f9ad4f503cf: je     0x00007f9ad4f507a9
  0x00007f9ad4f503d5: movl   $0x1,0x14(%r12,%r11,8)
  0x00007f9ad4f503de: lock addl $0x0,(%rsp)     ;*putfield x
                                                ; - HelloWorld::setX@2 (line 12)
                                                ; - HelloWorld$run$1::invoke@15 (line 21)

  0x00007f9ad4f503e3: mov    (%rcx),%r10d       ;*getfield this$0
                                                ; - HelloWorld$run$1::invoke@19 (line 22)

  0x00007f9ad4f503e6: test   %r10d,%r10d
  0x00007f9ad4f503e9: je     0x00007f9ad4f507cd  ;*invokevirtual getY
                                                ; - HelloWorld$run$1::invoke@26 (line 22)

  0x00007f9ad4f503ef: mov    0x18(%r12,%r10,8),%r11d  ;*getfield y
                                                ; - HelloWorld::getY@1 (line 13)
                                                ; - HelloWorld$run$1::invoke@26 (line 22)

  0x00007f9ad4f503f4: mov    %r11d,0x1c(%r12,%r10,8)  ;*putfield r1
                                                ; - HelloWorld::setR1@2 (line 14)
                                                ; - HelloWorld$run$1::invoke@29 (line 22)

And for the second loop:

  0x00007f9ad4f47269: mov    (%rcx),%r11d       ;*getfield this$0
                                                ; - HelloWorld$run$2::invoke@11 (line 30)

  0x00007f9ad4f4726c: test   %r11d,%r11d
  0x00007f9ad4f4726f: je     0x00007f9ad4f47629
  0x00007f9ad4f47275: movl   $0x1,0x18(%r12,%r11,8)
  0x00007f9ad4f4727e: lock addl $0x0,(%rsp)     ;*putfield y
                                                ; - HelloWorld::setY@2 (line 13)
                                                ; - HelloWorld$run$2::invoke@15 (line 30)

  0x00007f9ad4f47283: mov    (%rcx),%r10d       ;*getfield this$0
                                                ; - HelloWorld$run$2::invoke@19 (line 31)

  0x00007f9ad4f47286: test   %r10d,%r10d
  0x00007f9ad4f47289: je     0x00007f9ad4f4764d  ;*invokevirtual getX
                                                ; - HelloWorld$run$2::invoke@26 (line 31)

  0x00007f9ad4f4728f: mov    0x14(%r12,%r10,8),%r11d  ;*getfield x
                                                ; - HelloWorld::getX@1 (line 12)
                                                ; - HelloWorld$run$2::invoke@26 (line 31)

  0x00007f9ad4f47294: mov    %r11d,0x20(%r12,%r10,8)  ;*putfield r2
                                                ; - HelloWorld::setR2@2 (line 15)
                                                ; - HelloWorld$run$2::invoke@29 (line 31)

And the main loop:

  0x00007fd383813838: lock addl $0x0,(%rsp)     ;*putfield x
                                                ; - HelloWorld::run@58 (line 38)

  0x00007fd38381383d: mov    $0x0,%edx
  0x00007fd383813842: mov    %edx,0x18(%rsi)
  0x00007fd383813845: lock addl $0x0,(%rsp)     ;*putfield y
                                                ; - HelloWorld::run@63 (line 39)

  0x00007fd38381384a: mov    0x24(%rsi),%edi

Note that in contrast to the original and the Common Lisp version there's one additional memory barrier here that's unnecessary.

EverythingDB

posted on 2016-11-23 21:40:04

How do you track everything you know? Computers and mobile phones in turn have become external memory, however data is still fragmented, stored in different formats and usable in different programs.

While there are partial solutions to this issue, none seem to be all-encompassing enough.

For example, PIMs come close for communication and contacts, but it's hard to extend and customise them. You should be able to associate custom tags and key-value pairs with "documents"/entities. This goes straight into ontologies since the question how to model many kinds of relations will be a problem to deal with.

Lastly tracking all data only makes sense if there are reasonable search facilities and if exchange with other databases can be done with security in mind, especially if even metadata is sensitive.

Previous Next

This blog covers work, unix, tachikoma, scala, sbt, redis, postgresql, no-ads, lisp, kotlin, jvm, java, hardware, go, git, emacs, docker

View content from 2014-08, 2014-11, 2014-12, 2015-01, 2015-02, 2015-04, 2015-06, 2015-08, 2015-11, 2016-08, 2016-09, 2016-10, 2016-11, 2017-06, 2017-07, 2017-12, 2018-04, 2018-07, 2018-08, 2018-12, 2020-04, 2021-03