Developers Planet

January 16, 2019

Marcin Juszkiewicz

Commodore: The Final Years book

About year ago friend convinced me to buy “Commodore: The Amiga Years” book by Brian Bagnall. It described how Commodore company looked right after Jack Tramiel left. Buying Amiga company, release of Amiga 1000 and then A500 and A2000 times.

In April I backed another project by Brian Bagnall: “Commodore The Final Years” book. This one describes 1987 – 1994 period. From A500/2000 releases to company end.

There were many stories written/told about Commodore company. People saying that Ali Mehdi was main reason it collapsed and other theories. Brian Bagnall’s book gives better explanation than any story I have read before.

So why C= ended?

For me this is wrong question. I would rather ask “How C= managed to survive so long?”…

It looked like a mix of terrible management with good engineers. All those people working on whatever they want to design. And then moved between projects. With complete lack of deadlines (at least sensible ones).

All those crazy machines

For example Commodore 64D… It was an idea of adding 1581 disk drive into C64 case. Side effect? C64 users buying 1581 disk drives (because software released on 3.5″ floppies) and C64D people buying 1541 to get access to older releases on 5.25″ floppies.

Or whole work on Commodore 65… Wet dream of some of my retro friends. 8bit machine with superior capabilities (compared to C64/C128/C+4) but several years too late. Huge amount of work hours spent on designing 4502 CPU, VIC-III and other chips, prototype boards, operating system, new BASIC…

Then Amiga 300 (released as A600) which got any expansion possibilities removed. Only to not give a chance for GVP to earn money on extensions. I used that model. Was terrible but allowed me to have hard drive so I bought it instead of A500+.

All that work done on AAA chipset. Which was far beyond everything when they started but then quickly became not-so-magical when PC market got SVGA cards, PCI slots…

Company related stories

Book is not only how many computer models were on designers’ tables. There are many stories related to the company, people working there. How Commodore interacted with communities etc. How people worked in 80s/90s. Internet/UUCP use in those years. Hobbies of Commodore employees and lot more.

Impact on technology market

Some readers may remember CDTV model. It was Amiga 500(+) with CD-ROM drive all put in HiFi like case. But not so many knows about CDTV-CR one. It was cost reduced version based on A600. And much cheaper CD-ROM drive based on some cheap ‘diskman’ like one with electronics done from scratch. According to the book it’s creation lowered prices of CD-ROM drives for the whole industry.


For me this book (and previous one) are must read for any Amiga fan. Lot of interesting details about released (and not released) models, accessories, some notes about software. Many stories from Commodore employees and people cooperating with C= company through years.

My Amiga story

I was Amiga user in 1995 – 1999 years. First Amiga 600 with 425MB hard drive and 2MB ram. Then A1200 which I moved to PC tower case. At the end it had 68040 cpu at 40MHz, 64MB ram and 17GB hard drive (connected to FastATA controller). AmigaOS was nice, I learnt a lot but hardware became slow so I sold it in 1999 and moved to PC running Debian.

by Marcin Juszkiewicz at January 01, 2019 18:53

January 15, 2019

Naresh Bhat

usermod/groupmod tools - Rename username and usergroup in Ubuntu

The laptop come with default ubuntu installed.  In that case the username, usergroup they have created by default.  This blog explains you how you can rename the default username and group with your own username, group.

Unix-like operating systems decouple the user name from the user identity, so you may safely change the name without affecting the ID. All permissions, files, etc are tied to your identity (uid), not your username.

To manage every aspect of the user database, you use the usermod tool.  To change username (it is probably best to do this without being logged in):

STEP 1: Reboot your laptop with 1 as a command line parameter.
The laptop will be booted into a rescue mode with 1 as a parameter.  You can also boot your laptop in a single user mode.

STEP 2: Change your root password with the command "passwd"
This is just tobe secured in future,  because one can easily hack your laptop with your default user password.

STEP 3: Rename oldUsername with newUsername

# usermod -l newUsername oldUsername

This however, doesn't rename the home folder.

STEP 4: Rename to newHomeDir
To change home-folder, use

# usermod -d /home/newHomeDir -m newUsername

after you changed the username.

STEP 5: Rename the groupName

# groupmod --new-name NEW_GROUP_NAME OLD_GROUP_NAME

Now you can reboot your laptop in multiuser mode and log-in as a new user which is being prompted.

by Naresh ( at January 01, 2019 17:01

January 13, 2019

Leif Lindholm

Building TianoCore with Visual Studio (on ARM64)


EDK2/TianoCore has a very complex build system. Part of that is to let developers use vastly different toolcains to build (GCC, CLANG, Visual Studio, ICC, XCODE). But it also provides different profiles for different versions of these toolchains.

(As a side note, this is what leads to the frequently repeated misconception that EDK2 cannot be built with GCC later than version 5. The reality is that GCC behaviour and command line options have remained stable enough since version 5 that we haven't needed to add new profiles, and the GCC5 profile works fine for 6-8.)

From the start, the ARM/AARCH64 ports were developed using ARM's commercial toolchain and GCC. Whereas on the Ia32/X64 side, most of the development has tended to happen with Visual Studio (GCC mainly being used for Ovmf). This means that for a developer moving from x86 to ARM, they have not only had to get used to a new architecture, but they've also had to deal with a new toolchain.

Installing the tools

Visual Studio

Visual Studio 2017 has included ARM/AARCH64 support since release 15.4. Not publicly announced, and not complete - but sufficient to build firmware, and UEFI applications and drivers. And with release 15.9, the support is now public and complete. Which makes for a good time to ensure we can provide a familiar development environment for those already using Visual Studio.

So I set out to make myself a development environment in which I could build all current architectures in the same environment - and in Visual Studio. And since I have my new ARM64 laptop, I'll make sure to get it working there.

There is no native Visual Studio for arm64, but the (32-bit) x86 version runs just fine.

Search for it in the Microsoft Store, or go straight to the download page. The Community Edition is sufficient, and is free (as in beer) for individuals or open source development.

I'm not going to go through downloading and starting the installer and how to press the Next button, but a few things are worth mentioning.

First, you don't need to install everything in order to get the basic toolchain functionality. I opted for the "Linux development with C++" toolset and ended up with what I needed. Screenscot of VS2017 installer toolset selection.

Second, make sure the components "Visual C++ compilers and libraries for ARM", "Visual C++ compilers and libraries for ARM64" and "Python 2 32-bit" are selected. Screenshot of VS2017 installer component selection.


For building EDK2 for Ia32/X64, you may also need nasm. Currently, there is no arm64 build of nasm for Windows, but again the 32-bit x86 variant does the job. (It also won't currently build with Visual Studio, so that's not a way to get a native one.)


Acpica-tools (including iasl for building ACPI tables) comes in a .zip file (32-bit x86). Rather ungracefully, the Visual Studio build profile simply assumes the binaries from this archive have been extracted and placed in C:\ASL, so do that.


If you don't want to rely completely on the Visual Studio git integration, the 32-bit x86 variant available from here works fine.


Open the Visual Studio Developer Command Prompt directly (don't worry about the GUI). Then, from your edk2 directory, run:

C:\git\edk2>set PYTHON_HOME=C:\Python27
C:\git\edk2>set NASM_PREFIX=C:\Program Files (x86)\NASM\
C:\git\edk2>edksetup.bat rebuild 

to build the native BaseTools and set up the build environment. This will complete with a warning that !!! WARNING !!! No CYGWIN_HOME set, gcc build may not be used !!!, which is fine, because we're not using GCC.

After this, the build command works as usual - V2017 is the toolchain profile we want. So, to build OvmfPkg for X64:

C:\git\edk2>build -a X64 -t VS2017 -p OvmfPkg\OvmfPkgX64.dsc

Or to build HelloWorld for AARCH64:

C:\git\edk2>build -a AARCH64 -t VS2017 -p MdeModulePkg\MdeModulePkg.dsc -m MdeModulePkg\Application\HelloWorld\HelloWorld.inf

What's missing?

Thanks to Pete Batard, support for building UEFI applications and drivers for AARCH64 was already available upstream. So for Option ROM drivers or UEFI command line utilities, you should be good to go.

However, since we've really only used GCC/CLANG for the port up till now, we're lacking assembler files using a compatible syntax. In addition to this, when trying to build whole platform support, there are several issues with (ARM-specific) C source files that have never before been compiled with Visual Studio.

I started ploughing through this end of last year - a hacked up version leaving many asm implementations empty (just so I could get through and identify all of the C issues) is available in one of my working branches. Of course, this appears to have suffered some bitrot (and change in behaviour with VS 15.9), so I will get back to that over the next few weeks. And as always, if you're impatient - patches welcome!

by Leif Lindholm at January 01, 2019 00:00

January 10, 2019

Marcin Juszkiewicz

Thunderbird sucks

Mutt website says: “All mail clients suck. This one sucks less.” Then Mutt-NG adds “Mutt sucks less – but it still sucks!”. And this is true. Thunderbird definitely sucks. But I still use it.

I have three email accounts: private one (own server, Dovecot + Postfix) and two work ones on Gmail. And 240GB SSD for /home partition so from time to time I have to clean something. This time it was ~/.thunderbird/ taking about 48GB…

But 48 is still better than 100GB it took in past. Still far too much. So let’s check why it takes so much.

The biggest folder had 57900 files (1.8GB total). Thunderbird said it has 1801 messages, 48.9MB in total. Each message stored over THIRTY times. Maildir format, Gmail account.

Went to another folder. 823MB in 77345 files on disk. Thunderbird said 3107 messages, 30.7MB in total. Maildir format again, same Gmail account.

Ok, let’s check Dovecot one. Turns out that this one is mbox based. The biggest one was 2.1GB on disk, 526MB according to Thunderbird.

For each of them, I went with “repair folder” button which is “drop whatever is on disk and fetch again from server”.

And then went through folders and disk usage went to even 14GB. But now it started to fetch everything so time will show how it ends.

Bug reported. Do not care much will it get solved or ignored.


Evolution was worse than Thunderbird when I tried to use it one-two years ago. Do not remember details now.

KMail is not usable due to lack of OAuth support for SMTP. There was code written for it but it is not available in Fedora yet.

by Marcin Juszkiewicz at January 01, 2019 10:45

Leif Lindholm

A long time coming

For a very long time now, I have put effort into dogfooding. Back when I first started working at ARM in 2005, all available ARM platforms you might even consider using for normal computing were ridiculously expensive. But finally, in 2008, something changed.


The BeagleBoard was the first fundamental change in how embedded development boards were marketed and sold. It was open hardware. It was backed by open source software. And it was cheap. It was released into a market where it was "simply common sense" that you couldn't turn a profit on a sub-$1000 development board, and it sold for < $200.

It wasn't brilliant - early revisions had serious issues with the USB host port, so a non-standard cable was needed in order to force the OTG port into host mode, and then you had to put networking, keyboard, mouse and any other peripherals you wanted to attach without a soldering iron to a hub connected to the single port. But you could run a normal graphic desktop environment on it!

This opened up for a bunch of follow-ons, including Raspberry Pi, but there was really nothing game changing for a bunch of years until...


When Google launched the Chromebook product line, they were initially all x86-based.

Samsung Series 3

But eventually, Samsung released the Series 3, and apart from the risk of setting your crotch on fire, the Crouton project made it quite easy to convert this to a Linux laptop-ish.

The underlying business model of course meant that it was intentionally short on local storage, and costcutting meant it was short on RAM even for running a web browser a couple of years down the line - but it was an actual thing I could bring instead of an x86 laptop when going to conferences. Both for hacking and for giving presentations.

I remain fairly convinced mine held the only armhf->ia64 cross compilation toolchain the world has ever seen, at least used in anger (for compile testing changes to the Linux EFI subsystem).

Samsung Chromebook 2

A couple of years later, Samsung followed up with the Chromebook 2, offering a model with 4 cores, a larger (and better) screen, and twice the amount of RAM. So I got one of those, but frankly, the shortage of local storage combined with the unreliability of uSD or USB storage across suspend/resume meant I eventually stopped using it for local builds.

Samsung Chromebook Plus

Well, Samsung eventually decided to give up on selling Chromebooks (and possibly even laptops) in Europe, so I had to import one from across the pond. But this one was 64-bit! And the screen was a serious step up from the previous ones, and the chassis was metal instead of plastic. Apart from that, it wasn't that much of an upgrade - but since my work was pretty much exclusively on 64-bit, it was still a useful thing to move to.

Marvell/SolidRun MacchiatoBIN

The MacchiatoBIN also deserves a mention. It remains the only platform I would recommend to a hobbyist without a list the length of my ARM of caveats. That doesn't mean there isn't such a list, just that it's shorter, and the issues easier to live with. This actually works pretty OK as a primary desktop system, and I used it for that for several months.

Biggest things it got right compared to competition

  • Mini-ITX form factor - fits in any regular PC case.
  • Onboard SATA.
  • Onboard PCIe (one open-ended x4 slot).
  • USB3.
  • On-board connector for front panel USB2 (which is weird, but there are adapters).
  • Unbrickable - can load firmware from uSD.

Biggest issues are

  • Very restrictive on which DIMMs are supported.
  • EDK2 port not yet fully upstream.
  • FTDI serial console flaky (when debugging early system firmware).
  • Non-ATX-like handling of power. Turns on as soon as cable inserted. No soft

Windows on ARM

Then, finally, devices running Windows (not Windows RT) trickled onto the market at the end of Q1 last year (2018). There was allegedly a contender from Asus, but that never materialised as available for me to buy either here in the UK, in the US or in Taiwan - until a couple of weeks ago.

HP Envy X2

So the first one I got to have a look at was the HP Envy X2 - really a tablet that comes with a keyboard built into its screen protector. I had some brief time with one during Linaro Connect in Hong Kong 2018, but then Linaro got me one to have a closer look, shortly before the subsequent Connect in Vancouver.

While it tries to encourage you to use cloud storage, it actually came with 128GB of onboard storage. This was really useful, because it let me get started figuring out how to build EDK2 under Visual Studio (posts to follow on this). It ended up being quite usable on long haul flights (and related time in airports).

But, this first wave of devices were based on the Qualcomm Snapdragon 835, which was slightly lacking in horsepower - something that got even worse once Spectre/Meltdown mitigations were rolled out.

And it still only had 4GB of RAM. The same as the phone I bought early 2017, and the same as the Chromebook I bought in 2014!

New laptop

So why this retrospective post?

Well, Tuesday this week I noticed that the first of the Snapdragon 850 laptops was finally available to buy in the UK.

Lenovo Yoga C630

Yeah, I may have ordered one of these. I may in fact be typing this post on it. It may also now be out of stock.

The Lenovo Yoga C630 is an octa-core system built like a proper laptop. Solid (and very sleekly stealth looking) metal chassis. The keybord has very short travel, which some people might hate, but I like it better than the Chromebook and Macbook ones.

Picture of Lenovo Yoga C630

Screen seems OK, and the machine feels a lot snappier than the Envy X2 did. But even more importantly, it comes with 8GB of RAM. The 128GB (only variant available to buy, although they claim a 256GB one also exists) of onboard storage sits on a UFS interface rather than eMMC like the Chromebooks. This makes a substantial difference for performance.

The Yoga ships with a Windows 10 Home licence. Upgrading that to Windows 10 Pro would set you back another £120 and push the total cost over £1000. If those extra features had been important to me, that may well have turned this device too expensive. They weren't for me, so I'm sticking with Home.

State of Windows on ARM(64)

Well, this is very much at a "first impressions" sort of level but...

Windows in S mode

All of these ARM-based laptops ship in S mode. What this means is basically that you can only install programs from the Microsoft Store. Clearly not very useful for me, but just like the default locked-down-ness of the Chomebooks - it really makes sense for what the majority of computer users need, and it does improve device security.

I'm totally OK with this, because it is optional. But it's also worth noting that unlike Chromebooks there is no way to switch back into S mode once you've made the jump.


What makes these laptops potential replacements to existing Windows users is that they provide dynamic binary translation for existing x86 applications. Worth noting is that only 32-bit applications are supported for this, but it does mean most of your standard applications will just work (albeit more sluggish than when running natively).

Windows Services for Linux

WSL is available with the default installation. You only need to enable it before going to the Microsoft Store (search for "WSL") to install your (mainstream) distribution(s) of choice.

Picture of Ubuntu, openSUSE, SLES, Debian and Kali in the Microsoft Store

Excellent, that means I can do work both with Visual Studio and in a proper Linux environment simultaneously? No :( Not yet. As I said, these devices only made it into the hands of real users less than a year ago, so fixes for issues that were picked up by people using them in anger haven't made it into the stable releases yet. This one is currently blocking me from doing my day job on the Yoga.

So I guess the way forward for me is to sign up as a Windows Insider and jump on the "slow track", to get early access to new features (but not quite drink from the firehose).

Edit: signed up as a Windows Insider, now running Version 1809, and this problem has gone away!

Browser support

When I got the Envy X2, I pretty much had the choices of native Edge or emulated Chrome/Firefox. But in a case of excllent timing, there are now native nightly builds of Firefox for arm64. Although it comes with the disclaimer "even nightlier than our normal Nightlies", I have not so far come across any issues.


With WSL you can certainly use your regular Linux ssh command, but if coming from a Windows environment already, it may be useful to know there are already snapshot builds of PuTTY available for both native 32-bit and 64-bit ARM.

Who are you and what have you done with Leif?

I'm me!

And I'm certainly going to look into being able to run Linux directly on this platform.

The nonsense that was "UEFI Secure Boot must not be possible to disable on ARM devices" does not apply to this class of devices, so that is not a blocker preventing this work. And once we have it working, we want to boot Linux with Secure Boot enabled.

But for now I'm going to do some dogfooding on Windows, and try to help find bugs and document my progress.

by Leif Lindholm at January 01, 2019 00:00

January 07, 2019

Steve McIntyre

Rebuilding the entire Debian archive twice on arm64 hardware for fun and profit

I've posted this analysis to Debian mailing lists already, but I'm thinking it's also useful as a blog post too. I've also fixed a few typos and added a few more details that people have suggested.

This has taken a while in coming, for which I apologise. There's a lot of work involved in rebuilding the whole Debian archive, and many days spent analysing the results. You learn quite a lot, too! :-)

I promised way back before DebConf 18 last August that I'd publish the results of the rebuilds that I'd just started. Here they are, after a few false starts. I've been rebuilding the archive specifically to check if we would have any problems building our 32-bit Arm ports (armel and armhf) using 64-bit arm64 hardware. I might have found other issues too, but that was my goal.

The logs for all my builds are online at

for reference. See in particular

for automated analysis of the build logs that I've used as the basis for the stats below.

Executive summary

As far as I can see we're basically fine to use arm64 hosts for building armel and armhf, so long as those hosts include hardware support for the 32-bit A32 instruction set. As I've mentioned before, that's not a given on all arm64 machines, but there are sufficient machine types available that I think we should be fine. There are a couple of things we need to do in terms of setup - see Machine configuration below.


I (naively) just attempted to rebuild all the source packages in unstable main, at first using pbuilder to control the build process and then later using sbuild instead. I didn't think to check on the stated architectures listed for the source packages, which was a mistake - I would do it differently if redoing this test. That will have contributed quite a large number of failures in the stats below, but I believe I have accounted for them in my analysis.

I built lots of packages, using a range of machines in a small build farm at home:
  • Macchiatobin
  • Seattle
  • Synquacer
  • Multiple Mustangs

using my local mirror for improved performance when fetching build-deps etc. I started off with a fixed list of packages that were in unstable when I started each rebuild, for the sake of simplicity. That's one reason why I have two different numbers of source packages attempted for each arch below. If packages failed due to no longer being available, I simply re-queued using the latest version in unstable at that point.

I then developed a script to scan the logs of failed builds to pick up on patterns that matched with obvious causes. Once that was done, I worked through all the failures to (a) verify those patterns, and (b) identify any other failures. I've classified many of the failures to make sense of the results. I've also scanned the Debian BTS for existing bugs matching my failed builds (and linked to them), or filed new bugs where I could not find matches.

I did not investigate fully every build failure. For example, where a package has never been built before on armel or armhf and failed here I simply noted that fact. Many of those are probably real bugs, but beyond the scope of my testing.

For reference, all my scripts and config are in git at

armel results

Total source packages attempted 28457
Successfully built 25827
Failed 2630

Almost half of the failed builds were simply due to the lack of a single desired build dependency (nodejs:armel, 1289). There were a smattering of other notable causes:

  • 100 log(s) showing build failures (java/javadoc)
    Java build failures seem particularly opaque (to me!), and in many cases I couldn't ascertain if it was a real build problem or just maven being flaky. :-(
  • 15 log(s) showing Go 32-bit integer overflow
    Quite a number of go packages are blindly assuming sizes for 64-bit hosts. That's probably fair, but seems unfortunate.
  • 8 log(s) showing Sbuild build timeout
    I was using quite a generous timeout (12h) with sbuild, but still a very small number of packages failed. I'd earlier abandoned pbuilder for sbuild as I could not get it to behave sensibly with timeouts.
The stats that matter are the arch-specific failures for armel:
  • 13 log(s) showing Alignment problem
  • 5 log(s) showing Segmentation fault
  • 1 log showing Illegal instruction
and the new bugs I filed:
  • 3 bugs for arch misdetection
  • 8 bugs for alignment problems
  • 4 bugs for arch-specific test failures
  • 3 bugs for arch-specific misc failures

Considering the number of package builds here, I think these numbers are basically "lost in the noise". I have found so few issues that we should just go ahead. The vast majority of the failures I found were either already known in the BTS (260), unrelated to what I was looking for, or both.

See below for more details about build host configuration for armel builds.

armhf results

Total source packages attempted 28056
Successfully built 26772
Failed 1284

FTAOD: I attempted fewer package builds for armhf as we simply had a smaller number of packages when I started that rebuild. A few weeks later, it seems we had a few hundred more source packages for the armel rebuild.

The armhf rebuild showed broadly the same percentage of failures, if you take into account the nodejs difference - it exists in the armhf archive, so many hundreds more packages could build using it.

In a similar vein for notable failures:

  • 89 log(s) showing build failures (java/javadoc)
    Similar problems, I guess...
  • 15 log(s) showing Go 32-bit integer overflow
    That's the same as for armel, I'm assuming (without checking!) that they're the same packages.
  • 4 log(s) showing Sbuild build timeout
    Only 4 timeouts compared to the 8 for armel. Maybe a sign that armhf will be slightly quicker in build time, so less likely to hit a timeout? Total guesswork on small-number stats! :-)

Arch-specific failures found for armhf:

  • 11 log(s) showing Alignment problem
  • 4 log(s) showing Segmentation fault
  • 1 log(s) showing Illegal instruction

and the new bugs I filed:

  • 1 bugs for arch misdetection
  • 8 bugs for alignment problems
  • 10 bugs for arch-specific test failures
  • 3 bugs for arch-specific misc failures

Again, these small numbers tell me that we're fine. I liked to 139 existing bugs in the BTS here.

Machine configuration

To be able to support 32-bit builds on arm64 hardware, there are a few specific hardware support issues to consider.


Our 32-bit Arm kernels are configured to fix up userspace alignment faults, which hides lazy programming at the cost of a (sometimes massive) slowdown in performance when this fixup is triggered. The arm64 kernel cannot be configured to do this - if a userspace program triggers an alignment exception, it will simply be handed a SIGBUS by the kernel. This was one of the main things I was looking for in my rebuild, common to both armel and armhf. In the end, I only found a very small number of problems.

Given that, I think we should immediately turn off the alignment fixups on our existing 32-bit Arm buildd machines. Let's flush out any more problems early, and I don't expect to see many.

To give credit here: Ubuntu have been using arm64 machines for building 32-bit Arm packages for a while now, and have already been filing bugs with patches which will have helped reduce this problem. Thanks!

Deprecated / retired instructions

In theory(!), alignment is all we should need to worry about for armhf builds, but our armel software baseline needs two additional pieces of configuration to make things work, enabling emulation for

  • SWP (low-level locking primitive, deprecated since ARMv6 AFAIK)
  • CP15 barriers (low-level barrier primitives, deprecated since ARMv7)

Again, there is quite a performance cost to enabling emulation support for these instructions but it is at least possible!

In my initial testing for rebuilding armhf only, I did not enable either of these emulations. I was then finding lots of "Illegal Instruction" crashes due to CP15 barrier usage in armhf Haskell and Mono programs. This suggests that maybe(?) the baseline architecture in these toolchains is incorrectly set to target ARMv6 rather than ARMv7. That should be fixed and all those packages rebuilt at some point.


  • Peter Green pointed out that ghc in Debian armhf is definitely configured for ARMv7, so maybe there is a deeper problem.
  • Edmund Grimley Evans suggests that the Haskell problem is coming from how it drives LLVM, linking to #864847 that he filed in 2017.

Bug highlights

There are a few things I found that I'd like to highlight:

  • In the glibc build, we found an arm64 kernel bug (#904385) which has since been fixed upstream thanks to Will Deacon at Arm. I've backported the fix for the 4.9-stable kernel branch, so the fix will be in our Stretch kernels soon.
  • There's something really weird happening with Vim (#917859). It FTBFS for me with an odd test failure for both armel-on-arm64 and armhf-on-arm64 using sbuild, but in a porter box chroot or directly on my hardware using debuild it works just fine. Confusing!
  • I've filed quite a number of bugs over the last few weeks. Many are generic new FTBFS reports for old packages that haven't been rebuilt in a while, and some of them look un-maintained. However, quite a few of my bugs are arch-specific ones in better-maintained packages and several have already had responses from maintainers or have already been fixed. Yay!
  • Yesterday, I filed a slew of identical-looking reports for packages using MPI and all failing tests. It seems that we have a real problem hitting openmpi-based packages across the archive at the moment (#918157 in libpmix2). I'm going to verify that on my systems shortly.

Other things to think about

Building in VMs

So far in Debian, we've tended to run our build machines using chroots on raw hardware. We have a few builders (x86, arm64) configured as VMs on larger hosts, but as far as I can see that's the exception so far. I know that OpenSUSE and Fedora are both building using VMs, and for our Arm ports now we have more powerful arm64 hosts available it's probably the way we should go here.

In testing using "linux32" chroots on native hardware, I was explicitly looking to find problems in native architecture support. In the case of alignment problems, they could be readily "fixed up / hidden" (delete as appropriate!) by building using 32-bit guest kernels with fixups enabled. If I'd found lots of those, that would be a safer way to proceed than instantly filing lots of release-critical FTBFS bugs. However, given the small number of problems found I'm not convinced it's worth worrying about.

Utilisation of hardware

Another related issue is in how we choose to slice up build machines. Many packages will build very well in parallel, and that's great if you have something like the Synquacer with many small/slow cores. However, not all our packages work so well and I found that many are still resolutely chugging through long build/test processes in single threads. I experimented a little with my config during the rebuilds and what seemed to work best for throughput was kicking off one build per 4 cores on the machines I was using. That seems to match up with what the Fedora folks are doing (thanks to hrw for the link!).

Migrating build hardware

As I mentioned earlier, to build armel and armhf sanely on arm64 hardware, we need to be using arm64 machines that include native support for the 32-bit A32 instruction set. While we have lots of those around at the moment, some newer/bigger arm64 server platforms that I've seen announced do not include it. (See an older mail from me for more details. We'll need to be careful about this going forwards and keep using (at least) some machines with A32. Maybe we'll migrate arm64-only builds onto newer/bigger A64-only machines and keep the older machines for armel/armhf if that becomes a problem?

At least for the foreseeable future, I'm not worried about losing A32 support. Arm keeps on designing and licensing ARMv8 cores that include it...


I've spent a lot of time looking at existing FTBFS bugs over the last weeks, to compare results against what I've been seeing in my build logs. Much kudos to people who have been finding and filing those bugs ahead of me, in particular Adrian Bunk and Matthias Klose who have filed many such bugs. Also thanks to Helmut Grohne for his script to pull down a summary of FTBFS bugs from UDD - that saved many hours of effort!


Please let me know if you think you've found a problem in what I've done, or how I've analysed the results here. I still have my machines set up for easy rebuilds, so reproducing things and testing fixes is quite easy - just ask!

January 01, 2019 12:57

November 30, 2018

Tom Gall

Kernel Testing News 11/30/2018

Nov 26th saw the release of 4.4.165, 4.9.141, 4.14.84 and 4.19.4

For these LTS kernel versions, results were reported upstream, no regressions were found.

2018-11-26: Rafael Tinoco – bug 4043 – Asked Greg to backport a fix for v4.4, Sasha forwarded to the mm list.

For Android Kernels, regressions were detected.


  • 4.14.84 + HiKey boot regression – observed in with 9.0 and AOSP
  • 4.4.165 Regression:
    VtsKernelSyscallExistence#testSyscall_name_to_handle_at – Unknown
    error: test case requested but not executed.
    VtsKernelSyscallExistence#testSyscall_open_by_handle_at – Unknown
    error: test case requested but not executed.
    VtsKernelSyscallExistence#testSyscall_uselib – Unknown error: test
    case requested but not executed.

No Others Regressions: 4.4.165 and 4.9.141 on Android 9.

X15: 4.14.84 + O-MR1 – Baselining activity has been particularly effective over the past two weeks, dropping the number of errors from 65 failing tests to 16 as of today. That’s really good progress towards setting a clean baseline.

Bug 4033  Sumit has been looking at the failing CtsBluetoothTestCases android.bluetooth.cts.BluetoothLeScanTest#testBasicBleScan and android.bluetooth.cts.BluetoothLeScanTest.testScanFilter failures.

These tests both pass across all kernels with 8.1. They however fail with both 9.0 and AOSP. Looking at historical AOSP results it appears that failures there started approx in the September timeframe.

Last, successful test builds and test boot to UI with 4.4.165 and 4.9.141 with Android 9) using the newly released clang-r346389 compiler.

by tgallfoo at November 11, 2018 22:52

November 28, 2018

Marcin Juszkiewicz

Where and when mistake was made?

I am tired of useless discussions. Tired of “we are talking about servers and desktop, not toys” which needs to happen in EVERY arm64 discussion sooner or later. It was that way in “Qt: GL or GLES on arm64” thread on debian-arm ML or recently on #debian-boot when I tried to find out how to get graphical installer working on arm64.

There was a mistake done at some point probably. Maybe aarch64 should start with A72 cores, GICv3 and multicore server chips. And mobile market get fast v7 cores at same time. To make a clean split.

On arm64 Fedora has graphical installer for last few releases. Took a while to debug X11 and kernel to find out why it requires config file when it should not. We wrote some patches (better than ones in linked post) and got them merged. I can take Mustang, put graphics card and install operating system using keyboard, mouse and monitor. Just like it is on boring computers.

Debian? Same machine, same config — you need to grab serial cable and second computer. Because it is Arm system so it is supposed to be one of those small toy boards people give to kids to play with, right?

Sure, I could sit and discuss with people but it does not work. You always get someone with ‘I use R/Pi zero as a desktop’ (or other insane setup) and then thread dies as every normal person leaves.

So sorry, but I do not plan to spent any time on improving operating system installers. Never mind which distribution it will be.

by Marcin Juszkiewicz at November 11, 2018 15:19

November 27, 2018

Marcin Juszkiewicz

AArch64 on AWS

I woke up today, looked into news stream on my phone and bang! Amazon announced Arm systems being available in AWS. Nice!

Red Hat Enterprise Linux 7.6 is one of operating systems available from day one. Boots, runs and all the boring things you expect from operating system. It is nice to see new systems run RHEL out of the box.

So, what to do with such EC2 instance? I know that some people plan to move their x86-64 based cloud infrastructure to aarch64. Several projects will add them into their pool of AWS instances to have another architecture available in their CI systems. Lot of people will run one just to check how it differ from their daily x86-64 systems.

As those are not bare metal systems you are not able to run OpenStack or play with virtualization but if you are using containers (Kubernetes on Arm anyone?) then it probably can be something to play with.

by Marcin Juszkiewicz at November 11, 2018 12:04

November 24, 2018

Gema Gomez

Idee der creativmarkt

I was in Berlin for an event last week and I stumbled upon a magic place 3 minutes walk from my hotel. This place was Idee Creativmarkt, a crafts shop similar but very different from Hobbycraft in the UK, it felt posher with a lot of high quality yarns on display. It was also different because the shop was organised in a more relaxed and creative way, with lots of example projects to inspire visitors to be creative and playful with colors and textures. I didn’t buy anything, though, because I am on a mission to reduce my stash for the foreseeable future, and I have decided to only buy new yarn when absolutely necessary (i.e. I have started a project and I need more of a particular color to be able to finish it) or a new project requires some new type of fibre that I don’t own in significan quantities.

Idee entry

Their building was decorated with what I thought was a very clever design of their logo in lighting. Apologies for the bad picture but I hope it conveys the idea of what it looks like:

Idee facade

Of all the projects they had as samples, this is the one that captured my imagination the most, I seem to be enthralled by variegated yarns nowadays:

Idee inspiration

I would definitely recommend any crafters spending a couple of days in Berlin to stop at Idee for inspiration, you won’t be disappointed. I shall go back to my started variegated shawl soon!

by Gema Gomez at November 11, 2018 00:00

November 20, 2018

Marcin Juszkiewicz

Red Hat Platform Enablement meeting week

Last week I was in Vancouver, Canada again. At the time when Linux Plumbers conference took place. But it was not the main reason as I went there to meet people from Platform Enablement team at Red Hat.

Linux Plumbers

The idea was simple — gather everyone in one place at same time and let them talk. Conference was selected to give something else to do at same time. And we were visible — for 473 attendees about 60 was from Red Hat.

Red Hat team before going for team dinner

I was talking with most of RH people to find out who they are, what they are working on etc. It ended in a lot of interesting discussions. Also many talks with non-RH people. The ‘so you are IBM now’ phrase happened just a few times.

There were funny moments too. Like one when Dave Airlie responded with “ah, you are the ‘arm64 + radeon guy'” ;D


As there was no breakfast option in ‘The Burrard’ hotel I went for a walk to find some. Davie street is full of bars, diners, restaurants (but most of them open at 11:00). Interesting graffiti, cannabis stores (as it is now legal in Canada) and lot of LGBT rainbows everywhere.


Due to one of my flights being cancelled I had to choose: weekend in Vancouver, weekend in Toronto or rebooking whole trip. So I decided to go to Toronto and meet friend there.

On Saturday I meet Karol and we had long walk. It was good to not discuss about ARM or OpenStack — we went for visual effects instead as this is Karol’s area of expertise. Maya, Houdini, Renderman, Mr. X, ILM, Pixar and other names were going over. I was told “they work on Houdini in that building” and later “here Maya is developed” ;D

So I asked about photo realistic movies — are they possible now? Turns out that yes, they are. But it is too expensive to make.

During weekend I did over 20 kilometers by just walking through the city. Some random photos below:

It was great week. Despite sleep deprivation ;D

by Marcin Juszkiewicz at November 11, 2018 11:54

November 01, 2018

Mark Brown

Linux Audio Miniconf 2018 report

The audio miniconference was held on the 21st in the offices of Cirrus Logic in Edinburgh with 15 attendees from across the industry including userspace and kernel developers, with people from several OS vendors and a range of silicon companies.


We started off with a discussion of community governance lead by Takashi Iwai. We decided that for the ALSA-hosted projects we’ll follow the kernel and adopt the modified version of the contributor covenant that they have adopted, Sound Open Firmware already has a code. We also talked a bit about moving to use some modern hosted git with web based reviews. While this is not feasible for the kernel components we decided to look at doing this for the userspace components, Takashi will start a discussion on alsa-devel. Speaking of the lists Vinod Koul also volunteered to work with the Linux Foundation admin team to get them integrated with

IMG_20181021_100446.jpgLiam Girdwood presenting virtualization (photo: Arun Raghavan)


Liam Girdwood then kicked off the first technical discussion of the day, covering virtualization. Intel have a new hypervisor called ACRN which they are using as part of a solution to expose individual PCMs from their DSPs to virtual clients, they have a virtio specification for control. There were a number of concerns about the current solution being rather specific to both the hardware and use case they are looking at, we need to review that this can work on architectures that aren’t cache coherent or systems where rather than exposing a DSP the host system is using a sound server.

We then moved on to AVB, several vendors have hardware implementations already but it seems clear that these have been built by teams who are not familiar with audio hardware, hopefully this will improve in future but for now there are some regrettable real time requirements. Sakamoto-san suggested looking at FireWire which has some similar things going on with timestamps being associated with the audio stream.

For SoundWire, basic functionality for x86 systems is now 90% there – we still need support for multiple CPU DAIs in the ASoC core (which is in review on the lists) and the Intel DSP drivers need to plumb in the code to instantiate the drivers.

We also covered testing, there may be some progress here this year as Intel have a new hypervisor called ACRN and some out of tree QEMU models for some of their newer systems both of which will help with the perennial problem that we need hardware for a lot of the testing we want to do. We also reviewed the status with some other recurring issues, including PCM granularity and timestamping, for PCM granularity Takashi Iwai will make some proposals on the list and for timestamping Intel will make sure that the rest of their driver changes for this are upstreamed. For dimen we agreed that Sakamoto-san’s work is pretty much done and we just need some comments in the header, and that his control refactoring was a good idea. There was discussion of user defined card elements, there were no concerns with raising the number of user defined elements that can be created but some fixes needed for cleanup of user defined card elements when applications close. The compressed audio userspace is also getting some development with the focus on making things easier to test, integrating with ffmpeg to give something that’s easier for user to work with.

Charles Keepax covered his work on rate domains (which we decided should really be much more generic than just covering sample rates), he’d posted some patches on the list earlier in the week and gave a short presentation about his plans which sparked quite a bit of discussion. His ideas are very much in line with what we’ve discussed before in this area but there’s still some debate as to how we configure the domains – the userspace interface is of course still there but how we determine which settings to use once we pass through something that can do conversions is less clear. The two main options are that the converters can expose configuration to userspace or that we can set constraints on other widgets in the card graph and then configure converters automatically when joining domains. No firm conclusion was reached, and since substantial implementation will be required it is not yet clear what will prove most sensible in practical systems.


Sakamoto-san also introduced some discussion of new language bindings. He has been working on a new library designed for use with GObject introspection which people were very interested in, especially with the discussion of testing – having something like this would simplify a lot of the boilerplate that is involved in using the C API and allow people to work in a wider variety of languages without needing to define specific bindings or use the individual language’s C adaptations. People also mentioned the Rust bindings that David Henningsson had been working on, they were particularly interesting for the ChromeOS team as they have been adopting Rust in their userspace.

We talked a bit about higher level userspace software too. PulseAudio development has been relatively quiet recently, Arun talked briefly about his work on native compressed audio support and we discussed if PulseAudio would be able to take advantage of the new timestamping features added by Pierre-Louis Bossart. There’s also the new PipeWire sound server stack, this is a new stack which was originally written for video but now also has some audio support. The goal is to address architectural limitations in the existing JACK and PulseAudio stacks, offering the ability to achieve low latencies in a stack which is more usable for general purpose applications than JACK is.


Discussions of DSP related issues were dominated by Sound Open Firmware which is continuing to progress well and now has some adoption outside Intel. Liam gave an overview of the status there and polled interest from the DSP vendors who were present. We talked about how to manage additions to the topology ABI for new Sound Open Firmware features including support for loading and unloading pieces of the DSP topology separately when dynamically adding to the DSP graph at runtime, making things far more flexible. The issues around downloading coefficient data were also covered, the discussion converged on the idea of adding something to hwdep and extending alsa-lib and tinyalsa to make this appear integrated with the standard control API. This isn’t ideal but it seems unlikely that anything will be. Techniques for handling long sequences of RPC calls to DSPs efficiently were also discussed, the conclusion was that the simplest thing was just to send commands asynchronously and then roll everything back if there are any errors.


Thanks again to all the attendees for their time and contributions and to Cirrus Logic for their generosity in hosting this in their Edinburgh office. It was really exciting to see all the active development that’s going on these days, it’ll be great to see some of that bear fruit over the next year!

_MG_1505Group photo

by broonie at November 11, 2018 17:12

October 28, 2018

Marcin Juszkiewicz

20?8 is year of acquiring?

In 2007 I started working for OpenedHand. They became acquired by Intel in 2008. Today I am working for Red Hat (for over 5 years now). And we have 2018 and it became acquired by IBM.

I came back home in the evening with plans for some cider and episode of some TV series (probably “Ozark”). But when I landed on a couch and took a look on my phone it shown set of notifications. Telegram, Facebook, Messenger…

And all of them were about one thing: Red Hat being acquired by IBM. First I looked and sources were Bloomberg and CNBC. At that phase I thought “ok, it can be a rumour” so my answer was “can not comment”.

But then I went for Red Hat mailbox. And there were links to more serious places: IBM newsroom and Red Hat announcement.

Looks like tomorrow will be interesting day. Full of reading mails.

by Marcin Juszkiewicz at October 10, 2018 20:27

October 17, 2018

Marcin Juszkiewicz

OpenStack Superuser Award nomination for my Linaro team

During last few years Linaro Enterprise Group (recently renamed to Linaro Datacenter and Cloud Group) was working on getting OpenStack working on AArch64 at same level as it works on x86-64 architecture. And I am proud to be member of that group ;D

We started our adventure with Liberty, migrated to Mitaka and then Newton. And we stayed there for a while with Developer Cloud to make sure that all those projects which rely on it can use VM instances for their work.

In meantime we were contributing to several OpenStack projects to get everything working properly. Main one was Kolla as we needed good way of cloud deployment but also Nova, Disk Image Builder and others.

Took us Pike and Queens to get to the point when we could create new setup of Developer Cloud. In clean way, using containers generated by one of OpenStack projects. No more in-house solutions.

Our team always consisted people from several countries and companies because this is how Linaro works — there are Linaro employees, there are assigned engineers from member companies etc. We cooperated with our kernel people, packagers, developers from several open source projects (libvirt, RDO, CentOS, Debian) and more.

Some people were running tests, some were doing image builds, package builds. Others were managing to keep us focus and to get it delivered as we planned to.

We attended several OpenStack related events (PTG, Summit etc) to tell people how AArch64 support looks like in all those projects. Gave several talks about how OpenStack works for us.

Was it lot of work? Stackalytics graphs show that it took a while. And it was worth it.

Now we got nominated to OpenStack Superuser Award. It is an achievement which would not be possible without all people working on it during last few years.

So, go, read about nominees and vote for us!

by Marcin Juszkiewicz at October 10, 2018 10:36

October 14, 2018

Marcin Juszkiewicz

QML – Quality Matters Last?

In 2004 I was newbie in embedded Linux area. Decision to buy Sharp Zaurus instead of HP iPaq got me to Qt/e world rather to GTK one. I was also KDE user rather than GNOME2 as well so I can say that I liked Qt already.

All those sizes in pixels, paddings and margins I saw in GTK code made me feel sick each time I had to edit UI of some application. No idea why developers went that way…

In Qt world all you had to do was launching Qt Designer, put some UI elements into window, apply some Layout elements and build your app. No need to deal with padding/margin settings etc cause library that for you.

In meantime Qt developers added QML as a new way to do UI for Qt applications. I ignored it’s existence until now…

Few days ago Michał Schulz did nice work on improving my Modland player. He also moved it’s UI from old Qt Designer one to QML.

Modland player with QML based UI

For now UI is hardcoded to 800×480. I have tried to make it scalable but have a feeling that QML is against me.

Look at Authors/Modules part. It is simple layout, right?

  • label
  • listView
  • label
  • listView

In Qt Designer UI I would select those four elements, put into GridLayout and it would scale properly. So I tried that for QML. Labels survived, listViews got 0x0 size.

And the only ‘design tool’ to edit QML is Qt Creator. Which gets fugly unstable once you try to play with QML designs…

So I looked at files describing UI. And you know what I found there? Old GTK nightmares… Positioning elements with pixels, sizes in pixels. Pixels! Not some magical “dp units”. There is no way to say “make this element 10em tall” like you can with CSS.

And it is not only with Modland player UI. Same it with QML examples…

WTH happened with Qt developers? Or was “QML is only for embedded devices, do not use on desktop” phrase got removed from documentation by mistake?

by Marcin Juszkiewicz at October 10, 2018 12:01

October 10, 2018

Neil Williams

Code Quality & Formatting for Python

I've recently added two packages (and their dependencies) to Debian and thought I'd cover a bit more about why.


black, the uncompromising Python code formatter, has arrived in Debian unstable and testing.

black is being adopted by the LAVA Software Community Project in a gradual way and the new CI will be checking that files which have been formatted by black stay formatted by black in merge requests.

There are endless ways to format Python code and pycodestyle and pylint are often too noisy to use without long lists of ignored errors and warnings. Black takes the stress out of maintaining a large Python codebase as long as a few simple steps are taken:

  • Changes due to black are not functional changes. A merge request applying black to a source code file must not include functional changes. Just the change done by black. This makes code review manageable.
  • Changes made by black are recorded and once made, CI is used to ensure that there are no regressions.
  • Black is only run on files which are not currently being changed in existing merge requests. This is a simple sanity provision, rebasing functional changes after running black is not fun.

Consistent formatting goes a long way to helping humans spot problematic code.

See or apt-get install python-black-doc for a version which doesn't "call home".


So much for code formatting, that's nice and all but what can matter more is an overview of the complexity of the codebase.

We're experimenting with running radon as part of our CI to get a CodeClimate report which GitLab should be able to understand.

(Take a bow - Vince gave me the idea by mentioning his use of Cyclomatic Complexity.)

What we're hoping to achieve here is a failed CI test if the complexity of critical elements increases and a positive indication if the code complexity of areas which are currently known to be complex can be reduced without losing functionality.

Initially, just having the data is a bonus. The first try at CodeClimate support took the best part of an hour to scan our code repository. radon took 3 seconds.

See or apt-get install python-radon-doc for a version which doesn't "call home".

(It would be really nice for upstreams to understand that putting badges in their sphinx documentation templates makes things harder to distribute fairly. Fine, have a nice web UI for your own page but remove the badges from the pages in the released tarballs, e.g. with a sphinx build time option.)

One more mention - bandit

I had nothing to do with introducing this to Debian but I am very grateful that it exists in Debian. bandit is proving to be very useful in our CI, providing SAST reports in GitLab. As with many tools of it's kind, it is noisy at first. However, with a few judicious changes and the use of the # nosec comment to rule out scanning of things like unit tests which deliberately tried to be insecure, we have substantially reduced the number of reports produced with bandit.

Having the tools available is so important to actually fixing problems before the software gets released.

by Neil Williams at October 10, 2018 14:26

September 19, 2018

Mark Brown

2018 Linux Audio Miniconference

As in previous years we’re trying to organize an audio miniconference so we can get together and talk through issues, especially design decisons, face to face. This year’s event will be held on Sunday October 21st in Edinburgh, the day before ELC Europe starts there. Cirrus Logic have generously offered to host this in their Edinburgh office:

7B Nightingale Way

As with previous years let’s pull together an agenda through a mailing list discussion on alsa-devel – if you’ve got any topics you’d like to discuss please join the discussion there.

There’s no cost for the miniconference but if you’re planning to attend please sign up using the document here.

by broonie at September 09, 2018 18:36

September 09, 2018

Bin Chen

eBook: Understand Container

The index page of understand container  had a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.

So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 

by Bin Chen ( at September 09, 2018 10:52

book: Understand Container

The index page of understand container  had a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.

So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 

by Bin Chen ( at September 09, 2018 08:04

August 31, 2018

Bin Chen

Understand Container - Index Page

This is an index page to a series of 8 articles on container implementation.


This page has a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.

So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 


by Bin Chen ( at August 08, 2018 11:17

Steve McIntyre

And lo, we sacrificed to the gods of BBQ once more

As is becoming something of a tradition by now, Jo and I hosted another OMGWTFBBQ at our place last weekend. People came from far and wide to enjoy themselves. Considering the summer heatwave we've had this year, we were a little unlucky with the weather. But with the power of gazebo technology we kept (mostly!) dry... :-)

I was too busy cooking and drinking etc. to take any photos myself, so here are some I sto^Wborrowed from my friends!

We continued to celebrate Debian getting old:
the cake is not a lie!
Photo from Jonathan McDowell

We had much beer from the nice folks at Milton Brewery:
is 3 firkins enough?
Photo from Rob Kendrick

Much meat was prepared and cooked:
very professional!
Photo from Stefano Rivera

And we had a lot of bread too!
Photo from Rob Kendrick

Finally, many thanks to a number of awesome companies for again sponsoring the important refreshments for the weekend. It's hungry/thirsty work celebrating like this!

August 08, 2018 02:24

August 16, 2018

Steve McIntyre

25 years...

We had a small gathering in the Haymakers pub tonight to celebrate 25 years since Ian Murdock started the Debian project.

people in the pub!

We had 3 DPLs, a few other DDs and a few more users and community members! Good to natter with people and share some history. :-) The Raspberry Pi people even chipped in for some drinks. Cheers! The celebrations will continue at the big BBQ at my place next weekend.

August 08, 2018 21:42

August 12, 2018

Steve McIntyre

DebConf in Taiwan!

DebConf 18 logo

So I'm slowly recovering from my yearly dose of full-on Debian! :-) DebConf is always fun, and this year in Hsinchu was no different. After so many years in the project, and so many DebConfs (13, I think!) it has become unmissable for me. It's more like a family gathering than a work meeting. In amongst the great talks and the fun hacking sessions, I love catching up with people. Whether it's Bdale telling me about his fun on-track exploits or Stuart sharing stories of life in an Australian university, it's awesome to meet up with good friends every year, old and new.

DC18 venue

For once, I even managed to find time to work on items from my own TODO list during DebCamp and DebConf. Of course, I also got totally distracted helping people hacking on other things too! In no particular order, stuff I did included:

  • Working with Holger and Wolfgang to get debian-edu netinst/USB images building using normal debian-cd infrastructure;
  • Debugging build issues with our buster OpenStack images, fixing them and also pushing some fixes to Thomas for build-openstack-debian-image;
  • Reviewing secure boot patches for Debian's GRUB packages;
  • As an AM, helping two DD candidates working their way through NM;
  • Monitoring and tweaking an archive rebuild I'm doing, testing building all of our packages for armhf using arm64 machines;
  • Releasing new upstream and Debian versions of abcde, the CD ripping and encoding package;
  • Helping to debug UEFI boot problems with Helen and Enrico;
  • Hacking on MoinMoin, the wiki engine we use for;
  • Engaging in lots of discussions about varying things: Arm ports, UEFI Secure Boot, Cloud images and more

I was involved in a lot of sessions this year, as normal. Lots of useful discussion about Ignoring Negativity in Debian, and of course lots of updates from various of the teams I'm working in: Arm porters, web team, Secure Boot. And even an impromptu debian-cd workshop.

Taipei 101 - datrip venue

I loved my time at the first DebConf in Asia (yay!), and I was yet again amazed at how well the DebConf volunteers made this big event work. I loved the genius idea of having a bar in the noisy hacklab, meaning that lubricated hacking continued into the evenings too. And (of course!) just about all of the conference was captured on video by our intrepid video team. That gives me a chance to catch up on the sessions I couldn't make it to, which is priceless.

So, despite all the stuff I got done in the 2 weeks my TODO list has still grown. But I'm continuing to work on stuff, energised again. See you in Curitiba next year!

August 08, 2018 15:11

July 26, 2018

Senthil Kumaran

lava-server official docker images

Linaro Automated Validation Architecture a.k.a LAVA project has released official docker images for lava-server only containers followed by the recent release of lava-dispatcher only docker images. This blog post explains how to use these lava-server docker images in order to run LAVA instances via docker.

Before getting into the details of running these images, let us see how these images are organized and what are the packages available via these images.

The lava-server only docker images will be officially supported by the LAVA project team and there will be regular releases of these images whenever there are updates or new releases. As of this writing there are two images released - production and staging. These docker images are based on Debian Stretch operating system, which is the recommended operating system for installing LAVA.

lava-server production docker images

The production docker image of lava-server is based on the official production-repo of LAVA project. The production-repo holds the latest stable packages released by LAVA team for each of the LAVA components.The production docker image will be available in the following link:

Whenever there is a production release from the LAVA project there will be a corresponding image created with the tag name in The latest tag as of this writing is 2018.7-1. In order to know what this production docker images are built with, have a look at the DockerFile in

lava-server staging docker images

The staging docker image of lava-server is based on the official staging-repo of LAVA project. The staging-repo holds the latest packages built everyday by LAVA team for each of the LAVA components, which is also a source for bleeding edge unreleased software. The staging docker image will be available in the following link, which is built daily:

Whenever there is a successful daily build of staging packages available, a docker image will be made available in with the tag name 'latest'. Hence, at any point of time there will be only one tag, i.e., latest in the staging docker image location. In order to know what this staging docker images are built with, have a look at the DockerFile in

Having seen the details about the lava-server only docker images, let us now see how to run these docker images to create a LAVA server instance.

running production lava-server docker image

$ sudo docker run -p 8080:80 --privileged --name lava-2018.7-1 linaro/lava-server-production-stretch-amd64:2018.7-1
Starting postgresql...
Starting PostgreSQL 9.6 database server: main.
Starting lava-coordinator...
Starting lava-coordinator : lava-coordinato.
Starting apache2 server...
Starting Apache httpd web server: apache2AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using Set the 'ServerName' directive globally to suppress this message
Creating admin account
Superuser created successfully.
Set initial password for admin account as: changeit
spawn lava-server manage changepassword admin
Changing password for user 'admin'
Password (again):
Password changed successfully for user 'admin'
Starting lava-server-gunicorn...

Once the docker image is started visit the instance using the url http://localhost:8080/ or from the host machine. The IP address is obtained from the output above.

running staging lava-server docker image

$ sudo docker run -p 8080:80 --privileged --name lava-latest linaro/lava-server-staging-stretch-amd64:latest
Starting postgresql...
Starting PostgreSQL 9.6 database server: main.
Starting lava-coordinator...
Starting lava-coordinator : lava-coordinato.
Starting apache2 server...
Starting Apache httpd web server: apache2AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using Set the 'ServerName' directive globally to suppress this message
Creating admin account
Superuser created successfully.
Set initial password for admin account as: changeit
spawn lava-server manage changepassword admin
Changing password for user 'admin'
Password (again):
Password changed successfully for user 'admin'
Starting lava-server-gunicorn...

Thus we have our lava-server docker image up and running in docker container. In order to login to this instance use the default user 'admin' and the password 'changeit'. The admin user has administration privileges, hence ensure you change the password to keep your instance secure.

Have a look at which accepts and executes commands which will be handy to tackle advanced use-cases that you want to envision using these lava-server based docker images.

by stylesen at July 07, 2018 08:43

July 15, 2018

Bin Chen

Understand Kubernetes 5: Controller

Controllers in k8s assumes the same role and responsibility as the Controller in the classic Model-View-Controller(whereras the Model are the various API objects stored in the etcd) architecture. What's kind of unique about the controller in k8s is will constantly reconcile the system desired state to current state, not just a one time task.

Replicaset Controller

To make things real, we'll look at the source code of Replicaset Controller and see what exactly is a controller, who it will interact with, and how.
The core logic of Replicaset Controller is quite simple, as showing below:
func (rsc *ReplicaSetController) manageReplicas(filteredPods []*v1.Pod, rs *apps.ReplicaSet) error {
diff := len(filteredPods) - int(*(rs.Spec.Replicas))
if (diff < 0) {
createPods( )
} else if (diff > 0) {
createPods( )
To create the Pod, it uses a KubeClient which talks to the API server.
func (r RealPodControl) createPods( )
newPod, _ := r.KubeClient.CoreV1().Pods(namespace).Create(pod)
Tracing further function Create(), it uses a nice builder patterner, to set up an HTTP request
func (c *pods) Create(pod *v1.Pod) (result *v1.Pod, err error) {
result = &v1.Pod{}
err = c.client.Post().
Upon calling Do, it will issue an HTTP post request and get the result.
func (r *Request) Do() Result {    
var result Result
err := r.request(func(req *http.Request, resp *http.Response) {
result = r.transformResponse(resp, req)
return result
That only cover one direction of the communication, from the controller to the API server.

How about the other direction?


A controller subscribe itself to the apiserver for the events it cares about.
A controller typical cares about two type of information: controller specific information and the core information regarding the Pods.
In k8s, the components used to notify the events are called Informer. FWIW, it is just an Oberser Pattern.
In the case of ReplicatSetController, When a replicatSet request is submitted, the API server will notify the replicatSetControll through appsinformers.ReplicaSetInformer. When a Pod gets created, the API server will notify the replicatSetControll using coreinformers.PodInformer.
See how a ReplicatSetController is initiated:
func startReplicaSetController(ctx ControllerContext) (bool, error) {
go replicaset.NewReplicaSetController(
ctx.InformerFactory.Apps().V1().ReplicaSets(), // appsinformers.ReplicaSetInformer
ctx.InformerFactory.Core().V1().Pods(), // coreinformers.PodInformer
).Run(int(ctx.ComponentConfig.ReplicaSetController.ConcurrentRSSyncs), ctx.Stop)
return true, nil
And how ReplicatSetController is handling those events:
AddFunc: rsc.enqueueReplicaSet,
UpdateFunc: rsc.updateRS,
DeleteFunc: rsc.enqueueReplicaSet,

AddFunc: rsc.addPod,
UpdateFunc: rsc.updatePod,
DeleteFunc: rsc.deletePod,
Ok, this covers the direction from the API server to the controller.

But we still missing a one thing.

Workqueue, and worker

After being notified of the relevant events, a controller will push the events to an event queue; meanwhile, a poor worker is in a dead loop checking the queue and processing the event.

Cached & Shared Informer

We know that etcd provided the API to list and watch particular resources and each resource in k8s has its dedicated locations. With that, we have the things needed to implement an informer for a controller. However, there are two aspects we can optimize. First, instead of relaying everything to etcd, we can cache the information/event in the apiserver for better performance; Second, since different controls care about same set information, it makes sense those controllers can share an informer.
With that in mind, here is how currently a ReplicaSetInformer is created.

Controller Manager

kube-controller-manageris a daemon that bundles together all the built-in controllers for k8s. It provides a central place to register, initiate, and start the controllers.


We go through what a controller is and it interacts with the api sever and does the job.

by Bin Chen ( at July 07, 2018 06:12

July 07, 2018

Bin Chen

Understand kubernetes 4 : Scheduler

The most known job of a container orchestration is to "Assign Pods to Nodes", or so-called scheduling. If all the Pods and Nodes are the same, it becomes a trivial problem to solve - a round robin policy would do the job. In practice, However, Pods have different resource requirements, and less obvious that the nodes may have different capabilities - thinking machines purchased 5 years ago and brand new ones.

An Analogy: Rent a house

Say you want to rent a house, and you tell the agent that any house with 2 bedrooms and 2 bathrooms is fine; However, you don't want a house with swimming Pool, since you would rather be going to the beaches and don't have to pay for something you won't use.
That actually covers the main concepts/job for the k8s scheduler.
  • You/Tenant: have some requirements (rooms)
  • Agent: k8s scheduler
  • Houses(owned by Landlords): The nodes.
You tell the Agent the must-have, definite no-no, and nice-to-have requirements.
Agent's job is to find you the house matches your requirement and anti-requirement.
The owner can also reject an application base on his preference (say no pets).

Requirements for Pod scheduler

Let's see some practical requirements when placing a Pod to Node.
1 Run Pods on a specific type of Nodes : e.g: run this Pod on Ubuntu 17.10 only.
2 Run Pods of different services on the same Node: e.g Place weberver and memcache on some Node.
3 Spread Pods of a service to different Nodes: e.g Place the websever on nodes in different zone for fault toleratnt.
4 Best utilization of the resource: e.g run as "much" job as possible but be able to preempty the low priority one.
In k8s world,
1, 2 can be resolved using Affinity
3 can be resolved using Anti-Affinity
4 can be resolved using Taint and Toleration and Priority and Preemption
Before we talking about those scheduler policies and we first need a way to identify the Nodes. Without the identification, the scheduler can do nothing more/better than allocating with only the capacity information of the node.

Lable the Nodes

Nothing fancy. Nodes are labeled.
You can add any label you want but there are predefined common labels, including
  • hostname
  • os/arch/instance-type
  • zone/region
The first may be used to identify a single node, the 2nd one for a type of nodes, the last one is for geolocation related fault toleration or scalability.


There two type of Affinity, Node Affnity and Pod Affinity. The first one indicates an Affinity to a type of Node, and can be used to achieve the 1st requirement; the later one indicates the Affinity to Node with a certain type of Pods already running, and can be used to achieve 2nd requirement.
The affinity can be soft or hard, which nice-to-have and must respectively.
Reverse the logical of Affinity, it became Anti-Affinity, means Pod don't want to be in the Nodes with a certain type of feature. Requirement 3 can be implemented as "Pod doesn't want to be in the Node with the same Pod (using Pod Label)".
Side notes: You might know that in Linux a process can set it is cpu affinity, that is which CPU core it prefers to run on. It assembles to the problem of placing a Pod on a specific (type of) Node. As well as the CPUset in cgroup.

Taint and Toleration

Landlord tells to the Angent that he only want to rent the house to a programmer (for whatever reason). So unless a renter identifies himself as a programmer, the agent won't submit his application to the landlord.
Similar, a node can add some special requirement (called Taint) and use that to repel a set of nodes. Unless a Pod can tolerate the taint, it will be placed on the Node.
I found the concept of Taint and Tolerations was a little bit twisted, since Taint sounds like a bad stuff, unreasonable requirements/restriction that Pod has tolerate. It more likes landlord requires to pay the upfront rent for half a year and only the one who will tolerate this are able to apply.
One thing to remember is Taint is it is an attribution of Node and it gives Node an opportunity to have a voice for his preference; unlike Affinity is for Pod shows its preference to Node.

Priority and Preemption

Maximise resource utilization is important and it can be overlooked for most people don't have the experience of managing thousands of servers. As pointed out in section 5 of Borg paper, which k8s is inspired from.
One of Borg’s primary goals is to make efficient use of
Google’s fleet of machines, which represents a significant
financial investment: increasing utilization by a few percentages
points can save millions of dollars.
How to increasing utilization? That could mean many things, such as: schedule jobs fast, optimize the Pod allocation so that more jobs can be accommodated, and last but not least, be able to interrupt the low priority job with high priority one.
The last one just makes sense for machine. Do something always better than running idle. But when more important jobs coming, it will be preempted.
And an indication for the possibility of being preempted is we have to spend a minute of thinking about the effect of the Pod/service that may be evicted. Does it matter? How to gracefully terminate itself?

Make it real

To make things more real, take a look at this sample toy scheduler, which will bind a Pod to the cheapest Node as long as the Node can it can "fit" the resource requirements needed by the Pod.
Here are a few takeaways:
  1. You can roll your own scheduler.
  2. You can have more than one schedulers in the system. Each scheduler looks after a particular set/type of Pods and schedules them. (It doesn't make sense to have multiple schedulers trying to schedule the same set of Pods - there will be racing.)
  3. Scheduler always talks to the API server, as a client. It asks the APIs server for unscheduled Pods, scheduler them using a defined policy, and post the scheduler results ( i.e Pod/Node binding) to API server.
schedulerschedulerapi serverapi serverget me unscheduled Podsget me Node info/status/capacityschedule it according to a predefined policypost binding resultpost binding OK events
You can find default scheduler here.


We go over the requirement of a Pod scheduler and the way to achieve those requirements in k8s.

by Bin Chen ( at July 07, 2018 04:47

June 30, 2018

Bin Chen

Understand Kubernetes 3 : etcd

In the last article, we said there was a statetore in the master node; in practice, it is implemented using etcdetcd is open source distributed key-value store (from coreOs) using the raft consensus algorithm. You can find a good introduction of etcd herek8s use etcd to store all the cluster information and is the only stateful component in the whole k8s (we don't count in the stateful components of the application itself).
Notably, it stores the following information:
  • Resource object/spec submitted by the user
  • The scheduler results from master node
  • Current status of work nodes and Pods

etcd is the critical

The stability and responsiveness of etcd is critical to stability & performance of the whole cluster. here is an excellent blog from open AI sharing that, there etcd system, hindered by 1) the high disk latency due to cloud backend and 2) high network io load incurred by the monitoring system, was one of the biggest issues they encountered when scaling the nodes to 2500.
For a production system, we will set up a separate etcd cluster and connect the k8s master to it. The master will store the requests to the etcd, update the results by controllers/schedulers, and the work nodes will watch the relevant state change through master and take action according, e,g start a container on itself.
It looks like this diagram:

usage of etcd in k8s

etcd is set up separately, but it has to be setup first so that the nodes ip (and tls info) of in the etcd cluster can be pass to the apiserver running on the master nodes. Using that information (etcd-servers and etcd-tls) apiserver will create an etc client (or multiple clients) talking to the etcd. That is all the connection between etcd and k8s.

All the components in the api-server will use storage.Interface to communicate with storage. etcd is the only backend implementation at the moment and it supports two versions of etcd, v2 and v3, which is the default.
class storage.Interface {
Create(key string, obj runtime.Object))
k8s master, to be specific, apiserver component, act as one client of the etcd, using the etcd client to implement the storage. Interface API with a little bit more stuff that fits k8s model.
Let's see two APIs, Create and Watch.
For create, the value part of the k/v is a runtime object, e.g Deployment spec, a few more steps (encoder, transform) is needed before finally commit that to the etcd.
  • Create
Create(key string, obj runtime.Object)
obj -> encoder -> transformer -> clientv3.OpPut(key, v string)
Besides the normal create/get/delete, there is one operation that is very important for distributed k/v store, watch, which allows you block wait on something and being notified when something is changed. As a user case, someone can watch a specific location for new pod creation/deletion and then take the corresponding action.
Kublete doesn't watch the storage direction, instead, it watches it through API server.
  • Watch
func (wc *watchChan) startWatching(watchClosedCh chan struct{}) {
wch := wc.watcher.client.Watch(wc.ctx, wc.key, opts...)

pluggable backend storage

In theory, you should be able to replace etcd with other k/v stores, such as Consul and Zookeeper.
There was a PR to add Consul as the backend, but was closed (after three years) as "not ready to do this in the near future". Why create pluggable container runtime but not for the storage backend, which seems make sense as well. One of the possible technical reason is that k8s and etcd are already loosely coupled so doesn't worth the effort to create another layer to make it pluggable.


etcd is the components storing all the state for k8s cluster. It is availability and performance is vital to the whole k8s. apisever is the only one that talks to ectd using etc clients, request that submit to apiserver will be encoded and transformed before committing to etcd. Anyone can watch a particular state change but not directly to the etcd instead that go through the apiserver.

by Bin Chen ( at June 06, 2018 03:59

June 29, 2018

Neil Williams

Automation & Risk

First of two posts reproducing some existing content for a wider audience due to delays in removing viewing restrictions on the originals. The first is a bit long... Those familiar with LAVA may choose to skip forward to Core elements of automation support.

A summary of this document was presented by Steve McIntyre at Linaro Connect 2018 in Hong Kong. A video of that presentation and the slides created from this document are available online:

Although the content is based on several years of experience with LAVA, the core elements are likely to be transferable to many other validation, CI and QA tasks.

I recognise that this document may be useful to others, so this blog post is under CC BY-SA 3.0: See also

Automation & Risk


Linaro created the LAVA (Linaro Automated Validation Architecture) project in 2010 to automate testing of software using real hardware. Over the seven years of automation in Linaro so far, LAVA has also spread into other labs across the world. Millions of test jobs have been run, across over one hundred different types of devices, ARM, x86 and emulated. Varied primary boot methods have been used alone or in combination, including U-Boot, UEFI, Fastboot, IoT, PXE. The Linaro lab itself has supported over 150 devices, covering more than 40 different device types. Major developments within LAVA include MultiNode and VLAN support. As a result of this data, the LAVA team have identified a series of automated testing failures which can be traced to decisions made during hardware design or firmware development. The hardest part of the development of LAVA has always been integrating new device types, arising from issues with hardware design and firmware implementations. There are a range of issues with automating new hardware and the experience of the LAVA lab and software teams has highlighted areas where decisions at the hardware design stage have delayed deployment of automation or made the task of triage of automation failures much harder than necessary.

This document is a summary of our experience with full background and examples. The aim is to provide background information about why common failures occur, and recommendations on how to design hardware and firmware to reduce problems in the future. We describe some device design features as hard requirements to enable successful automation, and some which are guaranteed to block automation. Specific examples are used, naming particular devices and companies and linking to specific stories. For a generic summary of the data, see Automation and hardware design.

What is LAVA?

LAVA is a continuous integration system for deploying operating systems onto physical and virtual hardware for running tests. Tests can be simple boot testing, bootloader testing and system level testing, although extra hardware may be required for some system tests. Results are tracked over time and data can be exported for further analysis.

LAVA is a collection of participating components in an evolving architecture. LAVA aims to make systematic, automatic and manual quality control more approachable for projects of all sizes.

LAVA is designed for validation during development - testing whether the code that engineers are producing “works”, in whatever sense that means. Depending on context, this could be many things, for example:

  • testing whether changes in the Linux kernel compile and boot
  • testing whether the code produced by gcc is smaller or faster
  • testing whether a kernel scheduler change reduces power consumption for a certain workload etc.

LAVA is good for automated validation. LAVA tests the Linux kernel on a range of supported boards every day. LAVA tests proposed android changes in gerrit before they are landed, and does the same for other projects like gcc. Linaro runs a central validation lab in Cambridge, containing racks full of computers supplied by Linaro members and the necessary infrastucture to control them (servers, serial console servers, network switches etc.)

LAVA is good for providing developers with the ability to run customised test on a variety of different types of hardware, some of which may be difficult to obtain or integrate. Although LAVA has support for emulation (based on QEMU), LAVA is best at providing test support for real hardware devices.

LAVA is principally aimed at testing changes made by developers across multiple hardware platforms to aid portability and encourage multi-platform development. Systems which are already platform independent or which have been optimised for production may not necessarily be able to be tested in LAVA or may provide no overall gain.

What is LAVA not?

LAVA is designed for Continuous Integration not management of a board farm.

LAVA is not a set of tests - it is infrastructure to enable users to run their own tests. LAVA concentrates on providing a range of deployment methods and a range of boot methods. Once the login is complete, the test consists of whatever scripts the test writer chooses to execute in that environment.

LAVA is not a test lab - it is the software that can used in a test lab to control test devices.

LAVA is not a complete CI system - it is software that can form part of a CI loop. LAVA supports data extraction to make it easier to produce a frontend which is directly relevant to particular groups of developers.

LAVA is not a build farm - other tools need to be used to prepare binaries which can be passed to the device using LAVA.

LAVA is not a production test environment for hardware - LAVA is focused on developers and may require changes to the device or the software to enable automation. These changes are often unsuitable for production units. LAVA also expects that most devices will remain available for repeated testing rather than testing the software with a changing set of hardware.

The history of automated bootloader testing

Many attempts have been made to automate bootloader testing and the rest of this document cover the issues in detail. However, it is useful to cover some of the history in this introduction, particularly as that relates to ideas like SDMux - the SD card multiplexer which should allow automated testing of bootloaders like U-Boot on devices where the bootloader is deployed to an SD card. The problem of SDMux details the requirements to provide access to SD card filesystems to and from the dispatcher and the device. Requirements include: ethernet, no reliance on USB, removable media, cable connections, unique serial numbers, introspection and interrogation, avoiding feature creep, scalable design, power control, maintained software and mounting holes. Despite many offers of hardware, no suitable hardware has been found and testing of U-Boot on SD cards is not currently possible in automation. The identification of the requirements for a supportable SDMux unit are closely related to these device requirements.

Core elements of automation support


The ability to deploy exactly the same software to the same board(s) and running exactly the same tests many times in a row, getting exactly the same results each time.

For automation to work, all device functions which need to be used in automation must always produce the same results on each device of a specific device type, irrespective of any previous operations on that device, given the same starting hardware configuration.

There is no way to automate a device which behaves unpredictably.


The ability to run a wide range of test jobs, stressing different parts of the overall deployment, with a variety of tests and always getting a Complete test job. There must be no infrastructure failures and there should be limited variability in the time taken to run the test jobs to avoid the need for excessive Timeouts.

The same hardware configuration and infrastructure must always behave in precisely the same way. The same commands and operations to the device must always generate the same behaviour.


The device must support deployment of files and booting of the device without any need for a human to monitor or interact with the process. The need to press buttons is undesirable but can be managed in some cases by using relays. However, every extra layer of complexity reduces the overall reliability of the automation process and the need for buttons should be limited or eliminated wherever possible. If a device uses on LEDs to indicate the success of failure of operations, such LEDs must only be indicative. The device must support full control of that process using only commands and operations which do not rely on observation.


All methods used to automate a device must have minimal footprint in terms of load on the workers, complexity of scripting support and infrastructure requirements. This is a complex area and can trivially impact on both reliability and reproducibility as well as making it much more difficult to debug problems which do arise. Admins must also consider the complexity of combining multiple different devices which each require multiple layers of support.

Remote power control

Devices MUST support automated resets either by the removal of all power supplied to the DUT or a full reboot or other reset which clears all previous state of the DUT.

Every boot must reliably start, without interaction, directly from the first application of power without the limitation of needing to press buttons or requiring other interaction. Relays and other arrangements can be used at the cost of increasing the overall complexity of the solution, so should be avoided wherever possible.

Networking support

Ethernet - all devices using ethernet interfaces in LAVA must have a unique MAC address on each interface. The MAC address must be persistent across reboots. No assumptions should be made about fixed IP addresses, address ranges or pre-defined routes. If more than one interface is available, the boot process must be configurable to always use the same interface every time the device is booted. WiFi is not currently supported as a method of deploying files to devices.

Serial console support

LAVA expects to automate devices by interacting with the serial port immediately after power is applied to the device. The bootloader must interact with the serial port. If a serial port is not available on the device, suitable additional hardware must be provided before integration can begin. All messages about the boot process must be visible using the serial port and the serial port should remain usable for the duration of all test jobs on the device.


Devices supporting primary SSH connections have persistent deployments and this has implications, some positive, some negative - depending on your use case.

  • Fixed OS - the operating system (OS) you get is the OS of the device and this must not be changed or upgraded.
  • Package interference - if another user installs a conflicting package, your test can fail.
  • Process interference - another process could restart (or crash) a daemon upon which your test relies, so your test will fail.
  • Contention - another job could obtain a lock on a constrained resource, e.g. dpkg or apt, causing your test to fail.
  • Reusable scripts - scripts and utilities your test leaves behind can be reused (or can interfere) with subsequent tests.
  • Lack of reproducibility - an artifact from a previous test can make it impossible to rely on the results of a subsequent test, leading to wasted effort with false positives and false negatives.
  • Maintenance - using persistent filesystems in a test action results in the overlay files being left in that filesystem. Depending on the size of the test definition repositories, this could result in an inevitable increase in used storage becoming a problem on the machine hosting the persistent location. Changes made by the test action can also require intermittent maintenance of the persistent location.

Only use persistent deployments when essential and always take great care to avoid interfering with other tests. Users who deliberately or frequently interfere with other tests can have their submit privilege revoked.

The dangers of simplistic testing

Connect and test

Seems simple enough - it doesn’t seem as if you need to deploy a new kernel or rootfs every time, no need to power off or reboot between tests. Just connect and run stuff. After all, you already have a way to manually deploy stuff to the board. The biggest problem with this method is Persistence as above - LAVA keeps the LAVA components separated from each other but tests frequently need to install support which will persist after the test, write files which can interfere with other tests or break the manual deployment in unexpected ways when things go wrong. The second problem within this fallacy is simply the power drain of leaving the devices constantly powered on. In manual testing, you would apply power at the start of your day and power off at the end. In automated testing, these devices would be on all day, every day, because test jobs could be submitted at any time.

ssh instead of serial

This is an over-simplification which will lead to new and unusual bugs and is only a short step on from connect & test with many of the same problems. A core strength of LAVA is demonstrating differences between types of devices by controlling the boot process. By the time the system has booted to the point where sshd is running, many of those differences have been swallowed up in the boot process.

Test everything at the same time

Issues here include:

Breaking the basic scientific method of test one thing at a time

The single system contains multiple components, like the kernel and the rootfs and the bootloader. Each one of those components can fail in ways which can only be picked up when some later component produces a completely misleading and unexpected error message.


Simply deploying the entire system for every single test job wastes inordinate amounts of time when you do finally identify that the problem is a configuration setting in the bootloader or a missing module for the kernel.


The larger the deployment, the more complex the boot and the tests become. Many LAVA devices are prototypes and development boards, not production servers. These devices will fail in unpredictable places from time to time. Testing a kernel build multiple times is much more likely to give you consistent averages for duration, performance and other measurements than if the kernel is only tested as part of a complete system.Automated recovery - deploying an entire system can go wrong, whether an interrupted copy or a broken build, the consequences can mean that the device simply does not boot any longer.

Every component involved in your test must allow for automated recovery

This means that the boot process must support being interrupted before that component starts to load. With a suitably configured bootloader, it is straightforward to test kernel builds with fully automated recovery on most devices. Deploying a new build of the bootloader itself is much more problematic. Few devices have the necessary management interfaces with support for secondary console access or additional network interfaces which respond very early in boot. It is possible to chainload some bootloaders, allowing the known working bootloader to be preserved.

I already have builds

This may be true, however, automation puts extra demands on what those builds are capable of supporting. When testing manually, there are any number of times when a human will decide that something needs to be entered, tweaked, modified, removed or ignored which the automated system needs to be able to understand. Examples include /etc/resolv.conf and customised tools.

Automation can do everything

It is not possible to automate every test method. Some kinds of tests and some kinds of devices lack critical elements that do not work well with automation. These are not problems in LAVA, these are design limitations of the kind of test and the device itself. Your preferred test plan may be infeasible to automate and some level of compromise will be required.

Users are all admins too

This will come back to bite! However, there are other ways in which this can occur even after administrators have restricted users to limited access. Test jobs (including hacking sessions) have full access to the device as root. Users, therefore, can modify the device during a test job and it depends on the device hardware support and device configuration as to what may happen next. Some devices store bootloader configuration in files which are accessible from userspace after boot. Some devices lack a management interface that can intervene when a device fails to boot. Put these two together and admins can face a situation where a test job has corrupted, overridden or modified the bootloader configuration such that the device no longer boots without intervention. Some operating systems require a debug setting to be enabled before the device will be visible to the automation (e.g. the Android Debug Bridge). It is trivial for a user to mistakenly deploy a default or production system which does not have this modification.


LAVA is aimed at kernel and system development and testing across a wide variety of hardware platforms. By the time the test has got to the level of automating a GUI, there have been multiple layers of abstraction between the hardware, the kernel, the core system and the components being tested. Following the core principle of testing one element at a time, this means that such tests quickly become platform-independent. This reduces the usefulness of the LAVA systems, moving the test into scope for other CI systems which consider all devices as equivalent slaves. The overhead of LAVA can become an unnecessary burden.

CI needs a timely response - it takes time for a LAVA device to be re-deployed with a system which has already been tested. In order to test a component of the system which is independent of the hardware, kernel or core system a lot of time has been consumed before the “test” itself actually begins. LAVA can support testing pre-deployed systems but this severely restricts the usefulness of such devices for actual kernel or hardware testing.

Automation may need to rely on insecure access. Production builds (hardware and software) take steps to prevent systems being released with known login identities or keys, backdoors and other security holes. Automation relies on at least one of these access methods being exposed, typically a way to access the device as the root or admin user. User identities for login must be declared in the submission and be the same across multiple devices of the same type. These access methods must also be exposed consistently and without requiring any manual intervention or confirmation. For example, mobile devices must be deployed with systems which enable debug access which all production builds will need to block.

Automation relies on remote power control - battery powered devices can be a signficant problem in this area. On the one hand, testing can be expected to involve tests of battery performance, low power conditions and recharge support. However, testing will also involve broken builds and failed deployments where the only recourse is to hard reset the device by killing power. With a battery in the loop, this becomes very complex, sometimes involving complex electrical bodges to the hardware to allow the battery to be switched out of the circuit. These changes can themselves change the performance of the battery control circuitry. For example, some devices fail to maintain charge in the battery when held in particular states artificially, so the battery gradually discharges despite being connected to mains power. Devices which have no battery can still be a challenge as some are able to draw power over the serial circuitry or USB attachments, again interfering with the ability of the automation to recover the device from being “bricked”, i.e. unresponsive to the control methods used by the automation and requiring manual admin intervention.

Automation relies on unique identification - all devices in an automation lab must be uniquely identifiable at all times, in all modes and all active power states. Too many components and devices within labs fail to allow for the problems of scale. Details like serial numbers, MAC addresses, IP addresses and bootloader timeouts must be configurable and persistent once configured.

LAVA is not a complete CI solution - even including the hardware support available from some LAVA instances, there are a lot more tools required outside of LAVA before a CI loop will actually work. The triggers from your development workflow to the build farm (which is not LAVA), the submission to LAVA from that build farm are completely separate and outside the scope of this documentation. LAVA can help with the extraction of the results into information for the developers but LAVA output is generic and most teams will benefit from some “frontend” which extracts the data from LAVA and generates relevant output for particular development teams.

Features of CI


How often is the loop to be triggered?

Set up some test builds and test jobs and run through a variety of use cases to get an idea of how long it takes to get from the commit hook to the results being available to what will become your frontend.

Investigate where the hardware involved in each stage can be improved and analyse what kind of hardware upgrades may be useful.

Reassess the entire loop design and look at splitting the testing if the loop cannot be optimised to the time limits required by the team. The loop exists to serve the team but the expectations of the team may need to be managed compared to the cost of hardware upgrades or finite time limits.


How many branches, variants, configurations and tests are actually needed?

Scale has a direct impact on the affordability and feasibility of the final loop and frontend. Ensure that the build infrastructure can handle the total number of variants, not just at build time but for storage. Developers will need access to the files which demonstrate a particular bug or regression

Scale also provides benefits of being able to ignore anomalies.

Identify how many test devices, LAVA instances and Jenkins slaves are needed. (As a hint, start small and design the frontend so that more can be added later.)


The development of a custom interface is not a small task

Capturing the requirements for the interface may involve lengthy discussions across the development team. Where there are irreconcilable differences, a second frontend may become necessary, potentially pulling the same data and presenting it in a radically different manner.

Include discussions on how or whether to push notifications to the development team. Take time to consider the frequency of notification messages and how to limit the content to only the essential data.

Bisect support can flow naturally from the design of the loop if the loop is carefully designed. Bisect requires that a simple boolean test can be generated, built and executed across a set of commits. If the frontend implements only a single test (for example, does the kernel boot?) then it can be easy to identify how to provide bisect support. Tests which produce hundreds of results need to be slimmed down to a single pass/fail criterion for the bisect to work.


This may take the longest of all elements of the final loop

Just what results do the developers actually want and can those results be delivered? There may be requirements to aggregate results across many LAVA instances, with comparisons based on metadata from the original build as well as the LAVA test.

What level of detail is relevant?

Different results for different members of the team or different teams?

Is the data to be summarised and if so, how?


A frontend has the potential to become complex and need long term maintenance and development

Device requirements

At the hardware design stage, there are considerations for the final software relating to how the final hardware is to be tested.


All units of all devices must uniquely identify to the host machine as distinct from all other devices which may be connected at the same time. This particularly covers serial connections but also any storage devices which are exported, network devices and any other method of connectivity.

Example - the WaRP7 integration has been delayed because the USB mass storage does not export a filesystem with a unique identifier, so when two devices are connected, there is no way to distinguish which filesystem relates to which device.

All unique identifiers must be isolated from the software to be deployed onto the device. The automation framework will rely on these identifiers to distinguish one device from up to a dozen identical devices on the same machine. There must be no method of updating or modifying these identifiers using normal deployment / flashing tools. It must not be possible for test software to corrupt the identifiers which are fundamental to how the device is identified amongst the others on the same machine.

All unique identifiers must be stable across multiple reboots and test jobs. Randomly generated identifiers are never suitable.

If the device uses a single FTDI chip which offers a single UART device, then the unique serial number of that UART will typically be a permanent part of the chip. However, a similar FTDI chip which provides two or more UARTs over the same cable would not have serial numbers programmed into the chip but would require a separate piece of flash or other storage into which those serial numbers can be programmed. If that storage is not designed into the hardware, the device will not be capable of providing the required uniqueness.

Example - the WaRP7 exports two UARTs over a single cable but fails to give unique identifiers to either connection, so connecting a second device disconnects the first device when the new tty device replaces the existing one.

If the device uses one or more physical ethernet connector(s) then the MAC address for each interface must not be generated randomly at boot. Each MAC address needs to be:

  • persistent - each reboot must always use the same MAC address for each interface.
  • unique - every device of this type must use a unique MAC address for each interface.

If the device uses fastboot, then the fastboot serial number must be unique so that the device can be uniquely identified and added to the correct container. Additionally, the fastboot serial number must not be modifiable except by the admins.

Example - the initial HiKey 960 integration was delayed because the firmware changed the fastboot serial number to a random value every time the device was rebooted.


Automation requires more than one device to be deployed - the current minimum is five devices. One device is permanently assigned to the staging environment to ensure that future code changes retain the correct support. In the early stages, this device will be assigned to one of the developers to integrate the device into LAVA. The devices will be deployed onto machines which have many other devices already running test jobs. The new device must not interfere with those devices and this makes some of the device requirements stricter than may be expected.

  • The aim of automation is to create a homogenous test platform using heterogeneous devices and scalable infrastructure.

  • Do not complicate things.

  • Avoid extra customised hardware

    Relays, hardware modifications and mezzanine boards all increase complexity

    Examples - X15 needed two relay connections, the 96boards initially needed a mezzanine board where the design was rushed, causing months of serial disconnection issues.

  • More complexity raises failure risk nonlinearly

    Example - The lack of onboard serial meant that the 96boards devices could not be tested in isolation from the problematic mezzanine board. Numerous 96boards devices were deemed to be broken when the real fault lay with intermittent failures in the mezzanine. Removing and reconnecting a mezzanine had a high risk of damaging the mezzanine or the device. Once 96boards devices moved to direct connection of FTDI cables into the connector formerly used by the mezzanine, serial disconnection problems disappeared. The more custom hardware has to be designed / connected to a device to support automation, the more difficult it is to debug issues within that infrastructure.

  • Avoid unreliable protocols and connections

    Example. WiFi is not a reliable deployment method, especially inside a large lab with lots of competing signals and devices.

  • This document is not demanding enterprise or server grade support in devices.

    However, automation cannot scale with unreliable components.

    Example - HiKey 6220 and the serial mezzanine board caused massively complex problems when scaled up in LKFT.

  • Server support typically includes automation requirements as a subset:

    RAS, performance, efficiency, scalability, reliability, connectivity and uniqueness

  • Automation racks have similar requirements to data centres.

  • Things need to work reliably at scale

Scale issues also affect the infrastructure which supports the devices as well as the required reliability of the instance as a whole. It can be difficult to scale up from initial development to automation at scale. Numerous tools and utilities prove to be uncooperative, unreliable or poorly isolated from other processes. One result can be that the requirements of automation look more like the expectations of server-type hardware than of mobile hardware. The reality at scale is that server-type hardware has already had fixes implemented for scalability issues whereas many mobile devices only get tested as standalone units.

Connectivity and deployment methods

  • All test software is presumed broken until proven otherwise
  • All infrastructure and device integration support must be proven to be stable before tests can be reliable
  • All devices must provide at least one method of replacing the current software with the test software, at a level lower than you're testing.

The simplest method to automate is TFTP over physical ethernet, e.g. U-Boot or UEFI PXE. This also puts the least load on the device and automation hardware when delivering large images

Manually writing software to SD is not suitable for automation. This tends to rule out many proposed methods for testing modified builds or configurations of firmware in automation.

See for more information on how the requirements of automation affect the hardware design requirements to provide access to SD card filesystems to and from the dispatcher and the device.

Some deployment methods require tools which must be constrained within an LXC. These include but are not limited to:

  • fastboot - due to a common need to have different versions installed for different hardware devices

    Example - Every fastboot device suffers from this problem - any running fastboot process will inspect the entire list of USB devices and attempt to connect to each one, locking out any other fastboot process which may be running at the time, which sees no devices at all.

  • IoT deployment - some deployment tools require patches for specific devices or use tools which are too complex for use on the dispatcher.

    Example - the TI CC3220 IoT device needs a patched build of OpenOCD, the WaRP7 needs a custom flashing tool compiled from a github repository.

Wherever possible, existing deployment methods and common tools are strongly encouraged. New tools are not likely to be as reliable as the existing tools.

Deployments must not make permanent changes to the boot sequence or configuration.

Testing of OS installers may require modifying the installer to not install an updated bootloader or modify bootloader configuration. The automation needs to control whether the next reboot boots the newly deployed system or starts the next test job, for example when a test job has been cancelled, the device needs to be immediately ready to run a different test job.


Automation requires driving the device over serial instead of via a touchscreen or other human interface device. This changes the way that the test is executed and can require the use of specialised software on the device to translate text based commands into graphical inputs.

It is possible to test video output in automation but it is not currently possible to drive automation through video input. This includes BIOS-type firmware interaction. UEFI can be used to automatically execute a bootloader like Grub which does support automation over serial. UEFI implementations which use graphical menus cannot be supported interactively.


The objective is to have automation support which runs test jobs reliably. Reproducible failures are easy to fix but intermittent faults easily consume months of engineering time and need to be designed out wherever possible. Reliable testing means only 3 or 4 test job failures per week due to hardware or infrastructure bugs across an entire test lab (or instance). This can involve thousands of test jobs across multiple devices. Some instances may have dozens of identical devices but they still need not to exceed the same failure rate.

All devices need to reach the minimum standard of reliability, or they are not fit for automation. Some of these criteria might seem rigid, but they are not exclusive to servers or enterprise devices. To be useful mobile and IoT devices need to meet the same standards, even though the software involved and the deployment methods might be different. The reason is that the Continuous Integration strategy remains the same for all devices. The problem is the same, regardless of underlying considerations.

A developer makes a change; that change triggers a build; that build triggers a test; that test reports back to the developer whether that change worked or had unexpected side effects.

  • False positive and false negatives are expensive in terms of wasted engineering time.
  • False positives can arise when not enough of the software is fully tested, or if the testing is not rigorous enough to spot all problems.
  • False negatives arise when the test itself is unreliable, either because of the test software or the test hardware.

This becomes more noticeable when considering automated bisections which are very powerful in tracking the causes of potential bugs before the product gets released. Every test job must give a reliable result or the bisection will not reliably identify the correct change.

Automation and Risk

Linaro kernel functional test framework (LKFT)

We have seen with LKFT that complexity has a non-linear relationship with the reliability of any automation process. This section aims to set out some guidelines and recommendations on just what is acceptable in the tools needed to automate testing on a device. These guidelines are based on our joint lab and software team experiences with a wide variety of hardware and software.

Adding or modifying any tool has a risk of automation failure

Risk increases non-linearly with complexity. Some of this risk can be mitigated by testing the modified code and the complete system.

Dependencies installed count as code in terms of the risks of automation failure

This is a key lesson learnt from our experiences with LAVA V1. We added a remote worker method, which was necessary at the time to improve scalability. But it massively increased the risk of automation failure simply due to the extra complexity that came with the chosen design.These failures did not just show up in the test jobs which actively used the extra features and tools; they caused problems for all jobs running on the system.

The ability in LAVA V2 to use containers for isolation is a key feature

For the majority of use cases, the small extension of the runtime of the test to set up and use a container is negligible. The extra reliability is more than worth the extra cost.

Persistent containers are themselves a risk to automation

Just as with any persistent change to the system.

Pre-installing dependencies in a persistent container does not necessarily lower the overall risk of failure. It merely substitutes one element of risk for another.

All code changes need to be tested

In unit tests and in functional tests. There is a dividing line where if something is installed as a dependency of LAVA, then when that something goes wrong, LAVA engineers will be pressured into fixing the code of that dependency whether or not we have any particular experience of that language, codebase or use case. Moving that code into a container moves that burden but also makes triage of that problem much easier by allowing debug builds / options to be substituted easily.

Complexity also increases the difficulty of debugging, again in a nonlinear fashion

A LAVA dependency needs a higher bar in terms of ease of triage.

Complexity cannot be easily measured

Although there are factors which contribute.


Large programs which appear as a single monolith are harder to debug than the UNIX model of one utility joined with other utilities to perform a wider task. (This applies to LAVA itself as much as any one dependency - again, a lesson from V1.)

Feature creep

Continually adding features beyond the original scope makes complex programs worse. A smaller codebase will tend to be simpler to triage than a large codebase, even if that codebase is not monolithic.

Targeted utilities are less risky than large environments

A program which supports protocol after protocol after protocol will be more difficult to maintain than 3 separate programs for each protocol. This only gets worse when the use case for that program only requires the use of one of the many protocols supported by the program. The fact that the other protocols are supported increases the complexity of the program beyond what the use case actually merits.

Metrics in this area are impossible

The risks are nonlinear, the failures are typically intermittent. Even obtaining or applying metrics takes up huge amounts of engineering time.

Mismatches in expectations

The use case of automation rarely matches up with the more widely tested use case of the upstream developers. We aren't testing the code flows typically tested by the upstream developers, so we find different bugs, raising the level of risk. Generally, the simpler it is to deploy a device in automation, the closer the test flow will be to the developer flow.

Most programs are written for the single developer model

Some very widely used programs are written to scale but this is difficult to determine without experience of trying to run it at scale.

Some programs do require special consideration

QEMU would fail most of these guidelines above, so there are mitigating factors:

  • Programs which can be easily restricted to well understood use cases lower the risk of failure. Not all use cases of the same program not need to be covered.
  • Programs which have excellent community and especially in-house support also lower the risk of failure. (Having QEMU experts in Linaro is a massive boost for having QEMU as a dispatcher dependency.)

Unfamiliar languages increase the difficulty of triage

This may affect dependencies in unexpected ways. A program which has lots of bindings into a range of other languages becomes entangled in transitions and bugs in those other languages. This commonly delays the availability of the latest version which may have a critical fix for one use case but which fails to function at all in what may seem to be an unrelated manner.

The dependency chain of the program itself increases the risk of failure in precisely the same manner as the program

In terms of maintenance, this can include the build dependencies of the program as those affect delivery / availability of LAVA in distributions like Debian.

Adding code to only one dispatcher amongst many increases the risk of failure on the instance as a whole

By having an untested element which is at variance to the rest of the system.

Conditional dependencies increase the risk

Optional components can be supported but only increase the testing burden by extending the matrix of installations.

Presence of the code in Debian main can reduce the risk of failure

This does not outweigh other considerations - there are plenty of packages in Debian (some complex, some not) which would be an unacceptable risk as a dependency of the dispatcher, fastboot for one. A small python utility from github can be a substantially lower risk than a larger program from Debian which has unused functionality.

Sometimes, "complex" simply means "buggy" or "badly designed"

fastboot is not actually a complex piece of code but we have learnt that it does not currently scale. This is a result of the disparity between the development model and the automation use case. Disparities like that actually equate to complexity, in terms of triage and maintenance. If fastboot was more complex at the codebase level, it may actually become a lower risk than currently.

Linaro as a whole does have a clear objective of harmonising the ecosystem

Adding yet another variant of existing support is at odds with the overall objective of the company. Many of the tools required in automation have no direct affect on the distinguishing factors for consumers. Adding another one "just because" is not a good reason to increase the risk of automation failure. Just as with standards.

Having the code on the dispatcher impedes development of that code

Bug fixes will take longer to be applied because the fix needs to go through a distribution or other packaging process managed by the lab admins. Applying a targeted fix inside an LXC is useful for proving that the fix works.

Not all programs can work in an LXC

LAVA also provides ways to test using those programs by deploying the code onto a test device. e.g. the V2 support for fastmodels involves only deploying the fastmodel inside a LAVA Test Shell on a test device, e.g. x86 or mustang or Juno.

Speed of running a test job in LAVA is important for CI

The goal of speed must give way to the requirement for reliability of automation

Resubmitting a test job due to a reliability failure is more harmful to the CI process than letting tests take longer to execute without such failures. Test jobs which run quickly are easier to parallelize by adding more test hardware.

Modifying software on the device

Not all parts of the software stack can be replaced automatically, typically the firmware and/or bootloader will need to be considered carefully. The boot sequence will have important effects on what kind of testing can be done automatically. Automation relies on being able to predict the behaviour of the device, interrupt that default behaviour and then execute the test. For most devices, everything which executes on the device prior to the first point at which the boot sequence can be interrupted can be considered as part of the primary boot software. None of these elements can be safely replaced or modified in automation.

The objective is to deploy the device such that as much of the software stack can be replaced as possible whilst preserving the predictable behaviour of all devices of this type so that the next test job always gets a working, clean device in a known state.

Primary boot software

For many devices, this is the bootloader, e.g. U-Boot, UEFI or fastboot.

Some devices include support for a Baseboard management controller or BMC which allows the bootloader and other firmware to be updated even if the device is bricked. The BMC software itself then be considered as the primary boot software, it cannot be safely replaced.

All testing of the primary boot software will need to be done by developers using local devices. SDMux was an idea which only fitted one specific set of hardware, the problem of testing the primary boot software is a hydra. Adding customised hardware to try to sidestep the primary boot software always increases the complexity and failure rates of the devices.

It is possible to divide the pool of devices into some which only ever use known versions of the primary boot software controlled by admins and other devices which support modifying the primary boot software. However, this causes extra work when processing the results, submitting the test jobs and administering the devices.

A secondary problem here is that it is increasingly common for the methods of updating this software to be esoteric, hacky, restricted and even proprietary.

  • Click-through licences to obtain the tools

  • Greedy tools which hog everything in /dev/bus/usb

  • NIH tools which are almost the same as existing tools but add vendor-specific "functionality"

  • GUI tools

  • Changing jumpers or DIP switches,

    Often in inaccessible locations which require removal of other ancillary hardware

  • Random, untrusted, compiled vendor software running as root

  • The need to press and hold buttons and watch for changes in LED status.

We've seen all of these - in various combinations - just in 2017, as methods of getting devices into a mode where the primary boot software can be updated.

Copyright 2018 Neil Williams

Available under CC BY-SA 3.0:

by Neil Williams at June 06, 2018 14:19

June 18, 2018

Senthil Kumaran

lava-dispatcher docker images - part 1

Introduction, Details and Preparation

Linaro Automated Validation Architecture a.k.a LAVA project has released official docker images for lava-dispatcher only containers. This blog post series explains how to use these images in order to run inpdependent LAVA workers along with devices attached to it. The blog post series is split into three parts as follows:

  1. lava-dispatcher docker images - part 1 - Introduction, Details and Preparation
  2. lava-dispatcher docker images - part 2 - Docker based LAVA Worker running pure LXC job
  3. lava-dispatcher docker images - part 3 - Docker based LAVA Worker running Nexus 4 job with and without LXC Protocol

Before getting into the details of running these images, let us see how these images are organized and what are the packages available via these images.

The lava-dispatcher only docker images will be officially supported by the LAVA project team and there will be regular releases of these images whenever there are updates or new releases. As of this writing there are two images released - production and staging. These docker images are based on Debian Stretch operating system, which is the recommended operating system for installing LAVA.

lava-dispatcher production docker images

The production docker image of lava-dispatcher is based on the official production-repo of LAVA project. The production-repo holds the latest stable packages released by LAVA team for each of the LAVA components.The production docker image is available in the following link:

Whenever there is a production release from the LAVA project there will be a corresponding image created with the tag name in The latest tag as of this writing is 2018.5-3. In order to know what this production docker images are built with, have a look at the DockerFile in

lava-dispatcher staging docker images

The staging docker image of lava-dispatcher is based on the official staging-repo of LAVA project. The staging-repo holds the latest packages built everyday by LAVA team for each of the LAVA components, which is also a source for bleeding edge unreleased software.The staging docker image is available in the following link, which is built daily:

Whenever there is a successful daily build of staging packages available, a docker image will be made available in with the tag name 'latest'. Hence, at any point of time there will be only one tag, i.e., latest in the staging docker image location. In order to know what this staging docker images are built with, have a look at the DockerFile in


Unlike regular installations of LAVA workers, installations via the above docker images will use a package called lava-lxc-mocker instead of lxc Debian package. lava-lxc-mocker is a pseudo implementation of lxc which tries to mock the lxc commands without running the commands on the machine, but providing the exact same output of the original lxc command. This package exists to provide an alternative (pseudo alternative) to lxc and also to avoid the overheads of running nested containers, which simplifies things without losing the power to run LAVA job definitions that has LXC protocol defined, unmodified.

Having seen the details about the lava-dispatcher only docker images, let us now see three different use cases where jobs are run within a docker container with and without using LXC protocol on attached device such as a Nexus 4 phone.

In demonstrating all these use cases we will use lava-dispatcher only staging docker images. We will use instance as the LAVA master to which the docker based LAVA worker will connect to. is an encrypted LAVA instance which accepts connections, only from authenticated LAVA workers. Read more about how to configure encrypted communication between LAVA master and LAVA worker in The following is a preparation step in order to connect the docker based LAVA slave to the encrypted LAVA master instance.

Creating slave certificate

We will name the docker based LAVA worker as 'docker-slave'. Let us create a slave certificate which could be shared to the LAVA master. In a previously running LAVA worker, issue the following command to create a slave certificate,

stylesen@hanshu:~$ sudo /usr/share/lava-dispatcher/ \
Creating the certificate in /etc/lava-dispatcher/certificates.d
 - docker-slave-1.key
 - docker-slave-1.key_secret

We can see the certificates are created successfully in /etc/lava-dispatcher/certificates.d As explained in copy the public component of the above slave certificate to the master instance (, which is shown below:

stylesen@hanshu:~$ scp /etc/lava-dispatcher/certificates.d/docker-slave-1.key \

docker-slave-1.key                            100%  364     1.4KB/s   00:00   

Then login to to do the actual copy as follows (since we need sudo rights to copy directly, this is done in two steps):

stylesen@hanshu:~$ ssh
stylesen@codehelp:~$ sudo mv /tmp/docker-slave-1.key /etc/lava-dispatcher/certificates.d/
[sudo] password for stylesen:
stylesen@codehelp:~$ sudo ls -alh /etc/lava-dispatcher/certificates.d/docker-slave-1.key
-rw-r--r-- 1 stylesen stylesen 364 Jun 18 00:05 /etc/lava-dispatcher/certificates.d/docker-slave-1.key

Now, we have the slave certificate copied to appropriate location on the LAVA master. For convenience, on the host machine from where we start the docker based LAVA worker, copy the slave certificates to a specific directory as shown below:

stylesen@hanshu:~$ mkdir docker-slave-files
stylesen@hanshu:~$ cd docker-slave-files/
stylesen@hanshu:~/docker-slave-files$ cp /etc/lava-dispatcher/certificates.d/docker-slave-1.key* .

Similarly, copy the master certificate's public component to the above folder, in order to enable communication.

stylesen@hanshu:~/docker-slave-files$ scp \ .

master.key                                    100%  364     1.4KB/s   00:00   
stylesen@hanshu:~/docker-slave-files$ ls -alh
total 20K
drwxr-xr-x  2 stylesen stylesen 4.0K Jun 18 05:48 .
drwxr-xr-x 17 stylesen stylesen 4.0K Jun 18 05:45 ..
-rw-r--r--  1 stylesen stylesen  364 Jun 18 05:45 docker-slave-1.key
-rw-r--r--  1 stylesen stylesen  313 Jun 18 05:45 docker-slave-1.key_secret
-rw-r--r--  1 stylesen stylesen  364 Jun 18 05:48 master.key

We are all set with the required files to start and run our docker based LAVA workers.

... Continue Reading Part 2

by stylesen at June 06, 2018 02:30

lava-dispatcher docker images - part 2

This is part 2 of the three part blog post series on lava-dispatcher only docker images. If you haven't read part 1 already, then read it on -

Docker based LAVA Worker running pure LXC job

This is the first use case in which we will look at starting a docker based LAVA worker and running a job that requests a LXC device type. The following command is used to start a docker based LAVA worker,

stylesen@hanshu:~$ sudo docker run \
-v /home/stylesen/docker-slave-files:/fileshare \
-v /var/run/docker.sock:/var/run/docker.sock -itd \
-e HOSTNAME='docker-slave-1' -e MASTER='tcp://' \
-e SOCKET_ADDR='tcp://' -e LOG_LEVEL='DEBUG' \
-e ENCRYPT=1 -e MASTER_CERT='/fileshare/master.key' \
-e SLAVE_CERT='/fileshare/docker-slave-1.key_secret' -p 2222:22 \
--name ld-latest linaro/lava-dispatcher-staging-stretch-amd64:latest

Unable to find image 'linaro/lava-dispatcher-staging-stretch-amd64:latest' locally
latest: Pulling from linaro/lava-dispatcher-staging-stretch-amd64
cc1a78bfd46b: Pull complete
5ddb65a5b8b4: Pull complete
41d8dcd3278b: Pull complete
071cc3e7e971: Pull complete
39bedb7bda2f: Pull complete
Digest: sha256:1bc7c7b2bee09beda4a6bd31a2953ae80847c706e8500495f6d0667f38fe0c9c
Status: Downloaded newer image for linaro/lava-dispatcher-staging-stretch-amd64:latest

Lets have a closer look at the 'docker run' command above and see what are the options used:

'-v /home/stylesen/docker-slave-files:/fileshare' - mounts the directory /home/stylesen/docker-slave-files from the host machine, inside the docker container at the location /fileshare This location is used to exchange files from the host to the container and vice versa.

'-v /var/run/docker.sock:/var/run/docker.sock' - similarly the docker socket file is exposed within the container. This is optional and may be required for advanced job runs and use cases.

For options such as '-itd', '-p' and '--name' refer to know what these option do for running docker images.

'-e' - This option is used to set environment variables inside the docker container being run. The following environment variables are set in the above command line which is consumed by the script inside the container and starts the lava-slave daemon based on these variable's values.

  1. HOSTNAME - Name of the slave
  2. MASTER - Main master socket
  3. SOCKET_ADDR - Log socket
  4. LOG_LEVEL - Log level, default to INFO
  5. ENCRYPT - Encrypt messages
  6. MASTER_CERT - Master certificate file
  7. SLAVE_CERT - Slave certificate file

We can see the docker based LAVA worker is started and running,

stylesen@hanshu:~$ sudo docker ps -a
CONTAINER ID        IMAGE                                               \
  COMMAND             CREATED              STATUS              PORTS    \

522f07964981        linaro/lava-dispatcher-staging-stretch-amd64:latest \
  "/"    About a minute ago   Up 58 seconds       \>22/tcp   ld-latest


If everything goes fine, we can see the LAVA master receiving ping messages from the above LAVA worker as seen below on the LAVA master logs:

stylesen@codehelp:~$ sudo tail -f /var/log/lava-server/lava-master.log
2018-06-18 00:24:30,878    INFO docker-slave-1 => HELLO
2018-06-18 00:24:30,878 WARNING New dispatcher <docker-slave-1>
2018-06-18 00:24:34,069   DEBUG lava-logs => PING(20)
2018-06-18 00:24:36,138   DEBUG docker-slave-1 => PING(20)

The worker will also get listed on in the web UI. The docker based LAVA worker host docker-slave-1 is up and running. Let us add a LXC device to this worker on which we will run our LXC protocol based job. The name of the LXC device we will add to docker-slave-1 is 'lxc-docker-slave-01'. Create a jinja2 template file for lxc-docker-slave-01 and copy it to /etc/lava-server/dispatcher-config/devices/ on the LAVA master instance,

stylesen@codehelp:~$ cat \

{% extends 'lxc.jinja2' %}
{% set exclusive = 'True' %}
stylesen@codehelp:~$ ls -alh \

-rw-r--r-- 1 lavaserver lavaserver 56 Jun 18 00:36 \


In order to add the above device lxc-docker-slave-01 to the LAVA master database and associate it with our docker based LAVA worker docker-slave-1, login to the LAVA master instance and issue the following command:

stylesen@codehelp:~$ sudo lava-server manage devices add \
--device-type lxc --worker docker-slave-1 lxc-docker-slave-01


The device will now be listed as part of the worker docker-slave-1 and could be seen in the link -

The LXC job we will submit to the above device will be which is a normal LXC job requesting a LXC device type and runs a basic smoke test on a Debian based LXC device.

stylesen@harshu:/tmp$ lavacli -i lava.codehelp jobs submit lxc.yaml 

NOTE: lavacli is the official command line tool for interacting with LAVA instances. Read more about lavacli in

Thus job 2486 has been submitted successfully to LAVA instance and it ran successfully as seen in This job used lava-lxc-mocker instead of lxc as seen from

Read part 1...                                                                                                                     ... Continue Reading part 3

Read all parts of this blog post series from below links:

  1. lava-dispatcher docker images - part 1 - Introduction, Details and Preparation
  2. lava-dispatcher docker images - part 2 - Docker based LAVA Worker running pure LXC job
  3. lava-dispatcher docker images - part 3 - Docker based LAVA Worker running Nexus 4 job with and without LXC Protocol

by stylesen at June 06, 2018 02:30

lava-dispatcher docker images - part 3

This is part 3 of the three part blog post series on lava-dispatcher only docker images. If you haven't read part 2 already, then read it on -

Docker based LAVA Worker running Nexus 4 job with LXC protocol

This is the second use case in which we will look at starting a docker based LAVA worker and running a job that requests a Nexus 4 device type with LXC protocol. The following command is used to start a docker based LAVA worker,

stylesen@hanshu:~$ sudo docker run \
-v /home/stylesen/docker-slave-files:/fileshare \
-v /var/run/docker.sock:/var/run/docker.sock -v /dev:/dev -itd --privileged \ -e HOSTNAME='docker-slave-1' -e MASTER='tcp://' \ -e SOCKET_ADDR='tcp://' -e LOG_LEVEL='DEBUG' \
-e ENCRYPT=1 -e MASTER_CERT='/fileshare/master.key' \
-e SLAVE_CERT='/fileshare/docker-slave-1.key_secret' -p 2222:22 \
--name ld-latest linaro/lava-dispatcher-staging-stretch-amd64:latest


There is not much difference in the above command from what we used in use case one, except for couple of new options.

'-v /dev:/dev' - mounts the host machine's /dev directory inside the docker container at the location /dev This is required when we deal with actual (physical) devices in order to access these devices from within the docker container.

'--privileged' - this option is required to allow seamless passthrough and device access from within the container.

Once we have the docker based LAVA worker up and running with the new options in place, we can add a new nexus4 device to it. The name of the nexus4 device we will add to docker-slave-1 is 'nexus4-docker-slave-01'. Create a jinja2 template file for nexus4-docker-slave-01 and copy it to /etc/lava-server/dispatcher-config/devices/ on the LAVA master instance,

stylesen@codehelp:~$ sudo cat \

{% extends 'nexus4.jinja2' %}
{% set adb_serial_number = '04f228d1d9c76f39' %}
{% set fastboot_serial_number = '04f228d1d9c76f39' %}
{% set device_info = [{'board_id': '04f228d1d9c76f39'}] %}
{% set fastboot_options = ['-u'] %}
{% set flash_cmds_order = ['update', 'ptable', 'partition', 'cache', \
'userdata', 'system', 'vendor'] %}

{% set exclusive = 'True' %}
stylesen@codehelp:~$ sudo ls -alh \

-rw-r--r-- 1 lavaserver lavaserver 361 Jun 18 01:32 \


In order to add the above device nexus4-docker-slave-01 to the LAVA master database and associate it with our docker based LAVA worker docker-slave-1, login to the LAVA master instance and issue the following command:

stylesen@codehelp:~$ sudo lava-server manage devices add \
--device-type nexus4 --worker docker-slave-1 nexus4-docker-slave-01


The device will now be listed as part of the worker docker-slave-1 and could be seen in the link -

The job definition we will submit to the above device will be which is a normal job requesting a Nexus4 device type and runs a simple test on the device using LXC protocol.

stylesen@harshu:/tmp$ lavacli -i lava.codehelp jobs submit nexus4.yaml 

Thus job 2491 has been submitted successfully to LAVA instance and it ran successfully as seen in

Docker based LAVA Worker running Nexus 4 job without LXC protocol

This is the third use case in which we will look at starting a docker based LAVA worker and running a job that requests a Nexus 4 device type without LXC protocol. The following command is used to start a docker based LAVA worker, which is exactly same as use case two.

stylesen@hanshu:~$ sudo docker run \
-v /home/stylesen/docker-slave-files:/fileshare \
-v /var/run/docker.sock:/var/run/docker.sock -v /dev:/dev -itd --privileged \
-e HOSTNAME='docker-slave-1' -e MASTER='tcp://' \
-e SOCKET_ADDR='tcp://' -e LOG_LEVEL='DEBUG' \
-e ENCRYPT=1 -e MASTER_CERT='/fileshare/master.key' \
-e SLAVE_CERT='/fileshare/docker-slave-1.key_secret' -p 2222:22 \
--name ld-latest linaro/lava-dispatcher-staging-stretch-amd64:latest


We will use the same device added for use case two i.e., 'nexus4-docker-slave-01' in order to execute this job.

The job we will submit to the above device will be which is a normal job requesting a Nexus4 device type and runs a simple test on the device, without calling any LXC protocol.

stylesen@harshu:/tmp$ lavacli -i lava.codehelp jobs submit nexus4-minus-lxc.yaml 

Thus job 2492 has been submitted successfully to LAVA instance and it ran successfully as seen in

Hope this blog series helps to get started with using lava-dispatcher only docker images and running your own docker based LAVA workers. If you have any doubts, questions or comments, feel free to email the LAVA team at lava-users [@] lists [dot] linaro [dot] org

Read part 2 ...

Read all parts of this blog post series from below links:

  1. lava-dispatcher docker images - part 1 - Introduction, Details and Preparation
  2. lava-dispatcher docker images - part 2 - Docker based LAVA Worker running pure LXC job
  3. lava-dispatcher docker images - part 3 - Docker based LAVA Worker running Nexus 4 job with and without LXC Protocol

by stylesen at June 06, 2018 02:30

June 17, 2018

Bin Chen

Understand Kubernetes 2: Operation Model

In the last article, we focus on the components in the work nodes. In this one, we'll switch our focus to the user and the component in master node.

Operation Model

From user's perspective, the model is quite simple: User declare a State he wants the system to be in, and then it is k8s's job to achieve that.
User send the Resouces and Operation to the k8s using the REST API, which is served by the API server inside of the master node, the request will be put into a stateStore (implemented using etcd). According to the type of resource, different Controllers will be delegated to do the job.
The exact Operations available depend on the Resource type, but most the case, it means CRUD. For create operation, there is a Specification define the attribute of the resource wanted to be created.
Here are two examples:
  • create a Pod, according to a Pod spec.
  • create a Deployment called mypetstore, according to ta Deployment spec.
  • update the mypetstore deployment with a new container image.
Each Resource (also called Object) has three pieces of information: Spec, Status and Metadata, and those are saved in the stateStore.
  • Spec is specified by the user for resource creation and update; it is desired state of the resource.
  • Status is updated by the k8s system and queried by the user; it is the actual state of the resource.
  • Metadata is partly specified by the user and can be updated by the k8s system; it is the label of the resource.
The class diagram looks like this:
ResourceSpec : create by userStatus : updated by k8s systemMetadata : may be updated by bothControllerCreate() : ResourceUpdate(Resource)Delete(Resource)GetStatus(Resource)CustomizedOps(Resource)K8sUseruse1ncontrols(CRUD)defines Spec, provides Metadata

Sequence Diagram:

Let's see how what really happens when you typing kubectl create -f deployment/spec.yaml:
UserUserkubectlkubectlAPI ServerAPI ServerStateStoreStateStoreControllerControllerWorkNodesWorkNodescreate spec.yamlkubectl turn it to REST callPost xxx/deploymentssave the specunblocked by new stateok (async)ok (async)do stuff to achieve the new stateok (async)update some new information (e.g pod & nodes binding)unblock by new state and do stuffdo stuff to achive the state


k8s cluster is managed and accessed from predefined APIkubectl is a client of the API, it converts the shell command to the REST call, as shown in above sequence diagram.
You can build your own tools using those APIs to add functionality that are currently not available. Since API is versioned and stable, it makes sure your tool are portable.
Portability and extensibility are the most important benefits k8s brings. In another word, k8s is awesome not only because it does awesome things itself but enables you and others build awesome things on top of it.


Controller is to make sure the actual state of the object matches the desired state.
The idea of matching the actual state to the desired state is the driving philosophy of k8s's design. It doesn't sound quite novel given most declarative tools follow the same idea. For example, both Terraform and Ansible are declarative. The things k8s is different is it keep monitoring the system status and make sure the desired status is always kept. And that means all the goodness of availability and scalability are built-in in k8s.
The desired state is defined using a Spec, and that is the stuff user will interact with. It is k8s's job to do whatever you requested.
The most common specs are:
  • Deployments for stateless persistent apps (e.g. http servers)
  • StatefulSets for stateful persistent apps (e.g. databases)
  • Jobs for run-to-completion apps (e.g. batch jobs).
Let's take a close look at the Deployments Spec.

Deployment Spec

Below is the deployment spec that can be used to create a deployment of nginx server with 3 replicas, each of which use nginx:1.7.9 as the container image and application will listen on 80 port.
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
app: nginx
replicas: 3
app: nginx
app: nginx
- name: nginx
image: nginx:1.7.9
- containerPort: 80
This should be simple to understand. Compared with a simple Pod/Container specification,it has an extra replica field. The kind is set as Deployment so that a right Controller will be able to pick it up.
Lots of specs will have a nested PodSpec, as shown below, since at the end of the day, k8s is a Pod/Container management system.
Deplyment Controller and Speck8s (master)cluster (work nodes)specDeploymentControllerspec : DeloymentSpecstatus: DeloymentStatusControllerPodKindMetadataSpec : PodSpecStatus : PodStatusSpecKindMetadataDeloymentSpecreplicas: intselector: LabelSelectorstrategy: DeloymentStrategytemplate: PodTemplateSpecPodTemplateSpecMetadataSpec: PodSpecPodSpecContainers: []ContainerVolumes : []Volumecreate/update/monitor$.specuseembed$.template$.spec
For a complete reference of the field available for deployment spec, you can check here.


In this article, we looked at the components of Master node and the overall operation Model of k8s: drive and maintainer the actual state of the system to be same as the desired state as specified by the user through various object specification. In particular, we took a close look at most used deployment spec.

by Bin Chen ( at June 06, 2018 01:34

June 10, 2018

Ard Biesheuvel

UEFI driver pitfalls and PC-isms

Even though Intel created UEFI (still known by its TLA EFI at the time) for Itanium initially, x86 is by far the dominant architecture when it comes to UEFI deployments in the field, and even though the spec itself is remarkably portable to architectures such as ARM, there are a lot of x86 UEFI drivers out there that cut corners when it comes to spec compliance. There are a couple of reasons for this:

  • the x86 architecture is not as heterogeneous as other architectures, and while the form factor may vary, most implementations are essentially PCs;
  • the way the PC platform organizes its memory and especially its DMA happens to result in a configuration that is rather forgiving when it comes to UEFI spec violations.

UEFI drivers provided by third parties are mostly intended for plugin PCI cards, and are distributed as binary option ROM images. There are very few open source UEFI drivers available (apart from the _HCI class drivers and some drivers for niche hardware available in Tianocore), and even if they were widely available, you would still need to get them into the flash ROM of your particular card, which is not a practice hardware vendors are eager to support.
This means the gap between theory and practice is larger than we would like, and this becomes apparent when trying to run such code on platforms that deviate significantly from a PC.

The theory

As an example, here is some code from the EDK2 EHCI (USB2) host controller driver.

  Status = PciIo->AllocateBuffer (PciIo, AllocateAnyPages,
                     EfiBootServicesData, Pages, &BufHost, 0);
  if (EFI_ERROR (Status)) {

  Bytes = EFI_PAGES_TO_SIZE (Pages);
  Status = PciIo->Map (PciIo, EfiPciIoOperationBusMasterCommonBuffer,
                     BufHost, &Bytes, &MappedAddr, &Mapping);
  if (EFI_ERROR (Status) || (Bytes != EFI_PAGES_TO_SIZE (Pages))) {
    goto FREE_BUFFER;


  Block->BufHost  = BufHost;
  Block->Buf      = (UINT8 *) ((UINTN) MappedAddr);
  Block->Mapping  = Mapping;

This is a fairly straight-forward way of using UEFI’s PCI DMA API, but there a couple of things to note here:

  • PciIo->Map () may be called with the EfiPciIoOperationBusMasterCommonBuffer mapping type only if the memory was allocated using PciIo->AllocateBuffer ();
  • the physical address returned by PciIo->Map () in MappedAddr may deviate from both the virtual and physical addresses as seen by the CPU (note that UEFI maps VA to PA 1:1);
  • the size of the actual mapping may deviate from the requested size.

However, none of this matters on a PC, since its PCI is cache coherent and 1:1 mapped. So the following code will work just as well:

  Status = gBS->AllocatePages (AllocateAnyPages, EfiBootServicesData,
                  Pages, &BufHost);
  if (EFI_ERROR (Status)) {


  Block->BufHost  = BufHost;
  Block->Buf      = BufHost;

So let’s look at a couple of ways a non-PC platform can deviate from a PC when it comes to the layout of its physical address space.

DRAM starts at address 0x0

On a PC, DRAM starts at address 0x0, and most of the 32-bit addressable physical region is used for memory. Not only does this mean that inadvertent NULL pointer dereferences from UEFI code may go entirely unnoticed (one example of this is the NVidia GT218 driver), it also means that PCI devices that only support 32-bit DMA (or need a little kick to support more than that) will always be able to work. In fact, most UEFI implementations for x86 explicitly limit PCI DMA to 4 GB, and most UEFI PCI drivers don’t bother to set the mandatory EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE attribute for >32 bit DMA capable hardware either.

On ARM systems, the amount of available 32-bit addressable RAM may be much smaller, or it may even be absent entirely. In the latter case, hardware that is only 32-bit DMA capable can only work if a IOMMU is present and wired into the PCI root bridge driver by the platform, or if DRAM is not mapped 1:1 in the PCI address space. But in general, it should be expected that ARM platforms use at least 40 bits of address space for DMA, and that drivers for 64-bit DMA capable peripherals enable this capability in the hardware.

PCI DMA is cache coherent

Although not that common, it is possible and permitted by the UEFI spec for PCI DMA to be non cache coherent. This is completely transparent to the driver, provided that it uses the APIs correctly. For instance, PciIo->AllocateBuffer () will return an uncached buffer in this case, and the Map () and Unmap () methods will perform cache maintenance under the hood to keep the CPU’s and the device’s view of memory in sync. Obviously, this use case breaks spectacularly if you cut corners like in the second example above.

PCI memory is mapped 1:1 with the CPU

On a PC, the two sides of the PCI host bridge are mapped 1:1. As illustrated in the example above, this means you can essentially ignore the device or bus address returned from the PciIo->Map () call, and just program the CPU physical address into the DMA registers/rings/etc. However, non-PC systems may have much more extravagant PCI topologies, and so a compliant driver should use the appropriate APIs to obtain these addresses. Note that this is not limited to inbound memory accesses (DMA) but also applies to outbound accesses, and so a driver should not interpret BAR fields from the PCI config space directly, given that the CPU side mapping of that BAR may be at a different address altogether.

PC has strongly ordered memory

Whatever. UEFI is uniprocessor anyway, and I don’t remember seeing any examples where this mattered.

Using encrypted memory for DMA

Interestingly, and luckily for us in the ARM world, there are other reasons why hardware vendors are forced to clean up their drivers: memory encryption. This case is actually rather similar to the non cache coherent DMA case, in the sense that the allocate, map and unmap actions all involve some extra work performed by the platform under the hood. Common DMA buffers are allocated from unencrypted memory, and mapping or unmapping involve decryption or encryption in place depending on the direction of the transfer (or bounce buffering if encryption in place is not possible, in which case the device address will deviate from the host address like in the non-1:1 mapped PCI case above). Cutting corners here means that attempted DMA transfers will produce corrupt data, usually a strong motivator to get your code fixed.


The bottom line is really that the UEFI APIs appear to be able to handle anything you throw at them when it comes to unconventional platform topologies, but this only works if you use them correctly, and having been tested on a PC doesn’t actually prove all that much in this regard.

by ardbiesheuvel at June 06, 2018 17:45

Bin Chen

Understand Kubernetes 1: Container Orchestration

By far, we know the benefits of the container and how the container is implemented using Linux primitives.
If we only need to one or two containers, we should be satisfied. That's all we need. But if we want to run dozens or thousands containers to build a stable and scalable web service that is able to server millions transaction per seconds, we have more problems to solve. To name a few:
  • scheduling: Which host to put a container?
  • update: How to update the container image and ensure zero downtime?
  • self-healing: How to detect and restart a container when it is down?
  • scaling: How to add more containers when more processing capacity is needed?
None of those issues are new but only the subject become containers, rather than physical servers (in the old days), or virtual machines as recently. The functionalities described above are usually referred as Container Orchestration.


kubernetes, abbreviated as k8s, is one of many container orchestration solutions. But, as of mid-2018, many would agree the competition is over; k8s is the de facto standard. I think it is a good news, freeing you from the hassle of picking from many options and worrying about investing in the wrong one. K8s is completely open source, with a variety of contributors from big companies to individual contributors.
k8s has a very good documentation, mostly here and here.
In this article, we'll take a different perspective. Instead of starting with how to use the tools, we'll start with the very object k8s platform is trying to manage - the container. We'll try to see what extra things k8s can do, compare with single machine container runtime such as runc or docker, and how k8s integrate with those container runtimes.
However, we can't do that without an understanding of the high-level architecture of k8s.

At the highest level, k8s is a master and slave architecture, with a master node controlling multiple slave or work nodes. master & slave nodes together are called a k8s clusterUser talks to the cluster using API, which is served by the master. We intentionally left the master node diagram empty, with a focus on the how the things are connected on the work node.
Master talks to work nodes through kublet, which primarily run and stop Pods, through CRI, which is connected to a container runtime. kublet also monitor Pods for liveness and pulling debug information and logs.
We'll go over the components in a little more detail below.


There are two type of nodes, master node and slave node. A node can either be a physical machine or virtual machine.
You can jam the whole k8s cluster into a single machine, such as using minikube.


Each work note has a kubelet, it is the agent that enables the master node talk to the slaves.
The responsibility of kubelet includes:
  • Creating/running the Pod
  • Probe Pods
  • Monitor Nodes/Pod
  • etc.
We can go nowhere without first introducing Pod.


In k8s, the smaller scheduling or deployment unit is Pod, not container. But there shouldn't be any cognitive overhead if you already know containers well. The benefits of Pod is to add another wrap on top of the container to make sure closely coupled contains are guaranteed end up being scheduled on the same host so that they can share a volume or network that would otherwise difficult or inefficient to implement if they being on different hosts.
A pod is a group of one or more containers, with shared storage and network, and a specification for how to run the containers. A pod’s contents are always co-located and co-scheduled and run in a shared context, such as namespaces and cgroups.
For details, you can find here.

Config, Scheduing and Run Pod

You config a Pod using ymal file, call it spec. As you can imagine, the Pod spec will include configurations for each container, which includes the image and the runtime configuration.
With this spec, the k8s will sure pull the image and run the container, just as you would do using simple docker command. Nothing quite innovative here.
What missing here is in the spec we'll describe the resource requirement for the containers/Pod, and the k8s will use that information along with current cluster status, find a suitable host for the host. This is called Pod scheduling. The functionality and effectiveness of the schedule may be overlooked, in the borg paper, it is mentioned a better schedule actually could save millions of dollar for in google scale.
In the spec, we can also specify the Liveness and Readiness Probes.

Probe Pods

The kubelet uses liveness probes to know when to restart a container, and readiness probes to know when a container is ready to start accepting traffic. The first is the foundation for self-healing and the second for load balancing.
Without k8s, you have to do all these by your owner. Time and $$ saved.

Container Runtime: CRI

k8s isn't binding to a particular container runtime, instead, it defines an interface for image management and container runtime. Anyone one implemented the interface can be plugged into the k8s, be more accurate, the kubelet.
There are multiple implementations of CRI. Docker has cri-contained that plugs the containd/docker into the kubelet. cri-o is another implementation, which wraps runc for the container runtime service and wraps a bunch of other libraries for the image service. Both use cni for the network setup.
Assuming a Pod/Container is assigned to a particular node, and the kubelet on that node will operate as follows:
kubeletkubeletcri clientcri clientcri servercri serverimage serviceimage serviceruntime service(runc)runtime service(runc)run containercreate (over gPRC)pull image from a registryunpack the image and create rootfscreate runtime config (config.json) using the pod specrun container


We go through why we need a container orchestration system, and then the high-level architecture of k8s, with a focus on the components in the work node and its integration with container runtime.

by Bin Chen ( at June 06, 2018 07:04

June 04, 2018

Tom Gall

Developing Android apps on ChromeOS

Having been at Google IO 2018 I happened to be lucky enough to attend the “What’s new on ChromeOS.” session at the end of which they handed out not only groovy socks but also 75% off on Pixelbook.

During the session however Google had all sorts of things to say about enabling both Linux and Android development on ChromeOS. Now these are two things the world has needed and wanted for some time.

The Chromebook offer is for the midrange i5 based Chromebook. I received mine on Friday so I’ve had a few days with it.


Setting up to Android development, meaning having Android Studio running as well as being able to run/debug Android Apps running on the Chromebook wasn’t too hard to setup.

Instructions are here but they are wrong in a few spots.

First, do turn on developer unstable and turn on Linux. BUT in order to debug Android Apps via Android Studio, you need to then turn on developer mode on your Pixelbook (or other akin device). You can’t debug Android Apps over USB (yet) so really I view this as an essential step.

Developer mode of course wipes the device so yeah, takes a bit longer to get to the end goal. You’ll live. I’ll link to the ‘snarky’ guide because there’s reasons not to enable developer mode if you don’t know what you’re doing. Remember you JUST need to enable developer mode, nothing else from this guide.

With developer mode on, again, turn on Linux mode, and now follow the rest of the guide. When you get to the point where you need to Mount Linux Files, before you do that you need to enable ssh server first in the debian environment.

> sudo rm sshd_not_to_be_run

> sudo service ssh restart

Ok now go back to the guide.

Then when you get to the Android Studio part, make sure you download the current preview, 3.2. If you don’t you’ll end up in a world of frustration where your new shiny Pixelbook will be at great risk to you throwing it across the room.

That done, you’ll find App development is pretty darn smooth. I’ve pounded out a couple of simple apps this weekend and everything ‘just worked’.  I’ll note that your very first compile will take awhile. This is down to some gradle files getting downloaded in the background. In real world terms, my first “hello world” app took about 3 minutes to build. After that, more like a second or two.


by tgallfoo at June 06, 2018 14:03

June 01, 2018

Alex Bennée

dired-rsync 0.4 released

I started hacking on this a while back but I’ve finally done the house-keeping tasks required to make it a proper grown up package.

dired-rsync is a simple command which you can use to trigger an rsync copy from within dired. This is especially useful when you want to copy across large files from a remote server without locking up Emacs/Tramp. The rsync just runs as an inferior process in the background.

Today was mainly a process of cleaning up the CI and fixing any issues with it. I’d still like to add some proper tests but the whole thing is interactive and that seems to be tricky for Emacs to test. Anyway I’ve now tagged 0.4 so it will be available from MELPA Stable once it rebuilds. You can of course grab the building edge from MELPA any time 😉

by Alex at June 06, 2018 17:12

May 29, 2018

Leif Lindholm

Running UniFi Controller on arm64 (or ppc64el)

Sometime last year I decided to switch my home wireless infrastructure over to Ubiquiti UniFi. This isn't just standalone access points, so they rely on controller software - to be run on Someone Else's Computer (just no), or using their UniFi Controller on a machine of your choice. Since the controller is written in Java, it will run pretty much anywhere that can also run its other dependencies. They even provide their own Debian/Ubuntu repository, and a pretty howto on setting it up.

UniFi on armhf

I initially actually ran this on armhf/Stretch, and still have a post in draft state on how to achieve this (since one of the prerequisites is MongoDB, no longer supported on armhf), but probably won't bother publishing it since it is a bit of a dead end.

(Short short version: grab the 2.6.10 sources from Ubuntu Xenial and fix the most awfully broken bits of code until it actually compiles. This includes the parts of the testsuite that try to verify undefined behaviour of the programming languages used. ?!?)

But since I now have always-on arm64 machines in my home network, I decided it was time to move to the architecture that has been my main development target for the past 8 years...

UniFi on arm64

Unsurprisingly, this hit a snag; while the package itself is completely architecture-independent, the Debian repository format is not. With the instructions from the howto, apt expects to find $ARCHIVE_ROOT/dists/$DISTRIBUTION/ubiquiti/binary-$arch/Packages.gz to tell it which packages are available in the repo and what their dependencies are. Which works fine when there is a populated entry for $arch. There is for (at least) i386, amd64 and armhf - but not for arm64 or ppc64el.

The $ARCHIVE_ROOT specified in abovelinked howto is Not sure why that does not specify https (which also works), but I will use the actually documented variant below.


The package itself is fully architecture independent. So what we can do instead is grab the Packages.gz for armhf and have a peek:

$ wget
$ zcat Packages.gz
Package: unifi
Version: 5.7.23-10670
Architecture: all
Depends: binutils, coreutils, jsvc (>=1.0.8) , mongodb-server (>=2.4.10) | mongodb-10gen (>=2.4.14) | mongodb-org-server (>=2.6.0), java8-runtime-headless, adduser, libcap2
Conflicts: unifi-controller
Provides: unifi-controller
Replaces: unifi-controller
Installed-Size: 113416
Maintainer: UniFi developers <>
Priority: optional
Section: java
Filename: pool/ubiquiti/u/unifi/unifi_5.7.23-10670_all.deb
Size: 64571866
SHA256: e7b60814c27d85c13e54fc3041da721cc38ad21bb0a932bdfe810c2ad3855392
SHA1: 49f16c3d0c6334cb2369cd2ac03ef3f0d0dfe9e8
MD5sum: 478b56465bf652993e9870912713fab2
Description: Ubiquiti UniFi server
 Ubiquiti UniFi server is a centralized management system for UniFi suite of devices.
 After the UniFi server is installed, the UniFi controller can be accessed on any
 web browser. The UniFi controller allows the operator to instantly provision thousands
 of UniFi devices, map out network topology, quickly manage system traffic, and further
 provision individual UniFi devices.

Download the package

The Filename: field tells us the current unifi packages can be found at pool/ubiquiti/u/unifi/unifi_5.7.23-10670_all.deb - relative to the $ARCHIVE_ROOT, not the binary-$arch - so we can download it with

$ wget

Verify the integrity of the package by running

$ sha256sum unifi_5.7.23-10670_all.deb

and comparing the output with the value from the SHA256: field.

Install dependencies and UniFi

The Depends: field tells us we need

  • binutils
  • coreutils

(both of these are likely to be installed already, unless you like me had just accidentally tried to install a broken home-built toolchain package in the host instead of a chroot ... oops!)

  • jsvc
  • mongodb-server
  • java8-runtime-headless
  • adduser (also likely to already be installed)
  • libcap2

Resolving this is straightforward enough, with perhaps the single exception of java8-runtime-headless, which is a virtual package. But if you try to install that, apt will let you know, and point out which available packages provide it. So, as a one-liner:

$ sudo apt-get install jsvc mongodb-server openjdk-8-jre-headless libcap2

Then we're ready to:

$ sudo dpkg -i unifi_5.7.23-10670_all.deb


Nothing architecture-specific about this: go to https://$HOST:8443 to set up. In my case, I just imported my downloaded backup from the armhf server and had everything back up and running quickly without manual intervention.

Final notes

Of course, this will leave you without automatic updates, so you'll need to go periodically have a look at one of the actually enabled architectures for version changes and manually install updates.

And if you have an account on the Ubiquiti forum, consider upvoting my proposal to add the missing architectures to the repository.

by Leif Lindholm at May 05, 2018 11:24

May 20, 2018

Naresh Bhat

A dream come true: Himalayan Odyssey - 2016 (Day-6 to 10)

Day-6: Leh

Leh, a high-desert city in the Himalayas, is the capital of the Leh region in northern India’s Jammu and Kashmir state. Originally a stop for trading caravans, Leh is now known for its Buddhist sites and nearby trekking areas. Massive 17th-century Leh Palace, modeled on the Dalai Lama’s former home (Tibet’s Potala Palace), overlooks the old town’s bazaar and maze like lanes.

Leh city

Apricot seller 

Vegetable seller

Leh is at an altitude of 3,524 metres (11,562 ft), and is connected via National Highway 1 to Srinagar in the southwest and to Manali in the south via the Leh-Manali Highway. In 2010, Leh was heavily damaged by the sudden floods caused by a cloud burst.

Dry fruits shop

Indian spices seller
Leh was an important stopover on trade routes along the Indus Valley between Tibet to the east, Kashmir to the west and also between India and China for centuries. The main goods carried were salt, grain, pashm or cashmere wool, charas or cannabis resin from the Tarim Basin, indigo, silk yarn and Banaras brocade.

Day-7: Leh To Hunder

This was the day we all were waiting eagerly. Riding to Hunder (Nubra) valley via highest motorable road called "Khardung La" pass.  The pass situated at an elevation of 5602 meters (18379 ft) in the Ladakh region and is 39.7 km from Leh at an altitude of 3,524 metres (11,562 ft).  You can just imagine the steep uphills Journey from Leh to Khardungla, was a painful 3 hours drive up on a winding road.  Khardung La is the highest motorable pass in the world.

Khardungla top

Highest motorable pass ..Yuppie..reached..:)
Best known as the gateway to the Nubra and Shyok valleys in the Ladakh region of Jammu and Kashmir, the Khardung La Pass, commonly pronounced as Khardzong La, is a very important strategic pass into the Siachen glacier.

The pristine air, the scenic beauty one sees all around and the feeling that you are on top of the world has made Khardung La a very popular tourist attraction in the past few years.

 The first 24 km, as far as the South Pullu check point, are paved. From there to the North Pullu check point about 15 km beyond the pass the roadway is primarily loose rock, dirt, and occasional rivulets of snow melt.

Nubra valley is a beautiful place where you can see sand dunes, water, and green apricot tree's.  We are staying at hunder in a tent.  After reaching valley we had hot snacks and went for double humped camel rides.
Nubra river

Sand dunes @Nubra valley

Nubra is mix of all in summer..water, tree, sand dunes, rocks and mountains. But completely frozen for 6 months
We had a campfire and party night.

Party all night..:)
The Siachen glacier water was flowing just beside our tent. The villagers use the flowing water directly. We were just 80kms away from Siachen glacier.

Tents just beside glacier water flow

You can directly drink glacier water

Karnataka state boys outside Royal Camp..Ready to ride out

Day-8: Hunder To Leh

Hundar is a village in the Leh district of Jammu and Kashmir, India. It is located in the Nubra tehsil, on the bank of Shyok River. The Hunder Monastery is located here. Hundar was once the capital of former Nubra kingdom.

Indian Army check post
You can see the Nubra river flowing in the background in the picture below.

Nubra valley view
The Nubra was the last destination of our journey.  Now it was time to start return journey and were headed back to Leh via KhardungLa pass.  I was half way through Khardungla pass it started snowing. Hands almost frozen and the slippery roads were, could not have asked for more 😊. It was a struggling ride up to Khardungla Pass because of low oxygen I could recognize very low response for throttle.

It was fun to ride highest motorable pass in rain and snow
I finally reached the highest motorable road Khardungla pass.  The snow fall had only increased.  Sipping on the lemon tea gave good feeling like never before. We took couple of pictures and started descending. Headache was already hitting back due to high altitude sickness. At couple of places we even faced land slides. When the snow settles down on the mountains the landslide will start automatically because of weight of the snow.

It started raining heavily when we reached south pullu check point.  We took a break and had a lunch. After the rain stopped, we continued our journey and reached Hotel Namgyal Palace in Leh.

Day-9: Leh To Debring (Tso Kar)

Today we are riding back towards Debring which is near to Moreplanes. We were staying in a camp near a salt lake called as Tso Kar.  We were also about to touch world's second highest pass called as "Tanglangla". The high altitude, minus temperature and cold wind are pretty common and one needs to gain all the physical and mental strength to withstand them and ride along.

We had a first break and regroup point at place called Rumtse. A small village even by Ladakh standards. Rumtse is the first human settlement on the way from Lahaul to Ladakh after Taglang Pass. It is located 70 km east of Leh and is the starting point for trek to Tso Moriri. Rumtse lies in Rupshu Valley which lies sandwiched between Tibet, Zanskar and Ladakh.
Tea break
The Tanglangla pass is located in the Zanskar range, at the northernmost tip of India, Tanglang La pass is famed as the second highest mountain pass in Leh Ladakh region. It is located at an altitude of around 17000 ft, on the Manali-Leh highway. Characterized by such an altitude, Tanglang La pass is like the gateway to Leh.

The pass provides for a scenic view as it sways away from the main highway. Ample vegetation on both sides further cools the already chilled air and at times, the sharp bends provide just the adrenaline push adventurists crave.

Second highest motorable pass

Second highest pass
 After reaching Moreplanes we had a group photo session.

Ready for group photo

60+ riders lined up for group photo at Moreplanes
Next we continued to ride towards Tso Kar camp site.  There were no roads. It is a very plain area with full of dust and small stones. Approximately after 15kms we reached the camp.  Had evening snacks and tea. We rested at Tso Kar for that night.

Tsokar camp site
It was a nightmare because of -ve temperature and cold windy weather. Early morning we were not able to touch the cold water for brush and bath.  There was no availability of any hot water, since were camped in the middle of no where.You can just see a plain area for miles and miles.

Day-10: Debring (Tso Kar) To Keylong

The distance between Tsokar and Keylong is around 236km but the time taken to cover this distance is around 7+ hours.  The road conditions are very bad.  Hence we just need to focus on the road and try to cover more distance taking less breaks. I just stopped at Moreplanes and took some pics

A view from Moreplanes

Dusty and tested thoroughly..:)
We reached hotel at Keylong by 5PM. It was very chilled weather and beautiful location.  I visited the local city market and purchased items like winter cap, gloves. The local market is very small and the roads are narrow. 

Motorcycles lined up outside Keylong hotel for check-up

Waiting for my turn
There was a fantastic view from out room balcony. We have also completed a round of motorcycle check-up because the next day ride would be very challenging with more water crossings...:)

Tobe continued.....:)

by Naresh ( at May 05, 2018 15:56

May 11, 2018

Leif Lindholm

Turn the page

On a long and lonesome highway... Err, nevermind.

Anyway, after nearly 12 and a half years, Friday 11 May 2018 will be my last day at ARM. I'm not going very far - after a short break I will be joining Linaro as a full time employee on 21 May.

I will keep my roles in LEG and TianoCore.

I joined ARM back in December 2005 to work in Support (cough, sorry, "applications engineering") for the Embedded Software series of products - which mainly meant TrustZone Software and a little bit of the software components required to make use of the Jazelle DBX (Direct Bytecode eXecution or Dogs BolloX, depending on context) extensions.

As is traditional, the job quickly turned into something quite different, and I spent the next few years supporting development boards and writing and delivering ARM software training. Both with a particular focus on multicore, following the release of the ARM11MPCore and Cortex-A9. I also spent a while in the compilation tools support team. It's impossible to overstate what an amazing time this was for learning. New things. All the time. Solving real problems for real people.

Then followed a short period (9 months) in the TechPubs group, where I worked on standalone documentation to help fill the gaps between the architecture specification and what a programmer is trying to find out. But at this point I had somewhat recovered from my startup years and was itching to get back to development.

I found a role advertised looking for someone to work on multicore software enablement. This sounded like fun, and I ended up getting the job. That was the last time I changed roles in ARM, but (as is traditional) the role itself kept changing. After a period including SWP emulation, Open MPI and Android, I ended up first being and then leading the original ARM server software project. Meanwhile Linaro was created, and after identifying that the IP paranoia overhead of running the server software project in-house was prohibitive, I first started working unofficially with the not-yet-announced Linaro Enterprise Group from around Q2 2012, and then became a full-time assignee into LEG from 1 January 2013.

I will look back at my time at ARM with fondness, and am making this move because I believe it will actually enable me to be more useful to the ARM ecosystem.

So long, and thanks for all the chips.

by Leif Lindholm at May 05, 2018 10:49

May 03, 2018

Neil Williams

Upgrading the home server rack

My original home server rack is being upgraded to use more ARM machines as the infrastructure of the lab itself. I've also moved house, so there is more room for stuff and kit. This has allowed space for a genuine machine room. I will be using that to host test devices which are do not need manual intervention despite repeated testing. (I'll also have the more noisy / brightly illuminated devices in the machine room.) The more complex devices will sit on shelves in the office upstairs. (The work to put the office upstairs was a major undertaking involving my friends Steve and Andy - embedding ethernet cables into the walls of four rooms in the new house. Once that was done, the existing ethernet cable into the kitchen could be fixed (Steve) and then connected to my new Ubiquity AP, (a present from Steve and Andy)).

Before I moved house, I found that the wall mounted 9U communications rack was too confined once there were a few devices in use. A lot of test devices now need many cables to each device. (Power, ethernet, serial, second serial and USB OTG and then add a relay board with it's own power and cables onto the DUT....)

Devices like beaglebone-black, cubietruck and other U-Boot devices will go downstairs, albeit in a larger Dell 24U rack purchased from Vince who has moved to a larger rack in his garage. Vince also had a gigabit 16 port switch available which will replace the Netgear GS108 8-port Gigabit Ethernet Unmanaged Switch downstairs.

I am currently still using the same microserver to run various other services around the house (firewall, file server etc.): HP 704941-421 ProLiant Micro Server

I've now repurposed a reconditioned Dell Compact Form Factor desktop box to be my main desktop machine in my office. This was formerly my main development dispatcher and the desktop box was chosen explicitly to get more USB host controllers on the motherboard than is typically available with an x86 server. There have been concerns that this could be causing bottlenecks when running multiple test jobs which all try to transfer several hundred megabytes of files over USB-OTG at the same time.

I've now added a SynQuacer Edge ARM64 Server to run a LAVA dispatcher in the office, controlling several of the more complex devices to test in LAVA - Hikey 620, HiKey 960 and Dragonboard 410c via a Cambrionix PP15s to provide switchable USB support to enable USB network dongles attached to the USB OTG port which is also used for file deployment during test jobs. There have been no signs of USB bottlenecks at this stage.

This arm64 machine then supports running test jobs on the development server used by the LAVA software team as azrael.codehelp. It runs headless from the supplied desktop tower case. I needed to use a PCIe network card from TPlink to get the device operating but this limitation should be fixed with new firmware. (I haven't had time to upgrade the firmware on that machine yet, still got the rest of the office to kit out and the rack to build.) The development server itself is an ARM64 virtual machine, provided by the Linaro developer cloud and is used with a range of other machines to test the LAVA codebase, doing functional testing.

The new dispatcher is working fine, I've not had any issues with running test jobs on some of the most complex devices used in LAVA. I haven't needed to extend the RAM from the initial 4G and the 24 cores are sufficient for the work I've done using the machine so far.

The rack was moved into place yesterday (thanks to Vince & Steve) but the patch panel which Andy carefully wired up is not yet installed and there are cables everywhere, so a photo will have to wait. The plan now is to purchase new UPS batteries and put each of the rack, the office and the ISP modem onto dedicated UPS. The objective is not to keep the lab running in the event of a complete power cut lasting hours, just to survive brown outs and power cuts lasting a minute or two, e.g. when I finally get around to labelling up the RCD downstairs. (The new house was extended a few yours before I bought it and the organisation of the circuits is a little unexpected in some parts of the house.)

Once the UPS batteries are in, the microserver, a PDU, the network switch and patch panel, as well as the test devices, will go into the rack in the machine room. I've recently arranged to add a second SynQuacer server into the rack - this time fitted into a 1U server case. (Definite advantage of the new full depth rack over the previous half-depth comms box.) I expect this second SynQuacer to have a range of test devices to complement our existing development staging instance which runs the nightly builds which are available for both amd64 and arm64.

I'll post again once I've got the rest of the rack built and the second SynQuacer installed. The hardest work, by far, has been fitting out the house for the cabling. Setting up the machines, installing and running LAVA has been trivial in comparison. Thanks to Martin Stadler for the two SynQuacer machines and the rest of the team in Linaro Enterprise Group (LEG) for getting this ARM64 hardware into useful roles to support wider development. With the support from Debian for building the arm64 packages, the new machine simply sits on the network and does "TheRightThing" without fuss or intervention. I can concentrate on the test devices and get on with things. The fact that the majority of my infrastructure now runs on ARM64 servers is completely invisible to my development work.

by Neil Williams at May 05, 2018 07:05

May 02, 2018

Neil Williams

Upgrading the home server rack

My original home server rack is being upgraded to use more ARM machines as the infrastructure of the lab itself. I've also moved house, so there is more room for stuff and kit. This has allowed space for a genuine machine room. I will be using that to host test …

by Neil Williams at May 05, 2018 07:05

April 25, 2018

Peter Maydell

Debian on QEMU’s Raspberry Pi 3 model

For the QEMU 2.12 release we added support for a model of the Raspberry Pi 3 board (thanks to everybody involved in developing and upstreaming that code). The model is sufficient to boot a Debian image, so I wanted to write up how to do that.

Things to know before you start

Before I start, some warnings about the current state of the QEMU emulation of this board:

  • We don’t emulate the boot rom, so QEMU will not automatically boot from an SD card image. You need to manually extract the kernel, initrd and device tree blob from the SD image first. I’ll talk about how to do that below.
  • We don’t have an emulation of the BCM2835 USB controller. This means that there is no networking support, because on the raspi devices the ethernet hangs off the USB controller.
  • Our raspi3 model will only boot AArch64 (64-bit) kernels. If you want to boot a 32-bit kernel you should use the “raspi2” board model.
  • The QEMU model is missing models of some devices, and others are guesswork due to a lack of documentation of the hardware; so although the kernel I tested here will boot, it’s quite possible that other kernels may fail.

You’ll need the following things on your host system:

  • QEMU version 2.12 or better
  • libguestfs (on Debian and Ubuntu, install the libguestfs-tools package)

Getting the image

I’m using the unofficial preview images described on the Debian wiki.

$ wget
$ xz -d 2018-01-08-raspberry-pi-3-buster-PREVIEW.img.xz

Extracting the guest boot partition contents

I use libguestfs to extract files from the guest SD card image. There are other ways to do this but I think libguestfs is the easiest to use. First, check that libguestfs is working on your system:

$ virt-filesystems -a 2018-01-08-raspberry-pi-3-buster-PREVIEW.img

If this doesn’t work, then you should sort that out first. A couple of common reasons I’ve seen:

  • if you’re on Ubuntu then your kernels in /boot are installed not-world-readable; you can fix this with sudo chmod 644 /boot/vmlinuz*
  • if you’re running Virtualbox on the same host it will interfere with libguestfs’s attempt to run KVM; you can fix that by exiting Virtualbox

Now you can ask libguestfs to extract the contents of the boot partition:

$ mkdir bootpart
$ guestfish --ro -a 2018-01-08-raspberry-pi-3-buster-PREVIEW.img -m /dev/sda1

Then at the guestfish prompt type:

copy-out / bootpart/

This should have copied various files into the bootpart/ subdirectory.

Run the guest image

You should now be able to run the guest image:

$ qemu-system-aarch64 \
  -kernel bootpart/vmlinuz-4.14.0-3-arm64 \
  -initrd bootpart/initrd.img-4.14.0-3-arm64 \
  -dtb bootpart/bcm2837-rpi-3-b.dtb \
  -M raspi3 -m 1024 \
  -serial stdio \
  -append "rw earlycon=pl011,0x3f201000 console=ttyAMA0 loglevel=8 root=/dev/mmcblk0p2 net.ifnames=0 rootwait memtest=1" \
  -drive file=2018-01-08-raspberry-pi-3-buster-PREVIEW.img,format=raw,if=sd

and have it boot to a login prompt (the root password for this Debian image is “raspberry”).

There will be several WARNING logs and backtraces printed by the kernel as it starts; these will have a backtrace like this:

[  145.157957] [] uart_get_baud_rate+0xe4/0x188
[  145.158349] [] pl011_set_termios+0x60/0x348
[  145.158733] [] uart_change_speed.isra.3+0x50/0x130
[  145.159147] [] uart_set_termios+0x7c/0x180
[  145.159570] [] tty_set_termios+0x168/0x200
[  145.159976] [] set_termios+0x2b0/0x338
[  145.160647] [] tty_mode_ioctl+0x358/0x590
[  145.161127] [] n_tty_ioctl_helper+0x54/0x168
[  145.161521] [] n_tty_ioctl+0xd4/0x1a0
[  145.161883] [] tty_ioctl+0x150/0xac0
[  145.162255] [] do_vfs_ioctl+0xc4/0x768
[  145.162620] [] SyS_ioctl+0x8c/0xa8

These are ugly but harmless. (The underlying cause is that QEMU doesn’t implement the undocumented ‘cprman’ clock control hardware, and so Linux thinks that the UART is running at a zero baud rate and complains.)

by pm215 at April 04, 2018 08:07

April 07, 2018

Alex Bennée

Working with dired

I’ve been making a lot more use of dired recently. One use case is copying files from my remote server to my home machine. Doing this directly from dired, even with the power of tramp, is a little too time consuming and potentially locks up your session for large files. While browsing reddit r/emacs I found a reference to this post that spurred me to look at spawning rsync from dired some more.

Unfortunately the solution is currently sitting in a pull-request to what looks like an orphaned package. I also ran into some other problems with the handling of where rsync needs to be run from so rather than unpicking some unfamiliar code I decided to re-implement everything in my own package.

I’ve still got some debugging to do to get it to cleanly handle multiple sessions as well as a more detailed mode-line status. Once I’m happy I’ll tag a 0.1 and get it submitted to MELPA.

While getting more familiar with dired I also came up with this little helper:

(defun my-dired-frame (directory)
  "Open up a dired frame which closes on exit."
  (switch-to-buffer (dired directory))
   (kbd "C-x C-c")
   (lambda ()
     (save-buffers-kill-terminal 't))))

Which is paired with a simple alias in my shell setup:

alias dired="emacsclient -a '' -t -e '(my-dired-frame default-directory)'"

This works really nicely for popping up a dired frame in your terminal window and cleaning itself up when you exit.

by Alex at April 04, 2018 10:12

April 02, 2018

Gema Gomez

What to make next?

One of the most complicated parts of the fiber crafts, and a part that normally takes at least a couple of weeks for me, is the planning phase. As soon as you are done with a project, you try to figure out what you want to do next. The first step is to decide what I feel inspired to make:

  • Quick project
  • Long and intrincate project
  • Use existing yarn project
  • Use existing pattern project
  • Learn a new skill only project
  • Garment or accessory project
  • Something I have done before or something new
  • Who will be the owner? Is it for me? Someone in my family? Friends? Special occassion?

In my case, it depends on the time of the year, the plans I have for the coming months, whether I have stumbled upon something super cool that I could make for someone and how much spare time I have over the coming months.

The first thing I decided is I want to use this gorgeus variegated yarn I bought a few months back:


I only have one skein, it is 100% merino, Unic from Bergere. The weight of it is DK, but it comes on 4ply untangled fibre, so it will be like working with 4 strands of fingering yarn at once. I have 660m of material (200g).

With this amount of yarn I cannot really make an adult size garment, but I could make a rather gorgeous complement, either cowl, infinity scarf or a shawl. I could also make a garment for a child or a baby. The changing color of the fibre also makes for a nice color effect if I were to find the right pattern for it.


Knitting or crochet?

Either one would work for me this time around.

What are you making? For whom?

Something easy and quick that showcases the yarns color. Probably a cowl/shawl/infity scarf for myself. Not in the mood for learning a new skill, so a pattern with some known techniques will have to do.

Which patterns are worth considering? Are there any nice examples out there of projects made with this yarn?

I looked at the patterns showcased by the manufacturer of the yarn, but none of them were really my cup of tea. Kept searching until I found a book of shawls that has patterns specific for variegated yarn like this one. I bought the book yesterday and I am trying to decide which one to make, it is called The Shawl Project: Book Four, by The Crochet Project.

Now the only question left is to figure out which of the projects in the book I like best and get crocheting. Will post a picture of the project when it is finished!

by Gema Gomez at April 04, 2018 23:00

April 01, 2018

Gema Gomez

Olca Cowl

As part of my yarn shopping spree in San Francisco last October, I bought some Berroco Mykonos (66% linen, 26% nylon, 8% cotton), color hera (8570). I decided to make a crocheted Olca Cowl with it, it required 2 x 50g hanks (260 m):

Olca cowl finished

The pattern was followed verbatim, I used a 3.75mm (F) hook as per pattern description:

hook and yarn

This was a quick and fun pattern to work, I managed to finish it in about a month of spare time. I recommend it for any advanced crochet beginner. Once the three first rows are worked, the rest is mechanic and quick to grow.

by Gema Gomez at April 04, 2018 23:00

March 30, 2018

Naresh Bhat

Benchmarking BigData


The purpose of this blog is try to explain about different types of benchmark tools available for BigData components.  We did a talk on BigData benchmark Linaro Connect @LasVegas in 2016. This is one of my effort to collectively put into a one place with more information.

We have to remember that all the BigData/components/benchmarks are developed 
  • Keeping in mind x86 architecture.  
    • So in first place we should make sure that all the relevant benchmark tools compile and run it on AArch64.  
    • Then we should go ahead and try to optimize the same for AArch64.
Different types of benchmarks and standards
  • Micro benchmarks: To evaluate specific lower-level, system operations
    • E.g. HiBench, HDFS DFSIO, AMP Lab Big Data Benchmark, CALDA, Hadoop Workload Examples (sort, grep, wordcount and Terasort, Gridmix, Pigmix)
  • Functional/Component benchmarks: Specific to low level function
    • E.g. Basic SQL: Individual SQL operations like select, project, join, Order-by..
  • Application level
    • Bigbench
    • Spark bench
The below table explains different types of benchmark
Benchmark Efforts - Microbenchmarks
Software Stacks
Generate, read, write, append, and remove data for MapReduce jobs
Execution Time, Throughput
Sort, WordCount, TeraSort, PageRank, K-means, Bayes classification, Index
Hadoop and Hive
Execution Time, Throughput, resource utilization
AMPLab benchmark
Part of CALDA workloads (scan, aggregate and join) and PageRank
Hive, Tez
Execution Time
Load, scan, select, aggregate and join data, count URL links
Hadoop, Hive
Execution Time

Benchmark Efforts - TPC
Software Stacks
HSGen, HSData, Check, HSSort and HSValidate
Performance, price and energy
Datawarehousing operations
Hive, Pig
Execution Time, Throughput
Decision support benchmark
Data loading, queries and maintenance
Hive, Pig
Execution Time, Throughput

Benchmark Efforts - Synthetic
Software Stacks
Synthetic user generated MapReduce jobs of reading, writing, shuffling and sorting
Multiple metrics
Synthetic and basic operations to stress test job scheduler and compression and decompression
Memory, Execution Time, Throughput
17 Pig specific queries
Hadoop, Pig
Execution Time
MapReduce benchmark as a complementary to TeraSort - Datawarehouse operations with 22 TPC-H queries
Execution Time
Load testing namenode and HDFS I/O with small payloads
CPU, memory and shuffle and IO intensive workloads. Machine Learning, Streaming, Graph Computation and SQL Workloads
Execution Time, Data process rate
Interactive-based queries based on synthetic data
Hadoop, Spark
Execution Time

Benchmark Efforts
Software Stacks
1. Micro Benchmarks (sort, grep, WordCount);
2. Search engine workloads (index, PageRank);
3. Social network workloads (connected components (CC), K-means and BFS);
4. E-commerce site workloads (Relational database queries (select, aggregate and join), collaborative filtering (CF) and Naive Bayes;
5. Multimedia analytics workloads (Speech Recognition, Ray Tracing, Image Segmentation, Face Detection);
6. Bioinformatics workloads
Hadoop, DBMSs, NoSQL systems, Hive, Impala, Hbase, MPI, Libc, and other real-time analytics systems
Memory, CPU (MIPS, MPKI - Misses per instruction)

Let's go through each of the benchmark in detail.

Hadoop benchmark and test tool:

The hadoop source comes with a number of bench marks. The TestDFSIO, nnbench, mrbench are in hadoop-*test*.jar file and the TeraGen, TeraSort, TeraValidate are in hadoop-*examples*.jar file in the source code of hadoop.

You can check it using the command

       $ cd /usr/local/hadoop
       $ bin/hadoop jar hadoop-*test*.jar
       $ bin/hadoop jar hadoop-*examples*.jar

While running the benchmarks you might want to use time command which measure the elapsed time.  This saves you the hassle of navigating to the hadoop JobTracker interface.  The relevant metric is real value in the first row.

      $ time hadoop jar hadoop-*examples*.jar ...
      real    9m15.510s
      user    0m7.075s
      sys     0m0.584s

TeraGen, TeraSort and TeraValidate

This is a most well known Hadoop benchmark.  The TeraSort is to sort the data as fast as possible.  This test suite combines HDFS and mapreduce layers of a hadoop cluster.  The TeraSort benchmark consists of 3 steps Generate input via TeraGen, Run TeraSort on input data and Validate sorted output data via TeraValidate.  We have a wikipage which explains about this test suite.  You can refer Hadoop Build Install And Run Guide


It is part of hadoop-mapreduce-client-jobclient.jar file.  The Stress test I/O performance (throughput and latency) on a clustered setup.  This test will shake out the hardware, OS and Hadoop setup on your cluster machines (NameNode/DataNode).  The tests are run as a MapReduce job using 1:1 mapping (1 map / file).  This test is helpful to discover performance bottlenecks in your network.  The benchmark write test follow up with read test.  You can use the switch case -write for write tests and -read for read tests.  The results are stored by default in TestDFSIO_results.log. You can use following switch case -resFile to choose different file name.

MR(Map Reduce) Benchmark for MR

The test loops a small job in number of times.  It checks whether small job runs are responsive and running efficiently on your cluster.  It puts focus on MapReduce layer as its impact on the HDFS layer is very limited.  The multiple parallel MRBench issue is resolved.  Hence you can run it from different boxes.

Test command to run 50 small test jobs
      $ hadoop jar hadoop-*test*.jar mrbench -numRuns 50

Exemplary output, which means in 31 sec the job finished
      DataLines       Maps    Reduces AvgTime (milliseconds)
      1               2       1       31414

NN (Name Node) Benchmark for HDFS

This test is useful for load testing the NameNode hardware &amp; configuration.  The benchmark test generates a lot of HDFS related requests with normally very small payloads.  It puts a high HDFS management stress on the NameNode.  The test can be simultaneously run from several machines e.g. from a set of DataNode boxes in order to hit the NameNode from multiple locations at the same time.

The TPC is a non-profit, vendor-neutral organization. The reputation of providing the most credible performance results to the industry. The TPC is a role of “consumer reports” for the computing industry.  It is a solid foundation for complete system-level performance.  The TPC is a methodology for calculating total-system-price and price-performance.  This is a methodology for measuring energy efficiency of complete system 

TPC Benchmark 
  • TPCx-HS
We have a collaborate page TPCxHS  The X: Express, H: Hadoop, S: Sort.  The TPCx-HS kit contains TPCx-HS specification documentation, TPCx-HS User's guide documentation, Scripts to run benchmarks and Java code to execute the benchmark load. A valid run consists of 5 separate phases run sequentially with overlap in their execution The benchmark test consists of 2 runs (Run with lower and higher TPCx-HS Performance Metric).  There is no configuration or tuning changes or reboot are allowed between the two runs.

TPC Express Benchmark Standard is easy to implement, run and publish, and less expensive.  The test sponsor is required to use TPCx-Hs kit as it is provided.  The vendor may choose an independent audit or peer audit which is 60 day review/challenge window apply (as per TPC policy). This is approved by  super majority of the TPC General Council. All publications must follow the TPC Fair Use Policy.
  • TPC-H
    • TPC-H benchmark focuses on ad-hoc queries
The TPC Benchmark™H (TPC-H) is a decision support benchmark. It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. The queries and the data populating the database have been chosen to have broad industry-wide relevance. This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions. The performance metric reported by TPC-H is called the TPC-H Composite Query-per-Hour Performance Metric (QphH@Size), and reflects multiple aspects of the capability of the system to process queries. These aspects include the selected database size against which the queries are executed, the query processing power when queries are submitted by a single stream, and the query throughput when queries are submitted by multiple concurrent users. The TPC-H Price/Performance metric is expressed as $/QphH@Size.
  • TPC-DS
    • This is the standard benchmark for decision support
The TPC Benchmark DS (TPC-DS) is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The benchmark provides a representative evaluation of performance as a general purpose decision support system. A benchmark result measures query response time in single user mode, query throughput in multi user mode and data maintenance performance for a given hardware, operating system, and data processing system configuration under a controlled, complex, multi-user decision support workload. The purpose of TPC benchmarks is to provide relevant, objective performance data to industry users. TPC-DS Version 2 enables emerging technologies, such as Big Data systems, to execute the benchmark.
  • TPC-C
    • TPC-C is an On-Line Transaction Processing Benchmark

Approved in July of 1992, TPC Benchmark C is an on-line transaction processing (OLTP) benchmark. TPC-C is more complex than previous OLTP benchmarks such as TPC-A because of its multiple transaction types, more complex database and overall execution structure. TPC-C involves a mix of five concurrent transactions of different types and complexity either executed on-line or queued for deferred execution. The database is comprised of nine types of tables with a wide range of record and population sizes. TPC-C is measured in transactions per minute (tpmC). While the benchmark portrays the activity of a wholesale supplier, TPC-C is not limited to the activity of any particular business segment, but, rather represents any industry that must manage, sell, or distribute a product or service.

TPC vs SPEC models

Here is our comparison between TPC Vs SPEC model benchmark

TPC modelSPEC model
Specification basedKit based
Performance, Price, energy in one benchmarkPerformance and energy in separate benchmarks
End-to-EndServer centric
Multiple tests (ACID, Load)Single test
Independent ReviewSummary disclosure
Full disclosureSPEC research group ICPE
TPC Technology conferenceSPEC Research Group, ICPE (International
Conference on Performance Engineering)

BigBench is a joint effort with partners in industry and academia on creating a comprehensive and standardized BigData benchmark. One of the reference reading about BigBench Toward An Industry Standard Benchmark for BigData Analytics  BigBench builds upon and borrows elements from existing benchmarking efforts (such as TPC-xHS, GridMix, PigMix, HiBench, Big Data Benchmark, YCSB and TPC-DS).  BigBench is a specification-based benchmark with an open-source reference implementation kit. As a specification-based benchmark, it would be technology-agnostic and provide the necessary formalism and flexibility to support multiple implementations.  It is focused around execution time calculation Consists of around 30 queries/workloads (10 of them are from TPC).  The drawback is, it is a structured-data-intensive benchmark.  

Spark Bench for Apache Spark

We are able to build on ARM64. The setup completed for single node but run scripts are failing. When spark bench examples are run, a KILL signal is observed which terminates all workers.  This is still under investigation as there are no useful logs to debug. No proper error description and lack of documentation is a challenge. A ticket is already filed on spark bench git which is unresolved.

It is based on TPC-H and TPC-DS benchmarks.  You can exeriment Apache Hive at any data scale. The benchmark contains data generator  and set of queries.  This is very useful to test the basic Hive performance on large data sets.  We have a wiki page for Hive TestBench

This is a stripped-down version of common Mapreduce jobs. (sorting text data and SequenceFiles).  Its a tool for benchmarking Hadoop clusters.  This is a trace based benchmark for MapReduce.  It 
evaluate MapReduce and HDFS performance. 

It submits a mix of synthetic jobs , modeling a profile mined from production loads.  The benchmark attempt to model the resource profiles of production jobs to identify bottlenecks

Basic command line usage:

 $ hadoop gridmix [-generate ] [-users ]
                - Destination directory
                - Path to a job trace

Con - Challenging to explore the performance impact of combining or separating workloads, e.g., through consolidating from many clusters.

The PigMix is a set of queries used test pig component performance.  There are queries that test latency (How long it takes to run this query ?).  The queries that test scalability (How many fields or records can ping handle before it fails ?).

Usage: Run the below commands from pig home

ant -Dharness.hadoop.home=$HADOOP_HOME pigmix-deploy (generate test dataset)
ant -Dharness.hadoop.home=$HADOOP_HOME pigmix (run the PigMix benchmark)

The documentation can be found at Apache pig - 

This benchmark enables rigorous performance measurement of MapReduce systems.  The benchmark contains suites of workloads of thousands of jobs, with complex data, arrival, and computation patterns.  Informs both highly targeted, workload specific optimizations.  This tool is highly recommended for MapReduce operators  The performance measurement - 

This is a BigData Benchmark from AMPLab, UC Berkeley provides quantitative and qualitative comparisons of five systems
  • Redshift – a hosted MPP database offered by based on the ParAccel data warehouse
  • Hive – a Hadoop-based data warehousing system
  • Shark – a Hive-compatible SQL engine which runs on top of the Spark computing framework
  • Impala – a Hive-compatible* SQL engine with its own MPP-like execution engine
  • Stinger/Tez – Tez is a next generation Hadoop execution engine currently in development
This benchmark measures response time on a handful of relational queries: scans, aggregations, joins, and UDF’s, across different data sizes.

This is a specification based benchmark.  The two key components: A data model specification and a workload/query specification. It's a comprehensive end-to-end big data benchmark suite.  The git hub for BigDataBenchmark

BigDataBench is a benchmark suite for scale-out workloads, different from SPEC CPU (sequential workloads), and PARSEC (multithreaded workloads). Currently, it simulates five typical and important big data applications: search engine, social network, e-commerce, multimedia data analytics, and bioinformatics.

Currently, BigDataBench includes 15 real-world data sets, and 34 big data workloads.

This benchmark test suite is for Hadoop.  It contains 4 different categories tests, 10 workloads and 3 types.  This is a best benchmark with metrics: Time (sec) &amp; Throughput (Bytes/Sec)

Screenshot from 2016-09-22 18:32:56.png


Terasort, TestDFSIO, NNBench, MRBench 

GridMix3, PigMix, HiBench, TPCx-HS, SWIM, AMPLab, BigBench 

Industry Standard benchmarks

TPC - Transaction Processing Performance Council 
SPEC - The Standard Performance Evaluation Corporation 
CLDS - Center for Largescale Data System Research 

by Naresh ( at March 03, 2018 09:30

March 26, 2018

Alex Bennée

Solving the HKG18 puzzle with org-mode

One of the traditions I like about Linaro’s Connect event is the conference puzzle. Usually set by Dave Piggot they provide a challenge to your jet lagged brain. Full disclosure: I did not complete the puzzle in time. In fact when Dave explained it I realised the answer had been staring me in the face. However I thought a successful walk through would make for a more entertaining read 😉

First the Puzzle:

Take the clues below and solve them. Once solved, figure out what the hex numbers mean and then you should be able to associate each of the clue solutions with their respective hex numbers.

Clue Hex Number
Lava Ale Code 1114DBA
Be Google Roe 114F6BE
Natural Gin 114F72A
Pope Charger 121EE50
Dolt And Hunk 12264BC
Monk Hops Net 122D9D9
Is Enriched Tin 123C1EF
Bran Hearing Kin 1245D6E
Enter Slim Beer 127B78E
Herbal Cabbages 1282FDD
Jan Venom Hon Nun 12853C5
A Cherry Skull 1287B3C
Each Noun Lands 1298F0B
Wave Zone Kits 12A024C
Avid Null Sorts 12A5190
Handcars All Trim 12C76DC


It looks like all the clues are anagrams. I was lazy and just used the first online anagram solver that Google pointed me at. However we can automate this by combining org-mode with Python and the excellent Beautiful Soup library.

from bs4 import BeautifulSoup
import requests
import re

# ask internet to solve the puzzle
url="" % (anagram.replace(" ", "%20"))

# fish out the answers
answers=soup.find("ul", class_="answers")
for li in answers.find_all("li"):
    result = li.text
    # filter out non computer related or poor results
    if result in ["Elmer Berstein", "Tim-Berners Lee", "Babbage Charles", "Calude Shannon"]:
    # filter out non proper names
    if"[a-z] [A-Z]", result):

return result

So with :var anagram=clues[2,0] we get

Ada Lovelace

I admit the “if result in []” is a bit of hack.

Hex Numbers

The hex numbers could be anything. But lets first start by converting to something else.

Hex Prompt Number
1114DBA 17911226
114F6BE 18151102
114F72A 18151210
121EE50 19000912
12264BC 19031228
122D9D9 19061209
123C1EF 19120623
1245D6E 19160430
127B78E 19380110
1282FDD 19410909
12853C5 19420101
1287B3C 19430204
1298F0B 19500811
12A024C 19530316
12A5190 19550608
12C76DC 19691228

The #+TBLFM: is $1='(identity remote(clues,@@#$2))::$2='(string-to-number $1 16)

This is where I went down a blind alley. The fact all they all had the top bit set made me think that Dave was giving a hint to the purpose of the hex number in the way many cryptic crosswords do (I know he is a fan of these). However the more obvious answer is that everyone in the list was born in the last millennium.

Looking up Birth Dates

Now I could go through all the names by hand and look up their birth dates but as we are automating things perhaps we can use computers for what they are good at. Unfortunately there isn’t a simple web-api for looking up this stuff. However there is a project called DBpedia which takes Wikipedia’s data and attempts to make it semantically useful. We can query this using a semantic query language called SparQL. If only I could call it from Emacs…

PREFIX dbr: <>
PREFIX dbo: <>
PREFIX dbp: <>

select ?birthDate where {
  { dbr:$name dbo:birthDate ?birthDate }
  { dbr:$name dbp:birthDate ?birthDate }

So calling with :var name="Ada_Lovelace" we get


Of course it shouldn’t be a surprise this exists. And in what I hope is a growing trend sparql-mode supports org-mode out of the box. The $name in the snippet is expanded from the passed in variables to the function. This makes it a general purpose lookup function we can use for all our names.

There are a couple of wrinkles. We need to format the name we are looking up with underscores to make a valid URL. Also the output spits out a header and possible multiple birth dates. We can solve this with a little wrapper function. It also introduces some rate limiting so we don’t smash DBpedia’s public SPARQL endpoint.

;; rate limit
(sleep-for 1)
;; do the query
(let* ((str (s-replace-all '((" " . "_") ("Von" . "von")) name))
       (ret (eval
               (format "(org-sbe get-dob (name $\"%s\"))" str))))))
  (string-to-number (replace-regexp-in-string "-" "" (car (cdr (s-lines ret))))))

Calling with :var name="Ada Lovelace" we get


Full Solution

So now we know what we are doing we need to solve all the puzzles and lookup the data. Fortunately org-mode’s tables are fully functional spreadsheets except they are not limited to simple transformations. Each formula can be a fully realised bit of elisp, calling other source blocks as needed.

Clue Solution DOB
Herbal Cabbages Charles Babbage 17911226
Be Google Roe George Boole 18151102
Lava Ale Code Ada Lovelace 18151210
A Cherry Skull Haskell Curry 19000912
Jan Venom Hon Nun John Von Neumann 19031228
Pope Charger Grace Hopper 19061209
Natural Gin Alan Turing 19120623
Each Noun Lands Claude Shannon 19160430
Dolt And Hunk Donald Knuth 19380110
Is Enriched Tin Dennis Ritchie 19410909
Bran Hearing Kin Brian Kernighan 19420101
Monk Hops Net Ken Thompson 19430204
Wave Zone Kits Steve Wozniak 19500811
Handcars All Trim Richard Stallman 19530316
Enter Slim Beer Tim Berners-Lee 19550608
Avid Null Sorts Linus Torvalds 19691228

The #+TBLFM: is $1='(identity remote(clues,@@#$1))::$2='(org-sbe solve-anagram (anagram $$1))::$3='(org-sbe frob-dob (name $$2))

The hex numbers are helpfully sorted so as long as we sort the clues table by the looked up date of birth using M-x org-table-sort-lines we are good to go.

You can find the full blog post in raw form here.

by Alex at March 03, 2018 10:19

March 22, 2018

Naresh Bhat

A dream come true: Himalayan Odyssey - 2016 (Day-0 to 5)


THE HIMALAYAS as most everyone knows are the highest mountains in the world, with 30 peaks over 24,000 feet. The adventure of a lifetime doesn't get much bigger or higher than riding and chasing mountains of the Himalayas.

The Royal Enfield (RE) motorcycles are manufactured and sold in INDIA since 1907. These motorcycles are best suited for INDIAN road conditions. These motorcycles are used by INDIAN ARMY from the period of second world war.

There is a saying "FOUR WHEELS MOVE THE BODY-BUT TWO WHEELS MOVE THE SOUL". I am a motorcycle enthusiast from my childhood days. I always had dreams to own a RE motorcycle after getting into a job. Right now I own two variants of RE motorcycles, “Royal Enfield Thunderbird Twinspark” (TBTS) and “Squadron Blue Classic Dispatch” which is a Limited Edition.

Thunder Bird Twin Spark 350cc 2011 model

Squadron Blue Dispatch 500cc  2015 model

The TBTS is 350CC, good for cruising on long stretched highways.  The dispatch is 500CC EFI engine which gives quick response to throttle.  Hence I decided to take classic 500CC motorcycle for Himalayan Odyssey (HO).

In INDIA, Royal Enfield conducts different motorcycling tours e.g. HO, Tour of Tibet, Tour of Nepal, Tour of Rajasthan..etc.  But out of all these tours it is  considered that HO is the toughest one.  The reason is very simple, riding on Himalayan mountains are not that easy by considering road conditions, unpredictable weather, high altitudes..etc.  The Himalayan mountain roads are completely shutdown for 6 months. The INDIAN army clears the snow, opens and maintains it for another 6 months. Every year, army announces the open and close dates.

From past 12 years RE is conducting HO. I took part in HO-2016, the 13th HO - “18 DAYS LIKE NO OTHER IN RIDING” . It was conducted between 6th to 23rd July 2016. Our group had 70 men and 14 women from all over the world.  The men and women odyssey route were different, but they meet at LADAKH. Again take separate route and meet last day celebration party in Chandigarh. Men's group route map is as below.

HO Preparation:

It takes lot efforts to convince our family and making suitable arrangements at office. I was planning HO ride from last 5 years by accumulating leaves. I was trying to physically be fit as much as possible by doing exercises on regular basis.  After doing registration it is required to go through physical fitness test and submit those documents.  The physical fitness test includes 5KM run and 50 push-ups in 45 min.  There is also a physical fitness certificate from local doctor and you need to submit to RE. Documents to be submitted include medical test reports for blood, urine and Treadmill test (TMT), Medical history by self, Medical check-up fitness certificate by doctor and Indemnity bond.

The HO team includes a doctor, backup van, mechanics, media people, 3-4 lead riders from RE etc. All the information will be communicated to you post registration.

The HO ride starts from Delhi and ends at Chandigarh.  I am located in  Bangalore and hence I also had to plan to reach Delhi on July 7th with my Motorcycle.  I knew I would need 3 days to reach Delhi from Bangalore via road.  Since I had  very limited amount of time, I planed to ship my motorcycle via containers and fly to Delhi.  The transport of my motorcycle I coasted INR Rs.5780.00 one way. Actually the cost of transportation of my motorcycle was more than my air tickets 😅.  The flight tickets round trip cost INR Rs. 7000.00.  Once you register for the HO trip they will include you in closed facebook, whatsapp group.  It will be very easy to discuss all your questions in those groups.

Ready to ship
I used VRL logistics (Vijayanand Road Lines) to ship my bike from Banglore to Delhi.  Many of you may ask why can't I just rent a motorcycle at Delhi ?  This is just because if I ride my own motorcycle in mountains,  I will understand my motorcycle in a better way and the personal attachment with motorcycle will be more.  That's the reason RE suggests to take our own motorcycle for any rides.

locked in a container
Luggage types and split-up:

When we start our ride, our overall luggage will be split into two. 

1. The luggage that we carry on the motorcycle, We call it as “satellite luggage”
    A duffel bag is a good choice. You can fasten it to you motorcycle using bungee cords or luggage straps. Remember to waterproof this bag well as this is exposed to the elements of outside nature whatever terrain you ride.  Packing of this bag is very crucial,  Distribute the weight evenly. If there is some space left in the bag use compression straps to ensure stuff does not move around inside the bag.  Tie the bag after checking its placement thoroughly and do so only on the centre stand. We will end up doing this even at camp sites and finding a flat piece of land could be tricky, use stones to ensure that your motorcycle is as upright as possible when you're fastening your luggage.  It is very tempting to use saddle bags for satellite luggage, but this will leave you with more empty space. Avoid starting the trip with saddlebags on your bike and then shifting them to the luggage vehicle.

What my satellite luggage will definitely have
1. A change of clothes- a pair of denims/cargos, a T-shirt and a casual jacket
2. A hat
3. A pair of running shoes
4. Winter gloves - depending on where we're on the Odyssey
5. Toiletries - I'll have my lip balm/guard and sunscreen
6. GoPro, some mounts, batteries and a power bank
7. a Beanie or a woolen buff
8. a Torch
9. Spare cables and a tube

2. The luggage that is carried in the luggage vehicle.
     This will be minus the riding gear that you bring, as that will be worn by you for the duration of the entire ride.  This luggage will have to be restricted to one piece per rider with a max limit of 15 kilos.  Why 15 kilos? After you have removed all the gear and your satellite luggage, we have found that this is comfortable cut-off. This is also a comfortable weight for you to carry to your rooms, unload/load to the luggage vehicle every day. This luggage will need to be loaded and unloaded every day and in case of rain, the bags can get soiled and wet. It is best that we use some level of waterproofing so as to safeguard what's in the bags. a waterproof cover or waterproofing from the inside could do the job.


Everybody needs to reach two days before the HO trip.  They will book the accommodation you. The very first day I just did a check-in and collected my motorcycles at Delhi.

The next day schedule was as below

Flag Off day and complete Itinerary:

The 13th edition of the Royal Enfield Himalayan Odyssey will flag off from New Delhi on 9th July 2016. This is a call to all those who love to ride on tough and testing terrain and have the passion to ride with RE. In the year 2016 will see 75 riders riding on one of the most spectacular motorcycle journeys in the world. 

Here is our detailed itinerary

Day-1: Delhi To Chandigarh

The first day started as below

  • 5 AM luggage loading - HO
  • 6:30 AM - breakfast
  • 7:15 AM HO start to India gate
Let this begin!

Group photo @INDIA Gate
The first day ride always starts from India Gate, Delhi.  We took a group photo and did some Buddhist rituals and prayed for a safe ride. The briefing includes regroup point, road conditions and some common mistakes committed by riders.

We were 12 people from Karnataka State and grouped together to take some group photos.

Riders joined from Karnataka State

The flag off is done by RE sales director.  The video just after flagoff there are some news channel coverage:  Auto Today  and NDTV

Chandigarh, the capital of the northern Indian states of Punjab and Haryana, was designed by the Swiss-French modernist architect, Le Corbusier.  Chandigarh is a city and a union territory in India that serves as the capital of both neighboring states of Haryana and Punjab. The city is not part of either of the two states and is governed directly by the Union Government, which administers all such territories in the country.

Afternoon we reached Chandigarh and checked-in into hotel. The Chandigarh is a very well planned and beautiful city. The city is having lot of tree's and parks. So we did  a quick tour of couple of places in the city.

Day-2: Chandigarh To Manali

In HO every day is a learning day.  You will become much more closer to your motorcycle each day.  In another words you will understand the motorcycle handling in a better way.   The day starts with luggage loading, breakfast, briefing and ride out.  The time which are followed same on each day.


The briefing will be about 10-15 min.  This is very important for a rider.  Because the briefing contains about the kind of road you are going to ride on that day and important riding tips.

We reached Manali Highland hotel by 5PM.  We visited local market to purchase  required items fir the ride.  This will be the last city on our onward journey to Leh.  After Manali, the real ride will start. There will be less tarmac and more rough roads.  After Manali you will see all shops in tents till you reach Leh.  I also met couple of cyclist who were cycling up to Leh.

Cyclists @Manali hotel
Manali is a high-altitude Himalayan resort town in India’s northern Himachal Pradesh state. It has a reputation as a backpacking center and honeymoon destination. Set on the Beas River, it’s a gateway for skiing in the Solang Valley and trekking in Parvati Valley. It's also a jumping-off point for paragliding, rafting and mountaineering in the Pir Panjal mountains, home to 4,000m-high Rohtang Pass.

Day-3: Manali To Keylong (Jispa)

The road from Manali to Rothang pass is a single road.  Although it had tarmac, it was not in a good condition.  We took a break at Rothang pass base camp.
Base camp
We started slowly climbing the pass.  I could feel that thin air and altitude change.  My motorcycle was also giving slow response to the throttle.  The machine also need oxygen for the combustion.   The weather on Rothang pass will change every 15 min.  The last leg climb was very foggy and hardly I could see the road.

Rothang Pass roads
After couple of kms it was very sunny and bright.  We were warned not to stay more than 10 min at high altitude region.
Top of Rothang Pass
We just took couple of photos and started descending the Rothang pass. It is good that after crossing the Rothang pass, the road is completely empty and traffic free.  You can only see some Indian Army trucks or some goods carriages on the road. But suddenly, the road becomes too rough, dusty.After travelling few kms on these rough roads my motorcycle started behaving in a weird way.  The headlight, horn and indicators stopped working.  Hence I stopped by to check the problem. Fortunately, there I spotted one of the RE HO trip co-coordinators.  He just did a basic check and identified that a fuse is burned.  In couple of minutes he replaced the fuse which is readily available in the side panel box.  I continued my ride till the lunch break.

Lunch time..:)

Dusty roads on the way to Tandi
At some places the roads were under construction.  Since they had put wet mud with stones, it was very difficult to handle the motorcycle which is of around 200 KG weight.
Road construction
Finally reached Tandi fuel pump. Filled up the tank full, since there will not be any filling station till next 365KM.
Tandi gas station
The rough and dusty road continues.  At some places the dust settled on the road might be nearly 10-15 cms too.

We continued to ride and reached Jispa camp.  The river was flowing just behind our tents.  It is really a heaven on the earth.  Very beautiful village.
Jispa camp

Our Tent
We had snacks and had hangout.  Evening onwards it was too cold because of wind and  the cold river was just behind our tents.  I was feeling like I could have taken room instead of tent. That was purely our mistake since we reached early we garbed a tent to stay.

Day-4: Keylong (Jispa) To Sarchu

Jispa is a village in Lahaul, in the Indian state of Himachal Pradesh. Jispa is located 20 km north of Keylong and 7 km south of Darcha, along the Manali-Leh Highway and the Bhaga river. There are approximately 20 villages between Jispa and Keylong.

The briefing we were given instructions on how to do water crossing.  In all the water crossing there will be small pebbles and water very chilled. One should make sure that the motorcycle tires should not get stuck between these small stone beds.
Ready to leave Jispa valley
The distance between Jispa to Sarchu is very less.  It makes difficult to ride on no road terrain. We finished the morning briefing and started riding.
Briefing @Jispa
We crossed couple of water streams before reaching Sarchu. The technique to cross water stream is very simple.  First you should hold your motorcycle tanks tightly with knee's.  Next you have to free your upper body, give focus and look ahead on water flowing road and give the throttle.
Riding beside Bhaga river

Water crossing
Valley view
Lunch break
We had a break for the lunch.  I had some noodles.  You will not get any other kind of foods in these tents other than omlet, noodles, plain rice.

We reached Sarchu very early around 3-4PM.  But after reaching Sarchu with-in 15-20mins the headache started. Almost all had mountain sickness.  Acute Mountain Sickness (AMS) is the mildest form and it's very common. The symptoms can feel like a hangover – dizziness, headache, muscle aches, nausea. Camp doctors checked heartbeats for all affected people.

We were unable to eat anything,  could not sleep or take rest.  Even if we walk for 100mtr we were unable to breath.  That day was a very horrible day which I will never forget in my life.

Sarchu camp
We were having again tented accommodation.  There will be only solar charged lights. There will not be any army hospitals nearby.  After sun goes there will be sudden drop in temperature. It felt like the situation was life threatening.

Sarchu is a major halt point with tented accommodation in the Himalayas on the Leh-Manali Highway, on the boundary between Himachal Pradesh and Ladakh (Jammu and Kashmir) in India. It is situated between Baralacha La to the south and Lachulung La to the north, at an altitude of 4,290 m (14,070 ft).

Day-5: Sarchu To Leh

I was very eager to start from Sarchu. The stay at high altitude and very cold weather I could not get a good sleep.  The RE guys bought petrol (gas) in backup van.  All of us queued up to top up petrol. The stay at Sarchu tent was the most uncomfortable stay.  But it is true once you get acclimatize to Sarchu altitude,  you are more prepared to travel further.
I  shifted my satellite luggage to backup van. As the experience speaks, it is very uncomfortable to ride with saddle bags on the motorcycle. After Sarchu the roads are open, no traffic for several kms. I was going alone and stoped to take pictures.  When I reached "Gata Loops" bottom, couple of my friends joined.

GATA Loops begin
Gata Loops is a name that is unknown to everyone except for a few who have traveled on the Manali Leh highway; or planning to do so. It is a series of twenty one hairpin bends that takes you to the top of the 3rd high altitude pass on this highway, Nakeela; at a height of 15,547 ft.
More (Mo-ray) plains
I have covered hundreds of mountain miles but never seen a plateau. When I came upon the More (pronounced ‘mo-ray’) Plains, they were much bigger than what I’d visualized of plateaus from school geography books.
They are endless. Well, 50 km of flatlands at an elevation of 15,000 feet deserves that epithet! And they are flat, for miles after miles, till they run into the surrounding mountains.  Camp here for the evening and you’ll see the most stunning of sunsets. The area is surprisingly active here. You will always have workers building or repairing roads.

We continue the ride towards Leh after taking few pics at More plains.  We were pass through Pang, Meroo, Debring and at Rumtse we had a lunch break.  The Indus river flows parallel to road and other side steep cliff of the mountains.  I remember each mountain had of different colors after Debring. By evening we reached Leh and check-into hotel "Namgyal Palace". 

by Naresh ( at March 03, 2018 10:16

March 11, 2018

Gema Gomez

Azufral Capelet

A few months ago I bought some Berroco Mykonos yarn in San Francisco. I also bought a pattern for it, the Azufral pattern, written by Donna Yacino. Now, after a few months with not a lot of spare time to work on it, I have managed to finish the capelet:


The pattern was followed verbatim, adjusting for gauge and measurements of desired garment. The needles used were Knit Pro Symfonie Cubic Square Needles - 30cm (Pair) - 4.00mm, single pointed.

The yarn is Berroco Mykonos (66% linen, 26% nylon, 8% cotton), color aura (8544), handwash in lukewarm water only and lay flat to dry. I hardly ever go for yarn that is not machine washable, but this one was so shiny and nice to the touch that I could not help it.

The fabric looks as follows once finished:


by Gema Gomez at March 03, 2018 00:00

March 01, 2018

Gema Gomez

OpenStack Queens on ARM64

We are in Dublin this week, at the OpenStack PTG. We happen to be here on a week that has red weather warnings all over Europe, so most of us are stuck in Dublin for longer than we expected.

Queens has been released!

During Pike/Queens my team at Linaro (Software Defined Infrastructure) have been enabling different parts of OpenStack on ARM64 and making sure the OpenStack code is multiarch when necessary (note that I use the terms AArch64 and ARM64 interchangeably).

There seems to be some confusion about the nature of the servers we are using, here is a picture of one of our racks:


Queens is the first release that we feel confident will run out of the box on ARM64, a milestone of collaboration not only from the Linaro member companies but also from the OpenStack community at large. OpenStack projects have been welcoming of the diversity and inclusive, helping us ramp up: either giving direction and reviewing our code or fixing issues themselves.

We will be deploying Queens with Kolla on the Linaro Developer Cloud (ARM64 servers) and documenting the experience for new Kolla users, including brownfield upgrades.

The Linaro Developer Cloud is a collaborative effort of the Linaro Enteprise Group to ensure ARM64 building and testing capabilities are available for different upstream projects, including OpenStack.

This cycle we added resources from one of our clouds to the openstack-infra project so the community can start testing multiarch changes regularly. The bring up of the ARM64 cloud in infra is in progress, there are only 8 executors currently available to run jobs that we’ll be using for experimental jobs for the time being. The long term goal of this effort is to be able to run ARM64 jobs on the gates by default for all projects.

What next? Next steps include running experimental gate jobs for Kolla and any other project that volunteers, ironing out any leftover issues, making sure devstack runs smoothly, incrementally making sure we have a stable platform to run tests on and inviting all OpenStack projects to take part if they are interested. If you want to discuss any specifics or have questions either use the Kolla mailing or reach out to hrw or gema on freenode.

by Gema Gomez at March 03, 2018 00:00

February 21, 2018

Alex Bennée

Workbooks for Benchmarking

While working on a major re-factor of QEMU’s softfloat code I’ve been doing a lot of benchmarking. It can be quite tedious work as you need to be careful you’ve run the correct steps on the correct binaries and keeping notes is important. It is a task that cries out for scripting but that in itself can be a compromise as you end up stitching a pipeline of commands together in something like perl. You may script it all in a language designed for this sort of thing like R but then find your final upload step is a pain to implement.

One solution to this is to use a literate programming workbook like this. Literate programming is a style where you interleave your code with natural prose describing the steps you go through. This is different from simply having well commented code in a source tree. For one thing you do not have to leap around a large code base as everything you need is on the file you are reading, from top to bottom. There are many solutions out there including various python based examples. Of course being a happy Emacs user I use one of its stand-out features org-mode which comes with multi-language org-babel support. This allows me to document my benchmarking while scripting up the steps in a variety of “languages” depending on the my needs at the time. Let’s take a look at the first section:

1 Binaries To Test

Here we have several tables of binaries to test. We refer to the
current benchmarking set from the next stage, Run Benchmark.

For a final test we might compare the system QEMU with a reference
build as well as our current build.

Binary title
/usr/bin/qemu-aarch64 system-2.5.log
~/lsrc/qemu/qemu-builddirs/ master.log
~/lsrc/qemu/qemu.git/aarch64-linux-user/qemu-aarch64 softfloat-v4.log

Well that is certainly fairly explanatory. These are named org-mode tables which can be referred to in other code snippets and passed in as variables. So the next job is to run the benchmark itself:

2 Run Benchmark

This runs the benchmark against each binary we have selected above.

    import subprocess
    import os


    for qemu,logname in files:
    cmd="taskset -c 0 %s ./vector-benchmark -n %s | tee %s" % (qemu, tests, logname), shell=True)

        return runs

So why use python as the test runner? Well truth is whenever I end up munging arrays in shell script I forget the syntax and end up jumping through all sorts of hoops. Easier just to have some simple python. I use python again later to read the data back into an org-table so I can pass it to the next step, graphing:

set title "Vector Benchmark Results (lower is better)"
set style data histograms
set style fill solid 1.0 border lt -1

set xtics rotate by 90 right
set yrange [:]
set xlabel noenhanced
set ylabel "nsecs/Kop" noenhanced
set xtics noenhanced
set ytics noenhanced
set boxwidth 1
set xtics format ""
set xtics scale 0
set grid ytics
set term pngcairo size 1200,500

plot for [i=2:5] data using i:xtic(1) title columnhead

This is a GNU Plot script which takes the data and plots an image from it. org-mode takes care of the details of marshalling the table data into GNU Plot so all this script is really concerned with is setting styles and titles. The language is capable of some fairly advanced stuff but I could always pre-process the data with something else if I needed to.

Finally I need to upload my graph to an image hosting service to share with my colleges. This can be done with a elaborate curl command but I have another trick at my disposal thanks to the excellent restclient-mode. This mode is actually designed for interactive debugging of REST APIs but it is also easily to use from an org-mode source block. So the whole thing looks like a HTTP session:

:client_id = feedbeef

# Upload images to imgur
Authorization: Client-ID :client_id
Content-type: image/png

< benchmark.png

Finally because the above dumps all the headers when run (which is very handy for debugging) I actually only want the URL in most cases. I can do this simply enough in elisp:

#+name: post-to-imgur
#+begin_src emacs-lisp :var json-string=upload-to-imgur()
  (when (string-match
         (rx "link" (one-or-more (any "\":" whitespace))
             (group (one-or-more (not (any "\"")))))
    (match-string 1 json-string))

The :var line calls the restclient-mode function automatically and passes it the result which it can then extract the final URL from.

And there you have it, my entire benchmarking workflow document in a single file which I can read through tweaking each step as I go. This isn’t the first time I’ve done this sort of thing. As I use org-mode extensively as a logbook to keep track of my upstream work I’ve slowly grown a series of scripts for common tasks. For example every patch series and pull request I post is done via org. I keep the whole thing in a git repository so each time I finish a sequence I can commit the results into the repository as a permanent record of what steps I ran.

If you want even more inspiration I suggest you look at John Kitchen’s scimax work. As a publishing scientist he makes extensive use of org-mode when writing his papers. He is able to include the main prose with the code to plot the graphs and tables in a single source document from which his camera ready documents are generated. Should he ever need to reproduce any work his exact steps are all there in the source document. Yet another example of why org-mode is awesome 😉

by Alex at February 02, 2018 20:34

February 13, 2018

Riku Voipio

Making sense of /proc/cpuinfo on ARM

Ever stared at output of /proc/cpuinfo and wondered what the CPU is?

processor : 7
BogoMIPS : 2.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part : 0xd03
CPU revision : 3
Or maybe like:

$ cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 2 (v7l)
BogoMIPS : 50.00
Features : half thumb fastmult vfp edsp thumbee vfpv3 tls idiva idivt vfpd32 lpae
CPU implementer : 0x56
CPU architecture: 7
CPU variant : 0x2
CPU part : 0x584
CPU revision : 2
The bits "CPU implementer" and "CPU part" could be mapped to human understandable strings. But the Kernel developers are heavily against the idea. Therefor, to the next idea: Parse in userspace. Turns out, there is a common tool almost everyone has installed does similar stuff. lscpu(1) from util-linux. So I proposed a patch to do ID mapping on arm/arm64 to util-linux, and it was accepted! So using lscpu from util-linux 2.32 (hopefully to be released soon) the above two systems look like:

Architecture: aarch64
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 2
NUMA node(s): 1
Vendor ID: ARM
Model: 3
Model name: Cortex-A53
Stepping: r0p3
CPU max MHz: 1200.0000
CPU min MHz: 208.0000
BogoMIPS: 2.40
L1d cache: unknown size
L1i cache: unknown size
L2 cache: unknown size
NUMA node0 CPU(s): 0-7
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

$ lscpu
Architecture: armv7l
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
Vendor ID: Marvell
Model: 2
Model name: PJ4B-MP
Stepping: 0x2
CPU max MHz: 1333.0000
CPU min MHz: 666.5000
BogoMIPS: 50.00
Flags: half thumb fastmult vfp edsp thumbee vfpv3 tls idiva idivt vfpd32 lpae
As we can see, lscpu is quite versatile and can show more information than just what is available in cpuinfo.

by Riku Voipio ( at February 02, 2018 14:33

February 11, 2018

Siddhesh Poyarekar

Optimizing toolchains for modern microprocessors

About 2.5 years ago I left Red Hat to join Linaro in a move that surprised even me for the first few months. I still work on the GNU toolchain with a glibc focus, but my focus changed considerably. I am no longer looking at the toolchain in its entirety (although I do that on my own time whenever I can, either as glibc release manager or reviewer); my focus is making glibc routines faster for one specific server microprocessor; no prizes for guessing which processor that is. I have read architecture manuals in the past to understand specific behaviours but this is the first time that I have had to pore through the entire manual and optimization guides and try and eek out the last cycle of performance from a chip.

This post is an attempt to document my learnings and make a high level guide of the various things me and my team looked at to improve performance of the toolchain. Note that my team is continuing to work on this chip (and I continue to learn new techniques, I may write about it later) so this ‘guide’ is more of a personal journey. I may add more follow ups or modify this post to reflect any changes in my understanding of this vast topic.

All of my examples use ARM64 assembly since that’s what I’ve been working on and translating the examples to something x86 would have discouraged me enough to not write this at all.

What am I optimizing for?

CPUs today are complicated beasts. Vulnerabilities like Spectre allude to how complicated CPU behaviour can get but in reality it can get a lot more complicated and there’s never really a universal solution to get the best out of them. Due to this, it is important to figure out what the end goal for the optimization is. For string functions for example, there are a number of different factors in play and there is no single set of behaviours that trumps over all others. For compilers in general, the number of such combinations is even higher. The solution often is to try and ensure that there is a balance and there are no exponentially worse behaviours.

The first line of defence for this is to ensure that the algorithm used for the routine does not exhibit exponential behaviour. I wrote about algorithmic changes I did to the multiple precision fallback implementation in glibc years ago elsewhere so I’m not going to repeat that. I will however state that the first line of attack to improve any function must be algorithmic. Thankfully barring strcmp, string routines in glibc had a fairly sound algorithmic base. strcmp fall back to a byte comparison when inputs are not mutually aligned, which is now fixed.

Large strings vs small

This is one question that gets asked very often in the context of string functions and different developers have different opinions on it, some differences even leading to flamewars in the past. One popular approach to ‘solving’ this is to quote usage of string functions in a popular benchmark and use that as a measuring stick. For a benchmark like CPU2006 or CPU2017, it means that you optimize for smaller strings because the number of calls to smaller strings is very high in those benchmarks. There are a few issues to that approach:

  • These benchmarks use glibc routines for a very small fraction of time, so you’re not going to win a lot of performance in the benchmark by improving small string performance
  • Small string operations have other factors affecting it a lot more, i.e. things like cache locality, branch predictor behaviour, prefether behaviour, etc. So while it might be fun to tweak behaviour exactly the way a CPU likes it, it may not end up resulting in the kind of gains you’re looking for
  • A 10K string (in theory) takes at least 10 times more cycles than a 1K string, often more. So effectively, there is 10x more incentive to look at improving performance of larger strings than smaller ones.
  • There are CPU features specifically catered for larger sequential string operations and utilizing those microarchitecture quirks will guarantee much better gains
  • There are a significant number of use cases outside of these benchmarks that use glibc far more than the SPEC benchmarks. There’s no established set of benchmarks that represent them though.

I won’t conclude with a final answer for this because there is none. This is also why I had to revisit this question for every single routine I targeted, sometimes even before I decide to target it.

Cached or not?

This is another question that comes up for string routines and the answer is actually a spectrum - a string could be cached, not cached or partially cached. What’s the safe assumption then?

There is a bit more consensus on the answer to this question. It is generally considered safe to consider that shorter string accesses are cached and then focus on code scheduling and layout for its target code. If the string is not cached, the cost of getting it into cache far outweighs the savings through scheduling and hence it is pointless looking at that case. For larger strings, assuming that they’re cached does not make sense due to their size. As a result, the focus for such situations should be on ensuring that cache utilization is optimal. That is, make sure that the code aids all of the CPU units that populate caches, either through a hardware prefetcher or through judiciously placed software prefetch instructions or by avoiding caching altogether, thus avoiding evicting other hot data. Code scheduling, alignment, etc. is still important because more often than not you’ll have a hot loop that does the loads, compares, stores, etc. and once your stream is primed, you need to ensure that the loop is not suboptimal and runs without stalls.

My branch is more important than yours

Branch predictor units in CPUs are quite complicated and the compiler does not try to model them. Instead, it tries to do the simpler and more effective thing; make sure that the more probably branch target is accessible through sequential fetching. This is another aspect of the large strings vs small for string functions and more often than not, smaller sizes are assumed to be more probable for hand-written assembly because it seems to be that way in practice and also the cost of a mispredict hits the smaller size more than it does the larger one.

Don’t waste any part of a pig CPU

CPUs today are complicated beasts. Yes I know I started the previous section with this exact same line; they’re complicated enough to bear repeating that. However, there is a bit of relief in the fact that the first principles of their design hasn’t changed much. The components of the CPU are all things we heard about in our CS class and the problem then reduces to understanding specific quirks of the processor core. At a very high level, there are three types of quirks you look for:

  1. Something the core does exceedingly well
  2. Something the core does very badly
  3. Something the core does very well or badly under specific conditions

Typically this is made easy by CPU vendors when they provide documentation that specifies a lot of this information. Then there are cases where you discover these behaviours through profiling. Oh yes, before I forget:

Learn how to use perf or similar tool and read its output it will save your life

For example, the falkor core does something interesting with respect with loads and addressing modes. Typically, a load instruction would take a specific number of cycles to fetch from L1, more if memory is not cached, but that’s not relevant here. If you issue a load instruction with a pre/post-incrementing addressing mode, the microarchitecture issues two micro-instructions; one load and another that updates the base address. So:

   ldr  x1, [x2, 16]!

effectively is:

  ldr   x1, [x2, 16]
  add   x2, x2, 16

and that increases the net cost of the load. While it saves us an instruction, this addressing mode isn’t always preferred in unrolled loops since you could avoid the base address increment at the end of every instruction and do that at the end. With falkor however, this operation is very fast and in most cases, this addressing mode is preferred for loads. The reason for this is the way its hardware prefetcher works.

Hardware Prefetcher

A hardware prefetcher is a CPU unit that speculatively loads the memory location after the location requested, in an attempt to speed things up. This forms a memory stream and larger the string, the more its gains from prefetching. This however also means that in case of multiple prefetcher units in a core, one must ensure that the same prefetcher unit is hit so that the unit gets trained properly, i.e. knows what’s the next block to fetch. The way a prefetcher typically knows is if sees a consistent stride in memory access, i.e. it sees loads of X, X+16, X+32, etc. in a sequence.

On falkor the addressing mode plays an important role in determining which hardware prefetcher unit is hit by the load and effectively, a pre/post-incrementing load ensures that the loads hit the same prefetcher. That combined with a feature called register renaming ensures that it is much quicker to just fetch into the same virtual register and pre/post-increment the base address than to second-guess the CPU and try to outsmart it. The memcpy and memmove routines use this quirk extensively; comments in the falkor routines even have detailed comments explaining the basis of this behaviour.

Doing something so badly that it is easier to win

A colleague once said that the best targets for toolchain optimizations are CPUs that do things badly. There always is this one behaviour or set of behaviours that CPU designers decided to sacrifice to benefit other behaviours. On falkor for example, calling the MRS instruction for some registers is painfully slow whereas it is close to single cycle latency for most other processors. Simply avoiding such slow paths in itself could result in tremendous performance wins; this was evident with the memset function for falkor, which became twice as fast for medium sized strings.

Another example for this is in the compiler and not glibc, where the fact that using a ‘str’ instruction on 128-bit registers with register addressing mode is very slow on falkor. Simply avoiding that instruction altogether results in pretty good gains.

CPU Pipeline

Both gcc and llvm allow you to specify a model of the CPU pipeline, i.e.

  1. The number of each type of unit the CPU has. That is, the number of load/store units, number of integer math units, number of FP units, etc.
  2. The latency for each type of instruction
  3. The number of micro-operations each instruction splits into
  4. The number of instructions the CPU can fetch/dispatch in a single cycle

and so on. This information is then used to sequence instructions in a function that it optimizes for. This may also help the compiler choose between instructions based on how long those take. For example, it may be cheaper to just declare a literal in the code and load from it than to construct a constant using mov/movk. Similarly, it could be cheaper to use csel to select a value to load to a register than to branch to a different piece of code that loads the register or vice versa.

Optimal instruction sequencing can often result in significant gains. For example, intespersing load and store instructions with unrelated arithmetic instructions could result in both those instructions executing in parallel, thus saving time. On the contrary, sequencing multiple load instructions back to back could result in other units being underutilized and all instructions being serialized on to the load unit. The pipeline model allows the compiler to make an optimal decision in this regard.

Vector unit - to use or not to use, that is the question

The vector unit is this temptress that promises to double your execution rate, but it doesn’t come without cost. The most important cost is that of moving data between general purpose and vector registers and quite often this may end up eating into your gains. The cost of the vector instructions themselves may be high, or a CPU might have multiple integer units and just one SIMD unit, because of which code may get a better schedule when executed on the integer units as opposed to via the vector unit.

I had seen an opposite example of this in powerpc years ago when I noticed that much of the integer operations were also implemented in FP in multiple precision math. This was because the original authors were from IBM and they had noticed a significant performance gain with that on powerpc (possible power7 or earlier given the timelines) because the CPU had 4 FP units!

Final Thoughts

This is really just the tip of the iceberg when it comes to performance optimization in toolchains and utilizing CPU quirks. There are more behaviours that could be exploited (such as aliasing behaviour in branch prediction or core topology) but the cost benefit of doing that is questionable.

Despite how much fun it is to hand-write assembly for such routines, the best approach is always to write simple enough code (yes, clever tricks might actually defeat compiler optimization passes!) that the compiler can optimize for you. If there are missed optimizations, improve compiler support for it. For glibc and aarch64, there is also the case of impending multiarch explosion. Due to the presence of multiple vendors, having a perfectly tuned routine for each vendor may pose code maintenance problems and also secondary issues with performance, like code layout in a binary and instruction cache utilization. There are some random ideas floating about for that already, like making separate text sections for vendor-specific code, but that’s something we would like to avoid doing if we can.

by Siddhesh at February 02, 2018 19:37

February 06, 2018

Alex Bennée


I’ve just returned from a weekend in Brussels for my first ever FOSDEM – the Free and Open Source Developers, European Meeting. It’s been on my list of conferences to go to for some time and thanks to getting my talk accepted, my employer financed the cost of travel and hotels. Thanks to the support of the Université libre de Bruxelles (ULB) the event itself is free and run entirely by volunteers. As you can expect from the name they also have a strong commitment to free and open source software.

The first thing that struck me about the conference is how wide ranging it was. There were talks on everything from the internals of debugging tools to developing public policy. When I first loaded up their excellent companion app (naturally via the F-Droid repository) I was somewhat overwhelmed by the choice. As it is a free conference there is no limit on the numbers who can attend which means you are not always guarenteed to be able to get into every talk. In fact during the event I walked past many long queues for the more popular talks. In the end I ended up just bookmarking all the talks I was interested in and deciding which one to go to depending on how I felt at the time. Fortunately FOSDEM have a strong archiving policy and video most of their talks so I’ll be spending the next few weeks catching up on the ones I missed.

There now follows a non-exhaustive list of the most interesting ones I was able to see live:

Dashamir’s talk on EasyGPG dealt with the opinionated decisions it makes to try and make the use of GnuPG more intuitive to those not versed in the full gory details of public key cryptography. Although I use GPG mainly for signing GIT pull requests I really should make better use it over all. The split-key solution to backups was particularly interesting. I suspect I’ll need a little convincing before I put part of my key in the cloud but I’ll certainly check out his scripts.

Liam’s A Circuit Less Travelled was an entertaining tour of some of the technologies and ideas from early computer history that got abandoned on the wayside. These ideas were often to be re-invented in a less superior form as engineers realised the error of their ways as technology advanced. The later half of the talk turns into a bit of LISP love-fest but as an Emacs user with an ever growing config file that is fine by me 😉

Following on in the history vein was Steven Goodwin’s talk on Digital Archaeology which was a salutatory reminder of the amount of recent history that is getting lost as computing’s breakneck pace has discarded old physical formats in lieu of newer equally short lived formats. It reminded me I should really do something about the 3 boxes of floppy disks I have under my desk. I also need to schedule a visit to the Computer History Museum with my children seeing as it is more or less on my doorstep.

There was a tongue in check preview that described the EDSAC talk as recreating “an ancient computer without any of the things that made it interesting”. This was was a little unkind. Although the project re-implemented the computation parts in a tiny little FPGA the core idea was to introduce potential students to the physicality of the early computers. After an introduction to the hoary architecture of the original EDSAC and the Wheeler Jump Mary introduced the hardware they re-imagined for the project. The first was an optical reader developed to read in paper tapes although this time ones printed on thermal receipt paper. This included an in-depth review of the problems of smoothing out analogue inputs to get reliable signals from their optical sensors which mirrors the problems the rebuild is facing with nature of the valves used in EDSAC. It is a shame they couldn’t come up with some way to involve a valve but I guess high-tension supplies and school kids don’t mix well. However they did come up with a way of re-creating the original acoustic mercury delay lines but this time with a tube of air and some 3D printed parabolic ends.

The big geek event was the much anticipated announcement of RISC-V hardware during the RISC-V enablement talk. It seemed to be an open secret the announcement was coming but it still garnered hearty applause when it finally came. I should point out I’m indirectly employed by companies with an interest in a competing architecture but it is still good to see other stuff out there. The board is fairly open but there are still some peripheral IPs which were closed which shows just how tricky getting to fully-free hardware is going to be. As I understand the RISC-V’s licensing model the ISA is open (unlike for example an ARM Architecture License) but individual companies can still have closed implementations which they license to be manufactured which is how I assume SiFive funds development. The actual CPU implementation is still very much a black box you have to take on trust.

Finally for those that are interested my talk is already online for those that are interested in what I’m currently working on. The slides have been slightly cropped in the video but if you follow the link to the HTML version you can read along on your machine.

I have to say FOSDEM’s setup is pretty impressive. Although there was a volunteer in each room to deal with fire safety and replace microphones all the recording is fully automated. There are rather fancy hand crafted wooden boxes in each room which take the feed from your laptop and mux it with the camera. I got the email from the automated system asking me to review a preview of my talk about half and hour after I gave it. It took a little longer for the final product to get encoded and online but it’s certainly the nicest system I’ve come across so far.

All in all I can heartily recommend FOSDEM for anyone in an interest is FLOSS. It’s a packed schedule and there is going to be something for everyone there. Big thanks to all the volunteers and organisers and I hope I can make it next year 😉

by Alex at February 02, 2018 09:36

January 23, 2018

Ard Biesheuvel

Per-task stack canaries for arm64

Due to the way the stack of a thread (or task in kernelspeak) is shared between control flow data (frame pointer, return address, caller saved registers) and temporary buffers, overflowing such buffers can completely subvert the control flow of a program, and the stack is therefore a primary target for attacks. Such attacks are referred to as Return Oriented Programming (ROP), and typically consist of a specially crafted array of forged stack frames, where each return from a function is directed at another piece of code (called a gadget) that is already present in the program. By piecing together gadgets like this, powerful attacks can be mounted, especially in a big program such as the kernel where the supply of gadgets is endless.

One way to mitigate such attacks is the use of stack canaries, which are known values that are placed inside each stack frame when entering a function, and checked again when leaving the function. This forces the attacker to craft his buffer overflow attack in a way that puts the correct stack canary value inside each stack frame. That by itself is rather trivial, but it does require the attacker to discover the value first.

GCC support

GCC implements support for stack canaries, which can be enabled using the various ‑fstack-protector[‑xxx] command line switches. When enabled, each function prologue will store the value of the global variable __stack_chk_guard inside the stack frame, and each epilogue will read the value back and compare it, and branch to the function __stack_chk_fail if the comparison fails.

This works fine for user programs, with the caveat that all threads will use the same value for the stack canary. However, each program will pick a random value at program start, and so this is not a severe limitation. Similarly, for uniprocessor (UP) kernels, where only a single task will be active at the same time, we can simply update the value of the __stack_chk_guard variable when switching from one task to the next, and so each task can have its own unique value.

However, on SMP kernels, this model breaks down. Each CPU will be running a different task, and so any combination of tasks could be active at the same time. Since each will refer to __stack_chk_guard directly, its value cannot be changed until all tasks have exited, which only occurs at a reboot. Given that servers don’t usually reboot that often, leaking the global stack canary value can seriously compromise security of a running system, as the attacker only has to discover it once.

x86: per-CPU variables

To work around this issue, Linux/x86 implements support for stack canaries using the existing Thread-local Storage (TLS) support in GCC, which replaces the reference to __stack_chk_guard with a reference to a fixed offset in the TLS block. This means each CPU has its own copy, which is set to the stack canary value of that CPU’s current task when it switches to it. When the task migrates, it just takes its stack canary value along, and so all tasks can use a unique value. Problem solved.

On arm64, we are not that lucky, unfortunately. GCC only supports the global stack canary value, although discussions are underway to decide how this is best implemented for multitask/thread environments, i.e., in a way that works for userland as well as for the kernel.

Per-CPU variables and preemption

Loading the per-CPU version of __stack_chk_guard could look something like this on arm64:

adrp    x0, __stack__chk_guard
add     x0, x0, :lo12:__stack_chk_guard
mrs     x1, tpidr_el1
ldr     x0, [x0, x1]

There are two problems with this code:

  • the arm64 Linux kernel implements support for Virtualization Host Extensions (VHE), and uses code patching to replace all references to tpidr_el1 with tpidr_el2 on VHE capable systems,
  • the access is not atomic: if this code is preempted after reading the value of tpidr_el1 but before loading the stack canary value, and is subsequently migrated to another CPU, it will load the wrong value.

In kernel code, we can deal with this easily: every emitted reference to tpidr_el1 is tagged so we can patch it at boot, and on preemptible kernels we put the code in a non-preemtible block to make it atomic. However, this is impossible to do in GCC generated code without putting elaborate knowledge of the kernel’s per-CPU variable implementation into the compiler, and doing so would severely limit our future ability to make any changes to it.

One way to mitigate this would be to reserve a general purpose register for the per-CPU offset, and ensure that it is used as the offset register in the ldr instruction. This addresses both problems: we use the same register regardless of VHE, and the single ldr instruction is atomic by definition.

However, as it turns out, we can do much better than this. We don’t need per-CPU variables if we can load the task’s stack canary value directly, and each CPU already keeps a pointer to the task_struct of the current task in system register sp_el0. So if we replace the above with

movz    x0, :abs_g0:__stack__chk_guard_offset
mrs     x1, sp_el0
ldr     x0, [x0, x1]

we dodge both issues, since all of the values involved are per-task values which do not change when migrating to another CPU. Note that the same sequence could be used in userland for TLS if you swap out sp_el0 for tpidr_el0 (and use the appropriate relocation type), so adding support for this to GCC (with a command line configurable value for the system register) would be a flexible solution to this problem.

Proof of concept implementation

I implemented support for the above, using a GCC plugin to replace the default sequence

adrp    x0, __stack__chk_guard
add     x0, x0, :lo12:__stack_chk_guard
ldr     x0, [x0]


mrs     x0, sp_el0
add     x0, x0, :lo12:__stack_chk_guard_offset
ldr     x0, [x0]

This limits __stack_chk_guard_offset to 4 KB, but this is not an issue in practice unless struct randomization is enabled. Another caveat is that it only works with GCC’s small code model (the one that uses adrp instructions) since the plugin works by looking for those instructions and replacing them.

Code can be found here.

by ardbiesheuvel at January 01, 2018 11:12

Leif Lindholm

Fun and games with gnu-efi

gnu-efi is a set of scripts, libraries, header files and code examples to make it possible to write applications and drivers for the UEFI environment directly from your POSIX world. It supports i386, Ia64, X64, ARM and AArch64 targets ... but it would be dishonest to say it is beginner friendly in its current state. So let's do something about that.

Rough Edges

gnu-efi comes packaged for most Linux distributions, so you can simply run

$ sudo apt-get install gnu-efi


$ sudo dnf install gnu-efi gnu-efi-devel

to install it. However, there is a bunch of Makefile boilerplate that is not covered by said packaging, meaning that getting from "hey, let's check this thing out" to "hello, world" involves a fair bit of tedious makefile hacking.

... serrated?

Also, the whole packaging story here is a bit ... special. It means installing headers and libraries into /usr/lib and /usr/include solely for the inclusion into images to be executed by the UEFI firmware during Boot Services, before the operating system is running. And don't get me started on multi-arch support.


Like most other programming languages, Make supports including other source files into the current context. The gnu-efi codebase makes use of this, but not in a way that's useful to a packaging system.

Now, at least GNU Make looks in /usr/include and /usr/local/include as well as the current working directory and any directories specified on the command line with -L. This means we can stuff most of the boilerplate in makefile fragments and include where we need them.

Hello World

So, let's start with the (almost) most trivial application imaginable:

#include <efi/efi.h>
#include <efi/efilib.h>

    EFI_HANDLE image_handle,
    EFI_SYSTEM_TABLE *systab
    InitializeLib(image_handle, systab);

    Print(L"Hello, world!\n");

    return EFI_SUCCESS;

Save that as hello.c.

Reducing the boiler-plate

Now grab Make.defaults and Make.rules from the gnu-efi source directory and stick them in a subdirectory called efi/.

Then download this I prepared earlier, and include it in your Makefile:


ifeq ($(HAVE_EFI_OBJCOPY), y)
FORMAT := --target efi-app-$(ARCH)      # Boot time application
#FORMAT := --target efi-bsdrv-$(ARCH)   # Boot services driver
#FORMAT := --target efi-rtdrv-$(ARCH)   # Runtime driver
#SUBSYSTEM=$(EFI_SUBSYSTEM_BSDRIVER)    # Boot services driver

all: hello.efi

    rm -f *.o *.so *.efi *~

The hello.efi dependency for the all target invokes implicit rules (defined in Make.rules) to generate hello.efi from, which is generated by an implicit rule from hello.o, which is generated by an implicit rule from hello.c.

NOTE: there are two bits of boiler-plate that still need addressing.

First of all, in, GNUEFI_LIBDIR needs to be manually adjusted to fit the layout implemented by your distribution. Template entries for Debian and Fedora are provided.

Secondly, the bit of boiler-plate we cannot easily get rid of - we need to inform the toolchain about whether the desired output is an application, a boot-time driver or a runtime driver. Templates for this is included in the Makefile snippet above - but note that different options must currently be set for toolchains where objcopy supports efi- targets directly and ones where it does not.

Building and running

Once the build environment has been completed, build the project as you would with any regular codebase.

$ make
gcc -I/usr/include/efi -I/usr/include/efi/x86_64 -I/usr/include/protocol -mno-red-zone -fpic  -g -O2 -Wall -Wextra -Werror -fshort-wchar -fno-strict-aliasing -fno-merge-constants -ffreestanding -fno-stack-protector -fno-stack-check -DCONFIG_x86_64 -DGNU_EFI_USE_MS_ABI -maccumulate-outgoing-args --std=c11 -c hello.c -o hello.o
ld -nostdlib --warn-common --no-undefined --fatal-warnings --build-id=sha1 -shared -Bsymbolic /usr/lib/crt0-efi-x86_64.o -L /usr/lib64 -L /usr/lib /usr/lib/gcc/x86_64-linux-gnu/6/libgcc.a -T /usr/lib/ hello.o -o -lefi -lgnuefi
objcopy -j .text -j .sdata -j .data -j .dynamic -j .dynsym -j .rel \
        -j .rela -j .rel.* -j .rela.* -j .rel* -j .rela* \
        -j .reloc --target efi-app-x86_64 hello.efi
rm hello.o

Then get the resulting application (hello.efi) over to a filesystem accessible from UEFI and run it.

UEFI Interactive Shell v2.2
UEFI v2.60 (EDK II, 0x00010000)
Mapping table
FS0: Alias(s):HD1a1:;BLK3:
BLK2: Alias(s):
BLK4: Alias(s):
BLK0: Alias(s):
BLK1: Alias(s):
Press ESC in 5 seconds to skip startup.nsh or any other key to continue.
Shell> fs0:
FS0:\> hello
Hello, world!

Wohoo, it worked! (I hope.)


gnu-efi provides a way to easily develop drivers and applications for UEFI inside your POSIX environment, but it comes with some unnecessarily rough edges. Hopefully this post makes it easier for you to get started with developing real applications and drivers using gnu-efi quickly.

Clearly, we should be working towards getting this sort of thing included in upstream and installed with distribution packages.

by Leif Lindholm at January 01, 2018 00:00

January 17, 2018

Alex Bennée

Edit with Emacs v1.15 released

After a bit of hiatus there was enough of a flurry of patches to make it worth pushing out a new release. I’m in a little bit of a quandary with what to do with this package now. It’s obviously a useful extension for a good number of people but I notice the slowly growing number of issues which I’m not making much progress on. It’s hard to find time to debug and fix things when it’s main state is Works For Me. There is also competition from the Atomic Chrome extension (and it’s related emacs extension). It’s an excellent package and has the advantage of a Chrome extension that is more actively developed and using a bi-directional web-socket to communicate with the edit server. It’s been a feature I’ve wanted to add to Edit with Emacs for a while but my re-factoring efforts are slowed down by the fact that Javascript is not a language I’m fluent in and finding a long enough period of spare time is hard with a family. I guess this is a roundabout way of saying that realistically this package is in maintenance mode and you shouldn’t expect to see any new development for the time being. I’ll of course try my best to address reproducible bugs and process pull requests in a timely manner. That said please enjoy v1.15:


* Now builds for Firefox using WebExtension hooks
* Use chrome.notifications instead of webkitNotifications
* Use

with style instead of inline for edit button
* fake “input” event to stop active page components overwriting text area


* avoid calling make-frame-on-display for TTY setups (#103/#132/#133)
* restore edit-server-default-major-mode if auto-mode lookup fails
* delete window when done editing with no new frame

Get the latest from the Chrome Webstore.

by Alex at January 01, 2018 16:47

December 30, 2017

Gema Gomez

Add new ball for knitting

I knit less than I crochet, and this means that I forget all the basic things from time to time. Up until now, I had never had to join a new ball of yarn to a project, because my projects were small and used just one skein.

After some research, I have found this video quite clear on how to add a new ball of yarn safely:


  1. In the middle of a row, insert the needle as if getting ready to knit a stitch normally.
  2. Instead of using the old yarn end, create a loop with the new one, and finish the stitch with it.
  3. Loop the old end of yarn over the top of the two new ones, this prevents a hole from forming.
  4. Holding both strands of the new ball of yarn do three or four more regular stitches to secure everything.
  5. Drop the short end from the new ball and just pick up the long strand and continue as normal.

Note: be careful on the way back not to work increases on the stitches that have been knitted with two strands, work them together. If the loose ends loosen up whilst you are working, give them little tugs, then weave them in.

by Gema Gomez at December 12, 2017 00:00

December 29, 2017

Gema Gomez

Autumn Knitting and Stiching Show 2017

This year, once again I took a day off during October and headed to Alexandra Palace in London to enjoy a day off looking at knitting/sewing supplies and ideas. This year’s Autumn Knitting and Stitching Show has been as interesting as always. I started the day doing some fabrics shopping (everything was so colorful):


Then, inevitably, admired all the art that was on display at the show. This time I was quite surprised by two scenes made of yarn, a railway station and a church. Here is proof that it can be knitted and it can look gorgeous:

railway station church

Awesome day out, as always with the Knitting and Stitching Show, cannot wait to see what things are there next year!

by Gema Gomez at December 12, 2017 00:00

October 14, 2017

Gema Gomez


A couple of weeks ago I was in San Francisco for work. This was not my first time in San Francisco, so I didn’t really have a very packed agenda. Since it was Sunday, went out with a couple of colleagues, we stopped at Presidio for picnic and ate some amazing food from the lovely food trucks there (Off the grid). Afterwards we headed to what would be a very amazing visit to a yarn shop abroad. Imagiknit:

Imagiknit shop

I had never heard of it before one of my friends at work mentioned it a couple of weeks prior to our trip. The shop was a delight: spacious and a nice atmosphere, welcoming. A lot of different brands of yarn. Lots of ideas hanging near the different brands of yarn.

Inside the shop

It took us a while to do our shopping, there was a lot of wall space to cover and we wanted to make sure to get enough yarn to have something to remember this little corner of the world by. The shop keepers were knowledgeable and helpful, they got me some of the colors I needed and were not on display. They were kind and also gave me advice on some of the patterns I was interested in, found the books I was looking for. They did not only have yarn, they had plenty of accessories and books to choose from too.

Inside the shop

And this is what my shopping looked like when I arrived to the hotel:


ImagiKnit has become a new must go place for me whenever I go next to San Francisco. Totally worth a couple of hours if you are ever visiting the city and are into knitting or crochet.

by Gema Gomez at October 10, 2017 23:00