Developers Planet

October 03, 2019

Siddhesh Poyarekar

gcc under the hood

My background in computers is a bit hacky for a compiler engineer. I never had the theoretical computer science base that the average compiler geek does (yes I have a Masters in Computer Applications, but it’s from India and yes I’m going to leave that hanging without explanation) and I have pretty much winged it all these years. What follows is a bunch of thoughts (high five to those who get that reference!) from my winging it for almost a decade with the GNU toolchain. If you’re a visual learner then I would recommend watching my talk video at Linaro Connect 2019 instead of reading this. This is an imprecise transcript of my talk, with less silly quips and in a more neutral accent.

Hell Oh World!

It all started as a lightning talk at SHD Belgaum where I did a 5 minute demonstration of how a program goes from source code to a binary program. I got many people asking me to talk about this in more detail and it eventually became a full hour workshop at FOSSASIA in 2017. Those remained the basis for the first part of my talk at Connect, in fact they’re a bit more detailed in their treatment of the subject of purely taking code from source to binary.

Go read those posts, I’ll wait.

Under the hood

Welcome back! So now you know almost exactly how code goes from source to binary and how it gets executed on your computer. Congratulations, you now have a headstart into hacking on the dynamic loader bits in glibc! Now that we’ve explored the woods around the swamp, lets step into the swamp. Lets take a closer look at what gcc does with our source code.


The first thing a compiler does is to read the source code and make it into a data structure form that is easy for the computer to manipulate. Since the computer deals best with binary data, it converts the text form of the source code language into a tree-like structure. There is plenty of literature available on how that is implemented; in fact most compiler texts end up putting too much focus on this aspect of the compiler and then end up rushing through the real fun stuff that the compiler does with the program you wrote.

The data structure that the compiler translates your source code into is called an Intermediate Representation (IR), and gcc has three of them!

This part of the compiler that parses the source code into IR is known as the front end. gcc has many such front ends, one for each language it implements; they are all cluttered into the gcc/ directory. The C family of languages has a common subset of code in the gcc/c-family directory and then there are specializations like the C++ front end, implemented in the gcc/cp directory. There is a directory for fortran, another for java, yet another for go and so on. The translation of code from source to IR varies from frontend to frontend because of language differences, but they all have one thing in common; their output is an IR called GENERIC and it attempts to be language-independent.

Optimisation passes

Once gcc translates the source code into its IR form, it runs the IR through a number of passes (about 200 of them, more or less depending on the -O flag you pass) to try and come up with the most optimal machine code representation of the program. gcc builds source code a file at a time, which is called a Translation Unit (TU). A lot of its optimisation passes operate at the function level, i.e. individual functions are seen as independent units and are optimised separately. Then there are Inter-procedural analysis (IPA) passes that look at interactions of these functions and finally there is Link Time Optimisation(LTO) that attempts to analyse source code across translation units to potentially get even better results.

Optimisation passes can be architecture-independent or architecture-dependent.

Architecture independent passes do not care too much about the underlying machine beyond some basic details like word size, whether the CPU has floating point support, vector support, etc. These passes have some configuration hooks that allow their behaviour to be modified according to the target CPU architecture but the high level behaviour is architecture-agnostic. Architecture-independent passes are the holy grail for optimisation because they usually don’t get old; optimisations that work today will continue to work regardless of CPU architecture evolution.

Architecture-dependent passes need more information about the architecture and as a result their results may change as architectures evolve. Register allocation and instruction scheduling for example are very interesting and complex problems that architecture-specific passes handle. The instruction scheduling problem was a lot more critical back in the day when CPUs could only execute code sequentially. With out-of-order execution, the scheduling problem has not become less critical, but it has definitely changed in nature. Similarly the register allocation problem can be complicated by various factors such as number of locigal registers, how they share physical register files, how costly moving between different types of registers is, and so on. Architecture-dependent passes have to take into consideration all of these factors.

The final pass in architecture-dependent passes does the work of emitting assembly code for the target CPU. The architecture-independent passes constitute what is known as the middle-end of the compiler and the architecture-dependent passes form the compiler backend.

Each pass is a complex work of art, mathematics and logic and may have one or more highly cited research papers as their basis. No single gcc engineer would claim to understand all of these passes; there are many who have spent most of their careers on a small subset of passes, such is their complexity. But then, this is also a testament to how well we can work together as humans to create something that is significantly more complex than what our individual minds can grasp. What I mean to say with all this is, it’s OK to not know all of the passes, let alone know them well; you’re definitely not the only one.

Optimisation passes are all listed in gcc/passes.def and that is the sequence in which they are executed. A pass is defined as a class with a gate and execute function that determine whether to run and what to do respectively. Here’s what a single pass namespace looks like:

/* Pass data to help the pass manager classify, prepare and cleanup.  */
const pass_data pass_data_my_pass =
  GIMPLE_PASS, /* type */
  "my_pass", /* name */
  OPTGROUP_LOOP, /* optinfo_flags */
  TV_TREE_LOOP, /* tv_id */
  PROP_cfg, /* properties_required */
  0, /* properties_provided */
  0, /* properties_destroyed */
  0, /* todo_flags_start */
  0, /* todo_flags_finish */

/* This is a GIMPLE pass.  I know you don't know what GIMPLE is yet ;) */
class pass_my_pass : public gimple_opt_pass
  pass_my_pass (gcc::context *ctxt)
    : gimple_opt_pass (pass_data_my_pass, ctxt)

  /* opt_pass methods: */
  virtual bool gate (function *) { /* Decide whether to run.  */ }

  virtual unsigned int execute (function *fn);

unsigned int
pass_my_pass::execute (function *)
  /* Magic!  */

We will not go into the anatomy of a pass yet. That is perhaps a topic for a follow-up post.


GENERIC is the first IR that gets generated from parsing the source code. It is a tree structure that has been present since the earliest gcc versions. Its core data structure is a tree_node, which is a hierarchy of structs, with tree_base as the Abraham. OK, if you haven’t been following gcc development, this can come as a surprise: a significant portion of gcc is now in c++!

It’s OK, take a minute to mourn/celebrate that.

The tree_node can mean a lot of things (it is a union), but the most important one is the tree_typed node. There are various types of tree_typed, like tree_int_cst, tree_identifier, tree_string, etc. that describe the elements of source code that they house. The base struct, i.e. tree_typed and even further up, tree_base have flags that have metadata about the elements that help in optimisation decisions. You’ll almost never use generic in the context of code traversal in optimisation passes, but the nodes are still very important to know about because GIMPLE continues to use them to store operand information. Take a peek at gcc/tree-core.h and gcc/tree.def to take a closer look at all of the types of nodes you could have in GENERIC.

What is GIMPLE you ask? Well, that’s our next IR and probably the most important one form the context of optimisations.

OK so now you have enough background to go look at the guts of the GENERIC tree IR in gcc. Here’s the gcc internals documentation that will help you navigate all of the convenience macros to analyze th tree nodes.


The GIMPLE IR is a workhorse of gcc. Passes that operate on GIMPLE are architecture-independent.

The core structure in GIMPLE is a struct gimple that holds all of the metadata for a single gimple statement and is also a node in the list of gimple statements. There are various structures named gimple_statement_with_ops_* that have the actual operand information based on its type. Once again like with GENERIC, it is a hierarchy of structs. Note that the operands are all of the tree type so we haven’t got rid of all of GENERIC. gcc/gimple.h is where all of these structures are defined and gcc/gimple.def is where all of the different types of gimple statements are defined.

Where did my control flow go?

But how is it that a simple list of gimple statements is sufficient to traverse a program that has conditions and loops? Here is where things get interesting. Every sequence of GIMPLE statements is housed in a Basic Block (BB) and the compiler, when translating from GENERIC to GIMPLE (also known as lowering, since GIMPLE is structurally simpler than GENERIC), generates a Control FLow Graph (CFG) that describes the flow of the function from one BB to another. The only control flow idea one then needs in GIMPLE to traverse code from within the gimple statement context is a jump from one block to another and that is fulfilled by the GIMPLE_GOTO statement type. The CFG with its basic blocks and edges connecting those blocks, takes care of everything else. Routines to generate and manipulate the CFG are defined in gcc/cfg.h and gcc/cfg.c but beware, you don’t modify the CFG directly. Since CFG is tightly linked with GIMPLE (and RTL, yes that’s our third and final IR), it provides hooks to manipulate the graph and update GIMPLE if necessary.

The last interesting detail about CFG is that it has a special construct for loops, because they’re typically the most interesting subjects for optimisation: you can splice them, unroll them, distribute them and more to produce some fantastic performance results. gcc/cfgloop.h is there you’ll find all of the routines you need to traverse and manipulate loops.

The final important detail with regard to GIMPLE is the Single Static Assignment (SSA) form. Typical source code would have variables that get declared, assigned to, manipulated and then stored back into memory. Essentially, it is multiple operations on a single variable, sometimes destroying its previous contents as we reuse variables for different things that are logically related in the context of the high level program. This reuse however makes it complicated for a compiler to understand what’s going on and hence it ends up missing a host of optimisation opportunities.

To make things easier for optimisation passes, these variables are broken up into versions of themselves that have a specific start and end point in their lifecycle. This is called the Single Static Assignment form of a variable, where each version of the variable as a single starting point, viz. its definition. So if you have code like this:

    x = 10;
    x += 20;

it becomes:

    x_1 = 10;
    x_2 = x_1 + 20;

where x_1 and x_2 are versions of x. If you have versions of variables in conditions, things get interesting and the compiler deals with it with a mysterious concept called PHI nodes. So this code:

    if (n > 10)
      x = 10;
      x = 20;
    return x;


    if (n > 10)
      x_1 = 10;
      x_2 = 20;
    # x_3 = PHI<x_1, x_2>;
    return x_3;

So the PHI node is a conditional selector of the earlier two versions of the variable and depending on the results of the optimisation passes, you could eliminate versions of the variables altogether or use CPU registers more efficiently to store those variable versions.

There you go, now you have everything you need to get started on hacking GIMPLE. I know this part is a bit heavy but guess what, this is where you can seriously start thinking about hacking on gcc! When you jump in, you’ll need the more detailed information in the gcc internals manual on GIMPLE, CFG and GIMPLE optimisations.


We are yet another step closer to generating assembly code for our assembler and linker to build into the final program. Recall that GIMPLE is largely architecture independent, so it works on high level ideas of statements, expressions and types and their relationships. RTL is much more primitive in comparison and is designed to mimic sequential architecture instructions. Its main purpose is to do architecture-specific work, such as register allocation and scheduling instructions in addition to more optimisation (because you can never get enough of that!) passes that make use of architecture information.

Internally, you will encounter two forms of RTL, one being the rtx struct that is used for most transformations and passes in the compiler. The other form is in text to map machine instructions to RTL and these are Lisp-like S expressions. These are used to make machine descriptions, where you can specify machine instructions for all of the common operations such as add, sub, multiply and so on. For example, this is what the description of a jump looks like for aarch64:

(define_insn "jump"
  [(set (pc) (label_ref (match_operand 0 "" "")))]
  [(set_attr "type" "branch")]

Machine description files are in the gcc/config directory in their respective architecture subdirectory and have the .md suffix. The main aarch64 machine description file for example is gcc/config/aarch64/ These files are parsed by a tool in gcc to generate C code with rtx structures for each of those S-expressions.

In general, the gcc/config directory contains source code that handles compilation for that specific architecture. In addition to the machine descriptions these directories also have C code that enhance the RTL passes to exploit as much of th architecture information as they possibly can. This is where some of the detailed architecture-specific analysis of the RTL instructions go. For example, combination of loads into load pairs for aarch64 is an important task and it is done with a combination of machine description and some code to peek into and rearrange neighbouring RTL instructions

But then you’re wondering, why are there multiple description files? Other than just cleaner layout (put constraint information in a separate file, type information in another, etc.) it is because there are multiple evolutions of an architecture. The ‘i386’ architecture is a mess of evolutions that span word sizes and capabilities. The aarch64 architecture has within it many microarchitectures developed by various Arm licensee vendors like xgene, thunderxt88, falkor and also those developed by Arm such as the cortex-a57, cortex-a72, ares, etc. All of these have different behaviours and performance characteristics despite sharing an instruction set. For example on some microarchitecture one may prefer to emit the csel instruction instead of cmp, b.cond and multiple mov instructions to reduce code size (and hence improve performance) but on some other architecture, the csel instruction may have been designed really badly and hence it may well be cheaper to execute the 4+ instructions instead of the one csel. These are behaviour quirks that you select when you use the -mtune flag to optimise for a specific CPU. A lot of this information is also available in the C code of the architecture in the form of structures called cost tables. These are relative costs of various operations that help the RTL passes and some GIMPLE passes determine the best target code and optimisation behaviour accordingly for the CPU. Here’s an example of register move costs for the Qualcomm Centriq processor:

static const struct cpu_regmove_cost qdf24xx_regmove_cost =
  2, /* GP2GP  */
  /* Avoid the use of int<->fp moves for spilling.  */
  6, /* GP2FP  */
  6, /* FP2GP  */
  4 /* FP2FP  */

This tells us that in general, moving between general purpose registers is cheaper, moving between floating point registers is slightly more expensive and moving between general purpose and floating point registers is most expensive.

The other important detail in such machine description files is the pipeline description. This is a description of how the pipeline for a specific CPU microarchitecture is designed along with latencies for instructions in that pipeline. This information is used by the instruction scheduler pass to determine the best schedule of instructions for a CPU.

Now where do I start?

This is a lot of information to page in at once and if you’re like me, you’d want something more concrete to get started with understanding what gcc is doing. gcc has you covered there because it has flags that allow you to study the IR generated for the compiler at every stage of compilation. By every stage, I mean every pass! The -fdump-* set of flags allow you to dump the IR to study what gcc did in each pass. Particularly, -fdump-tree-* flags dump GIMPLE IR (in a C-like format so that it is not too complicated to read) into a file for each pass and the -fdump-rtl-* flags do the same for RTL. The GIMPLE IR can be dumped in its raw form as well (e.g. -fdump-tree-all-raw), which makes it much simpler to correlate with the code that is manipulating the GIMPLE in an optimisation pass.

The easiest way to get into gcc development (and compiler development in general) in my experience is the back end. Try tweaking the various cost tables to see what effect it has on code generation. Modify the instructions generated by the RTL descriptions and use that to look closer at one pass that interests you. Once you’re comfortable with making changes to gcc, rebuilding and checking its outputs, you can then try writing a pass, which is a slightly more involved process. Maybe I’ll write about it some day.


  • The GCC internals manual is the canonical place to read (and fix up) documentation for GCC internals. It is wonderfully detailed and hopelessly outdated at the same time. Bringing it up to date is a task by itself and the project continues to look for volunteers to do that.
  • David Malcolm has a more functional newbie guide for fledgling gcc hackers who might struggle with debugging gcc and getting involved in the gcc development process
  • The GCC Resource Committee workshop on GCC is where I cut my teeth on gcc internals. I don’t think their workshop is active anymore but they have presentations and other literature there that is still very relevant.
  • I wrote about micro-optimisations in the past and those ideas are great to try on gcc.

by Siddhesh at October 10, 2019 21:12

September 16, 2019

Siddhesh Poyarekar

Wrestling with the register allocator: LuaJIT edition

For some time now, I kept running into one specific piece of code in luajit repeatedly for various reasons and last month I came across a fascinating bug in the register allocator pitfall that I had never encountered before. But then as is the norm after fixing such bugs, I concluded that it’s too trivial to write about and I was just stupid to not have found it sooner; all bugs are trivial once they’re fixed.

After getting over my imposter syndrome (yeah I know, took a while) I finally have the courage to write about it, so like a famous cook+musician says, enough jibberjabber…

Looping through a hash table

One of the key data structures that luajit uses to implement metatables is a hash table. Due to its extensive use, the lookup loop into such hash tables is optimised in the JIT using an architecture-specific asm_href function. Here is what a snippet of the arm64 version of the function looks like. A key thing to note is that the assembly code is generated backwards, i.e. the last instruction is emitted first.

  /* Key not found in chain: jump to exit (if merged) or load niltv. */
  l_end = emit_label(as);
  as->invmcp = NULL;
  if (merge == IR_NE)
    asm_guardcc(as, CC_AL);
  else if (destused)
    emit_loada(as, dest, niltvg(J2G(as->J)));

  /* Follow hash chain until the end. */
  l_loop = --as->mcp;
  emit_n(as, A64I_CMPx^A64I_K12^0, dest);
  emit_lso(as, A64I_LDRx, dest, dest, offsetof(Node, next));
  l_next = emit_label(as);

  /* Type and value comparison. */
  if (merge == IR_EQ)
    asm_guardcc(as, CC_EQ);
    emit_cond_branch(as, CC_EQ, l_end);

  if (irt_isnum(kt)) {
    if (isk) {
      /* Assumes -0.0 is already canonicalized to +0.0. */
      if (k)
        emit_n(as, A64I_CMPx^k, tmp);
        emit_nm(as, A64I_CMPx, key, tmp);
      emit_lso(as, A64I_LDRx, tmp, dest, offsetof(Node, key.u64));
    } else {
      Reg ftmp = ra_scratch(as, rset_exclude(RSET_FPR, key));
      emit_nm(as, A64I_FCMPd, key, ftmp);
      emit_dn(as, A64I_FMOV_D_R, (ftmp & 31), (tmp & 31));
      emit_cond_branch(as, CC_LO, l_next);
      emit_nm(as, A64I_CMPx | A64F_SH(A64SH_LSR, 32), tisnum, tmp);
      emit_lso(as, A64I_LDRx, tmp, dest, offsetof(Node, key.n));
  } else if (irt_isaddr(kt)) {
    Reg scr;
    if (isk) {
      int64_t kk = ((int64_t)irt_toitype(irkey->t) << 47) | irkey[1].tv.u64;
      scr = ra_allock(as, kk, allow);
      emit_nm(as, A64I_CMPx, scr, tmp);
      emit_lso(as, A64I_LDRx, tmp, dest, offsetof(Node, key.u64));
    } else {
      scr = ra_scratch(as, allow);
      emit_nm(as, A64I_CMPx, tmp, scr);
      emit_lso(as, A64I_LDRx, scr, dest, offsetof(Node, key.u64));
    rset_clear(allow, scr);
  } else {
    Reg type, scr;
    lua_assert(irt_ispri(kt) && !irt_isnil(kt));
    type = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
    scr = ra_scratch(as, rset_clear(allow, type));
    rset_clear(allow, scr);
    emit_nm(as, A64I_CMPw, scr, type);
    emit_lso(as, A64I_LDRx, scr, dest, offsetof(Node, key));

  *l_loop = A64I_BCC | A64F_S19(as->mcp - l_loop) | CC_NE;

Here, the emit_* functions emit assembly instructions and the ra_* functions allocate registers. In the normal case everything is fine and the table lookup code is concise and effective. When there is register pressure however, things get interesting.

As an example, here is what a typical type lookup would look like:

0x100	ldr x1, [x16, #52]
0x104	cmp x1, x2
0x108	beq -> exit
0x10c	ldr x16, [x16, #16]
0x110	cmp x16, #0
0x114	bne 0x100

Here, x16 is the table that the loop traverses. x1 is a key, which if it matches, results in an exit to the interpreter. Otherwise the loop moves ahead until the end of the table. The comparison is done with a constant stored in x2.

The value of x is loaded later (i.e. earlier in the code, we are emitting code backwards, remember?) whenever that register is needed for reuse through a process called restoration or spilling. In the restore case, it is loaded into register as a constant or expressed as another constant (look up constant rematerialisation) and in the case of a spill, the register is restored from a slot in the stack. If there is no register pressure, all of this restoration happens at the head of the trace, which is why if you study a typical trace you will notice a lot of constant loads at the top of the trace.

Like the Spill that ruined your keyboard the other day…

Things get interesting when the allocation of x2 in the loop above results in a restore. Looking at the code a bit closer:

    type = ra_allock(as, ~((int64_t)~irt_toitype(ir->t) << 47), allow);
    scr = ra_scratch(as, rset_clear(allow, type));
    rset_clear(allow, scr);
    emit_nm(as, A64I_CMPw, scr, type);
    emit_lso(as, A64I_LDRx, scr, dest, offsetof(Node, key));

The x2 here is type, which is a constant. If a register is not available, we have to make one available by either rematerializing or by restoring the register, which would result in something like this:

0x100   ldr x1, [x16, #52]
0x104   cmp x1, x2
0x108	mov x2, #42
0x10c   beq -> exit
0x110   ldr x16, [x16, #16]
0x114   cmp x16, #0
0x118   bne 0x100

This ends up breaking the loop because the allocator restore/spill logic assumes that the code is linear and the restore will affect only code that follows it, i.e. code that got generated earlier. To fix this, all of the register allocations should be done before the loop code is generated.

Making things right

The result of this analysis was this fix in my LuaJIT fork that allocates registers for operands that will be used in the loop before generating the body of the loop. That is, if the registers have to spill, they will do so after the loop (we are generating code in reverse order) and leave the loop compact. The fix is also in the luajit2 repository in the OpenResty project. The work was sponsored by OpenResty as this wonderfully vague bug could only be produced by some very complex scripts that are part of the OpenResty product.

by Siddhesh at September 09, 2019 00:20

September 12, 2019

Ard Biesheuvel

GHASH for high-end ARM cores

After years of using Cortex-A57 or A53 based systems as both my development machines as well as my optimization targets, I was recently given a ThunderX2 workstation, and after having moved my development environment to it, I have started  looking into whether the existing accelerated crypto drivers in the Linux kernel perform optimally on this micro-architecture.

Marvell’s ThunderX2 core (née Broadcom Vulcan) is a server class CPU with a deep pipeline. This typically means that it takes longer for the result of an instruction to become available to subsequent instructions, and instead of stalling the pipeline waiting for that result, we should try to do some other useful work in the meantime.

So for the simpler AES modes such as CBC and CTR, I have increased the interleave factor from 4x to 5x, resulting in a speedup of around 11% on ThunderX2. This result fueled my suspicion that the GCM code, which was using an interleave factor of only 2x, could benefit even more from this approach, and since I already added an implementation of GHASH (one of the constituent parts of GCM) that uses aggregation to operate on 4 blocks of input at a time, it should simply be a matter of combining this code with a 4-way interleaved implementation of CTR.

GHASH aggregation

GHASH is not a general purpose cryptographic hash function, but one that was specifically designed for (and should only be used in) AES in GCM mode. It is based on multiplication in GF(2128), and basically comes down to the following

X[0] = (H * I[0]) mod M
X[1] = (H * (X[0] ^ I[1])) mod M
X[n] = (H * (X[n-1] ^ I[n])) mod M

where I[] is an input array of 16 byte blocks, and the output of the hash function is the last value in the array X[]. H is a quantity that is derived from the AES key by encrypting a single AES block consisting of NUL bytes, and M is GHASH’s characteristic polynomial x128 + x7 + x2 + x + 1 over GF(2128).

Each line involves a multiplication in GF(2128), which means that it consists a multiplication step and a modulo division step, and it is possible the amortize the cost of the latter step over multiple multiplications, increasing performance. For example, we can process two blocks at a time like so

X[n] = (H * (((H * (X[n-2] ^ I[n-1])) mod M) ^ I[n])) mod M
     = ((H^2 * (X[n-2] ^ I[n-1])) ^ (H * I[n])) mod m

This is called GHASH aggregation, and was implemented for the core GHASH transform here. Note that this requires powers of H to be precomputed, and so we cannot apply this transformation an arbitrary number of times.

Another thing to note is that this algorithm combines the highest power of H with the first input block, which means that if we want to reuse the same code sequence to process fewer than n blocks, we need to populate the input blocks from right to left, and jump to the appropriate point in the routine so that only the lower powers of H are used in the computation.

Handling the tail

GCM is an AEAD (Authenticated Encryption with Associated Data) which can operate on inputs of any size. Since GCM is mainly used in the networking space, we should expect having to deal mostly with the typical size of a network packet, which is ~1500 bytes. This means that our 4-way aggregated code operating on 4 blocks or 64 bytes at a time will always have to deal with a tail block of less than 64 bytes in size.

Since out-of-order cores with deep pipelines perform poorly when running code with lots of branches and tight loops that load small quantities at a time, we should try to handle the tail with as few branches as possible, and use large, potentially overlapping loads to get the data into SIMD registers.

This is what the sequence below implements. It unconditionally performs four 16-byte loads, and manipulates the input address and the post-increment values so that we end up with I[] populated right to left with between 1 and 63 bytes of remaining input. (x0 contains the source address, and x2 the size of the remaining input minus 64)

mov     x15, #16
ands    x19, x0, #0xf
csel    x19, x19, x15, ne
adr_l   x17, .Lpermute_table + 16

sub     x11, x15, x19
add     x12, x17, x11
ld1     {T1.16b}, [x12]
sub     x11, x2, x11

cmp     x0, #-16
csel    x14, x15, xzr, gt
cmp     x0, #-32
csel    x15, x15, xzr, gt
cmp     x0, #-48
csel    x16, x19, xzr, gt
csel    x2, x2, x11, gt

ld1     {I0.16b}, [x2], x14
ld1     {I1.16b}, [x2], x15
ld1     {I2.16b}, [x2], x16
ld1     {I3.16b}, [x2]
tbl     I3.16b, {I3.16b}, T1.16b

Since we always load at least 16 bytes, it is up to the caller to present the input in a way that permits doing so if the entire input is less than 16 bytes in size (e.g, by copying the input to the end of a 16 byte buffer)

CTR encryption

While the 4-way GHASH code requires taking another code path through the 4-way routine when it is invoked with less than 4 blocks of input, the CTR code can be modified more easily to support this, by incrementing the overall counter by the remaining number of blocks, and then subtracting 4-n for each keystream block n. This naturally populates the keystream blocks from right to left as well, and instead of using a different encryption routine depending on the remaining input size, we can just unconditionally generate a 64 byte keystream and discard the parts that we don’t need.

Generating the tag

The final step in performing GCM is generating the authentication tag, which involves one final round of GHASH on a block of input containing the input size, and a final pass of CTR encryption.

While the existing code does not handle the tail or the tag directly, and jumps back to C code instead to copy the tail into a stack buffer and process it via the split GHASH and CTR routines, we have now incorporated this into the core GCM transform. This means we can go one step further, and fold the tag generation in as well.

ld1     {INP3.16b}, [x10]             // load lengths[]
mov     w9, #1                        // one block of input
bl      pmull_gcm_ghash_4x

mov     w11, #(0x1 << 24)             // BE '1U'
ld1     {KS0.16b}, [x5]               // load upper counter
mov     KS0.s[3], w11                 // set lower counter

enc_block KS0, x7, x6, x12

ext     X.16b, X.16b, X.16b, #8
rev64   X.16b, X.16b
eor     X.16b, X.16b, KS0.16b
st1     {XL.16b}, [x10]               // store tag


When combining all the changes above, we end up with a routine that can consume the entire encryption input buffer including the tail and the input block for the tag, and process it in a way that is up to 25% faster than the old code for 1500 byte blocks. Note that 4 KiB blocks are only 11-16% faster, and on other micro-architectures, performance regresses slightly for large inputs. This is not a use case worth obsessing about, though. More performance numbers below, the full implementation can be found here.

AES-128 AES-192 AES-256
#bytes 512 1500 4k 512 1500 4k 512 1500 4k
TX2 35% 23% 11% 34% 20% 9% 38% 25% 16%
EMAG 11% 6% 3% 12% 4% 2% 11% 4% 2%
A72 8% 5% -4% 9% 4% -5% 7% 4% -5%
A53 11% 6% -1% 10% 8% -1% 10% 8% -2%

by ardbiesheuvel at September 09, 2019 15:48

September 03, 2019

Marcin Juszkiewicz

My first 8K intro

I am on demoscene since 1997 when I attended “Intel Outside 4” party in Włocławek, Poland. But I have never released anything. Until Xenium 2019 party where I presented my first 8K intro. Written for Atari 2600 game console. In some kind of BASIC language…

The idea

The idea for it came about year ago during Riverwash demoscene party. Most of PC 64KB intros started with some kind of progress bar as code was generating textures, instruments and other stuff requiring calculations. I joked that it would be great to make something similar for Atari 2600 VCS.

The fun is that Atari 2600 lacks any usable memory. It has 128 bytes (bytes, not kilobytes) of RAM. The only storage is cartridge with 4KB of ROM space (expandable by bankswitching to 32KB). So there is no point in any generating at other phase than development.

batari Basic

During July I took a look at available options and found ‘batari Basic’. It is BASIC-like language developed in 2005-2007 by Fred X. Quimby.

My main source of help was a forum on Atari Age. Detailed language info came from useful page called ‘Random Terrain’. There is also “Visual BB which is IDE with some tools speeding up development.


Thanks to Xenium organizers I have my intro recorded as video:

thumbnail for LChMOMF8GPw video

As you can see it has two parts. First one is playfield with one line changing every 30 frames (first version was for NTSC). Result is simple progress bar.

Second part uses “titlescreen kernel” to display dino graphics. I used one provided by Xenium organizers (rescaled to 96x91 pixels). I used rainbow colours to have some colours on screen.

At party

At start of “oldchool intro” competition I told to friend sitting next to me that I hope for at least four entries. He asked “why?” so I pointed him to the screen saying “because of it”:

My first 8K intro announcement screen

There were six entries so I though that I am safe and will have 5th entry maximum.


Imagine my surprise during voting results announcement next day. My intro took 3rd place!

Award for 3rd place

Amazing for production without any effects, without any audio. And written in BASIC-like language without using any knowledge of 6502 assembly.

Source code

If someone wants to see how simple it was then source code is in “my first 8k” git repo on github. Enjoy!

What next?

During Xenium I got lot of technical information for VCS programming. And there were questions about my next production.

Nothing to promise here. And not for Atari VCS. It is sick platform to program due to lack of any framebuffer memory so ‘race the beam’ is in use…

I have some ideas for Atari XL/XE

by Marcin Juszkiewicz at September 09, 2019 10:53

August 28, 2019

Steve McIntyre

If you can't stand the heat, get out of the kitchen...

Wow, we had a hot weekend in Cambridge. About 40 people turned up to our place in Cambridge for this year's OMGWTFBBQ. Last year we were huddling under the gazebos for shelter from torrential rain; this year we again had all the gazebos up, but this time to hide from the sun instead. We saw temperatures well into the 30s, which is silly for Cambridge at the end of August.

I think it's fair to say that everybody enjoyed themselves despite the ludicrous heat levels. We had folks from all over the UK, and Lars and Soile travelled all the way from Helsinki in Finland to help him celebrate his birthday!


We had a selection of beers again from the nice folks at Milton Brewery:
is 3 firkins enough?

Lars made pancakes, Paul made bread, and people brought lots of nice food and drink with them too.

Many thanks to a number of awesome friendly companies for again sponsoring the important refreshments for the weekend. It's hungry/thirsty work celebrating like this!

August 08, 2019 20:17

August 27, 2019

Mark Brown

Linux Audio Miniconference 2019

As in previous years we’re going to have an audio miniconference so we can get together and talk through issues, especially design decisions, face to face. This year’s event will be held on Sunday October 31st in Lyon, France, the day after ELC-E. This will be held at the Lyon Convention Center (the ELC-E venue), generously sponsored by Intel.

As with previous years let’s pull together an agenda through a mailing list discussion – this announcement has been posted to alsa-devel as well, the most convenient thing would be to follow up to it. Of course if we can sort things out more quickly via the mailing list that’s even better!

If you’re planning to attend please fill out the form here.

This event will be covered by the same code of conduct as ELC-E.

Thanks again to Intel for supporting this event.

by broonie at August 08, 2019 20:28

August 10, 2019

Steve McIntyre

DebConf in Brazil again!

Highvoltage and me

I was lucky enough to meet up with my extended Debian family again this year. We went back to Brazil for the first time since 2004, this time in Curitiba. And this time I didn't lose anybody's clothes! :-)

Rhonda modelling our diversity T!

I had a very busy time, as usual - lots of sessions to take part in, and lots of conversations with people from all over. As part of the Community Team (ex-AH Team), I had a lot of things to catch up on too, and a sprint report to send. Despite all that, I even managed to do some technical things too!

I ran sessions about UEFI Secure Boot, the Arm ports and the Community Team. I was meant to be running a session for the web team too, but the dreaded DebConf 'flu took me out for a day. It's traditional - bring hundreds of people together from all over the world, mix them up with too much alcohol and not enough sleep and many people get ill... :-( Once I'm back from vacation, I'll be doing my usual task of sending session summaries to the Debian mailing lists to describe what happened in my sessions.

Maddog showed a group of us round the micro-brewery at Hop'n'Roll which was extra fun. I'm sure I wasn't the only experienced guy there, but it's always nice to listen to geeky people talking about their passion.

Small group at Hop'n'Roll

Of course, I could't get to all the sessions I wanted to - there's just too many things going on in DebConf week, and sessions clash at the best of times. So I have a load of videos on my laptop to watch while I'm away. Heartfelt thanks to our always-awesome video team for their efforts to make that possible. And I know that I had at least one follower at home watching the live streams too!

Pepper watching the Arm BoF

August 08, 2019 18:33

July 25, 2019

Marcin Juszkiewicz

Kolla ‘stein’ released

On last Monday we finally released Kolla and Kolla Ansible 8.0.0 ‘stein’. Took us longer than we planned but now it is done and ready for users.


What got changed? Many things — details can be found in Kolla release notes and Kolla Ansible release notes.

My work in this cycle was more reviews, less code. And lot of planning how to handle Python 3 migration for Debian/Ubuntu based images. At some point we decided that this stuff will be postponed to ‘train’ cycle. You can read more about it my previous post: Moving Kolla images to Python 3.

Why so late

Usually we release Kolla ‘two weeks’ after official OpenStack release. This allows us to switch to final release code of other projects, do some testing etc. This time it took us far more time :(

Due to several issues (some core developers got more occupied with work, distributions changed dependencies in packages) it took us longer than ‘two weeks’ after official OpenStack ‘stein’ release. We added more tests for CI, handled partial Python 3 migration in Ubuntu Cloud Archive and more.Several fixes were made in ‘train’ and backported to ‘stein’.

Now it is your turn — build, deploy, test, report ;D

by Marcin Juszkiewicz at July 07, 2019 10:53

July 01, 2019

Naresh Bhat

CentOS: Create and share your own YUM repository

This blog explains how to create your own yum repository with the createrepo tool and to distribute specialized packages within an organization. In this blog I have given an example of kernel RPMs repository.

Creating your own yum repository is very simple, and very straightforward. In order to do it, you need the createrepo tool, which can be found in the createrepo package, so to install it, execute as root:
# yum install createrepo
Once the package is installed, you can begin creating your repository. You will also need some RPM packages to create the repository with. Decide where you want to store your repository; let's say, /var/ftp/pub will be the base directory . 
Depending on how particular you want to get, you can dump everything to a single repository or keep things organized. 
# mkdir -p /var/ftp/pub/repo/CentOS/7/{SRPMS,aarch64}
Now copy your aarch64 packages to /var/ftp/pub/repo/CentOS/7/aarch64, and the SRPMS you have (if wanted) to /var/ftp/pub/repo/CentOS/7/SRPMS. To easily automate the creation of the repository metadata, create a shell script called create-my-repo and place it somewhere in your PATH:
for arch in arm64
pushd ${destdir}/${arch} >/dev/null 2>&1
createrepo .
popd >/dev/null 2>&1

Make the script executable and whenever you run it, it will call the createrepo tool on one directory: /var/ftp/pub/repo/CentOS/7/aarch64 Once this is done, your repository is ready for use.  If /var/ftp is the available FTP root, then would be the download URL for the aarch64 packages. To make this available to the other client systems, create a yum repository configuration file called /etc/yum.repos.d/local.repo with the following contents:

$ sudo vim /etc/yum.repos.d/local.repo

save and exit.  

by Naresh ( at July 07, 2019 06:55

May 30, 2019

Gema Gomez

KubeCon Barcelona 2019

I was at KubeCon Barcelona 2019 a couple of weeks ago and had a lovely week full of great meetings and new technology finds. I must say kubernetes has come a long way over the past couple of years and there is a thriving community around it. Here is a list of videos that I encourage people to have a look at (video list). Regarding keynotes, I’d probably highlight the Don’t stop believin’ one from Bryan Liles at VMware. He is not only a great speaker but he also summarized where the community is and where it is heading beatifully:

keynote pic

The conference was at a great venue, Fira Gran Via Barcelona, but we spent a lot of time walking from presentation to presentation (it was huge). The sponsor booths were busy and lots of demos were going on at any given time. I stopped at many booths and discussed plenty of topics with plenty of interesting people, most of it around how to manage and monitor services, what the best way to manage scaling is and what is the right cloud to be deploying to depending on requirements. Arm64 featured also in a lot of conversations.

I even found time to enter a couple of raffles and came home with an echo show from the Mesosphere booth, thank you guys! I must say I loved the tetris machine you had at the booth too :D

echo show picture

by Gema Gomez at May 05, 2019 23:00

May 25, 2019

Marcin Juszkiewicz

Upgraded my desktop a bit

I have built my current desktop 7.5 years ago. Since then I did not had a need for a big hardware changes.


Machine (called ‘puchatek‘ (Winnie the Pooh in Polish)) had several upgrades in meantime:

  • got maxxed at 32GB of memory
  • 60GB Corsair SSD for / was installed 8 years ago
  • 250GB Samsung Evo for /home was added 3 years ago
  • graphic cards were changed from Radeon HD5450 via Radeon R7 240 to Nvidia GTX 1050 Ti


Size of system drive became an issue when I needed to build hundreds of container images. All that Kolla stuff…

One of solutions was replacing system drive with bigger one. So I tried to use PCI Express to m.2 adapter card and realised that x8 slot stopped working.

New mainboard

It was a time to replace motherboard. And it is impossible to find a brand new one with 1155 socket. So I went through used ones and found nice replacement — Asrock Z68 Extreme4.

What’s nice in it? PLX chip. It is PCI Express switch. So mainboard can have x16/x8+x8 slots, x4 slot, some x1 slots and several onboard components despite of only 24 pcie lanes available (16 from cpu, 8 from chipset).

This way I can have graphics card working in x16 slot (it goes with x8 anyway) and NVME drive in x4 slot. If I decide to go into SLI (two graphics cards) or 10GbE I have a slot for it.

PCI tree

PCI tree looks a bit different now:

  • cpu

    • x16 slot with graphics card
    • x8 slot (empty)
  • chipset

    • x4 slot with nvme
    • x1 link to Marvell SATA controller (disabled in firmware setup)
    • x1 link to Etron USB 3.0 host controller
    • x1 link to Etron USB 3.0 host controller
    • x1 link to PLX switch
  • PLX switch

    • x1 slot with Renesas USB 3.0 host controller
    • x1 link to FireWire controller (disabled in firmware setup)
    • x1 link to Broadcom 1GbE controller
    • x1 slot (empty, covered by graphics card)
    • x1 link to PCIe to pci bridge

Hacking firmware

Latest available firmware was from 2012 and lacked any support for NVME boot. Thanks to other hackers it was not an issue. Only had to follow “how to add NVME booting into BIOS” instruction. After flashing modified firmware I could boot directly from NVME drive.

Final result

System boots from NVME now. 256GB of fast storage available for / and my container images. Spare PCIe x8 slot for future upgrades.

by Marcin Juszkiewicz at May 05, 2019 20:53

May 23, 2019

Marcin Juszkiewicz

Not enough bandwidth for new device state

USB. Protocol which replaced random keyboard connectors, PS/2, ADB, gameport, serial and parallel ports (and many more). Sometimes expanded to “USB Sucks Badly“.

Five years ago buying USB 3.0 hub was a task as it was not so popular thing. Nowadays I have three SuperSpeed ones. One in LG monitor (webcam, phone, Bluetooth dongle), one on desk (watch charging/adb, Yubikey, card reader, pendrives etc) and that 7 port one from five years ago with random dongles in it.

Yesterday I replaced Gigabyte P67X-UD3-B3 mainboard with AsRock Z68 Extreme4 one. This gave me extra PCIe x4 slot where I can plug NVME storage. Built system, booted into Fedora and started using.

At some moment I had to login into one of systems with two factor authorization. I use Yubikey for it. Pressed the button and nothing was outputed…

Then I realized why previous configuration had that extra USB 3 controller:

usb 3-2.2: new full-speed USB device number 12 using xhci_hcd
usb 3-2.2: New USB device found, idVendor=1050, idProduct=0110, bcdDevice= 3.33
usb 3-2.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 3-2.2: Product: Yubikey NEO OTP
usb 3-2.2: Manufacturer: Yubico
usb 3-2.2: Not enough bandwidth for new device state.
usb 3-2.2: can't set config #1, error -28

Yay, USB!

Plugged Renesas uPD720201 PCIe USB 3.0 host controller, moved all hubs to it and it works just fine:

usb 7-3.2: new full-speed USB device number 11 using xhci_hcd
usb 7-3.2: New USB device found, idVendor=1050, idProduct=0110, bcdDevice= 3.33
usb 7-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 7-3.2: Product: Yubikey NEO OTP
usb 7-3.2: Manufacturer: Yubico
input: Yubico Yubikey NEO OTP as /devices/pci0000:00/0000:00:1c.7/0000:05:00.0/0000:06:01.0/0000:07:00.0/usb7/7-3/7-3.2/7-3.2:1.0/0003:1050:0110.0008/input/input30
hid-generic 0003:1050:0110.0008: input,hidraw6: USB HID v1.10 Keyboard [Yubico Yubikey NEO OTP] on usb-0000:07:00.0-3.2/input0

Then checked which USB 3.0 host controller is on mainboard:

11:29 (0s) hrw@puchatek:~$ lspci |grep -i usb
00:1a.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 (rev 05)
00:1d.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 (rev 05)
03:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
04:00.0 USB controller: Etron Technology, Inc. EJ168 USB 3.0 Host Controller (rev 01)
07:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03)

Argh, Etron EJ168 again…

So if your mainboard has Etron EJ168 then consider adding Renesas card. Works just fine. Nevermind how slow/fast your USB devices are.

by Marcin Juszkiewicz at May 05, 2019 09:20

May 21, 2019

Naresh Bhat

How to setup git and configure send-email

The following steps will explain how to setup a git on Ubuntu bash shell  and configure your mail ID.

Install git and git -core

sudo apt-get install git git-core

You can also clone, compile and install the latest version of the git

git clone

make configure

./configure --prefix=/usr

make -j8

sudo make install

You need to install the following dependency packages

sudo apt-get update

sudo apt-get upgrade

sudo apt-get install perl-IO-Socket-SSL libio-socket-ssl-perl libcrypt-ssleay-perl libnet-ssleay-perl libio-socket-ssl-perl perl-mime-tools perl-authen-sasl libauthen-sasl-perl libmime-tools-perl

The .gitconfig file will be as below,

$ cat .gitconfig


        name = [Username]

        email = [Your Mail ID]


        smtpencryption = tls

        smtpserver =

        smtpuser = [Your mail ID]

        smtpserverport = 587

        confirm = auto

        smtppass = [You need to generate a password in your google account for this application]

Now you all set to send the mail.  you can do a simple test by sending mail to yourself using git send mail. e.g.

git send-email <patch-name>.patch --to=<mailing-list ID>

by Naresh ( at May 05, 2019 05:47

May 20, 2019

Neil Williams

New directions

It's been a difficult time, the last few months, but I've finally got some short updates.

First, in two short weeks I will be gainfully employed again at (UltraSoc) as Senior Software Tester, developing test framework solutions for SoC debugging, including on RISC-V. Despite vast numbers of discussions with a long list of recruitment agences, success came from a face to face encounter at a local Job Fair. Many thanks to Cambridge Network for hosting the event.

Second, I've finally accepted that was too old to retain and I'm simply redirecting the index page to this blog. The old codehelp site hasn't kept up with new technology and the CSS handles modern screen resolutions particularly badly. I don't expect that many people were finding the PHP and XML content useful, let alone the now redundant WML content. In time, I'll add redirects to the other pages.

Third, my job hunting has shown that the centralisation of decentralised version control is still a thing. As far as recruitment is concerned, if the code isn't visible on GitHub, it doesn't exist. (It's not the recruitment agencies asking for GitHub links, it is the company HR departments themselves.) So I had to add a bunch of projects to GitHub and there's a link now in the blog.

Time to pick up some Debian work again, well after I pay a visit or two to the Cambridge Beer Festival 2019, of course.

by Neil Williams at May 05, 2019 14:15

Naresh Bhat

How to update from Ubuntu 16.04 to 18.04 on Windows 10

The Windows 10  is having default Ubuntu is of version "Ubuntu 16.04.03 LTS".  If you want to update to the latest available you can follow the below steps.

Check the distribution version on bash shell.

lsb_release -d

To update to Ubuntu 18.04 on Windows 10, first update and upgrade all the packages.

sudo apt-get update 
sudo apt-get upgrade -y

If you face the below error message then run the dist-upgrade.

Checking for a new Ubuntu release
Please install all available updates for your release before upgrading.
sudo apt-get dist-upgrade

And now you all set to do the release upgrade.

sudo do-release-upgrade

You will be asked to intervene twice during the upgrade process. The first time, Ubuntu will ask you if you want to install the latest version of the sshd_config file. You can keep the current version, or you can get the latest one.

Next, a little further during the upgrade process, Ubuntu will ask you if you want to remove obsolete packages. You can choose to remove them, or keep them. When the upgrade finishes, a restart will be required. If Ubuntu is unable to restart your system, go ahead and manually restart it.

Open Ubuntu bash shell and run the following command again to make sure the upgrade was successful.
$ lsb_release -d
Description:    Ubuntu 18.04.2 LTS

by Naresh ( at May 05, 2019 09:47

May 10, 2019

Marcin Juszkiewicz

Nine years of Linaro

Nine years ago at 11:00 a group of developers gathered in a small room. I was one of them and did not knew anyone from a group before entering the room.

The meeting took place in Dolce La Hulpe Hotel and Resort in a village close to Brussels, Belgium. It was on first day of UDS-M.

This was the first meeting of NewCo developers. The organization now known as Linaro.

I do not remember who exactly was at that meeting so will not provide a list. We introduced ourselves, got some knowledge of who is who’s boss and what we will do from now. I just told “I’ll do what my boss (pointing to Steve Langasek) orders me to” as I was tired after whole night train trip.

For me this was the real beginning of Linaro. Not June 2010 when (at Computex) it was announced to the world. Neither 26th April 2010 — a day when I started working for Canonical as NewCo engineer.

by Marcin Juszkiewicz at May 05, 2019 09:00

April 24, 2019

Marcin Juszkiewicz

Good bye WordPress

I had some kind of personal website since started using Internet in 1996. First it was set of hand edited Lynx bookmarks, then were experiments with wikis. Finally in 2005 I started using WordPress. And it was in use for those 14 years. Until now…

WordPress is nice platform but I got tired of it. More and more plugins and themes became demo versions of commercial products. Also amount of JavaScript and CSS added to website made it harder and harder to maintain. At some point I said myself that it is enough. And started looking for alternatives.


Here came Pelican — static site generator written in Python. I had few attempts to switch to it and finally managed to find some time and sorted out all issues.

Someone my ask why Pelican? Why not Jekkyl, Gatsby, Hexo, Wintersmith or other. For me reason is simple — it is Python. Language which I already know. So in case of need I can read source code and know how to change it (already sent one change and it got merged).


The good side is import from WordPress went nice. As I used Markdown most of posts required changes. Implementation in Pelican differs from old Markdown Extra + SmartyPants I had in my blog.


Then came images. Copied whatever I had on previous website and removed all thumbnails. Then decided to go with 700px wide ones and to not link to original photos. Boring work as almost every image in every post needed change. Some entries got pictures removed (most of time due to their low resolution).

This also shown how my blog was changing through all those years. Over 10 years ago adding 300x300px picture into blog post was normal thing. Now such graphics got either removed or replaced with 700px wide one.

Some posts had galleries inserted instead of pictures. This took a bit more time as I was grabbing filenames from database to replace gallery with set of photos. And removed some of them during.

Look and feel

When I was collecting ideas for a new platform I had few ideas:

  • static generator
  • no JavaScript
  • minimal CSS
  • similar look to WordPress version

Pelican solved first point. Handling rest was harder.

I took a look at existing Pelican themes and tried several ones. Finally decided to make own one — like WordPress “Spacious” one.

As a base I used “Simple” theme. Typical template with header, content, sidebar and footer. Elements put in CSS grid for most of screens and once screen width goes under 70em layout switches to “flex”. This allowed for simple responsive web design. All in ~2.5KB of CSS (plus some code for webfonts).


One of big changes (compared to WordPress) is a way of presenting archive posts. You can go into archives to see the list of all my blog posts like it was before. But if you go for a list of posts in a tag (like AArch64, Red Hat, Zaurus) then instead of posts with pagination you get list of posts presented in archive form.

This should make old content easier to find.


As you may notice there is no way to comment posts anymore. Amount of comments was lower every year so I decided to not bother with them in new website. I could add Disqus for example but is it worth for just a few entries per year?

To Do

There are some things I need to take care of still. Page about my fridge magnets collection is missing, some entries may get some formatting changes or small contents edit. No big edits of old posts as they show how awful my English was in past (not that it got any better).

by Marcin Juszkiewicz at April 04, 2019 10:59

April 16, 2019

Marcin Juszkiewicz

The end of “Mali question”?

For several Linaro Connect events we had sessions about state of graphics drivers on ARM platforms. I attended most of them and got a reputation of person asking problematic questions.

But situation has changed. With Panfrost project happening. It is a Foss driver for Arm Mali Midgard graphics chipset (Bifrost support on a way). It went from “wow, a triangle” to “we can play some games or run a desktop” in quite short time.

At BKK19 Linaro Connect we had “State of opensource drivers for mobile GPU chips” session. Freedreno, Etnaviv, vc4, v3d, Panfrost, Lima etc. What they target, what was already achieved, what are plans. Great progress across whole ARM world. And several questions from the audience. And interesting answers as well.

thumbnail for VTgDP3yNXI0 video

Mali then. Grant Likely from Arm told that they are looking how Panfrost is going. From company perspective both Midgard and Bifrost chips are “done, in a field” product which will not get changes. Still — engineering support goes entirely into their binary drivers as this is what their customers are using. Situation may change if those customers start asking for open drivers.

I do not use any Arm hardware with Mali GPU anymore. But hope that at next Linaro Connect instead of asking famous “Mali question” we will rather discuss how it runs on our devices.

by Marcin Juszkiewicz at April 04, 2019 08:44

April 09, 2019

Tom Gall

Linux Kernel Testing by Linaro – March 6th Edition

Linaro runs a battery of tests on the Open Embedded and Android operating systems using a variety of hardware and kernel versions in order to detect kernel regressions. These regressions are reported to member companies and the various upstream communities like linux-stable.

This report is a summary of our activity this week.

Due to some breakage that was witnessed, there was a desire this month to add KVM into the kernel testing mix. KV-165 Investigate adding KVM testing to LKFT

Testing on Open Embedded

  • KV-126 Final testing for upgrade to sumo happening in staging. Looking to upgrade this week.
  • Db410c boot issue showed up, investigating.
  • Bug Status – 62 open bugs
    • 2019-03-06: bpf_test_tcpbpf_user confirmed fixed on mainline thanks to Anders’ patch re: bug 3938
    • 2019-03-05: Bug 4305 – LTP: Juno: 64K page: causing Unable to handle kernel NULL pointer dereference reported to ARM and LDCG
    • 2019-03-04: Anders Roxell reported a use after free with KASAN in next
    • 2019-03-01: Naresh Kamboju reported a kernel warning triggered by bpf test_sock
  • RC Log
    • 2019-03-04
      • 4.9.162, 4.14.105, 4.19.27, 4.20.14
        • Reported no regressions in <24h

Testing on Android

    • Discussion
      • Pixel 3 is starting to boot :
      • DB845 is following along as well
      • Working with John/YongQin and hikey-linaro kernel branches that then get used by LKFT. Part of what we’ve been bit by in failures is due to changes in Android Common due to post P.
    • Android 9 / P LTS-premerge – 4.4, 4.9, 4.14, 4.19
      • 4.19.26 / HiKey – no regressions
      • 4.14.104 / HiKey –
        • cts-lkft/arm64-v8a.CtsOsTestCases/android.os.cts.StrictModeTest.testNetwork
        • cts-lkft/arm64-v8a.CtsOsTestCases/android.os.cts.StrictModeTest.testUntaggedSocketsHttp
        • cts-lkft/arm64-v8a.CtsOsTestCases/android.os.cts.StrictModeTest.testUntaggedSocketsRaw
        • cts-lkft/armeabi-v7a.CtsOsTestCases/android.os.cts.StrictModeTest.testNetwork
        • cts-lkft/armeabi-v7a.CtsOsTestCases/android.os.cts.StrictModeTest.testUntaggedSocketsHttp
        • cts-lkft/armeabi-v7a.CtsOsTestCases/android.os.cts.StrictModeTest.testUntaggedSocketsRaw
        • cts-lkft/armeabi-v7a.CtsWebkitTestCases/android.webkit.cts.WebViewSslTest.testProceedClientCertRequestKeyWithAndroidKeystoreKey
      • 4.9.161 / HiKey
      • 4.4.176 / HiKey
        • No regressions
        • cts-lkft/arm64-v8a.CtsUsbTests /
          • Seen before but I believe this test is interesting in the context of adb disconnect issues
    • Android 9 / P –  4.4, 4.9, 4.14, 4.19 + HiKey
      • 4.19.23 / HiKey – no new data
      • 4.14.101 / HiKey – no new data
      • 4.9.158 / HiKey – no new data
      • 4.4.174 / HiKey – no new data
    • AOSP-master-tracking –  4.9, 4.14 4.19 / HiKey & 4.14 / X15
      • hi6220-hikey_4.19.23:
      •  cts-lkft-arm64-v8a/arm64-v8a.CtsBluetoothTestCases:
        • android.bluetooth.cts.HearingAidProfileTest.test_getConnectedDevices
        • android.bluetooth.cts.HearingAidProfileTest.test_getDevicesMatchingConnectionStates
      • hi6220-hikey_4.14.101:
      •  cts-lkft-arm64-v8a/arm64-v8a.CtsBluetoothTestCases:
        • android.bluetooth.cts.HearingAidProfileTest.test_getDevicesMatchingConnectionStates
      •  cts-lkft-armeabi-v7a/armeabi-v7a.CtsBluetoothTestCases:
        • android.bluetooth.cts.HearingAidProfileTest.test_getConnectedDevices
        • android.bluetooth.cts.HearingAidProfileTest.test_getConnectionStateChangedIntent
      •  cts-lkft-arm64-v8a/arm64-v8a.CtsLibcoreTestCases:
      •  cts-lkft-armeabi-v7a/armeabi-v7a.CtsBluetoothTestCases:
        • android.bluetooth.cts.HearingAidProfileTest.test_getConnectionStateChangedIntent
      • hi6220-hikey_4.9.158:
      •  cts-lkft-armeabi-v7a/armeabi-v7a.CtsBluetoothTestCases:
        • android.bluetooth.cts.HearingAidProfileTest.test_getDevicesMatchingConnectionStates
      • x15_4.14.101:
      •  cts-lkft-armeabi-v7a/armeabi-v7a.CtsWebkitTestCases:
        • android.webkit.cts.WebSettingsTest.testAccessJavaScriptEnabled
        • android.webkit.cts.WebSettingsTest.testAccessLayoutAlgorithm
    • Android 8.1 – 4.4 + HiKey, 4.14 and X15
      • 4.14.103 / X15 – no regressions
      • 4.4.x / HiKey – no new data
    • Bug Activity
      • 22 – stable WtW

by tgallfoo at April 04, 2019 21:59

Marcin Juszkiewicz

How did I hacked Linaro Connect BKK19 puzzle

One of Linaro Connect traditions is a puzzle to solve. Created by Dave Pigott. And recent BKK19 event was not any different. There was puzzle announcement on the first day — right before first keynote. But no one could be first to answer at that time…


As usual before Connect I looked at a map and marked several locations in Bangkok as places to visit. Then took a look at the official BKK19 application. Installed it on my phone and started.

First screen had few paragraphs of text. Some information about the event and schedule. But there was also paragraph with information about the puzzle. WITH the link to it!

I clicked to get some redirection to Google Forms website. With information that this form is not available for users outside of organization. As I do not have work accounts on my phone I checked for redirection link and loaded it on my desktop. And landed into the puzzle.

Puzzle form

It was a bit different than version provided during Linaro Connect. There was a graphics with seven (official one had eight ones) columns of text. Under it was graphics with chess figures:

H1 G3 F1 H2 G4 E3 D1
A4 C3 B1 A3 C4 B6 A8
B5 A7 C8 D6 E8 F6 D5
G8 H6 F5 G7 H5 F4 E2
C6 A5 B7 C5 A6 B8 D7
F3 G1 H3 G5 H7 F8 E6
F7 H8 G6 H4 G2 E1 D3
A2 B4 C2 A1 B3 D2 E4

Graphical hint

Dave later said that knight was not present in it but I did not noticed that. There was a plan to add that graphics into official puzzle if no one provides proper answer until Wednesday.

From text I noticed that it is chess related. Took a sheet of paper, draw 8x8 grid on it and started following each row on it with different markings each time. Hm… nothing came to my mind. Noted missing entries.

Help me Google, you are my only hope

Then started googling “knight chess puzzle” and got “knight’s tour” links on first page. Started reading what it is about. Then restarted tracking knight’s moves from the puzzle with adding missing ones. Turned out that this was it.

Let me mail Dave

I submitted “knight’s tour problem” as the answer and wrote to Dave:

I see that they are online already.

Are the ones there official ones or testing one?

Turned out that I found testing version. Which did not even collected emails when someone provided an answer. But there was one such sent so Dave marked it as my submission. And the link got removed from BKK19 application.

At pool bar

I arrived in Bangkok on Saturday. Met Dave at pool bar and we had a chat about the puzzle. Was fun to see how surprised people around were that first answer was already provided. I asked Dave to not give me the proper answer nor info was my answer good so I would not spoil other people.

During Connect few attendees asked me about the puzzle, how it went. Kept away from spoiling them.

And the winner is…

Then Friday happened with closing remarks session. Only a few people provided two words answer (“knight’s tour”) and few three word one (“closed knight’s tour”). My name was in “special mention” section.

Puzzle winners slide

Turned out that there is separate award for hacking a puzzle. It was 3rd time when it happened. I got a waterproof action camera (EZVIV S1C model) — will find some use for it sooner or later ;D

Whole puzzle was fun. Thanks go to Dave for creating it and providing me with a copy of both results slide and graphical hint so I could use it in blog post.

by Marcin Juszkiewicz at April 04, 2019 17:14

March 28, 2019

Siddhesh Poyarekar

A JIT in Time...

It’s been a different 3 months. For over 6 years I had been working almost exclusively on the GNU toolchain with a focus on glibc and I now had the chance of working on a completely different set of projects, something I had done a lot of during my Red Hat technical support days but not since. I was to look into Pypy, OpenJDK and LuaJIT, three very different projects with very different development styles, communities and technologies. The comparison of these projects among themselves and the GNU projects is an interesting point but not the purpose of this post, maybe some other day. In this post I want to talk about the project I spent the most time on (~1.5 months) and found to be technically the most intriguing: LuaJIT.

A Just In Time Introduction

For those new to the concept, JIT compilation techniques are pretty old and there is a very interesting paper called the A brief history of just in time that does what the title states. The basic concept is quite straightforward - code written in a high level language (in the case of luajit, lua) is interpreted as usual while keeping track of which parts of the code get hit often. If a part of the code is seen to be executed repeatedly, all or part of that code is compiled into binary and mapped in, with entry and exit branches into the interpreter, also known as exit guards. There are a number of tradeoffs in designing a JIT and the paper I’ve linked above gives enough of an introduction to appreciate the complexity of the problem being solved.

The key difference from compilers is that the time required to compile is often as much a performance factor as the quality of the generated code. Due to this, one needs to be careful about the amount of processing one can do on the code to optimise it. So while gcc or llvm may end up giving higher quality code, the ~200 passes that are involved in building a TU may well end up eating up all the performance gains compiling just in time would have given.

LuaJIT: Peeking under the hood

The LuaJIT project was started and is mostly written by Mike Pall, that is apparently a pseudonym for a very private and very smart hacker. I assume that he is male given that Mike is a common male name. The source code repository is a bit odd. There is a github repository that is supposed to be official but isn’t; it is a mirror created by CloudFlare along with Mike with the aim to broaden the developer community base. That ride hasn’t been the smoothest and I’ve talked about it in more detail below. The latest code with support for other architectures such as arm64 and ppc are in the v2.1 branch, which has only had beta releases come off it, the last one in 2017. There are tests in a separate repository called LuaJIT-test-cleanup which has a big fat warning that it is not the official testsuite, although if you look around, it pretty much is the only testsuite worth using for luajit.

Wait, there’s also bench_lua, which has some benchmarks and a pretty nice driver for the benchmarks, something that the LuaJIT-test-cleanup benchmarks lack.

LuaJIT uses the concept of trace compiling which is pretty simple in concept but has some very interesting side-effects. The idea of trace compilation, specifically with luajit is quite simple and follows roughly this logic:

  • Interpret program and profile it while it is running. Typical candidates for profiling would be loops for the obvious reason that it will likely execute repeatedly.
  • If a loop is hit repeatedly, i.e. it crosses a threshold number of iterations, the JIT compiler is invoked on its next iteration.
  • The JIT compiler first traces execution of the program and generates an IR for the trace of the program.
  • The IR then goes through some optimisation passes and finally code is generated for the desired CPU backend.

This keeps on repeating as the interpreter encounters more hotspots. The interesting bit here is that the only bit that gets compiled is the code that gets executed during the trace. So if you have a branch like so:

    if cond > threshold then
        i = i + 1
        i = i - 1

and the else block is executed during the trace, only that bit is compiled and not the if block. The compiled code then has branches (known as exit guards) to jump back into the interpreter if the condition is true. This produces an interesting optimisation opportunity that can be done during tracing itself. If cond > threshold is found to be always false because they are constants or some other reason, the if condition can be completely eliminated, which saves compilation time as well as execution time.

Another interesting side effect of tracing that is not seen in typical compilers is that function calls effectively get inlined. Again, that becomes a very cheap way to achieve something that would otherwise have been done in a separate pass in traditional compilers.

In addition to very fast tracing and compilation, all of luajit is quite compact. It’s IR is linear array based and is hence allows very fast traversal. It’s easy to visualize it using the jit.* debug modules and using the -jdump flag to dump the IR during execution. The luajit wiki has some pretty detailed documentation on its internals.

The coding style of the project is a bit too compact to my taste since I personally prefer writing for readability. There are a lot of constructs throughout the code that need a fair amount of squinting to understand, such as assignments inside the for loop headers and inside conditions. OK all of you pointing at the macro and makefile soups in glibc and laughing, please be quiet ;)

There’s also the infamous (at least in luajit circles) 47-bit address space limitation for garbage collected objects in luajit because luajit uses the top bits for metadata. This is known to have correctness issues with Lua userdata objects and also performance issues because luajit repeatedly tries allocations until it finds a suitable address in the 47-bit space. It doesn’t hurt x86 much (because of MAP_32BIT) but arm64 feels it and I imagine so do other architectures.

My LuaJIT involvement

My full time involvement with luajit was brief and will likely end soon (my personal involvement may still continue) so in this short period I wanted to tick off as many short but significant work items as I could. My github fork is here.

Sameera Deshpande started the initial work and then helped me ramp up later on. We got a couple of CI instances up and running to begin with, one for the official repository and another for my github fork so that I can review my changes regularly. If you’re interested in adding a node for your architecture to the Ci projects, please feel free to reach out to me, Linaro will happily add the node to the CI matrix.

Register Allocation improvements

The register allocator in luajit is pretty simple to keep the compilation overhead low. Registers are allocated sequentially based on their categories (caller saved, callee saved, etc.) and it uses some tricks such as constant rematerialization used to reduce register pressure. Rematerialization is also very basic in its implementation; whenever constants need to be allocated to registers, it is preferred that they use existing constants, (assuming their live ranges are compatible) either directly or as a constant computation. This is quite valuable because there is a fair amount of constant usage in the JITted code; exit guard addresses are coded in as constants for example and so are floating point numbers, in addition to the usual integers. The register modes are not specified during allocation and are defined by the instructions generated in the assembly phase.

There was a bug in the luajit register allocator due to which registers used for constant rematerialization were being clobbered, resulting in corruption. A fix was proposed but the author of the fix was not sure if it was correct. I posted an alternative patch and then realized and explained why my patch is overkill and his approach is optimal. I added additional cleanups to that to finish it up.

While working on this problem, I noticed that the arm64 backend was not using XZR often enough and I posted a patch to fix that. I started benchmarking the improvement (the codegen was obviously better, it was saving registers for stores fo zeroes for example) and quickly realized that both bench_lua and the LuaJIT-test-cleanup benchmarks were quite raw and couldn’t be relied upon for consistent results.

So I digressed.

Benchmark improvements and luaJIT-test-cleanup cleanup

bench_lua was my more favourite project to hack on benchmarks because it was evident that reviews were very hard to come by in the luajit project. Also, bench_lua had a benchmark driver that produced repeatable results but it still had some cleanup issues, including the fact that it did not have a license! The author was very responsive on the license question though and quickly put one in. I fixed some timing issues in the driver and while doing so, I realized that it might be better if I used this driver on the more extensive set of benchmarks in LuaJIT-test-cleanup. So that’s what I did.

I integrated the bench_lua driver into luajit-test-cleanup and added Makefile targets so that one could easily do make check and make bench to run the tests and benchmarks. Now I had something I could work with but it was still in a different repo and it was getting quite cumbersome to work with them.

So I integrated LuaJIT-test-cleanup into LuaJIT. Now I had a LuaJIT repository that IMO was complete and could handle the standard make/make check workflow. At the same time, it was modular enough that it could be merged into the upstream LuaJIT with relative ease. I posted all of these patches as PRs and watched as nothing happened. The LuaJIT-test-cleanup project had not seen a PR review since about 2016 and the LuaJIT project had seen occassional comments and patches from Mike in the past couple of years, but not much else.

Fusing and combining optimisations

Instruction fusion is an architecture dependent feature in luajit and each backend implements its own during the IR to assembly conversion phase, where the IR is traversed from the bottom up and assembly instructions generated sequentially. Luajit does some trivial reordering in its IR optimisation passes but during assembly, it does not peek ahead to actively look for instruction fusion opportunities; it only tries to fuse neighbouring instructions. As a result, while there are implementations for instructions like load and store pair in arm64, it is useful in only the most trivial of tests. Likewise for fmadd/fmsub; a simple intervening load is sufficient to prevent the optimisation.

In addition to this, it is often seen that optimisations like loop unrolling and vectorisation bring in even more opportunities for combining of loads and stores. Luajit does some loop peeling but that’s about it.

Sameera did some analysis on ways to introduce more aggressive unrolling and possibly some amount of vectorisation but we did not have enough time to implement it. She did have enough time to implement some instruction fusing and using fnmadd and fnmsub for arm64. She also looked at load combining opportunities but realized that luajit would need more powerful instruction reordering, similar to the load grouping in the gcc scheduler that makes load pair generation much easier. So that project was also not small enough for us to complete in the limited time.

Casting floats to unsigned integers

The C standard defines casting of floating point types to unsigned integer types only for the range (-1.0, UTYPE_MAX), where UTYPE_MAX is the unsigned version of TYPE. Casts to signed types work just fine as long as the number is in the range of that type. Waters get a bit murky with dynamic types and type narrowing when the default internal representation for all numbers is double. That was the situation in luajit. The fix for this was pretty straightforward in theory, which was to add an additional cast from float to signed int and then to unsigned int for floating point values less than zero and sticking to a direct cast to unsigned int for positive numbers. I have implemented this for the interpreter and for arm64 in my fork.

Project state and the road ahead

LuaJIT is a very interesting project that has some very interesting concepts that I learned in the last month or so. It has a pretty active user community that sings praises of the project and seems to advocate it in a number of areas. However, the project development itself is in a bit of a crisis.

Around 2015 Mike Pall said he wanted to step back from the project and wanted more people to get involved in the development. With that intent, Cloudflare created the github organisation and repository to allow for better collaboration. Based on conversation threads I read, things seemed to go fine when the community stepped in to create the LuaJIT-test-cleanup repository based on some initial tests Mike had written and built it up into a set of 500+ tests. However in about a year that excitement faded because nobody was made maintainer alongside Mike to carry forward the work and that meant that the LuaJIT project itself would only get sporadic fixes whenever Mike had some free time. Minor patches were accepted but bigger pieces of code went unreviewed and presumably the developers also lost interest.

Fast forward four years into 2019 and we are still in the same situation, probably worse. LuaJIT-test-cleanup has not had a patch review since 2016. LuaJIT has had comments about a couple of times each quarter and bug fixes with similar frequency, but not much else. The mailing list also has similar traffic - I announced all of the work I did above and did not get any responses. there are forks of LuaJIT all over the place in projects such as OpenResty and RaptorJIT and the projects seem happy to let things run that way. Lua language support is in a bit of a limbo with it being mostly 5.1 compliant with some 5.2 bits thrown in. Overall, it’s a great chunk of code that’s about to vanish into oblivion.

Then there is the very tricky question of copyright. The copyright notices all over the code say that Mike Pall has ownership. However, the code clearly has a number of contributions from others and there is no copyright assignment in place. While it’s likely not an issue from a licensing standpoint (IANAL, etc.), it is definitely something that needs to be addressed if the project is somehow ressurected, at the very least to give more prominent credit to contributors.

I’ve posted PRs for my work and tried to engage but I don’t have much hope given past history. I intend to spend at least some of my free time tinkering with this code since it’s just a very interesting project and there’s a lot that can be done. I am trawling the PRs and issue lists to look for patches that can be incorporated in my tree so if anyone is interested in contributing patches, you’re most welcome. I will continue to ensure that my tree applies on top of the official repository because I do not want to give up hope of the project coming back to life.

by Siddhesh at March 03, 2019 20:29

March 23, 2019

Riku Voipio

On the #uploadfilter problem

The copyright holders in europe are pushing hard mandate upload filters for internet. We have been here before - when they outlawed circumventing DRM. Both have roots in the same problem. The copyright holders look at computers and see bad things happening to their revenue. They come to IT companies and say "FIX IT". It industry comes back and says.. "We cant.. making data impossible to copy is like trying to make water not wet!". But we fail at convincing copyright holders in how perfect DRM or upload filter is not possible. Then copyright holders go to law makers and ask them in turn to fix it.

We need to turn tables around. If they want something impossible, it should be upto them to implement it.

It is simply unfair to require each online provider to implement an AI to detect copyright infringement, manage a database of copyrighted content and pay for the costs running it all.. ..And getting slapped with a lawsuit anyways, since copyrighted content is still slipping through.

The burden of implementing #uploadfilter should be on the copyright holder organizations. Implement as a SaaS. Youtube other web platforms call your API and pay $0.01 each time a pirate content is detected. On the other side, to ensure correctness of the filter, copyright holders have to pay any lost revenue, court costs and so on for each false positive.

Filtering uploads is still problematic. But it's now the copyright holders problem. Instead people blaming web companies for poor filters, it's the copyright holders now who have to answer to the public why their filters are rejecting content that doesn't belong to them.

by Riku Voipio ( at March 03, 2019 16:07

March 20, 2019

Naresh Bhat

Apache Drill on ARM64

Apache Drill on ARM64

What is Drill ?
Apache Drill is a distributed MPP query layer that supports SQL and alternative query languages against NoSQL and Hadoop data storage systems. It was inspired in part by Google's Dremel.  Apache Drill is an Apache Foundation project.

Query any non-relational datastore
With the exponential growth of data in recent years, and the shift towards rapid application development, new data is increasingly being stored in non-relational datastores including Hadoop, NoSQL and cloud storage. Apache Drill enables analysts, business users, data scientists and developers to explore and analyze this data without sacrificing the flexibility and agility offered by these datastores.  

Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. A single query can join data from multiple datastores. For example, you can join a user profile collection in MongoDB with a directory of event logs in Hadoop.

Drill's datastore-aware optimizer automatically restructures a query plan to leverage the datastore's internal processing capabilities. In addition, Drill supports data locality, so it's a good idea to co-locate Drill and the datastore on the same nodes.

Apache Drill includes a distributed execution environment, purpose built for large-scale data processing. It doesn’t use a general purpose execution engine like MapReduce, Tez or Spark. As a result, Drill is flexible (schema-free JSON model) and performant. Drill’s optimizer leverages rule- and cost-based techniques, as well as data locality and operator push-down, which is the capability to push down query fragments into the back-end data sources. 


Apache Drill is built to achieve high throughput and low latency. It provides the following capabilities.
  • Distributed query optimization and execution: Drill is designed to scale from a single node (your laptop) to large clusters with thousands of servers.
  • Columnar execution: Drill is the world's only columnar execution engine that supports complex data and schema-free data. It uses a shredded, in-memory, columnar data representation.
  • Runtime compilation and code generation: Drill is the world's only query engine that compiles and re-compiles queries at runtime. This allows Drill to achieve high performance without knowing the structure of the data in advance. Drill leverages multiple compilers as well as ASM-based bytecode rewriting to optimize the code.
  • Vectorization: Drill takes advantage of the latest SIMD instructions available in modern processors.
  • Optimistic/pipelined execution: Drill is able to stream data in memory between operators. Drill minimizes the use of disks unless needed to complete the query.
Drill is the only columnar query engine that supports complex data. It features an in-memory shredded columnar representation for complex data which allows Drill to achieve columnar speed with the flexibility of an internal JSON document model.
Runtime compilation enables faster execution than interpreted execution. Drill generates highly efficient custom code for every single query.

Top 10 Reasons to use Apache Drill

1. Get started in minutes
It takes just a few minutes to get started with Drill. Untar the Drill software on your Linux, Mac, or Windows laptop and run a query on a local file. No need to set up any infrastructure or to define schemas. Just point to the data, such as data in a file, directory, HBase table, and drill.

2. Schema-free JSON model
Drill is the world's first and only distributed SQL engine that doesn't require schemas. It shares the same schema-free JSON model as MongoDB and Elasticsearch. No need to define and maintain schemas or transform data (ETL). Drill automatically understands the structure of the data.

3. Query complex, semi-structured data in-situ
Using Drill's schema-free JSON model, you can query complex, semi-structured data in situ. No need to flatten or transform the data prior to or during query execution. Drill also provides intuitive extensions to SQL to work with nested data. 

4. Real SQL -- not "SQL-like"
Drill supports the standard SQL:2003 syntax. No need to learn a new "SQL-like" language or struggle with a semi-functional BI tool. Drill supports many data types including DATE, INTERVAL, TIMESTAMP, and VARCHAR, as well as complex query constructs such as correlated sub-queries and joins in WHERE clauses. 

5. Leverage standard BI tools
Drill works with standard BI tools. You can use your existing tools, such as Tableau, MicroStrategy, QlikView and Excel.

6. Interactive queries on Hive tables
Apache Drill lets you leverage your investments in Hive. You can run interactive queries with Drill on your Hive tables and access all Hive input/output formats (including custom SerDes). You can join tables associated with different Hive metastores, and you can join a Hive table with an HBase table or a directory of log files. 

7. Access multiple data sources
Drill is extensible. You can connect Drill out-of-the-box to file systems (local or distributed, such as S3 and HDFS), HBase and Hive. You can implement a storage plugin to make Drill work with any other data source. Drill can combine data from multiple data sources on the fly in a single query, with no centralized metadata definitions. 

8. User-Defined Functions (UDFs) for Drill and Hive
Drill exposes a simple, high-performance Java API to build custom user-defined functions (UDFs) for adding your own business logic to Drill. Drill also supports Hive UDFs. If you have already built UDFs in Hive, you can reuse them with Drill with no modifications.

9. High performance
Drill is designed from the ground up for high throughput and low latency. It doesn't use a general purpose execution engine like MapReduce, Tez or Spark. As a result, Drill is flexible (schema-free JSON model) and performant. Drill's optimizer leverages rule- and cost-based techniques, as well as data locality and operator push-down, which is the capability to push down query fragments into the back-end data sources. Drill also provides a columnar and vectorized execution engine, resulting in higher memory and CPU efficiency.

10. Scales from a single laptop to a 1000-node cluster
Drill is available as a simple download you can run on your laptop. When you're ready to analyze larger datasets, deploy Drill on your Hadoop cluster (up to 1000 commodity servers). Drill leverages the aggregate memory in the cluster to execute queries using an optimistic pipelined model, and automatically spills to disk when the working set doesn't fit in memory.

The flow of a Drill query

  • The Drill client issues a query. A Drill client is a JDBC, ODBC, command line interface or a REST API. Any Drillbit in the cluster can accept queries from the clients. There is no master-slave concept.
  • The Drillbit then parses the query, optimizes it, and generates a distributed query plan that is optimized for fast and efficient execution.
  • The Drillbit that accepts the query becomes the driving Drillbit node for the request. It gets a list of available Drillbit nodes in the cluster from ZooKeeper. The driving Drillbit determines the appropriate nodes to execute various query plan fragments to maximize data locality.
  • The Drillbit schedules the execution of query fragments on individual nodes according to the execution plan.
  • The individual nodes finish their execution and return data to the driving Drillbit.
  • The driving Drillbit streams results back to the client.

Goals on ARM64

  • Create .deb and rpm packages for Apache Drill for AArch64.
  • Install Drill packages along with the dependency.
  • Do basic workload testing

  • OpenJDK8
  • Zookeeper
  • git
  • maven@v3.3.9

Efforts from Linaro BigData team
  • Implement and upstream DEB/RPM support on Apache Drill
  • Document the following installation steps in collaborate page.
    • Define prerequisites
      • Install HDFS aarch64 bits from debian repo
      • Install YARN aarch64 bits from debian repo
      • Install zookeeper aarch64 bits from debian repo
    • Check YARN and zookeeper versions
    • Setup HDFS in distributed mode
    • Setup YARN in distributed mode
    • Update Hosts files
    • Configure HDFS, YARN and Zookeeper with nodes information.
    • Point Drill to zookeeper quorum
  • Configure Drill to run on YARN distributed mode. This might cause issues, if drill is installed prior to YARN. If so, need to uninstall drill and redo. 
  • Check if drill is running on YARN 
  • Configure drill dfs (hdfs) storage plugin
  • Start drill daemon in each node
  • Start drill bit in distributed mode 
  • Test basic data import
  • Double check and Re-configure zookeeper
  • Update settings
  • Download and import github data as json files into HDFS
  • Build drill query
  • Check if the data shows up in drill
  • Configure drill memory and check for optimization
  • Check on caching in drill (Optimistic/pipelined execution)
  • Research on Integrating Zeppelin/Jupyter if possible for drill query
Build/Setup and Run Apache Drill

git clone

cd drill
mvn clean package -DskipTests

Test drill-embedded

You can launch the drill embedded as below and query sample file or JSON file.  You only need to provide absolute path while doing querry.

linaro@debian:~$ drill-embedded
Apache Drill 1.15.0-SNAPSHOT
"Drill must go on."
0: jdbc:drill:zk=local>
0: jdbc:drill:zk=local> SELECT * FROM dfs.`/home/linaro/Apache-components-build/drill/distribution/target/apache-drill-1.15.0-SNAPSHOT/apache-drill-1.15.0-SNAPSHOT/sample-data/region.parquet`;
0AFRICAlar deposits. blithe
1AMERICAhs use ironic, even
2ASIAges. thinly even pin
3EUROPEly final courts cajo
4MIDDLE EASTuickly special accou
5 rows selected (1.025 seconds)
0: jdbc:drill:zk=local>

 0: jdbc:drill:zk=local> !quit
Closing: org.apache.drill.jdbc.impl.DrillConnectionImpl

Setup and test drill in clustered mode

  • Edit drill-override.conf to provide zookeeper location
  • Start the drillbit using bin/ start
  • Repeat on other nodes
  • Connect with sqlline by using bin/sqlline -u "jdbc:drill:zk=[zk_host:port]"
  • Run a query (below).
Now we will see one by one in details,

Install OpenJDK

    $ sudo apt-get install openjdk-8-jdk

Make sure you have the right OpenJDK version

    $ java -version

It should display 1.8.0_111


    $ export JAVA_HOME=`readlink -f /usr/bin/java | sed "s:jre/bin/java::"`

Building Apache Zookeeper

Some distributions like Ubuntu/Debian comes with latest zookeeper.  Hence you can just install using apt-get command "sudo apt-get install zookeeper".  If your distribution does not come with zookeeper then just go for latest download and unzip the Zookeeper package from Official Apache archive in all machines that will be used for zookeeper quorum as shown below:

    $ wget
    $ tar -xzvf zookeeper-3.4.12.tar.gz

Edit the /etc/hosts file across all the nodes and add the ipaddress and hostname (nodenames). If the hostnames are not right, change them in /etc/hosts file


Create zookeeper user

You can create a new user or you can also configure the zookeeper for any existing user.    You can just use any other existing user name instead of zookeeper e.g. ubuntu, centos or debian..etc

    $ sudo adduser zookeeper

Configure zookeeper user or any already existing user

To make an ensemble with Master-slave architecture,  we needed to have odd number of zookeeper server .i.e.{1, 3 ,5,7....etc}.

Now, Create the directory zookeeper under /var/lib folder which will serve as Zookeeper data directory and create another zookeeper directory under /var/log where all the Zookeeper logs will be captured. Both of the directory ownership need to be changed as zookeeper.

    $ sudo mkdir /var/lib/zookeeper
    $ cd /var/lib
    $ sudo chown zookeeper:zookeeper zookeeper/
    $ sudo mkdir /var/log/zookeeper
    $ cd /var/log
    $ sudo chown zookeeper:zookeeper zookeeper/

Note: While running the zookeeper if you get a message something like below you may need to check/change for permissions of the files under /var/lib/zookeeper and /var/log/zookeeper.

Since I have loged-in as linaro and running zookeeper.  I have changed the permission to linaro user.

    linaro@node1:~/drill-setup/zookeeper-3.4.12$ ./bin/ start
    ZooKeeper JMX enabled by default
    Using config: /home/linaro/drill-setup/zookeeper-3.4.12/bin/../conf/zoo.cfg
 Starting zookeeper ... ./bin/ line 149: /var/lib/zookeeper/ Permission denied

Edit the bashrc for the zookeeper user via setting up the following Zookeeper environment variables.

    $ export ZOO_LOG_DIR=/var/log/zookeeper

Source the .bashrc in current login session:

    $ source ~/.bashrc

Create the server id for the ensemble. Each zookeeper server should have a unique number in the myid file within the ensemble and should have a value between 1 and 255.

In Node1

    $ sudo sh -c "echo '1' > /var/lib/zookeeper/myid"

In Node2

    $ sudo sh -c "echo '2' > /var/lib/zookeeper/myid"

In Node3

    $ sudo sh -c "echo '3' > /var/lib/zookeeper/myid"

Now, go to the conf folder under the Zookeeper home directory (location of the Zookeeper directory after Archive has been unzipped/extracted).

    $ cd /home/zookeeper/zookeeper-3.4.13/conf/

By default, a sample conf file with name zoo_sample.cfg will be present in conf directory. Make a copy of it with name zoo.cfg as shown below, and edit new zoo.cfg as described across all the nodes.

    $ cp zoo_sample.cfg zoo.cfg

Edit zoo.cfg and the below

    $ vi zoo.cfg


Now, do the below changes in file as follows.

    $ vi

    log4j.rootLogger=INFO, CONSOLE, ROLLINGFILE

After the configuration has been done in zoo.cfg file in all three nodes, start zookeeper in all the nodes one by one, using following command:

    $ /home/zookeeper/zookeeper-3.4.12/bin/ start

Zookeeper Service Start on all the Nodes.

    ZooKeeper JMX enabled by default
    Using config: /home/ubuntu/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED

The log file will be created in /var/log/zookeeper of zookeeper named zookeeper.log, tail the file to see logs for any errors.

    $ tail -f /var/log/zookeeper/zookeeper.log

Verify the Zookeeper Cluster and Ensemble

In Zookeeper ensemble out of three servers, one will be in leader mode and other two will be in follower mode. You can check the status by running the following commands.

    $ /home/zookeeper/zookeeper-3.4.13/bin/ status

Zookeeper Service Status Check.

In Zookeeper ensemble If you have 3 nodes, out of them, one will be in leader mode and other two will be in follower mode. You can check the status by running the following commands. If you have just one then it will be standalone.

With three nodes:


    ZooKeeper JMX enabled by default
    Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Mode: leader


    ZooKeeper JMX enabled by default
    Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Mode: follower


    ZooKeeper JMX enabled by default
    Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Mode: follower


    ZooKeeper JMX enabled by default
    Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Mode: standalone

    $ echo stat | nc node1 2181

Lists brief details for the server and connected clients.

     $ echo mntr | nc node1 2181

Zookeeper list of variables for cluster health monitoring.
       $ echo srvr | nc localhost 2181

Lists full details for the Zookeeper server.
If you need to check and see the znode, you can connect by using the below command on any of the zookeeper node:

    $ /home/zookeeper/zookeeper-3.4.12/bin/ -server `hostname -f`:2181

Connect to Zookeeper data node and lists the contents.

Install Pre-requisites for Build

    $ sudo apt-get install git

Setup environment

Add environment variables to profile file

# setup environments
export LANG="en_US.UTF-8"
export PATH=${HOME}/gradle/bin:$PATH
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64
export JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF8"

$ source ~/.bashrc

Hooking up upstream Maven 3.6.0 (for Debian Jessie only)

 $ wget
    $ tar xvf apache-maven-3.6.0-bin.tar.gz
    $ cd apache-maven-3.6.0/bin
    $ export PATH=$PWD:$PATH
    $ mvn --version # should list the version as 3.6.0

Clone and Build Apache Drill

    $ git clone
    $ cd drill
    $ git branch v1.15.0 origin/1.15.0
    $ git checkout v1.15.0

To build .deb package 

    $ mvn clean -X package -Pdeb -DskipTests

To build .rpm package 

    $ mvn clean -X package -Prpm -DskipTests

After successful compilation. Edit your computer /etc/hosts file and make sure that the loopback is commented. e.g. and replace with your host <IP-Address>

    $ cd distribution/target/apache-drill-1.15.0/apache-drill-1.15.0

    # localhost
    # ubuntu
    <IP-address> ubuntu
    <IP-address> localhost

Because in distributed mode the loopback IP cannot be binded reference

Next you need to edit the conf/drill-override.conf and change the zookeeper cluster ID e.g. as below


    { cluster-id: "1", zk.connect: "<IP-address>:2181" }

Now you can run the drillbit and watchout the log. To play more with drillbit you can refer drill-override-example.conf file.

    $ apache-drill-1.15.0$ ./bin/ help
    Usage: [--config|--site <site-dir>] (start|stop|status|restart|run|graceful_stop) [args]

In one of the terminal switch on the logs with the tail command

    $ apache-drill-1.15.0$ tail -f log/drillbit.log
    $ apache-drill-1.15.0$ ./bin/ start
    $ apache-drill-1.15.0$ ./bin/ status

    drillbit is running.

    $ apache-drill-1.15.0$ ./bin/ graceful_stop
    Stopping drillbit

You can either stop or do a graceful stop. We can repeat the same steps on more than one machines (nodes).

I could able to run the Drill and access the http://IP-Address:8047 and run a sample querry in distributed mode. So In order to do in a distributed mode. I just need to do a similar setup on multiple machines (nodes). Reference -

If you are using the CentOS 7   you should be little careful because the connection errors may be caused because of the firewall issues. I have used below set of commands to disable the firewall.

    $ sudo systemctl stop firewalld

    $ sudo firewall-cmd --zone=public --add-port=2181/udp --add-port=2181/tcp --permanent
    [sudo] password for centos:

    $ sudo firewall-cmd --reload

    $ restart
    ZooKeeper JMX enabled by default
    Using config: /home/centos/zookeeper-3.4.12/bin/../conf/zoo.cfg
    ZooKeeper JMX enabled by default
    Using config: /home/centos/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Stopping zookeeper ... STOPPED
    ZooKeeper JMX enabled by default
    Using config: /home/centos/zookeeper-3.4.12/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED


Official web page:

by Naresh ( at March 03, 2019 10:02

March 15, 2019

Naresh Bhat

Learning from Himalayan Odyssey 2016

Learning from the dream ride event HO 2016  

The Himalayan Odyssey is considered to be the toughest because you will come across all types of roads in different mighty Himalayan terrains.  The water crossings are very common at some places nearly more than a km the cold waters will be flowing on the road.  The flowing water level increases as the sun goes up.  We did ride our motorcycle started from tarmac to stone roads, mud or slush roads,  sand dunes, while snowing, raining and also in the extreme heat and cold conditions.  In the last leg of the journey you will get soo confidence to handle your motorcycle.  That is because now you are more closer to your motorcycle after following the below tips.

Riding Tip #1 Focus on road 

It is required focus on road and look forward while you are riding.  Remember that wherever you will focus and see your vehicle automatically takes that path.  You need to maintain a comfortable distance from your front rider and maintain the speed.

The road from Chandigarh to Manali contains mountain paths.  The Manali is crowded with tourist.  Obviously you can expect more vehicles on these roads.  Hence you need tobe very careful while doing overtake.

Riding Tip #2 Overtaking and corners

You should never overtake if there is no visibility from the front. Before overtaking bring your vehicle to comfortable speed.  While doing overtake in the corners  you have to look ahead as much as possible and make sure that you have enough distance between over took vehicle and the vehicle in-front.

Riding Tip #3 Be on gas in corners

Make sure that you will never overtake in any blind corners.  You should follow the vehicle in-front of you and wait for your turn.  More importantly remember that your vehicle is always stable on gas (throttle).  So while doing overtake in corners remember tobe on throttle with comfortable speed.  When you start overtaking a vehicle slowly apply the throttle till you reach to a safe position after overtaking.

Riding Tip #4 Hold gas tank with thigh

You need to hold the gas tank with your thighs,  there should be any gaps between your thigh and gas tank.  You need to hold the tank soo tight that the paint should peel out when you completed your ride.  With this action your lower body is completely attached to your vehicle.

Riding Tip #5 Free your upper body

After following tip #3 make sure that your lower body is attached to your motorcycle.  Along with that you should free your upper body when you are sitting on a motorcycle.  Which means you should never ever hold your handle bars tightly.  If you do so, by the end of the day you will definitely get joints or body pain. Hence always hold the handle bars as free as possible.

Riding Tip #6 Find your own path

When you are riding in a slush, mud or in water crossings.  Just focus and find your own path do not follow the same path who is riding in front of you.

Riding Tip #7 Eat less

While doing long rides it is suggested that do not eat more so that you will feel sleepy.  When you ride motorcycle for long hours your body needs more liquid foods.  Hence try to consume more juices, Electrolyte or some energetic drinks depending on the weather condition.  While riding on mountains it is good to have some dry-fruits and chocolates as backup.

Riding Tip #8 Take break

It is always good to take break after covering some distance.  In HO there will be some re-group points where you can take break after finishing food. We usually take a 30min break after riding 150+KM

Riding Tip #9 Change riding position

While doing long rides especially on highways It is required to change your riding posture.  Never ride motorcycle for long hours in a fixed posture.  You can stretch your legs change your posture after every 3+KM ride.

Riding Tip #10 Group riding

While doing group ride never rush.  Always keep an eye on your front and follow rider.  Try to give signals as much as possible. Don't use high-beem or focused lights while doing group rides. Have your parking lights on so that the front rider can easy identification.  Always keep a proper breaking distance between front and follow rider.


*  The complete 18 days trip videos are available on twininsane films webpage you can goto REEL tab and clock on Himalayan Odessey 2016 videos.

* The direct link for HO 2016 videos on youtube -

by Naresh ( at March 03, 2019 07:39

March 12, 2019

Steve McIntyre

Debian BSP in Cambridge, 08 - 10 March 2019

Lots of snacks, lots of discusssion, lots of bugs fixed! YA BSP at my place.


March 03, 2019 02:08

March 10, 2019

Gema Gomez

Sweet Heart Baby Blanket

I have finished my most recent baby blanket. When my best friend from high school told me she was having a baby, I couldn’t have been happier for her. Then I started thinking what project would be most suitable for this baby, something that the mum and dad would also love. After two months of looking at patterns I think I found something really cute for them! This blanket took 5 months to make (in spare time here and there), each row took me between 15 and 30 mins, depending on how many yarn changes were required and whether I needed to replace any of them with a new yarn ball. It has 200 rows. This is the end result:


The pattern used for the blanket is from Elena Balyuk, Sweet Heart Baby Blanket. I followed the chart that comes with the pattern, the stitches were simple enough that no big explanations were required for me. It is cumbersome only in terms of changes of color throughout (I would advice to become familiar with changing colors neatly and hiding away ends in your crochet before attempting to undertake this project… there were a lot of ends to tidy up!). It is simple and easy to make otherwise, lots of fun and a very rewarding project.

The yarn used was Caron Simply Soft. The colors used are white, bone and soft blue (plus a bit of black for eyes and nose). Hook size 6mm (J).

The blanket itself, after finished and washed is not really square, due to some of the rows having more or less tension depending on how many yarn changes there are and how I was feeling at the moment. I have kept the same number of stitches throughout and it looks gorgeous with this shiny and soft Caron yarn. I am super happy with the end result and I hope it’ll give my friend’s baby the comfort he deserves to grow confident and fearless, at least during the winter months :)

Adding a pic for the yarn label:

yarn label

I did a swatch at the beginning with all the colors to make sure washing the final piece would be ok. Black didn’t really taint anything, so I was happy with that. I have washed this blanket at 30 C, 1200 spinning speed. Even though the label says it can be dryed, I wouldn’t recommend this, as it comes out of the washing machine almost dry, just hanging it over any rail and letting it dry a couple of hours does the trick!

by Gema Gomez at March 03, 2019 00:00

February 28, 2019

Tom Gall

Linux Kernel Testing by Linaro, Feb 28th Edition

Linaro runs a battery of tests on the Open Embedded and Android operating systems using a variety of hardware and kernel versions in order to detect kernel regressions. These regressions are reported to member companies and the various upstream communities like linux-stable.

This report is a summary of our activity this week.

Testing on Open Embedded

  • KV-126 Final testing for upgrade to sumo happening in staging. Looking to upgrade this week.
  • KV-17 Bisection automation work picking back up
    • Design based on performing an OE build identical to production for a particular board, and then submitting lava job to lkft to determine good/bad
    • Bisection was used last week to find the db410c fix; worked well, needs a lot of cleaning up to become generalized and easy to use.
  • KV-36 It looks like we will be able to add x15s to kernelci without any changes to lkft. Will be pursuing with kernelci.
  • KV-171 “Add LTP tests that android runs to OE/LKFT” finished.
    • Dio tests implemented
    • Commands tests implemented
    • Every test implemented except for ftrace_regression02. We would like to run as many tracing tests as we can, so we split that work out into a new ticket: KV-197 Investigate LTP tracing test cases for LKFT test plan improvement
  • KV-195 Test perf in LKFT based on request from Guenter
    • We used to have a simple perf test; Naresh is going to port it into our environment. User-space tools need to be added to build; kernel config already fine. March/April timeframe.
  • KV-194 Test 64k page size in LKFT request from ARM

Bug Status – 60 open bugs

Linux-Stable LTS RC tested this week

  • 2019-02-25
    • 4.9.161, 4.14.104, 4.19.26, 4.20.13
      • Reported no regressions in <24h
  • 2019-02-21
    • 4.4.176, 4.9.160, 4.14.103, 4.19.25, 4.20.12
      • Reported no regressions in <24h

Testing on Android

  • Discussion
    • Tom followed up with John about hikey kernel configs, He’s optimizing for running on AOSP-master, while LKFT is focused on O-MR1, P, and AOSP-master. The CONFIG_QTAGUID change recently increased failures on P, O given there are test that look for the feature. For kernels on the desserts we’ll need to freeze the config
  • Android 9 / P LTS-premerge – 4.4, 4.9, 4.14, 4.19
    • 4.19.26 / HiKey – No regressions
    • 4.19.25 / HiKey – No regressions
    • 4.14.104 / HiKey – No regressions
    • 4.14.103 / HiKey – No regressions
    • 4.9.161 – Run in progress
    • 4.9.160 / HiKey – No regressions
    • 4.4.176 / HiKey – No regressions
    • Addendum



    • 4.9.160 / HiKey – failures observed on LKFT but not elsewhere, we have been attempting to reproduce however no failures have been since observed.
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
      • cts-lkft/armeabi-v7a.CtsLibcoreTestCases/
  • Android 9 / P –  4.4, 4.9, 4.14, 4.19 + HiKey
  • AOSP-master-tracking –  4.9, 4.14 4.19 / HiKey & 4.14 / X15
    • We suffered several job failures due to a hub controller issue. The lab has fix. 
      • Example jobs : /usr/local/lab-scripts/cbrxd_hub_control –usb_port 7 –mode sync -i DQ007ADJ failed
    • Regressions found on hi6220-hikey_4.14:
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllKeys
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllMotions
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.SonyDualshock4TestCase.testAllKeys
      • cts-lkft-armeabi-v7a/armeabi-v7a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllKeys
      • cts-lkft-armeabi-v7a/armeabi-v7a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllMotions
      • cts-lkft-armeabi-v7a/armeabi-v7a.CtsHardwareTestCases/android.hardware.input.cts.tests.SonyDualshock4TestCase.testAllKeys
    • Regressions found on hi6220-hikey_4.19:
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllKeys
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllMotions
      • cts-lkft-armeabi-v7a/armeabi-v7a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllKeys
      • cts-lkft-armeabi-v7a/armeabi-v7a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllMotions
    • Regressions found on hi6220-hikey_4.9:
      • cts-lkft-arm64-v8a/arm64-v8a.CtsGraphicsTestCases/
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllKeys
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.AsusGamepadTestCase.testAllMotions
      • cts-lkft-arm64-v8a/arm64-v8a.CtsHardwareTestCases/android.hardware.input.cts.tests.SonyDualshock4TestCase.testAllKeys
  • Android 8.1 – 4.4 + HiKey, 4.14 and X15
    • 4.14.101 / X15
      • 45 failures – 21 are QTAGUID related –CONFIG_NETFILTER_XT_MATCH_QTAGUID needs to be turned on
    • 4.4.174 / HiKey
      • No regressions!
  • Bugs:
    • 22 – Stable WtW

by tgallfoo at February 02, 2019 20:40

Linux Kernel Testing Results by Linaro, Feb 21st Edition


Testing on Open Embedded Linux

Testing on Android

  • Discussion :
    • New combo 4.19 + X15 + P – will be added – anticipate ~ 2 weeks
    • Aosp-master-tracking – kernel version, patchlevel, sublevel will be added qa-reports
    • YongQin – updated LKFT to use latest VTS (R5)
  • Android 9 / P LTS-premerge – 4.4, 4.9, 4.14, 4.19
    • 4.19.23  – no regressions
    • 4.14.101 – no regressions
    • 4.9.158 – no regressions
    • 4.4.174 – no new data
  • Android 9 / P –  4.4, 4.9, 4.14, 4.19 + HiKey
    • 4.19.19 – no regressions
    • 4.14.97 – no regressions
    • 4.9.154 – no regressions
    • 4.4.170 – no new data
  • AOSP-master-tracking –  4.9, 4.14 4.19 / HiKey & 4.14 / X15
  • Android 8.1 – 4.4 + HiKey, 4.14 and X15
    • 4.14.101 / X15
      • 21 regressions – QTAGUID – will work with TI team to adjust their config
    • 4.4.170 / HiKey – no regressions
  • Bugs
    • 22 – Stable WtW
    • 4267, 4268, 4269, 4270 
      • Tom
      • Mykhalio (might have time)
    • 4072 – YongQin
    • 3713 – Sumit


by tgallfoo at February 02, 2019 19:58

February 26, 2019

Riku Voipio

Linus Torvalds is wrong - PC no longer defines a platform

Hey, I can do these clickbait headlines too! Recently it has gotten media's attention that Linus is dismissive of ARM servers. The argument is roughly "Developers use X86 PCs, cross-platform development is painful, and therefor devs will use X86 servers, unless they get ARM PCs to play with".

This ignores the reality where majority of developers do cross-platform development every day. They develop on Mac and Windows PC's and deploy on Linux servers or mobile phones. The two biggest Linux success stories, cloud and Android, are built on cross-platform development. Yes, cross-platform development sucks. But it's just one of the many things that sucks in software development.

More importantly, the ship of "local dev enviroment" has long since sailed. Using Linus's other great innovation, git, developers push their code to a Microsoft server, which triggers a Rube Goldberg machine of software build, container assembly, unit tests, deployment to test environment and so on - all in cloud servers.

Yes, the ability to easily by a cheap whitebox PC from CompUSA was the important factor in making X86 dominate server space. But people get cheap servers from cloud now, and even that is getting out of fashion. Services like AWS lambda abstract the whole server away, and the instruction set becomes irrelevant. Which CPU and architecture will be used to run these "serverless" services is not going to depend on developers having Arm Linux Desktop PC's.

Of course there are still plenty of people like me who use Linux Desktop and run things locally. But in the big picture things are just going one way. The way where it gets easier to test things in your git-based CI loop rather than in local development setup.

But like Linus, I still do want to see an powerful PC-like Arm NUC or Laptop. One that could run mainline Linux kernel and offer a PC-like desktop experience. Not because ARM depends on it to succeed in server space (what it needs is out of scope for this blogpost) - but because PC's are useful in their own.

by Riku Voipio ( at February 02, 2019 20:25

February 15, 2019

Tom Gall

Linux Kernel Testing Results by Linaro – Feb 14th Edition

The information provided is a wrapup of Linaro’s kernel testing efforts. In particular we are searching for kernel regressions. The report is divided into two parts.

The first involves testing of the Linux kernel using Open Embedded as the user space. Long Term Support kernels (LTS) as well as current stable, mainline and next are uses.

The second part of this report involves testing Linux kernels using Android as the user space. LTS kernels is what are being tested, however these LTS kernels also have the out of tree Android Common patches applied.

Generally but not always new kernel versions are available every week. In the case of testing in Open Embedded, we espeically want to report on RC versions of the LTS kernels within the 48 hour testing window before they are released.

Testing on Open Embedded Linux

Automated reports for kselftest results on -next sending to lkft-triage now

  • Report formatting improved, finished

New work:

  • KV-191 UEFI validation in kernelci
  • Ard Biesheuvel requested support testing UEFI boot mode in kernelci under QEMU. This is looking straight-forward

Bug Status — 59 open bugs

RC Log


  • 4.9.156, 4.14.99, 4.19.21 — Reported no regressions in <24h
  • 4.20.8 — Reported no regressions in <48h


  • 4.4.174 — Reported no regressions in <24h

Testing on Android

Discussion :

Android 9 / P LTS-premerge — 4.4, 4.9, 4.14, 4.19

  • 4.19.20 / HiKey — no regressions
  • 4.14.98 / HiKey — no regressions
  • 4.9.155 / HiKey — no regressions
  • 4.4.174 / HiKey — failed to boot — with fix applied, Vts is clean — no regressions and Cts network regressions were still there. The fix as it turns out was the need to be on the latest clang release clang-r349610
  • 4.4.173 / HiKey — 97 cts failures were observed (networking)

Android 9 / P — 4.4, 4.9, 4.14, 4.19 + HiKey

  • 4.4.170 / HiKey — no new data
  • 4.9.154 / HiKey — no regressions
  • 4.14.97 / HiKey — no data received
  • 4.19.19 / HiKey — no data received

AOSP-master-tracking — 4.9, 4.14 4.19 / HiKey & 4.14 / X15

  • A regression was introduced where boot to UI was not successful. Through the course of this week that problem was diagnosed and fixed.
  • Cts CtsLibcoreTestcases — new failure java.lang.IllegalStateException: No SecureRandom implementation was observed. This has introduced approximately ~2600 failures per kernel/board combination.
  • Network tests failed with “Network unreachable” (x15 & HiKey)

Android 8.1–4.4 + HiKey, 4.14 and X15

  • 4.14.94 / X15 — no new data this week
  • 4.4.170 / HiKey — no new data


  • No bug discussion this week as YongQin is just back
  • 22 bugs — no change WtW

Plan for the week

  • Examine aosp-master for understand better if we are looking at new test/AOSP changes, infra structure issues, etc

by tgallfoo at February 02, 2019 02:23

February 07, 2019

Tom Gall

Linux Kernel Testing Results by Linaro Feb 7th Edition

This report is broken up into two parts, OpenEmbedded and Android. These are the two operating systems we are using to run a battery of tests in order to find regressions in the kernel as new patches are added.

The general list of tests that Linaro runs can be found at

When we look for regressions tests often fall into 3 categories of results. Pass, Fail (a regression, exactly what we want to find) and Known Failure where there is a past history with the testcase. Often known failures are flaky testscases that need to be fixed.


  • Automated reports for kselftest results on -next sending to lkft-triage now

The following tests were fixed and removed from our list of known issues

  • open11
  • fcntl36
  • pselect01
  • pselect01_64
  • inotify08
  • bind03

Bug Status — 61 open bugs

RC Log

  • 4.4.173, 4.9.155, 4.14.98, 4.19.20, 4.20.7
  • LTP/fanotify09 confirmed fixed on 4.14.98 due to backport requested in 4.14.97
  • Reported no regressions in <24h


New prebuilt of clang r349610

  • build tested on 4.4.172, 4.9.153, 4.14.96, 4.19.18
  • Boot tested : 4.19.18, 4.14.96, 4.9.153, 4.4.172
  • VTS tested: 4.19.18, 4.14.96, 4.9.153, 4.4.172
  • 4.4.172 — has 1 VtsKernelProcFileApi regression
  • 4.14.96 has 12 kselftest regressions
  • CTS tested: 4.14.96, 4.9.153, 4.4.172 — No regressions

New LTP 20190115 release

  • New VTS 9.0_r5 with latest LPT 20190115 created for testing
  • Initial 4.19.19, 4.14.97, 4.9.154. run has been made, 4.4.172 is in progress at press time

Android 9 / P LTS-premerge — 4.4, 4.9, 4.14, 4.19

  • 4.19.19–1 regression
  • testUsbSerialReadOnDeviceMatches warrents looking into lsusb -v fails?
  • 4.19.20 — in progress
  • 4.14.97 — no regressions
  • 4.14.98 — in progress
  • 4.9.154 — no regressions
  • 4.9.155 — in progress
  • 4.4.173 — in progress

Android 9 / P — 4.4, 4.9, 4.14, 4.19 + HiKey

  • 4.19.16 — current, no new data
  • 4.14.94 — current, no new data
  • 4.9.150 — current, no new data
  • 4.4.170 — current, no new data

AOSP-master-tracking — 4.9, 4.14 4.19 / HiKey & 4.14 / X15


  • 22 — Steady WtW

by tgallfoo at February 02, 2019 21:32

January 30, 2019

Tom Gall

Linux Kernel Testing by Linaro Jan 30th 2019 Edition

This week the Linux based testing the uses Openembedded upgraded to a newer version of LTP. The Android based testing will make a similar upgrade in another week or two.

Two sets of LTS releases were made during the course of the week this report covers. No regressions were observed on Linux nor with Android.


  • LTP “mm” tests added to LKFT (75 tests/board)
  • Upgraded kselftest that is run against all stable kernels to 4.20.

Bug Status — 65 open bugs

RC Log


  • 4.9.154, 4.14.97, 4.19.19, 4.20.6
  • LTP upgraded to 20190115 for all branches
  • Reported no regressions in <48h


  • 4.4.172, 4.9.153, 4.14.96, 4.19.18, 4.20.5
  • kselftest upgraded to 4.20 for all LTS branches
  • Reported no regressions in <24h



  • 4.19 has a fairly high number of failures as part of it’s baseline. Started to look into improving baseline
  • 40 VTS failures due to QTAGUID on 4.19, moving to known failures.
  • New LTP testcases, consistent failures

vts-test/arm64-v8a.VtsKernelLtp/ fail

vts-test/arm64-v8a.VtsKernelLtp/ fail

vts-test/arm64-v8a.VtsKernelLtp/VtsKernelLtp.syscalls.io_setup01_64bit fail

vts-test/arm64-v8a.VtsKernelLtp/VtsKernelLtp.syscalls.io_submit01_64bit fail

vts-test/arm64-v8a.VtsKernelLtp/VtsKernelLtp.syscalls.select04_64bit fail

vts-test/armeabi-v7a.VtsKernelLtp/ fail

vts-test/armeabi-v7a.VtsKernelLtp/ fail

vts-test/armeabi-v7a.VtsKernelLtp/VtsKernelLtp.syscalls.io_setup01_32bit fail

vts-test/armeabi-v7a.VtsKernelLtp/VtsKernelLtp.syscalls.io_submit01_32bit fail

Android 9 / P LTS-premerge — 4.4, 4.9, 4.14, 4.19

  • 4.19.18 — no regressions
  • 4.19.17 — no regressions
  • 4.14.96 — no regressions
  • 4.14.95 — no regressions
  • 4.9.153 — no regressions
  • 4.9.152 — no regressions
  • 4.4.172 — no regressions
  • 4.4.171 — no regressions

Android 9 / P — 4.4, 4.9, 4.14, 4.19 + HiKey

  • 4.19.16 — current, rerun completed for missing CTS from last week otherwise no new data
  • 4.14.94 — current, no new data
  • 4.9.150 — current, no new data
  • 4.4.170 — current, no new data

AOSP-master-tracking — 4.9, 4.14 4.19 / HiKey & 4.14 / X15

Android 8.1–4.4 + HiKey, 4.14 and X15

  • 4.14.94 / X15 — seems PVR driver is not loading (outdated). Need to test locally, build and upload the driver.
  • 4.4.170 / HiKey — current, no new data


by tgallfoo at January 01, 2019 22:05

January 23, 2019

Tom Gall

Linux Kernel Testing Results by Linaro — Jan 23rd 2019 Edition

One of the things that we do at Linaro is testing Linux Kernels to look for kernel regressions. Ideally we want a world where those that make use of Long Term Support Kernels (LTS) can depend on the stream of fixes that are being provided.

Mobile phone companies, Linux Distros, embedded Linux deployments, etc all generally like the idea of installing one major version of Linux (e.g. 4.9) and sticking with it for the lifetime of their product.

This, and following stories tell how week to week testing of Linux kernels is going, what we’ve found, or better, not found as the kernel versions tick by.

We test using two host user spaces, open embedded and Android.

Open Embedded


4.9.152, 4.14.95, 4.20.4

  • Reported crashes in v4.20.3–15-g5592f5bf010b which were intentional ‘canaries’ (the canary successfully died)
  • Reported no regressions in <24h


  • Reported no regressions in <48h

Bug Status — 57 open bugs


Android 9 / P — 4.4, 4.9, 4.14, 4.19 on HiKey

  • 4.14.94 — no regressions
  • 4.19.16 — Note USB OTG regression and potential eMMC issue documented in the bugs section
  • 4.4.170 — no regressions
  • 4.9.150 — no regressions

Android 8.1–4.4 on HiKey, Android 8.1, 4.14 on X15

  • 4.14.94 / X15 no regressions
  • 4.4.170 / HiKey no regressions

Android 9/P + automerged latest version of LTS 4.4, 4.9, 4.14, 4.19 + HiKey + Latest LTP

  • This new combination is a work in progress to pull in latest LTP from AOSP-master, as well as using the combination of Android Common + HiKey Linaro (auto merged). It triggers automatically when Android Common is updated right after a new LTS release is merged. This combo thus gives everyone great visibility to test results nearly immediately after an new LTS is available.
  • We have initial data but are not sharing them as part of this report yet.

AOSP-master tracking with 4.4, 4.9, 4.14, 4.19 on HiKey

  • These builds are being reworked / repackaged so we’ll have data to report next week.


by tgallfoo at January 01, 2019 17:55

January 15, 2019

Naresh Bhat

usermod/groupmod tools - Rename username and usergroup in Ubuntu

The laptop come with default ubuntu installed.  In that case the username, usergroup they have created by default.  This blog explains you how you can rename the default username and group with your own username, group.

Unix-like operating systems decouple the user name from the user identity, so you may safely change the name without affecting the ID. All permissions, files, etc are tied to your identity (uid), not your username.

To manage every aspect of the user database, you use the usermod tool.  To change username (it is probably best to do this without being logged in):

STEP 1: Reboot your laptop with 1 as a command line parameter.
The laptop will be booted into a rescue mode with 1 as a parameter.  You can also boot your laptop in a single user mode.

STEP 2: Change your root password with the command "passwd"
This is just tobe secured in future,  because one can easily hack your laptop with your default user password.

STEP 3: Rename oldUsername with newUsername

# usermod -l newUsername oldUsername

This however, doesn't rename the home folder.

STEP 4: Rename to newHomeDir
To change home-folder, use

# usermod -d /home/newHomeDir -m newUsername

after you changed the username.

STEP 5: Rename the groupName

# groupmod --new-name NEW_GROUP_NAME OLD_GROUP_NAME

Now you can reboot your laptop in multiuser mode and log-in as a new user which is being prompted.

by Naresh ( at January 01, 2019 17:01

January 13, 2019

Leif Lindholm

Building TianoCore with Visual Studio (on ARM64)


EDK2/TianoCore has a very complex build system. Part of that is to let developers use vastly different toolcains to build (GCC, CLANG, Visual Studio, ICC, XCODE). But it also provides different profiles for different versions of these toolchains.

(As a side note, this is what leads to the frequently repeated misconception that EDK2 cannot be built with GCC later than version 5. The reality is that GCC behaviour and command line options have remained stable enough since version 5 that we haven't needed to add new profiles, and the GCC5 profile works fine for 6-8.)

From the start, the ARM/AARCH64 ports were developed using ARM's commercial toolchain and GCC. Whereas on the Ia32/X64 side, most of the development has tended to happen with Visual Studio (GCC mainly being used for Ovmf). This means that for a developer moving from x86 to ARM, they have not only had to get used to a new architecture, but they've also had to deal with a new toolchain.

Installing the tools

Visual Studio

Visual Studio 2017 has included ARM/AARCH64 support since release 15.4. Not publicly announced, and not complete - but sufficient to build firmware, and UEFI applications and drivers. And with release 15.9, the support is now public and complete. Which makes for a good time to ensure we can provide a familiar development environment for those already using Visual Studio.

So I set out to make myself a development environment in which I could build all current architectures in the same environment - and in Visual Studio. And since I have my new ARM64 laptop, I'll make sure to get it working there.

There is no native Visual Studio for arm64, but the (32-bit) x86 version runs just fine.

Search for it in the Microsoft Store, or go straight to the download page. The Community Edition is sufficient, and is free (as in beer) for individuals or open source development.

I'm not going to go through downloading and starting the installer and how to press the Next button, but a few things are worth mentioning.

First, you don't need to install everything in order to get the basic toolchain functionality. I opted for the "Linux development with C++" toolset and ended up with what I needed. Screenscot of VS2017 installer toolset selection.

Second, make sure the components "Visual C++ compilers and libraries for ARM", "Visual C++ compilers and libraries for ARM64" and "Python 2 32-bit" are selected. Screenshot of VS2017 installer component selection.


For building EDK2 for Ia32/X64, you may also need nasm. Currently, there is no arm64 build of nasm for Windows, but again the 32-bit x86 variant does the job. (It also won't currently build with Visual Studio, so that's not a way to get a native one.)


Acpica-tools (including iasl for building ACPI tables) comes in a .zip file (32-bit x86). Rather ungracefully, the Visual Studio build profile simply assumes the binaries from this archive have been extracted and placed in C:\ASL, so do that.


If you don't want to rely completely on the Visual Studio git integration, the 32-bit x86 variant available from here works fine.


Open the Visual Studio Developer Command Prompt directly (don't worry about the GUI). Then, from your edk2 directory, run:

C:\git\edk2>set PYTHON_HOME=C:\Python27
C:\git\edk2>set NASM_PREFIX=C:\Program Files (x86)\NASM\
C:\git\edk2>edksetup.bat rebuild 

to build the native BaseTools and set up the build environment. This will complete with a warning that !!! WARNING !!! No CYGWIN_HOME set, gcc build may not be used !!!, which is fine, because we're not using GCC.

After this, the build command works as usual - V2017 is the toolchain profile we want. So, to build OvmfPkg for X64:

C:\git\edk2>build -a X64 -t VS2017 -p OvmfPkg\OvmfPkgX64.dsc

Or to build HelloWorld for AARCH64:

C:\git\edk2>build -a AARCH64 -t VS2017 -p MdeModulePkg\MdeModulePkg.dsc -m MdeModulePkg\Application\HelloWorld\HelloWorld.inf

What's missing?

Thanks to Pete Batard, support for building UEFI applications and drivers for AARCH64 was already available upstream. So for Option ROM drivers or UEFI command line utilities, you should be good to go.

However, since we've really only used GCC/CLANG for the port up till now, we're lacking assembler files using a compatible syntax. In addition to this, when trying to build whole platform support, there are several issues with (ARM-specific) C source files that have never before been compiled with Visual Studio.

I started ploughing through this end of last year - a hacked up version leaving many asm implementations empty (just so I could get through and identify all of the C issues) is available in one of my working branches. Of course, this appears to have suffered some bitrot (and change in behaviour with VS 15.9), so I will get back to that over the next few weeks. And as always, if you're impatient - patches welcome!

by Leif Lindholm at January 01, 2019 00:00

January 10, 2019

Leif Lindholm

A long time coming

For a very long time now, I have put effort into dogfooding. Back when I first started working at ARM in 2005, all available ARM platforms you might even consider using for normal computing were ridiculously expensive. But finally, in 2008, something changed.


The BeagleBoard was the first fundamental change in how embedded development boards were marketed and sold. It was open hardware. It was backed by open source software. And it was cheap. It was released into a market where it was "simply common sense" that you couldn't turn a profit on a sub-$1000 development board, and it sold for < $200.

It wasn't brilliant - early revisions had serious issues with the USB host port, so a non-standard cable was needed in order to force the OTG port into host mode, and then you had to put networking, keyboard, mouse and any other peripherals you wanted to attach without a soldering iron to a hub connected to the single port. But you could run a normal graphic desktop environment on it!

This opened up for a bunch of follow-ons, including Raspberry Pi, but there was really nothing game changing for a bunch of years until...


When Google launched the Chromebook product line, they were initially all x86-based.

Samsung Series 3

But eventually, Samsung released the Series 3, and apart from the risk of setting your crotch on fire, the Crouton project made it quite easy to convert this to a Linux laptop-ish.

The underlying business model of course meant that it was intentionally short on local storage, and costcutting meant it was short on RAM even for running a web browser a couple of years down the line - but it was an actual thing I could bring instead of an x86 laptop when going to conferences. Both for hacking and for giving presentations.

I remain fairly convinced mine held the only armhf->ia64 cross compilation toolchain the world has ever seen, at least used in anger (for compile testing changes to the Linux EFI subsystem).

Samsung Chromebook 2

A couple of years later, Samsung followed up with the Chromebook 2, offering a model with 4 cores, a larger (and better) screen, and twice the amount of RAM. So I got one of those, but frankly, the shortage of local storage combined with the unreliability of uSD or USB storage across suspend/resume meant I eventually stopped using it for local builds.

Samsung Chromebook Plus

Well, Samsung eventually decided to give up on selling Chromebooks (and possibly even laptops) in Europe, so I had to import one from across the pond. But this one was 64-bit! And the screen was a serious step up from the previous ones, and the chassis was metal instead of plastic. Apart from that, it wasn't that much of an upgrade - but since my work was pretty much exclusively on 64-bit, it was still a useful thing to move to.

Marvell/SolidRun MacchiatoBIN

The MacchiatoBIN also deserves a mention. It remains the only platform I would recommend to a hobbyist without a list the length of my ARM of caveats. That doesn't mean there isn't such a list, just that it's shorter, and the issues easier to live with. This actually works pretty OK as a primary desktop system, and I used it for that for several months.

Biggest things it got right compared to competition

  • Mini-ITX form factor - fits in any regular PC case.
  • Onboard SATA.
  • Onboard PCIe (one open-ended x4 slot).
  • USB3.
  • On-board connector for front panel USB2 (which is weird, but there are adapters).
  • Unbrickable - can load firmware from uSD.

Biggest issues are

  • Very restrictive on which DIMMs are supported.
  • EDK2 port not yet fully upstream.
  • FTDI serial console flaky (when debugging early system firmware).
  • Non-ATX-like handling of power. Turns on as soon as cable inserted. No soft power-off.

Windows on ARM

Then, finally, devices running Windows (not Windows RT) trickled onto the market at the end of Q1 last year (2018). There was allegedly a contender from Asus, but that never materialised as available for me to buy either here in the UK, in the US or in Taiwan - until a couple of weeks ago.

HP Envy X2

So the first one I got to have a look at was the HP Envy X2 - really a tablet that comes with a keyboard built into its screen protector. I had some brief time with one during Linaro Connect in Hong Kong 2018, but then Linaro got me one to have a closer look, shortly before the subsequent Connect in Vancouver.

While it tries to encourage you to use cloud storage, it actually came with 128GB of onboard storage. This was really useful, because it let me get started figuring out how to build EDK2 under Visual Studio (posts to follow on this). It ended up being quite usable on long haul flights (and related time in airports).

But, this first wave of devices were based on the Qualcomm Snapdragon 835, which was slightly lacking in horsepower - something that got even worse once Spectre/Meltdown mitigations were rolled out.

And it still only had 4GB of RAM. The same as the phone I bought early 2017, and the same as the Chromebook I bought in 2014!

New laptop

So why this retrospective post?

Well, Tuesday this week I noticed that the first of the Snapdragon 850 laptops was finally available to buy in the UK.

Lenovo Yoga C630

Yeah, I may have ordered one of these. I may in fact be typing this post on it. It may also now be out of stock.

The Lenovo Yoga C630 is an octa-core system built like a proper laptop. Solid (and very sleekly stealth looking) metal chassis. The keybord has very short travel, which some people might hate, but I like it better than the Chromebook and Macbook ones.

Picture of Lenovo Yoga C630

Screen seems OK, and the machine feels a lot snappier than the Envy X2 did. But even more importantly, it comes with 8GB of RAM. The 128GB (only variant available to buy, although they claim a 256GB one also exists) of onboard storage sits on a UFS interface rather than eMMC like the Chromebooks. This makes a substantial difference for performance.

The Yoga ships with a Windows 10 Home licence. Upgrading that to Windows 10 Pro would set you back another £120 and push the total cost over £1000. If those extra features had been important to me, that may well have turned this device too expensive. They weren't for me, so I'm sticking with Home.

State of Windows on ARM(64)

Well, this is very much at a "first impressions" sort of level but...

Windows in S mode

All of these ARM-based laptops ship in S mode. What this means is basically that you can only install programs from the Microsoft Store. Clearly not very useful for me, but just like the default locked-down-ness of the Chomebooks - it really makes sense for what the majority of computer users need, and it does improve device security.

I'm totally OK with this, because it is optional. But it's also worth noting that unlike Chromebooks there is no way to switch back into S mode once you've made the jump.


What makes these laptops potential replacements to existing Windows users is that they provide dynamic binary translation for existing x86 applications. Worth noting is that only 32-bit applications are supported for this, but it does mean most of your standard applications will just work (albeit more sluggishly than when running natively).

Windows Services for Linux

WSL is available with the default installation. You only need to enable it before going to the Microsoft Store (search for "WSL") to install your (mainstream) distribution(s) of choice.

Picture of Ubuntu, openSUSE, SLES, Debian and Kali in the Microsoft Store

Excellent, that means I can do work both with Visual Studio and in a proper Linux environment simultaneously? No :( Not yet. As I said, these devices only made it into the hands of real users less than a year ago, so fixes for issues that were picked up by people using them in anger haven't made it into the stable releases yet. This one is currently blocking me from doing my day job on the Yoga; the release version of WSL fails to emulate the userspace variant of cache maintenance, so pretty much any JIT will die from a SIGILL (illegal instruction).

So I guess the way forward for me is to sign up as a Windows Insider and jump on the "slow track", to get early access to new features (but not quite drink from the firehose).

Edit: signed up as a Windows Insider, now running Version 1809, and this problem has gone away!

Browser support

When I got the Envy X2, I pretty much had the choices of native Edge or emulated Chrome/Firefox. But in a case of excllent timing, there are now native nightly builds of Firefox for arm64. Although it comes with the disclaimer "even nightlier than our normal Nightlies", I have not so far come across any issues.


With WSL you can certainly use your regular Linux ssh command, but if coming from a Windows environment already, it may be useful to know there are already snapshot builds of PuTTY available for both native 32-bit and 64-bit ARM.

Who are you and what have you done with Leif?

I'm me!

And I'm certainly going to look into being able to run Linux directly on this platform.

The nonsense that was "UEFI Secure Boot must not be possible to disable on ARM devices" does not apply to this class of devices, so that is not a blocker preventing this work. And once we have it working, we want to boot Linux with Secure Boot enabled.

But for now I'm going to do some dogfooding on Windows, and try to help find bugs and document my progress.

by Leif Lindholm at January 01, 2019 00:00

January 07, 2019

Steve McIntyre

Rebuilding the entire Debian archive twice on arm64 hardware for fun and profit

I've posted this analysis to Debian mailing lists already, but I'm thinking it's also useful as a blog post too. I've also fixed a few typos and added a few more details that people have suggested.

This has taken a while in coming, for which I apologise. There's a lot of work involved in rebuilding the whole Debian archive, and many days spent analysing the results. You learn quite a lot, too! :-)

I promised way back before DebConf 18 last August that I'd publish the results of the rebuilds that I'd just started. Here they are, after a few false starts. I've been rebuilding the archive specifically to check if we would have any problems building our 32-bit Arm ports (armel and armhf) using 64-bit arm64 hardware. I might have found other issues too, but that was my goal.

The logs for all my builds are online at

for reference. See in particular

for automated analysis of the build logs that I've used as the basis for the stats below.

Executive summary

As far as I can see we're basically fine to use arm64 hosts for building armel and armhf, so long as those hosts include hardware support for the 32-bit A32 instruction set. As I've mentioned before, that's not a given on all arm64 machines, but there are sufficient machine types available that I think we should be fine. There are a couple of things we need to do in terms of setup - see Machine configuration below.


I (naively) just attempted to rebuild all the source packages in unstable main, at first using pbuilder to control the build process and then later using sbuild instead. I didn't think to check on the stated architectures listed for the source packages, which was a mistake - I would do it differently if redoing this test. That will have contributed quite a large number of failures in the stats below, but I believe I have accounted for them in my analysis.

I built lots of packages, using a range of machines in a small build farm at home:
  • Macchiatobin
  • Seattle
  • Synquacer
  • Multiple Mustangs

using my local mirror for improved performance when fetching build-deps etc. I started off with a fixed list of packages that were in unstable when I started each rebuild, for the sake of simplicity. That's one reason why I have two different numbers of source packages attempted for each arch below. If packages failed due to no longer being available, I simply re-queued using the latest version in unstable at that point.

I then developed a script to scan the logs of failed builds to pick up on patterns that matched with obvious causes. Once that was done, I worked through all the failures to (a) verify those patterns, and (b) identify any other failures. I've classified many of the failures to make sense of the results. I've also scanned the Debian BTS for existing bugs matching my failed builds (and linked to them), or filed new bugs where I could not find matches.

I did not investigate fully every build failure. For example, where a package has never been built before on armel or armhf and failed here I simply noted that fact. Many of those are probably real bugs, but beyond the scope of my testing.

For reference, all my scripts and config are in git at

armel results

Total source packages attempted 28457
Successfully built 25827
Failed 2630

Almost half of the failed builds were simply due to the lack of a single desired build dependency (nodejs:armel, 1289). There were a smattering of other notable causes:

  • 100 log(s) showing build failures (java/javadoc)
    Java build failures seem particularly opaque (to me!), and in many cases I couldn't ascertain if it was a real build problem or just maven being flaky. :-(
  • 15 log(s) showing Go 32-bit integer overflow
    Quite a number of go packages are blindly assuming sizes for 64-bit hosts. That's probably fair, but seems unfortunate.
  • 8 log(s) showing Sbuild build timeout
    I was using quite a generous timeout (12h) with sbuild, but still a very small number of packages failed. I'd earlier abandoned pbuilder for sbuild as I could not get it to behave sensibly with timeouts.
The stats that matter are the arch-specific failures for armel:
  • 13 log(s) showing Alignment problem
  • 5 log(s) showing Segmentation fault
  • 1 log showing Illegal instruction
and the new bugs I filed:
  • 3 bugs for arch misdetection
  • 8 bugs for alignment problems
  • 4 bugs for arch-specific test failures
  • 3 bugs for arch-specific misc failures

Considering the number of package builds here, I think these numbers are basically "lost in the noise". I have found so few issues that we should just go ahead. The vast majority of the failures I found were either already known in the BTS (260), unrelated to what I was looking for, or both.

See below for more details about build host configuration for armel builds.

armhf results

Total source packages attempted 28056
Successfully built 26772
Failed 1284

FTAOD: I attempted fewer package builds for armhf as we simply had a smaller number of packages when I started that rebuild. A few weeks later, it seems we had a few hundred more source packages for the armel rebuild.

The armhf rebuild showed broadly the same percentage of failures, if you take into account the nodejs difference - it exists in the armhf archive, so many hundreds more packages could build using it.

In a similar vein for notable failures:

  • 89 log(s) showing build failures (java/javadoc)
    Similar problems, I guess...
  • 15 log(s) showing Go 32-bit integer overflow
    That's the same as for armel, I'm assuming (without checking!) that they're the same packages.
  • 4 log(s) showing Sbuild build timeout
    Only 4 timeouts compared to the 8 for armel. Maybe a sign that armhf will be slightly quicker in build time, so less likely to hit a timeout? Total guesswork on small-number stats! :-)

Arch-specific failures found for armhf:

  • 11 log(s) showing Alignment problem
  • 4 log(s) showing Segmentation fault
  • 1 log(s) showing Illegal instruction

and the new bugs I filed:

  • 1 bugs for arch misdetection
  • 8 bugs for alignment problems
  • 10 bugs for arch-specific test failures
  • 3 bugs for arch-specific misc failures

Again, these small numbers tell me that we're fine. I liked to 139 existing bugs in the BTS here.

Machine configuration

To be able to support 32-bit builds on arm64 hardware, there are a few specific hardware support issues to consider.


Our 32-bit Arm kernels are configured to fix up userspace alignment faults, which hides lazy programming at the cost of a (sometimes massive) slowdown in performance when this fixup is triggered. The arm64 kernel cannot be configured to do this - if a userspace program triggers an alignment exception, it will simply be handed a SIGBUS by the kernel. This was one of the main things I was looking for in my rebuild, common to both armel and armhf. In the end, I only found a very small number of problems.

Given that, I think we should immediately turn off the alignment fixups on our existing 32-bit Arm buildd machines. Let's flush out any more problems early, and I don't expect to see many.

To give credit here: Ubuntu have been using arm64 machines for building 32-bit Arm packages for a while now, and have already been filing bugs with patches which will have helped reduce this problem. Thanks!

Deprecated / retired instructions

In theory(!), alignment is all we should need to worry about for armhf builds, but our armel software baseline needs two additional pieces of configuration to make things work, enabling emulation for

  • SWP (low-level locking primitive, deprecated since ARMv6 AFAIK)
  • CP15 barriers (low-level barrier primitives, deprecated since ARMv7)

Again, there is quite a performance cost to enabling emulation support for these instructions but it is at least possible!

In my initial testing for rebuilding armhf only, I did not enable either of these emulations. I was then finding lots of "Illegal Instruction" crashes due to CP15 barrier usage in armhf Haskell and Mono programs. This suggests that maybe(?) the baseline architecture in these toolchains is incorrectly set to target ARMv6 rather than ARMv7. That should be fixed and all those packages rebuilt at some point.


  • Peter Green pointed out that ghc in Debian armhf is definitely configured for ARMv7, so maybe there is a deeper problem.
  • Edmund Grimley Evans suggests that the Haskell problem is coming from how it drives LLVM, linking to #864847 that he filed in 2017.

Bug highlights

There are a few things I found that I'd like to highlight:

  • In the glibc build, we found an arm64 kernel bug (#904385) which has since been fixed upstream thanks to Will Deacon at Arm. I've backported the fix for the 4.9-stable kernel branch, so the fix will be in our Stretch kernels soon.
  • There's something really weird happening with Vim (#917859). It FTBFS for me with an odd test failure for both armel-on-arm64 and armhf-on-arm64 using sbuild, but in a porter box chroot or directly on my hardware using debuild it works just fine. Confusing!
  • I've filed quite a number of bugs over the last few weeks. Many are generic new FTBFS reports for old packages that haven't been rebuilt in a while, and some of them look un-maintained. However, quite a few of my bugs are arch-specific ones in better-maintained packages and several have already had responses from maintainers or have already been fixed. Yay!
  • Yesterday, I filed a slew of identical-looking reports for packages using MPI and all failing tests. It seems that we have a real problem hitting openmpi-based packages across the archive at the moment (#918157 in libpmix2). I'm going to verify that on my systems shortly.

Other things to think about

Building in VMs

So far in Debian, we've tended to run our build machines using chroots on raw hardware. We have a few builders (x86, arm64) configured as VMs on larger hosts, but as far as I can see that's the exception so far. I know that OpenSUSE and Fedora are both building using VMs, and for our Arm ports now we have more powerful arm64 hosts available it's probably the way we should go here.

In testing using "linux32" chroots on native hardware, I was explicitly looking to find problems in native architecture support. In the case of alignment problems, they could be readily "fixed up / hidden" (delete as appropriate!) by building using 32-bit guest kernels with fixups enabled. If I'd found lots of those, that would be a safer way to proceed than instantly filing lots of release-critical FTBFS bugs. However, given the small number of problems found I'm not convinced it's worth worrying about.

Utilisation of hardware

Another related issue is in how we choose to slice up build machines. Many packages will build very well in parallel, and that's great if you have something like the Synquacer with many small/slow cores. However, not all our packages work so well and I found that many are still resolutely chugging through long build/test processes in single threads. I experimented a little with my config during the rebuilds and what seemed to work best for throughput was kicking off one build per 4 cores on the machines I was using. That seems to match up with what the Fedora folks are doing (thanks to hrw for the link!).

Migrating build hardware

As I mentioned earlier, to build armel and armhf sanely on arm64 hardware, we need to be using arm64 machines that include native support for the 32-bit A32 instruction set. While we have lots of those around at the moment, some newer/bigger arm64 server platforms that I've seen announced do not include it. (See an older mail from me for more details. We'll need to be careful about this going forwards and keep using (at least) some machines with A32. Maybe we'll migrate arm64-only builds onto newer/bigger A64-only machines and keep the older machines for armel/armhf if that becomes a problem?

At least for the foreseeable future, I'm not worried about losing A32 support. Arm keeps on designing and licensing ARMv8 cores that include it...


I've spent a lot of time looking at existing FTBFS bugs over the last weeks, to compare results against what I've been seeing in my build logs. Much kudos to people who have been finding and filing those bugs ahead of me, in particular Adrian Bunk and Matthias Klose who have filed many such bugs. Also thanks to Helmut Grohne for his script to pull down a summary of FTBFS bugs from UDD - that saved many hours of effort!


Please let me know if you think you've found a problem in what I've done, or how I've analysed the results here. I still have my machines set up for easy rebuilds, so reproducing things and testing fixes is quite easy - just ask!

January 01, 2019 12:57

November 30, 2018

Tom Gall

Kernel Testing News 11/30/2018

Nov 26th saw the release of 4.4.165, 4.9.141, 4.14.84 and 4.19.4

For these LTS kernel versions, results were reported upstream, no regressions were found.

2018-11-26: Rafael Tinoco – bug 4043 – Asked Greg to backport a fix for v4.4, Sasha forwarded to the mm list.

For Android Kernels, regressions were detected.


  • 4.14.84 + HiKey boot regression – observed in with 9.0 and AOSP
  • 4.4.165 Regression:
    VtsKernelSyscallExistence#testSyscall_name_to_handle_at – Unknown
    error: test case requested but not executed.
    VtsKernelSyscallExistence#testSyscall_open_by_handle_at – Unknown
    error: test case requested but not executed.
    VtsKernelSyscallExistence#testSyscall_uselib – Unknown error: test
    case requested but not executed.

No Others Regressions: 4.4.165 and 4.9.141 on Android 9.

X15: 4.14.84 + O-MR1 – Baselining activity has been particularly effective over the past two weeks, dropping the number of errors from 65 failing tests to 16 as of today. That’s really good progress towards setting a clean baseline.

Bug 4033  Sumit has been looking at the failing CtsBluetoothTestCases android.bluetooth.cts.BluetoothLeScanTest#testBasicBleScan and android.bluetooth.cts.BluetoothLeScanTest.testScanFilter failures.

These tests both pass across all kernels with 8.1. They however fail with both 9.0 and AOSP. Looking at historical AOSP results it appears that failures there started approx in the September timeframe.

Last, successful test builds and test boot to UI with 4.4.165 and 4.9.141 with Android 9) using the newly released clang-r346389 compiler.

by tgallfoo at November 11, 2018 22:52

November 24, 2018

Gema Gomez

Idee der creativmarkt

I was in Berlin for an event last week and I stumbled upon a magic place 3 minutes walk from my hotel. This place was Idee Creativmarkt, a crafts shop similar but very different from Hobbycraft in the UK, it felt posher with a lot of high quality yarns on display. It was also different because the shop was organised in a more relaxed and creative way, with lots of example projects to inspire visitors to be creative and playful with colors and textures. I didn’t buy anything, though, because I am on a mission to reduce my stash for the foreseeable future, and I have decided to only buy new yarn when absolutely necessary (i.e. I have started a project and I need more of a particular color to be able to finish it) or a new project requires some new type of fibre that I don’t own in significan quantities.

Idee entry

Their building was decorated with what I thought was a very clever design of their logo in lighting. Apologies for the bad picture but I hope it conveys the idea of what it looks like:

Idee facade

Of all the projects they had as samples, this is the one that captured my imagination the most, I seem to be enthralled by variegated yarns nowadays:

Idee inspiration

I would definitely recommend any crafters spending a couple of days in Berlin to stop at Idee for inspiration, you won’t be disappointed. I shall go back to my started variegated shawl soon!

by Gema Gomez at November 11, 2018 00:00

November 01, 2018

Mark Brown

Linux Audio Miniconf 2018 report

The audio miniconference was held on the 21st in the offices of Cirrus Logic in Edinburgh with 15 attendees from across the industry including userspace and kernel developers, with people from several OS vendors and a range of silicon companies.


We started off with a discussion of community governance lead by Takashi Iwai. We decided that for the ALSA-hosted projects we’ll follow the kernel and adopt the modified version of the contributor covenant that they have adopted, Sound Open Firmware already has a code. We also talked a bit about moving to use some modern hosted git with web based reviews. While this is not feasible for the kernel components we decided to look at doing this for the userspace components, Takashi will start a discussion on alsa-devel. Speaking of the lists Vinod Koul also volunteered to work with the Linux Foundation admin team to get them integrated with

IMG_20181021_100446.jpgLiam Girdwood presenting virtualization (photo: Arun Raghavan)


Liam Girdwood then kicked off the first technical discussion of the day, covering virtualization. Intel have a new hypervisor called ACRN which they are using as part of a solution to expose individual PCMs from their DSPs to virtual clients, they have a virtio specification for control. There were a number of concerns about the current solution being rather specific to both the hardware and use case they are looking at, we need to review that this can work on architectures that aren’t cache coherent or systems where rather than exposing a DSP the host system is using a sound server.

We then moved on to AVB, several vendors have hardware implementations already but it seems clear that these have been built by teams who are not familiar with audio hardware, hopefully this will improve in future but for now there are some regrettable real time requirements. Sakamoto-san suggested looking at FireWire which has some similar things going on with timestamps being associated with the audio stream.

For SoundWire, basic functionality for x86 systems is now 90% there – we still need support for multiple CPU DAIs in the ASoC core (which is in review on the lists) and the Intel DSP drivers need to plumb in the code to instantiate the drivers.

We also covered testing, there may be some progress here this year as Intel have a new hypervisor called ACRN and some out of tree QEMU models for some of their newer systems both of which will help with the perennial problem that we need hardware for a lot of the testing we want to do. We also reviewed the status with some other recurring issues, including PCM granularity and timestamping, for PCM granularity Takashi Iwai will make some proposals on the list and for timestamping Intel will make sure that the rest of their driver changes for this are upstreamed. For dimen we agreed that Sakamoto-san’s work is pretty much done and we just need some comments in the header, and that his control refactoring was a good idea. There was discussion of user defined card elements, there were no concerns with raising the number of user defined elements that can be created but some fixes needed for cleanup of user defined card elements when applications close. The compressed audio userspace is also getting some development with the focus on making things easier to test, integrating with ffmpeg to give something that’s easier for user to work with.

Charles Keepax covered his work on rate domains (which we decided should really be much more generic than just covering sample rates), he’d posted some patches on the list earlier in the week and gave a short presentation about his plans which sparked quite a bit of discussion. His ideas are very much in line with what we’ve discussed before in this area but there’s still some debate as to how we configure the domains – the userspace interface is of course still there but how we determine which settings to use once we pass through something that can do conversions is less clear. The two main options are that the converters can expose configuration to userspace or that we can set constraints on other widgets in the card graph and then configure converters automatically when joining domains. No firm conclusion was reached, and since substantial implementation will be required it is not yet clear what will prove most sensible in practical systems.


Sakamoto-san also introduced some discussion of new language bindings. He has been working on a new library designed for use with GObject introspection which people were very interested in, especially with the discussion of testing – having something like this would simplify a lot of the boilerplate that is involved in using the C API and allow people to work in a wider variety of languages without needing to define specific bindings or use the individual language’s C adaptations. People also mentioned the Rust bindings that David Henningsson had been working on, they were particularly interesting for the ChromeOS team as they have been adopting Rust in their userspace.

We talked a bit about higher level userspace software too. PulseAudio development has been relatively quiet recently, Arun talked briefly about his work on native compressed audio support and we discussed if PulseAudio would be able to take advantage of the new timestamping features added by Pierre-Louis Bossart. There’s also the new PipeWire sound server stack, this is a new stack which was originally written for video but now also has some audio support. The goal is to address architectural limitations in the existing JACK and PulseAudio stacks, offering the ability to achieve low latencies in a stack which is more usable for general purpose applications than JACK is.


Discussions of DSP related issues were dominated by Sound Open Firmware which is continuing to progress well and now has some adoption outside Intel. Liam gave an overview of the status there and polled interest from the DSP vendors who were present. We talked about how to manage additions to the topology ABI for new Sound Open Firmware features including support for loading and unloading pieces of the DSP topology separately when dynamically adding to the DSP graph at runtime, making things far more flexible. The issues around downloading coefficient data were also covered, the discussion converged on the idea of adding something to hwdep and extending alsa-lib and tinyalsa to make this appear integrated with the standard control API. This isn’t ideal but it seems unlikely that anything will be. Techniques for handling long sequences of RPC calls to DSPs efficiently were also discussed, the conclusion was that the simplest thing was just to send commands asynchronously and then roll everything back if there are any errors.


Thanks again to all the attendees for their time and contributions and to Cirrus Logic for their generosity in hosting this in their Edinburgh office. It was really exciting to see all the active development that’s going on these days, it’ll be great to see some of that bear fruit over the next year!

_MG_1505Group photo

by broonie at November 11, 2018 17:12

October 10, 2018

Neil Williams

Code Quality & Formatting for Python

I've recently added two packages (and their dependencies) to Debian and thought I'd cover a bit more about why.


black, the uncompromising Python code formatter, has arrived in Debian unstable and testing.

black is being adopted by the LAVA Software Community Project in a gradual way and the new CI will be checking that files which have been formatted by black stay formatted by black in merge requests.

There are endless ways to format Python code and pycodestyle and pylint are often too noisy to use without long lists of ignored errors and warnings. Black takes the stress out of maintaining a large Python codebase as long as a few simple steps are taken:

  • Changes due to black are not functional changes. A merge request applying black to a source code file must not include functional changes. Just the change done by black. This makes code review manageable.
  • Changes made by black are recorded and once made, CI is used to ensure that there are no regressions.
  • Black is only run on files which are not currently being changed in existing merge requests. This is a simple sanity provision, rebasing functional changes after running black is not fun.

Consistent formatting goes a long way to helping humans spot problematic code.

See or apt-get install python-black-doc for a version which doesn't "call home".


So much for code formatting, that's nice and all but what can matter more is an overview of the complexity of the codebase.

We're experimenting with running radon as part of our CI to get a CodeClimate report which GitLab should be able to understand.

(Take a bow - Vince gave me the idea by mentioning his use of Cyclomatic Complexity.)

What we're hoping to achieve here is a failed CI test if the complexity of critical elements increases and a positive indication if the code complexity of areas which are currently known to be complex can be reduced without losing functionality.

Initially, just having the data is a bonus. The first try at CodeClimate support took the best part of an hour to scan our code repository. radon took 3 seconds.

See or apt-get install python-radon-doc for a version which doesn't "call home".

(It would be really nice for upstreams to understand that putting badges in their sphinx documentation templates makes things harder to distribute fairly. Fine, have a nice web UI for your own page but remove the badges from the pages in the released tarballs, e.g. with a sphinx build time option.)

One more mention - bandit

I had nothing to do with introducing this to Debian but I am very grateful that it exists in Debian. bandit is proving to be very useful in our CI, providing SAST reports in GitLab. As with many tools of it's kind, it is noisy at first. However, with a few judicious changes and the use of the # nosec comment to rule out scanning of things like unit tests which deliberately tried to be insecure, we have substantially reduced the number of reports produced with bandit.

Having the tools available is so important to actually fixing problems before the software gets released.

by Neil Williams at October 10, 2018 14:26

September 19, 2018

Mark Brown

2018 Linux Audio Miniconference

As in previous years we’re trying to organize an audio miniconference so we can get together and talk through issues, especially design decisons, face to face. This year’s event will be held on Sunday October 21st in Edinburgh, the day before ELC Europe starts there. Cirrus Logic have generously offered to host this in their Edinburgh office:

7B Nightingale Way

As with previous years let’s pull together an agenda through a mailing list discussion on alsa-devel – if you’ve got any topics you’d like to discuss please join the discussion there.

There’s no cost for the miniconference but if you’re planning to attend please sign up using the document here.

by broonie at September 09, 2018 18:36

September 09, 2018

Bin Chen

eBook: Understand Container

The index page of understand container  had a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.

So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 

by Unknown ( at September 09, 2018 10:52

book: Understand Container

The index page of understand container  had a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.

So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 

by Bin Chen ( at September 09, 2018 08:04

August 31, 2018

Bin Chen

Understand Container - Index Page

This is an index page to a series of 8 articles on container implementation.


This page has a very good page view after being created. Then I was thinking if anyone would be interested in a more polished, extended, and easier to read version.

So I started a book called "understand container". Let me know if you will be interested in the work by subscribing here and I'll send the first draft version which will include all the 8 articles here. The free subscription will end at 31th, Oct, 2018.

* Remember to click "Share email with author (optional)", so that I can send the book to your email directly. 


by Unknown ( at August 08, 2018 11:17

Steve McIntyre

And lo, we sacrificed to the gods of BBQ once more

As is becoming something of a tradition by now, Jo and I hosted another OMGWTFBBQ at our place last weekend. People came from far and wide to enjoy themselves. Considering the summer heatwave we've had this year, we were a little unlucky with the weather. But with the power of gazebo technology we kept (mostly!) dry... :-)

I was too busy cooking and drinking etc. to take any photos myself, so here are some I sto^Wborrowed from my friends!

We continued to celebrate Debian getting old:
the cake is not a lie!
Photo from Jonathan McDowell

We had much beer from the nice folks at Milton Brewery:
is 3 firkins enough?
Photo from Rob Kendrick

Much meat was prepared and cooked:
very professional!
Photo from Stefano Rivera

And we had a lot of bread too!
Photo from Rob Kendrick

Finally, many thanks to a number of awesome companies for again sponsoring the important refreshments for the weekend. It's hungry/thirsty work celebrating like this!

August 08, 2018 02:24

August 16, 2018

Steve McIntyre

25 years...

We had a small gathering in the Haymakers pub tonight to celebrate 25 years since Ian Murdock started the Debian project.

people in the pub!

We had 3 DPLs, a few other DDs and a few more users and community members! Good to natter with people and share some history. :-) The Raspberry Pi people even chipped in for some drinks. Cheers! The celebrations will continue at the big BBQ at my place next weekend.

August 08, 2018 21:42

August 15, 2018

Naresh Bhat

Apache Ambari on ARM64


In this blog we try to explain about Ambari and its uses, Status of the Ambari on ARM64.

Apache Ambari is an open source administration tool deployed on top of Hadoop cluster and responsible for keeping track of running applications and their status. Apache Ambari can be referred to as an open source web-based management tool that manages, monitors and provisions the health of Hadoop clusters.

The Apache Ambari is currently one of the leading projects running under Apache Software Foundation.  The reason is that Ambari eliminates the need for manual tasks used to watch over Hadoop operations. It gives a simple secure platform for provisioning, managing and monitoring Hortonworks Data Platform (HDP) deployments. 

How Apache Ambari came into existence 

The genesis of Apache Ambari traces the emergence of Hadoop when its distributed and scalable computing took the world by storm. More and more technologies were incorporated in the existing infrastructure. Gradually Hadoop matured and it became difficult for the cluster to maintain multiple nodes and applications simultaneously. That is when this technology came into picture to make distributed computing easier.
  • It's a completely open source management platform for provisioning, managing, monitoring and securing Apache Hadoop clusters. 
  • Apache Ambari takes the guesswork out of operating Hadoop.
  • Apache Ambari, as part of the Hortonworks Data Platform, allows enterprises to plan, install and securely configure HDP making it easier to provide ongoing cluster maintenance and management, no matter the size of the cluster.
Ambari ARM64 porting efforts

We (BigData team) in Linaro are trying to port the Ambari on ARM64 and upstream all the patches.  But as you know the upstream process takes it's own time to complete.  In first place we are trying to compile and generate the RPM/DEB packages on standard distribution.  The compilation steps for Ambari can be found here - Ambari Compilation But you need couple of patches to get it compiled on ARM64 Ambari The patches basically does the following
  • Replace needarch variable hard coding of x86_64.
  • Replace npm/Node version in your distribution. 
  • The phantomjs versions which is having AArch64 support.
  • Use the ember-handlebars-brunch version which has got ARM64 support.
  • Replace hardcoded amd64 values for deb.architecture variable.
  • The RPM/JDEB support patches for couple of missing packages.
The Ambari do not have any AArch64 bits, It has x86 bits hard coded in all places.  We have fixed it from v2.5 and used bigtop to make the build then wrote mpack (Management pack) specs and  used bigtop as mpack.  We have created a collaborate page to Build and Install the v2.6.1 Ambari. If you want to build different version, It may be required to slightly tweak the patches depending on the version which you want to build.  These patches you may find on our linaro git repositories.

Build/Install/Run Apache Ambari

In this section we try to explain example v2.6.1 Apache Ambari build from scratch on both CentOS and Debian/Ubuntu machines.  The versions are very important The latest version build method will be almost similar with some minor changes.  To build latest version of the Ambari use my git repository on  The repository usually up to date with patches here are usually forward ported.


Setup Environment

  • Debian 9.0 64bit for AArch64, or CentOS-7.4 64bit for AArch64
  • jdk8u-server-release-1804




maven@v3.5.3, nodejs@v4.2.6, npm@2.14.12, brunch@1.7.10, phantomjs@2.1.1, python>=2.6, python-dev, rpm, yum, g++

Build Steps

Install Pre-requisities

For Debian 9.0:

sudo apt install git python python-dev rpm yum build-essential libfreetype6 libfreetype6-dev fontconfig fontconfig-config libfontconfig1-dev libssl-dev openssl findbugs -y

For CentOS7

sudo yum groupinstall "Development Tools"
sudo yum install git python python-devel openssl-devel openssl openssl-libs freetype freetype-devel fontconfig-devel fontconfig gcc gcc-c++ make build autoconf automake cppunit-devel cmake bzip2 rpm-build

Setup maven

To setup maven 3.5.3

tar xvf apache-maven-3.5.3-bin.tar.gz
cd apache-maven-3.5.3/bin
export PATH=$PWD:$PATH

Make sure the version of Maven is 3.5.3 when the following command is issued.
mvn --version

Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z)
Maven home: /home/centos/maven/apache-maven-3.5.3
Java version: 1.8.0-release, vendor: Oracle Corporation
Java home: /home/centos/jdk8u/jdk8u-server-release-1804/jre
Default locale: en_IN, platform encoding: UTF-8
OS name: "linux", version: "4.12.0-1.1.aarch64", arch: "aarch64", family: "unix"

Setup python tools

  • For python 2.6, download
    sudo sh setuptools-0.6c11-py2.6.egg

  • For python 2.7, download
    sudo sh setuptools-0.6c11-py2.7.egg
    The python 2.6 did't work for me, hence I have just created a softlink of v2.6 for v2.7 python
    $ sudo ln -s /usr/bin/python2.7 /usr/bin/python2.6

Setup nodejs/npm

Nodejs and npm come with different versions along with Ubuntu/Debian

Ubuntu/Debian, nodejs/npm can be installed by:

sudo apt-get install -y nodejs npm
cd /usr/bin && sudo ln -s nodejs node
sudo npm install -g brunch@1.7.10
Note that if you are using Debian 9 stretch then please follow the below steps
sudo apt-get install curl
curl -sL | sudo -E bash -
sudo apt-get install -y nodejs
cd /usr/bin && sudo ln -s nodejs node
sudo npm install -g brunch@1.7.10

CentOS7, nodejs/npm need to be built from source.
git clone
cd node
git checkout -b 4.2.6 v4.2.6
./configure --prefix=/usr && make -j8
sudo make install
sudo npm install -g brunch@1.7.10

The version of built out binaries are: nodejs@v4.2.6, npm@2.14.12.
As long as they are installed, pom.xml in ambari-admin needs to be changed to reflect these versions. The target nodejs/npm version are defined in "configuration" field of "frontend-maven-plugin".

Build PhantomJS

The following steps explain about AArch64 supported phantomjs v2.1.1.  Note that you have to install all the dependency packages before you proceed further.  Refer collaborate page for PhantomJS

git clone
cd phantomjs
git checkout -b v2.1.1 2.1.1
./ -c -j $(getconf _NPROCESSORS_ONLN)

When the build is finished, create tar file for deployment

cd deploy

You can test phantomjs build by issuing:

./bin/phantomjs test/run-tests.js
Install phantomjs-2.1.1-linux-aarch64.tar.bz2 to the system and add phantomjs to $PATH.

Check if phantomjs is properly installed by doing:

$ phantomjs --version

Replace frontend-maven-plugin

Ambari uses fronend-maven-plugin@v0.0.16, which doesn't support AArch64. Do following to rebuild this plugin for AArch64.

git clone
cd frontend-maven-plugin
git checkout -b 0.0.16 frontend-plugins-0.0.16
git apply frontend-maven.patch
mvn clean -DskipTests install

Replace leveldbjni

levedbjni is used in Ambari-metrics. It only provides x86/x86_64 version in maven repo. So AArch64 version of leveldbjni needs to be built and installed.

tar -xf snappy-1.0.5.tar.gz
cd snappy-1.0.5
./configure --disable-shared --with-pic --host aarch64-unknown-linux --build aarch64-unknown-linux
make -j4
cd ..
git clone git://
git clone git://
export SNAPPY_HOME=`cd snappy-1.0.5; pwd`
export LEVELDB_HOME=`cd leveldb; pwd`
export LEVELDBJNI_HOME=`cd leveldbjni; pwd`
cd leveldb
git apply ../leveldbjni/leveldb.patch
wget -O port/atomic_pointer.h
make libleveldb.a
git checkout -b 1.8 leveldbjni-1.8
mvn clean install -P all -P linux64 -DskipTests=true

Build Ambari

To build Ambari, a certain version number should be provided. This version number IS 5-DIGITS, not "4-digits" mentioned on Ambari's Wiki Page. The last digit may vary but the first 3 digits should be same as Ambari source/release version. In our case this is 2.6.1. Patch is provided to make Ambari built on AArch64.  Apply all the patches before you are going for the build. You can directly clone and build my AMBARI git repository -

git clone
cd ambari
git checkout release-2.6.1
Download and apply following patches
git am 0001-ambari-build-aarch64-2.6.1.patch
git am 0002-ambari-metrics-grafana-Add-jdeb-support.patch
git am 0003-ambari-funtest-Add-jdeb-support.patch
git am 0004-ambari-logsearch-Add-jdeb-support.patch
git am 0005-ambari-Add-jdeb-arm64-support.patch
mvn versions:set -DnewVersion=
pushd ambari-metrics
mvn versions:set -DnewVersion=
On CentOS 7.4 to generate rpm's you can issue below command.
mvn -B clean install package rpm:rpm -DskipTests -Dpython.ver="python >= 2.6" -Preplaceurl -Drat.ignoreErrors=true
On Debian 9 to generate Debian packages you can issue below command.
mvn -B clean install jdeb:jdeb -DnewVersion= -DskipTests -Dpython.ver="python >= 2.6" -Drat.ignoreErrors=true
Ambari Server will create following packages
  • RPM will be created under AMBARI_DIR/ambari-server/target/rpm/ambari-server/RPMS/aarch64.
Ambari Agent will create following packages
  • RPM will be created under AMBARI_DIR/ambari-agent/target/rpm/ambari-agent/RPMS/aarch64.
Ambari Metrics will create following packages
  • RPM will be created under  AMBARI_DIR/ambari-metrics/ambari-metrics-timelineservice/target/rpm/ambari-metrics-collector/RPMS/noarch.

Run Ambari

Ambari Server

First, install Pre-Requisities

sudo yum install postgresql
sudo yum install postgresql-server

Then install the Ambari Server RPM.

sudo yum install ambari-server/target/rpm/ambari-server/RPMS/aarch64/ambari-server-*.rpm

Initialize Ambari Server:

sudo ambari-server setup

Start up Ambari Server:

sudo ambari-server start

To access Ambari, go to: http://{ambari-server-hostname}:8080

The initial username/password is admin/admin.

Ambari Agent

Install the Ambari Agent RPM.

sudo yum install ambari-agent/target/rpm/ambari-agent/RPMS/aarch64/ambari-agent-

Then edit the location of Ambari Server in /etc/ambari-agent/conf/ambari-agent.ini by editing the hostname line.

Start Ambari Agent:

sudo ambari-agent start

Patches submitted

Jira issues on Ambari

Git repository to track upstream activity

The Ambari version 2.6.1 patches can be downloaded from


Ambari Demo on ARM64 at Linaro connect

At Budapest Linaro connect ( BUD17 ) we did a Apache Ambari demo on ARM64 -

Upcoming activities

- Upstream Ambari ARM64 needarch support patches
- upstream
- Management pack (mpack) implementations for BigTop project

by Naresh ( at August 08, 2018 12:44

August 12, 2018

Steve McIntyre

DebConf in Taiwan!

DebConf 18 logo

So I'm slowly recovering from my yearly dose of full-on Debian! :-) DebConf is always fun, and this year in Hsinchu was no different. After so many years in the project, and so many DebConfs (13, I think!) it has become unmissable for me. It's more like a family gathering than a work meeting. In amongst the great talks and the fun hacking sessions, I love catching up with people. Whether it's Bdale telling me about his fun on-track exploits or Stuart sharing stories of life in an Australian university, it's awesome to meet up with good friends every year, old and new.

DC18 venue

For once, I even managed to find time to work on items from my own TODO list during DebCamp and DebConf. Of course, I also got totally distracted helping people hacking on other things too! In no particular order, stuff I did included:

  • Working with Holger and Wolfgang to get debian-edu netinst/USB images building using normal debian-cd infrastructure;
  • Debugging build issues with our buster OpenStack images, fixing them and also pushing some fixes to Thomas for build-openstack-debian-image;
  • Reviewing secure boot patches for Debian's GRUB packages;
  • As an AM, helping two DD candidates working their way through NM;
  • Monitoring and tweaking an archive rebuild I'm doing, testing building all of our packages for armhf using arm64 machines;
  • Releasing new upstream and Debian versions of abcde, the CD ripping and encoding package;
  • Helping to debug UEFI boot problems with Helen and Enrico;
  • Hacking on MoinMoin, the wiki engine we use for;
  • Engaging in lots of discussions about varying things: Arm ports, UEFI Secure Boot, Cloud images and more

I was involved in a lot of sessions this year, as normal. Lots of useful discussion about Ignoring Negativity in Debian, and of course lots of updates from various of the teams I'm working in: Arm porters, web team, Secure Boot. And even an impromptu debian-cd workshop.

Taipei 101 - datrip venue

I loved my time at the first DebConf in Asia (yay!), and I was yet again amazed at how well the DebConf volunteers made this big event work. I loved the genius idea of having a bar in the noisy hacklab, meaning that lubricated hacking continued into the evenings too. And (of course!) just about all of the conference was captured on video by our intrepid video team. That gives me a chance to catch up on the sessions I couldn't make it to, which is priceless.

So, despite all the stuff I got done in the 2 weeks my TODO list has still grown. But I'm continuing to work on stuff, energised again. See you in Curitiba next year!

August 08, 2018 15:11

July 26, 2018

Senthil Kumaran

lava-server official docker images

Linaro Automated Validation Architecture a.k.a LAVA project has released official docker images for lava-server only containers followed by the recent release of lava-dispatcher only docker images. This blog post explains how to use these lava-server docker images in order to run LAVA instances via docker.

Before getting into the details of running these images, let us see how these images are organized and what are the packages available via these images.

The lava-server only docker images will be officially supported by the LAVA project team and there will be regular releases of these images whenever there are updates or new releases. As of this writing there are two images released - production and staging. These docker images are based on Debian Stretch operating system, which is the recommended operating system for installing LAVA.

lava-server production docker images

The production docker image of lava-server is based on the official production-repo of LAVA project. The production-repo holds the latest stable packages released by LAVA team for each of the LAVA components.The production docker image will be available in the following link:

Whenever there is a production release from the LAVA project there will be a corresponding image created with the tag name in The latest tag as of this writing is 2018.7-1. In order to know what this production docker images are built with, have a look at the DockerFile in

lava-server staging docker images

The staging docker image of lava-server is based on the official staging-repo of LAVA project. The staging-repo holds the latest packages built everyday by LAVA team for each of the LAVA components, which is also a source for bleeding edge unreleased software. The staging docker image will be available in the following link, which is built daily:

Whenever there is a successful daily build of staging packages available, a docker image will be made available in with the tag name 'latest'. Hence, at any point of time there will be only one tag, i.e., latest in the staging docker image location. In order to know what this staging docker images are built with, have a look at the DockerFile in

Having seen the details about the lava-server only docker images, let us now see how to run these docker images to create a LAVA server instance.

running production lava-server docker image

$ sudo docker run -p 8080:80 --privileged --name lava-2018.7-1 linaro/lava-server-production-stretch-amd64:2018.7-1
Starting postgresql...
Starting PostgreSQL 9.6 database server: main.
Starting lava-coordinator...
Starting lava-coordinator : lava-coordinato.
Starting apache2 server...
Starting Apache httpd web server: apache2AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using Set the 'ServerName' directive globally to suppress this message
Creating admin account
Superuser created successfully.
Set initial password for admin account as: changeit
spawn lava-server manage changepassword admin
Changing password for user 'admin'
Password (again):
Password changed successfully for user 'admin'
Starting lava-server-gunicorn...

Once the docker image is started visit the instance using the url http://localhost:8080/ or from the host machine. The IP address is obtained from the output above.

running staging lava-server docker image

$ sudo docker run -p 8080:80 --privileged --name lava-latest linaro/lava-server-staging-stretch-amd64:latest
Starting postgresql...
Starting PostgreSQL 9.6 database server: main.
Starting lava-coordinator...
Starting lava-coordinator : lava-coordinato.
Starting apache2 server...
Starting Apache httpd web server: apache2AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using Set the 'ServerName' directive globally to suppress this message
Creating admin account
Superuser created successfully.
Set initial password for admin account as: changeit
spawn lava-server manage changepassword admin
Changing password for user 'admin'
Password (again):
Password changed successfully for user 'admin'
Starting lava-server-gunicorn...

Thus we have our lava-server docker image up and running in docker container. In order to login to this instance use the default user 'admin' and the password 'changeit'. The admin user has administration privileges, hence ensure you change the password to keep your instance secure.

Have a look at which accepts and executes commands which will be handy to tackle advanced use-cases that you want to envision using these lava-server based docker images.

by stylesen at July 07, 2018 08:43

July 15, 2018

Bin Chen

Understand Kubernetes 5: Controller

Controllers in k8s assumes the same role and responsibility as the Controller in the classic Model-View-Controller(whereras the Model are the various API objects stored in the etcd) architecture. What's kind of unique about the controller in k8s is will constantly reconcile the system desired state to current state, not just a one time task.

Replicaset Controller

To make things real, we'll look at the source code of Replicaset Controller and see what exactly is a controller, who it will interact with, and how.
The core logic of Replicaset Controller is quite simple, as showing below:
func (rsc *ReplicaSetController) manageReplicas(filteredPods []*v1.Pod, rs *apps.ReplicaSet) error {
diff := len(filteredPods) - int(*(rs.Spec.Replicas))
if (diff < 0) {
createPods( )
} else if (diff > 0) {
createPods( )
To create the Pod, it uses a KubeClient which talks to the API server.
func (r RealPodControl) createPods( )
newPod, _ := r.KubeClient.CoreV1().Pods(namespace).Create(pod)
Tracing further function Create(), it uses a nice builder patterner, to set up an HTTP request
func (c *pods) Create(pod *v1.Pod) (result *v1.Pod, err error) {
result = &v1.Pod{}
err = c.client.Post().
Upon calling Do, it will issue an HTTP post request and get the result.
func (r *Request) Do() Result {    
var result Result
err := r.request(func(req *http.Request, resp *http.Response) {
result = r.transformResponse(resp, req)
return result
That only cover one direction of the communication, from the controller to the API server.

How about the other direction?


A controller subscribe itself to the apiserver for the events it cares about.
A controller typical cares about two type of information: controller specific information and the core information regarding the Pods.
In k8s, the components used to notify the events are called Informer. FWIW, it is just an Oberser Pattern.
In the case of ReplicatSetController, When a replicatSet request is submitted, the API server will notify the replicatSetControll through appsinformers.ReplicaSetInformer. When a Pod gets created, the API server will notify the replicatSetControll using coreinformers.PodInformer.
See how a ReplicatSetController is initiated:
func startReplicaSetController(ctx ControllerContext) (bool, error) {
go replicaset.NewReplicaSetController(
ctx.InformerFactory.Apps().V1().ReplicaSets(), // appsinformers.ReplicaSetInformer
ctx.InformerFactory.Core().V1().Pods(), // coreinformers.PodInformer
).Run(int(ctx.ComponentConfig.ReplicaSetController.ConcurrentRSSyncs), ctx.Stop)
return true, nil
And how ReplicatSetController is handling those events:
AddFunc: rsc.enqueueReplicaSet,
UpdateFunc: rsc.updateRS,
DeleteFunc: rsc.enqueueReplicaSet,

AddFunc: rsc.addPod,
UpdateFunc: rsc.updatePod,
DeleteFunc: rsc.deletePod,
Ok, this covers the direction from the API server to the controller.

But we still missing a one thing.

Workqueue, and worker

After being notified of the relevant events, a controller will push the events to an event queue; meanwhile, a poor worker is in a dead loop checking the queue and processing the event.

Cached & Shared Informer

We know that etcd provided the API to list and watch particular resources and each resource in k8s has its dedicated locations. With that, we have the things needed to implement an informer for a controller. However, there are two aspects we can optimize. First, instead of relaying everything to etcd, we can cache the information/event in the apiserver for better performance; Second, since different controls care about same set information, it makes sense those controllers can share an informer.
With that in mind, here is how currently a ReplicaSetInformer is created.

Controller Manager

kube-controller-manageris a daemon that bundles together all the built-in controllers for k8s. It provides a central place to register, initiate, and start the controllers.


We go through what a controller is and it interacts with the api sever and does the job.

by Unknown ( at July 07, 2018 06:12

July 07, 2018

Bin Chen

Understand kubernetes 4 : Scheduler

The most known job of a container orchestration is to "Assign Pods to Nodes", or so-called scheduling. If all the Pods and Nodes are the same, it becomes a trivial problem to solve - a round robin policy would do the job. In practice, However, Pods have different resource requirements, and less obvious that the nodes may have different capabilities - thinking machines purchased 5 years ago and brand new ones.

An Analogy: Rent a house

Say you want to rent a house, and you tell the agent that any house with 2 bedrooms and 2 bathrooms is fine; However, you don't want a house with swimming Pool, since you would rather be going to the beaches and don't have to pay for something you won't use.
That actually covers the main concepts/job for the k8s scheduler.
  • You/Tenant: have some requirements (rooms)
  • Agent: k8s scheduler
  • Houses(owned by Landlords): The nodes.
You tell the Agent the must-have, definite no-no, and nice-to-have requirements.
Agent's job is to find you the house matches your requirement and anti-requirement.
The owner can also reject an application base on his preference (say no pets).

Requirements for Pod scheduler

Let's see some practical requirements when placing a Pod to Node.
1 Run Pods on a specific type of Nodes : e.g: run this Pod on Ubuntu 17.10 only.
2 Run Pods of different services on the same Node: e.g Place weberver and memcache on some Node.
3 Spread Pods of a service to different Nodes: e.g Place the websever on nodes in different zone for fault toleratnt.
4 Best utilization of the resource: e.g run as "much" job as possible but be able to preempty the low priority one.
In k8s world,
1, 2 can be resolved using Affinity
3 can be resolved using Anti-Affinity
4 can be resolved using Taint and Toleration and Priority and Preemption
Before we talking about those scheduler policies and we first need a way to identify the Nodes. Without the identification, the scheduler can do nothing more/better than allocating with only the capacity information of the node.

Lable the Nodes

Nothing fancy. Nodes are labeled.
You can add any label you want but there are predefined common labels, including
  • hostname
  • os/arch/instance-type
  • zone/region
The first may be used to identify a single node, the 2nd one for a type of nodes, the last one is for geolocation related fault toleration or scalability.


There two type of Affinity, Node Affnity and Pod Affinity. The first one indicates an Affinity to a type of Node, and can be used to achieve the 1st requirement; the later one indicates the Affinity to Node with a certain type of Pods already running, and can be used to achieve 2nd requirement.
The affinity can be soft or hard, which nice-to-have and must respectively.
Reverse the logical of Affinity, it became Anti-Affinity, means Pod don't want to be in the Nodes with a certain type of feature. Requirement 3 can be implemented as "Pod doesn't want to be in the Node with the same Pod (using Pod Label)".
Side notes: You might know that in Linux a process can set it is cpu affinity, that is which CPU core it prefers to run on. It assembles to the problem of placing a Pod on a specific (type of) Node. As well as the CPUset in cgroup.

Taint and Toleration

Landlord tells to the Angent that he only want to rent the house to a programmer (for whatever reason). So unless a renter identifies himself as a programmer, the agent won't submit his application to the landlord.
Similar, a node can add some special requirement (called Taint) and use that to repel a set of nodes. Unless a Pod can tolerate the taint, it will be placed on the Node.
I found the concept of Taint and Tolerations was a little bit twisted, since Taint sounds like a bad stuff, unreasonable requirements/restriction that Pod has tolerate. It more likes landlord requires to pay the upfront rent for half a year and only the one who will tolerate this are able to apply.
One thing to remember is Taint is it is an attribution of Node and it gives Node an opportunity to have a voice for his preference; unlike Affinity is for Pod shows its preference to Node.

Priority and Preemption

Maximise resource utilization is important and it can be overlooked for most people don't have the experience of managing thousands of servers. As pointed out in section 5 of Borg paper, which k8s is inspired from.
One of Borg’s primary goals is to make efficient use of
Google’s fleet of machines, which represents a significant
financial investment: increasing utilization by a few percentages
points can save millions of dollars.
How to increasing utilization? That could mean many things, such as: schedule jobs fast, optimize the Pod allocation so that more jobs can be accommodated, and last but not least, be able to interrupt the low priority job with high priority one.
The last one just makes sense for machine. Do something always better than running idle. But when more important jobs coming, it will be preempted.
And an indication for the possibility of being preempted is we have to spend a minute of thinking about the effect of the Pod/service that may be evicted. Does it matter? How to gracefully terminate itself?

Make it real

To make things more real, take a look at this sample toy scheduler, which will bind a Pod to the cheapest Node as long as the Node can it can "fit" the resource requirements needed by the Pod.
Here are a few takeaways:
  1. You can roll your own scheduler.
  2. You can have more than one schedulers in the system. Each scheduler looks after a particular set/type of Pods and schedules them. (It doesn't make sense to have multiple schedulers trying to schedule the same set of Pods - there will be racing.)
  3. Scheduler always talks to the API server, as a client. It asks the APIs server for unscheduled Pods, scheduler them using a defined policy, and post the scheduler results ( i.e Pod/Node binding) to API server.
schedulerschedulerapi serverapi serverget me unscheduled Podsget me Node info/status/capacityschedule it according to a predefined policypost binding resultpost binding OK events
You can find default scheduler here.


We go over the requirement of a Pod scheduler and the way to achieve those requirements in k8s.

by Unknown ( at July 07, 2018 04:47

June 30, 2018

Bin Chen

Understand Kubernetes 3 : etcd

In the last article, we said there was a statetore in the master node; in practice, it is implemented using etcdetcd is open source distributed key-value store (from coreOs) using the raft consensus algorithm. You can find a good introduction of etcd herek8s use etcd to store all the cluster information and is the only stateful component in the whole k8s (we don't count in the stateful components of the application itself).
Notably, it stores the following information:
  • Resource object/spec submitted by the user
  • The scheduler results from master node
  • Current status of work nodes and Pods

etcd is the critical

The stability and responsiveness of etcd is critical to stability & performance of the whole cluster. here is an excellent blog from open AI sharing that, there etcd system, hindered by 1) the high disk latency due to cloud backend and 2) high network io load incurred by the monitoring system, was one of the biggest issues they encountered when scaling the nodes to 2500.
For a production system, we will set up a separate etcd cluster and connect the k8s master to it. The master will store the requests to the etcd, update the results by controllers/schedulers, and the work nodes will watch the relevant state change through master and take action according, e,g start a container on itself.
It looks like this diagram:

usage of etcd in k8s

etcd is set up separately, but it has to be setup first so that the nodes ip (and tls info) of in the etcd cluster can be pass to the apiserver running on the master nodes. Using that information (etcd-servers and etcd-tls) apiserver will create an etc client (or multiple clients) talking to the etcd. That is all the connection between etcd and k8s.

All the components in the api-server will use storage.Interface to communicate with storage. etcd is the only backend implementation at the moment and it supports two versions of etcd, v2 and v3, which is the default.
class storage.Interface {
Create(key string, obj runtime.Object))
k8s master, to be specific, apiserver component, act as one client of the etcd, using the etcd client to implement the storage. Interface API with a little bit more stuff that fits k8s model.
Let's see two APIs, Create and Watch.
For create, the value part of the k/v is a runtime object, e.g Deployment spec, a few more steps (encoder, transform) is needed before finally commit that to the etcd.
  • Create
Create(key string, obj runtime.Object)
obj -> encoder -> transformer -> clientv3.OpPut(key, v string)
Besides the normal create/get/delete, there is one operation that is very important for distributed k/v store, watch, which allows you block wait on something and being notified when something is changed. As a user case, someone can watch a specific location for new pod creation/deletion and then take the corresponding action.
Kublete doesn't watch the storage direction, instead, it watches it through API server.
  • Watch
func (wc *watchChan) startWatching(watchClosedCh chan struct{}) {
wch := wc.watcher.client.Watch(wc.ctx, wc.key, opts...)

pluggable backend storage

In theory, you should be able to replace etcd with other k/v stores, such as Consul and Zookeeper.
There was a PR to add Consul as the backend, but was closed (after three years) as "not ready to do this in the near future". Why create pluggable container runtime but not for the storage backend, which seems make sense as well. One of the possible technical reason is that k8s and etcd are already loosely coupled so doesn't worth the effort to create another layer to make it pluggable.


etcd is the components storing all the state for k8s cluster. It is availability and performance is vital to the whole k8s. apisever is the only one that talks to ectd using etc clients, request that submit to apiserver will be encoded and transformed before committing to etcd. Anyone can watch a particular state change but not directly to the etcd instead that go through the apiserver.

by Unknown ( at June 06, 2018 03:59

June 29, 2018

Neil Williams

Automation & Risk

First of two posts reproducing some existing content for a wider audience due to delays in removing viewing restrictions on the originals. The first is a bit long... Those familiar with LAVA may choose to skip forward to Core elements of automation support.

A summary of this document was presented by Steve McIntyre at Linaro Connect 2018 in Hong Kong. A video of that presentation and the slides created from this document are available online:

Although the content is based on several years of experience with LAVA, the core elements are likely to be transferable to many other validation, CI and QA tasks.

I recognise that this document may be useful to others, so this blog post is under CC BY-SA 3.0: See also

Automation & Risk


Linaro created the LAVA (Linaro Automated Validation Architecture) project in 2010 to automate testing of software using real hardware. Over the seven years of automation in Linaro so far, LAVA has also spread into other labs across the world. Millions of test jobs have been run, across over one hundred different types of devices, ARM, x86 and emulated. Varied primary boot methods have been used alone or in combination, including U-Boot, UEFI, Fastboot, IoT, PXE. The Linaro lab itself has supported over 150 devices, covering more than 40 different device types. Major developments within LAVA include MultiNode and VLAN support. As a result of this data, the LAVA team have identified a series of automated testing failures which can be traced to decisions made during hardware design or firmware development. The hardest part of the development of LAVA has always been integrating new device types, arising from issues with hardware design and firmware implementations. There are a range of issues with automating new hardware and the experience of the LAVA lab and software teams has highlighted areas where decisions at the hardware design stage have delayed deployment of automation or made the task of triage of automation failures much harder than necessary.

This document is a summary of our experience with full background and examples. The aim is to provide background information about why common failures occur, and recommendations on how to design hardware and firmware to reduce problems in the future. We describe some device design features as hard requirements to enable successful automation, and some which are guaranteed to block automation. Specific examples are used, naming particular devices and companies and linking to specific stories. For a generic summary of the data, see Automation and hardware design.

What is LAVA?

LAVA is a continuous integration system for deploying operating systems onto physical and virtual hardware for running tests. Tests can be simple boot testing, bootloader testing and system level testing, although extra hardware may be required for some system tests. Results are tracked over time and data can be exported for further analysis.

LAVA is a collection of participating components in an evolving architecture. LAVA aims to make systematic, automatic and manual quality control more approachable for projects of all sizes.

LAVA is designed for validation during development - testing whether the code that engineers are producing “works”, in whatever sense that means. Depending on context, this could be many things, for example:

  • testing whether changes in the Linux kernel compile and boot
  • testing whether the code produced by gcc is smaller or faster
  • testing whether a kernel scheduler change reduces power consumption for a certain workload etc.

LAVA is good for automated validation. LAVA tests the Linux kernel on a range of supported boards every day. LAVA tests proposed android changes in gerrit before they are landed, and does the same for other projects like gcc. Linaro runs a central validation lab in Cambridge, containing racks full of computers supplied by Linaro members and the necessary infrastucture to control them (servers, serial console servers, network switches etc.)

LAVA is good for providing developers with the ability to run customised test on a variety of different types of hardware, some of which may be difficult to obtain or integrate. Although LAVA has support for emulation (based on QEMU), LAVA is best at providing test support for real hardware devices.

LAVA is principally aimed at testing changes made by developers across multiple hardware platforms to aid portability and encourage multi-platform development. Systems which are already platform independent or which have been optimised for production may not necessarily be able to be tested in LAVA or may provide no overall gain.

What is LAVA not?

LAVA is designed for Continuous Integration not management of a board farm.

LAVA is not a set of tests - it is infrastructure to enable users to run their own tests. LAVA concentrates on providing a range of deployment methods and a range of boot methods. Once the login is complete, the test consists of whatever scripts the test writer chooses to execute in that environment.

LAVA is not a test lab - it is the software that can used in a test lab to control test devices.

LAVA is not a complete CI system - it is software that can form part of a CI loop. LAVA supports data extraction to make it easier to produce a frontend which is directly relevant to particular groups of developers.

LAVA is not a build farm - other tools need to be used to prepare binaries which can be passed to the device using LAVA.

LAVA is not a production test environment for hardware - LAVA is focused on developers and may require changes to the device or the software to enable automation. These changes are often unsuitable for production units. LAVA also expects that most devices will remain available for repeated testing rather than testing the software with a changing set of hardware.

The history of automated bootloader testing

Many attempts have been made to automate bootloader testing and the rest of this document cover the issues in detail. However, it is useful to cover some of the history in this introduction, particularly as that relates to ideas like SDMux - the SD card multiplexer which should allow automated testing of bootloaders like U-Boot on devices where the bootloader is deployed to an SD card. The problem of SDMux details the requirements to provide access to SD card filesystems to and from the dispatcher and the device. Requirements include: ethernet, no reliance on USB, removable media, cable connections, unique serial numbers, introspection and interrogation, avoiding feature creep, scalable design, power control, maintained software and mounting holes. Despite many offers of hardware, no suitable hardware has been found and testing of U-Boot on SD cards is not currently possible in automation. The identification of the requirements for a supportable SDMux unit are closely related to these device requirements.

Core elements of automation support


The ability to deploy exactly the same software to the same board(s) and running exactly the same tests many times in a row, getting exactly the same results each time.

For automation to work, all device functions which need to be used in automation must always produce the same results on each device of a specific device type, irrespective of any previous operations on that device, given the same starting hardware configuration.

There is no way to automate a device which behaves unpredictably.


The ability to run a wide range of test jobs, stressing different parts of the overall deployment, with a variety of tests and always getting a Complete test job. There must be no infrastructure failures and there should be limited variability in the time taken to run the test jobs to avoid the need for excessive Timeouts.

The same hardware configuration and infrastructure must always behave in precisely the same way. The same commands and operations to the device must always generate the same behaviour.


The device must support deployment of files and booting of the device without any need for a human to monitor or interact with the process. The need to press buttons is undesirable but can be managed in some cases by using relays. However, every extra layer of complexity reduces the overall reliability of the automation process and the need for buttons should be limited or eliminated wherever possible. If a device uses on LEDs to indicate the success of failure of operations, such LEDs must only be indicative. The device must support full control of that process using only commands and operations which do not rely on observation.


All methods used to automate a device must have minimal footprint in terms of load on the workers, complexity of scripting support and infrastructure requirements. This is a complex area and can trivially impact on both reliability and reproducibility as well as making it much more difficult to debug problems which do arise. Admins must also consider the complexity of combining multiple different devices which each require multiple layers of support.

Remote power control

Devices MUST support automated resets either by the removal of all power supplied to the DUT or a full reboot or other reset which clears all previous state of the DUT.

Every boot must reliably start, without interaction, directly from the first application of power without the limitation of needing to press buttons or requiring other interaction. Relays and other arrangements can be used at the cost of increasing the overall complexity of the solution, so should be avoided wherever possible.

Networking support

Ethernet - all devices using ethernet interfaces in LAVA must have a unique MAC address on each interface. The MAC address must be persistent across reboots. No assumptions should be made about fixed IP addresses, address ranges or pre-defined routes. If more than one interface is available, the boot process must be configurable to always use the same interface every time the device is booted. WiFi is not currently supported as a method of deploying files to devices.

Serial console support

LAVA expects to automate devices by interacting with the serial port immediately after power is applied to the device. The bootloader must interact with the serial port. If a serial port is not available on the device, suitable additional hardware must be provided before integration can begin. All messages about the boot process must be visible using the serial port and the serial port should remain usable for the duration of all test jobs on the device.


Devices supporting primary SSH connections have persistent deployments and this has implications, some positive, some negative - depending on your use case.

  • Fixed OS - the operating system (OS) you get is the OS of the device and this must not be changed or upgraded.
  • Package interference - if another user installs a conflicting package, your test can fail.
  • Process interference - another process could restart (or crash) a daemon upon which your test relies, so your test will fail.
  • Contention - another job could obtain a lock on a constrained resource, e.g. dpkg or apt, causing your test to fail.
  • Reusable scripts - scripts and utilities your test leaves behind can be reused (or can interfere) with subsequent tests.
  • Lack of reproducibility - an artifact from a previous test can make it impossible to rely on the results of a subsequent test, leading to wasted effort with false positives and false negatives.
  • Maintenance - using persistent filesystems in a test action results in the overlay files being left in that filesystem. Depending on the size of the test definition repositories, this could result in an inevitable increase in used storage becoming a problem on the machine hosting the persistent location. Changes made by the test action can also require intermittent maintenance of the persistent location.

Only use persistent deployments when essential and always take great care to avoid interfering with other tests. Users who deliberately or frequently interfere with other tests can have their submit privilege revoked.

The dangers of simplistic testing

Connect and test

Seems simple enough - it doesn’t seem as if you need to deploy a new kernel or rootfs every time, no need to power off or reboot between tests. Just connect and run stuff. After all, you already have a way to manually deploy stuff to the board. The biggest problem with this method is Persistence as above - LAVA keeps the LAVA components separated from each other but tests frequently need to install support which will persist after the test, write files which can interfere with other tests or break the manual deployment in unexpected ways when things go wrong. The second problem within this fallacy is simply the power drain of leaving the devices constantly powered on. In manual testing, you would apply power at the start of your day and power off at the end. In automated testing, these devices would be on all day, every day, because test jobs could be submitted at any time.

ssh instead of serial

This is an over-simplification which will lead to new and unusual bugs and is only a short step on from connect & test with many of the same problems. A core strength of LAVA is demonstrating differences between types of devices by controlling the boot process. By the time the system has booted to the point where sshd is running, many of those differences have been swallowed up in the boot process.

Test everything at the same time

Issues here include:

Breaking the basic scientific method of test one thing at a time

The single system contains multiple components, like the kernel and the rootfs and the bootloader. Each one of those components can fail in ways which can only be picked up when some later component produces a completely misleading and unexpected error message.


Simply deploying the entire system for every single test job wastes inordinate amounts of time when you do finally identify that the problem is a configuration setting in the bootloader or a missing module for the kernel.


The larger the deployment, the more complex the boot and the tests become. Many LAVA devices are prototypes and development boards, not production servers. These devices will fail in unpredictable places from time to time. Testing a kernel build multiple times is much more likely to give you consistent averages for duration, performance and other measurements than if the kernel is only tested as part of a complete system.Automated recovery - deploying an entire system can go wrong, whether an interrupted copy or a broken build, the consequences can mean that the device simply does not boot any longer.

Every component involved in your test must allow for automated recovery

This means that the boot process must support being interrupted before that component starts to load. With a suitably configured bootloader, it is straightforward to test kernel builds with fully automated recovery on most devices. Deploying a new build of the bootloader itself is much more problematic. Few devices have the necessary management interfaces with support for secondary console access or additional network interfaces which respond very early in boot. It is possible to chainload some bootloaders, allowing the known working bootloader to be preserved.

I already have builds

This may be true, however, automation puts extra demands on what those builds are capable of supporting. When testing manually, there are any number of times when a human will decide that something needs to be entered, tweaked, modified, removed or ignored which the automated system needs to be able to understand. Examples include /etc/resolv.conf and customised tools.

Automation can do everything

It is not possible to automate every test method. Some kinds of tests and some kinds of devices lack critical elements that do not work well with automation. These are not problems in LAVA, these are design limitations of the kind of test and the device itself. Your preferred test plan may be infeasible to automate and some level of compromise will be required.

Users are all admins too

This will come back to bite! However, there are other ways in which this can occur even after administrators have restricted users to limited access. Test jobs (including hacking sessions) have full access to the device as root. Users, therefore, can modify the device during a test job and it depends on the device hardware support and device configuration as to what may happen next. Some devices store bootloader configuration in files which are accessible from userspace after boot. Some devices lack a management interface that can intervene when a device fails to boot. Put these two together and admins can face a situation where a test job has corrupted, overridden or modified the bootloader configuration such that the device no longer boots without intervention. Some operating systems require a debug setting to be enabled before the device will be visible to the automation (e.g. the Android Debug Bridge). It is trivial for a user to mistakenly deploy a default or production system which does not have this modification.


LAVA is aimed at kernel and system development and testing across a wide variety of hardware platforms. By the time the test has got to the level of automating a GUI, there have been multiple layers of abstraction between the hardware, the kernel, the core system and the components being tested. Following the core principle of testing one element at a time, this means that such tests quickly become platform-independent. This reduces the usefulness of the LAVA systems, moving the test into scope for other CI systems which consider all devices as equivalent slaves. The overhead of LAVA can become an unnecessary burden.

CI needs a timely response - it takes time for a LAVA device to be re-deployed with a system which has already been tested. In order to test a component of the system which is independent of the hardware, kernel or core system a lot of time has been consumed before the “test” itself actually begins. LAVA can support testing pre-deployed systems but this severely restricts the usefulness of such devices for actual kernel or hardware testing.

Automation may need to rely on insecure access. Production builds (hardware and software) take steps to prevent systems being released with known login identities or keys, backdoors and other security holes. Automation relies on at least one of these access methods being exposed, typically a way to access the device as the root or admin user. User identities for login must be declared in the submission and be the same across multiple devices of the same type. These access methods must also be exposed consistently and without requiring any manual intervention or confirmation. For example, mobile devices must be deployed with systems which enable debug access which all production builds will need to block.

Automation relies on remote power control - battery powered devices can be a signficant problem in this area. On the one hand, testing can be expected to involve tests of battery performance, low power conditions and recharge support. However, testing will also involve broken builds and failed deployments where the only recourse is to hard reset the device by killing power. With a battery in the loop, this becomes very complex, sometimes involving complex electrical bodges to the hardware to allow the battery to be switched out of the circuit. These changes can themselves change the performance of the battery control circuitry. For example, some devices fail to maintain charge in the battery when held in particular states artificially, so the battery gradually discharges despite being connected to mains power. Devices which have no battery can still be a challenge as some are able to draw power over the serial circuitry or USB attachments, again interfering with the ability of the automation to recover the device from being “bricked”, i.e. unresponsive to the control methods used by the automation and requiring manual admin intervention.

Automation relies on unique identification - all devices in an automation lab must be uniquely identifiable at all times, in all modes and all active power states. Too many components and devices within labs fail to allow for the problems of scale. Details like serial numbers, MAC addresses, IP addresses and bootloader timeouts must be configurable and persistent once configured.

LAVA is not a complete CI solution - even including the hardware support available from some LAVA instances, there are a lot more tools required outside of LAVA before a CI loop will actually work. The triggers from your development workflow to the build farm (which is not LAVA), the submission to LAVA from that build farm are completely separate and outside the scope of this documentation. LAVA can help with the extraction of the results into information for the developers but LAVA output is generic and most teams will benefit from some “frontend” which extracts the data from LAVA and generates relevant output for particular development teams.

Features of CI


How often is the loop to be triggered?

Set up some test builds and test jobs and run through a variety of use cases to get an idea of how long it takes to get from the commit hook to the results being available to what will become your frontend.

Investigate where the hardware involved in each stage can be improved and analyse what kind of hardware upgrades may be useful.

Reassess the entire loop design and look at splitting the testing if the loop cannot be optimised to the time limits required by the team. The loop exists to serve the team but the expectations of the team may need to be managed compared to the cost of hardware upgrades or finite time limits.


How many branches, variants, configurations and tests are actually needed?

Scale has a direct impact on the affordability and feasibility of the final loop and frontend. Ensure that the build infrastructure can handle the total number of variants, not just at build time but for storage. Developers will need access to the files which demonstrate a particular bug or regression

Scale also provides benefits of being able to ignore anomalies.

Identify how many test devices, LAVA instances and Jenkins slaves are needed. (As a hint, start small and design the frontend so that more can be added later.)


The development of a custom interface is not a small task

Capturing the requirements for the interface may involve lengthy discussions across the development team. Where there are irreconcilable differences, a second frontend may become necessary, potentially pulling the same data and presenting it in a radically different manner.

Include discussions on how or whether to push notifications to the development team. Take time to consider the frequency of notification messages and how to limit the content to only the essential data.

Bisect support can flow naturally from the design of the loop if the loop is carefully designed. Bisect requires that a simple boolean test can be generated, built and executed across a set of commits. If the frontend implements only a single test (for example, does the kernel boot?) then it can be easy to identify how to provide bisect support. Tests which produce hundreds of results need to be slimmed down to a single pass/fail criterion for the bisect to work.


This may take the longest of all elements of the final loop

Just what results do the developers actually want and can those results be delivered? There may be requirements to aggregate results across many LAVA instances, with comparisons based on metadata from the original build as well as the LAVA test.

What level of detail is relevant?

Different results for different members of the team or different teams?

Is the data to be summarised and if so, how?


A frontend has the potential to become complex and need long term maintenance and development

Device requirements

At the hardware design stage, there are considerations for the final software relating to how the final hardware is to be tested.


All units of all devices must uniquely identify to the host machine as distinct from all other devices which may be connected at the same time. This particularly covers serial connections but also any storage devices which are exported, network devices and any other method of connectivity.

Example - the WaRP7 integration has been delayed because the USB mass storage does not export a filesystem with a unique identifier, so when two devices are connected, there is no way to distinguish which filesystem relates to which device.

All unique identifiers must be isolated from the software to be deployed onto the device. The automation framework will rely on these identifiers to distinguish one device from up to a dozen identical devices on the same machine. There must be no method of updating or modifying these identifiers using normal deployment / flashing tools. It must not be possible for test software to corrupt the identifiers which are fundamental to how the device is identified amongst the others on the same machine.

All unique identifiers must be stable across multiple reboots and test jobs. Randomly generated identifiers are never suitable.

If the device uses a single FTDI chip which offers a single UART device, then the unique serial number of that UART will typically be a permanent part of the chip. However, a similar FTDI chip which provides two or more UARTs over the same cable would not have serial numbers programmed into the chip but would require a separate piece of flash or other storage into which those serial numbers can be programmed. If that storage is not designed into the hardware, the device will not be capable of providing the required uniqueness.

Example - the WaRP7 exports two UARTs over a single cable but fails to give unique identifiers to either connection, so connecting a second device disconnects the first device when the new tty device replaces the existing one.

If the device uses one or more physical ethernet connector(s) then the MAC address for each interface must not be generated randomly at boot. Each MAC address needs to be:

  • persistent - each reboot must always use the same MAC address for each interface.
  • unique - every device of this type must use a unique MAC address for each interface.

If the device uses fastboot, then the fastboot serial number must be unique so that the device can be uniquely identified and added to the correct container. Additionally, the fastboot serial number must not be modifiable except by the admins.

Example - the initial HiKey 960 integration was delayed because the firmware changed the fastboot serial number to a random value every time the device was rebooted.


Automation requires more than one device to be deployed - the current minimum is five devices. One device is permanently assigned to the staging environment to ensure that future code changes retain the correct support. In the early stages, this device will be assigned to one of the developers to integrate the device into LAVA. The devices will be deployed onto machines which have many other devices already running test jobs. The new device must not interfere with those devices and this makes some of the device requirements stricter than may be expected.

  • The aim of automation is to create a homogenous test platform using heterogeneous devices and scalable infrastructure.

  • Do not complicate things.

  • Avoid extra customised hardware

    Relays, hardware modifications and mezzanine boards all increase complexity

    Examples - X15 needed two relay connections, the 96boards initially needed a mezzanine board where the design was rushed, causing months of serial disconnection issues.

  • More complexity raises failure risk nonlinearly

    Example - The lack of onboard serial meant that the 96boards devices could not be tested in isolation from the problematic mezzanine board. Numerous 96boards devices were deemed to be broken when the real fault lay with intermittent failures in the mezzanine. Removing and reconnecting a mezzanine had a high risk of damaging the mezzanine or the device. Once 96boards devices moved to direct connection of FTDI cables into the connector formerly used by the mezzanine, serial disconnection problems disappeared. The more custom hardware has to be designed / connected to a device to support automation, the more difficult it is to debug issues within that infrastructure.

  • Avoid unreliable protocols and connections

    Example. WiFi is not a reliable deployment method, especially inside a large lab with lots of competing signals and devices.

  • This document is not demanding enterprise or server grade support in devices.

    However, automation cannot scale with unreliable components.

    Example - HiKey 6220 and the serial mezzanine board caused massively complex problems when scaled up in LKFT.

  • Server support typically includes automation requirements as a subset:

    RAS, performance, efficiency, scalability, reliability, connectivity and uniqueness

  • Automation racks have similar requirements to data centres.

  • Things need to work reliably at scale

Scale issues also affect the infrastructure which supports the devices as well as the required reliability of the instance as a whole. It can be difficult to scale up from initial development to automation at scale. Numerous tools and utilities prove to be uncooperative, unreliable or poorly isolated from other processes. One result can be that the requirements of automation look more like the expectations of server-type hardware than of mobile hardware. The reality at scale is that server-type hardware has already had fixes implemented for scalability issues whereas many mobile devices only get tested as standalone units.

Connectivity and deployment methods

  • All test software is presumed broken until proven otherwise
  • All infrastructure and device integration support must be proven to be stable before tests can be reliable
  • All devices must provide at least one method of replacing the current software with the test software, at a level lower than you're testing.

The simplest method to automate is TFTP over physical ethernet, e.g. U-Boot or UEFI PXE. This also puts the least load on the device and automation hardware when delivering large images

Manually writing software to SD is not suitable for automation. This tends to rule out many proposed methods for testing modified builds or configurations of firmware in automation.

See for more information on how the requirements of automation affect the hardware design requirements to provide access to SD card filesystems to and from the dispatcher and the device.

Some deployment methods require tools which must be constrained within an LXC. These include but are not limited to:

  • fastboot - due to a common need to have different versions installed for different hardware devices

    Example - Every fastboot device suffers from this problem - any running fastboot process will inspect the entire list of USB devices and attempt to connect to each one, locking out any other fastboot process which may be running at the time, which sees no devices at all.

  • IoT deployment - some deployment tools require patches for specific devices or use tools which are too complex for use on the dispatcher.

    Example - the TI CC3220 IoT device needs a patched build of OpenOCD, the WaRP7 needs a custom flashing tool compiled from a github repository.

Wherever possible, existing deployment methods and common tools are strongly encouraged. New tools are not likely to be as reliable as the existing tools.

Deployments must not make permanent changes to the boot sequence or configuration.

Testing of OS installers may require modifying the installer to not install an updated bootloader or modify bootloader configuration. The automation needs to control whether the next reboot boots the newly deployed system or starts the next test job, for example when a test job has been cancelled, the device needs to be immediately ready to run a different test job.


Automation requires driving the device over serial instead of via a touchscreen or other human interface device. This changes the way that the test is executed and can require the use of specialised software on the device to translate text based commands into graphical inputs.

It is possible to test video output in automation but it is not currently possible to drive automation through video input. This includes BIOS-type firmware interaction. UEFI can be used to automatically execute a bootloader like Grub which does support automation over serial. UEFI implementations which use graphical menus cannot be supported interactively.


The objective is to have automation support which runs test jobs reliably. Reproducible failures are easy to fix but intermittent faults easily consume months of engineering time and need to be designed out wherever possible. Reliable testing means only 3 or 4 test job failures per week due to hardware or infrastructure bugs across an entire test lab (or instance). This can involve thousands of test jobs across multiple devices. Some instances may have dozens of identical devices but they still need not to exceed the same failure rate.

All devices need to reach the minimum standard of reliability, or they are not fit for automation. Some of these criteria might seem rigid, but they are not exclusive to servers or enterprise devices. To be useful mobile and IoT devices need to meet the same standards, even though the software involved and the deployment methods might be different. The reason is that the Continuous Integration strategy remains the same for all devices. The problem is the same, regardless of underlying considerations.

A developer makes a change; that change triggers a build; that build triggers a test; that test reports back to the developer whether that change worked or had unexpected side effects.

  • False positive and false negatives are expensive in terms of wasted engineering time.
  • False positives can arise when not enough of the software is fully tested, or if the testing is not rigorous enough to spot all problems.
  • False negatives arise when the test itself is unreliable, either because of the test software or the test hardware.

This becomes more noticeable when considering automated bisections which are very powerful in tracking the causes of potential bugs before the product gets released. Every test job must give a reliable result or the bisection will not reliably identify the correct change.

Automation and Risk

Linaro kernel functional test framework (LKFT)

We have seen with LKFT that complexity has a non-linear relationship with the reliability of any automation process. This section aims to set out some guidelines and recommendations on just what is acceptable in the tools needed to automate testing on a device. These guidelines are based on our joint lab and software team experiences with a wide variety of hardware and software.

Adding or modifying any tool has a risk of automation failure

Risk increases non-linearly with complexity. Some of this risk can be mitigated by testing the modified code and the complete system.

Dependencies installed count as code in terms of the risks of automation failure

This is a key lesson learnt from our experiences with LAVA V1. We added a remote worker method, which was necessary at the time to improve scalability. But it massively increased the risk of automation failure simply due to the extra complexity that came with the chosen design.These failures did not just show up in the test jobs which actively used the extra features and tools; they caused problems for all jobs running on the system.

The ability in LAVA V2 to use containers for isolation is a key feature

For the majority of use cases, the small extension of the runtime of the test to set up and use a container is negligible. The extra reliability is more than worth the extra cost.

Persistent containers are themselves a risk to automation

Just as with any persistent change to the system.

Pre-installing dependencies in a persistent container does not necessarily lower the overall risk of failure. It merely substitutes one element of risk for another.

All code changes need to be tested

In unit tests and in functional tests. There is a dividing line where if something is installed as a dependency of LAVA, then when that something goes wrong, LAVA engineers will be pressured into fixing the code of that dependency whether or not we have any particular experience of that language, codebase or use case. Moving that code into a container moves that burden but also makes triage of that problem much easier by allowing debug builds / options to be substituted easily.

Complexity also increases the difficulty of debugging, again in a nonlinear fashion

A LAVA dependency needs a higher bar in terms of ease of triage.

Complexity cannot be easily measured

Although there are factors which contribute.


Large programs which appear as a single monolith are harder to debug than the UNIX model of one utility joined with other utilities to perform a wider task. (This applies to LAVA itself as much as any one dependency - again, a lesson from V1.)

Feature creep

Continually adding features beyond the original scope makes complex programs worse. A smaller codebase will tend to be simpler to triage than a large codebase, even if that codebase is not monolithic.

Targeted utilities are less risky than large environments

A program which supports protocol after protocol after protocol will be more difficult to maintain than 3 separate programs for each protocol. This only gets worse when the use case for that program only requires the use of one of the many protocols supported by the program. The fact that the other protocols are supported increases the complexity of the program beyond what the use case actually merits.

Metrics in this area are impossible

The risks are nonlinear, the failures are typically intermittent. Even obtaining or applying metrics takes up huge amounts of engineering time.

Mismatches in expectations

The use case of automation rarely matches up with the more widely tested use case of the upstream developers. We aren't testing the code flows typically tested by the upstream developers, so we find different bugs, raising the level of risk. Generally, the simpler it is to deploy a device in automation, the closer the test flow will be to the developer flow.

Most programs are written for the single developer model

Some very widely used programs are written to scale but this is difficult to determine without experience of trying to run it at scale.

Some programs do require special consideration

QEMU would fail most of these guidelines above, so there are mitigating factors:

  • Programs which can be easily restricted to well understood use cases lower the risk of failure. Not all use cases of the same program not need to be covered.
  • Programs which have excellent community and especially in-house support also lower the risk of failure. (Having QEMU experts in Linaro is a massive boost for having QEMU as a dispatcher dependency.)

Unfamiliar languages increase the difficulty of triage

This may affect dependencies in unexpected ways. A program which has lots of bindings into a range of other languages becomes entangled in transitions and bugs in those other languages. This commonly delays the availability of the latest version which may have a critical fix for one use case but which fails to function at all in what may seem to be an unrelated manner.

The dependency chain of the program itself increases the risk of failure in precisely the same manner as the program

In terms of maintenance, this can include the build dependencies of the program as those affect delivery / availability of LAVA in distributions like Debian.

Adding code to only one dispatcher amongst many increases the risk of failure on the instance as a whole

By having an untested element which is at variance to the rest of the system.

Conditional dependencies increase the risk

Optional components can be supported but only increase the testing burden by extending the matrix of installations.

Presence of the code in Debian main can reduce the risk of failure

This does not outweigh other considerations - there are plenty of packages in Debian (some complex, some not) which would be an unacceptable risk as a dependency of the dispatcher, fastboot for one. A small python utility from github can be a substantially lower risk than a larger program from Debian which has unused functionality.

Sometimes, "complex" simply means "buggy" or "badly designed"

fastboot is not actually a complex piece of code but we have learnt that it does not currently scale. This is a result of the disparity between the development model and the automation use case. Disparities like that actually equate to complexity, in terms of triage and maintenance. If fastboot was more complex at the codebase level, it may actually become a lower risk than currently.

Linaro as a whole does have a clear objective of harmonising the ecosystem

Adding yet another variant of existing support is at odds with the overall objective of the company. Many of the tools required in automation have no direct affect on the distinguishing factors for consumers. Adding another one "just because" is not a good reason to increase the risk of automation failure. Just as with standards.

Having the code on the dispatcher impedes development of that code

Bug fixes will take longer to be applied because the fix needs to go through a distribution or other packaging process managed by the lab admins. Applying a targeted fix inside an LXC is useful for proving that the fix works.

Not all programs can work in an LXC

LAVA also provides ways to test using those programs by deploying the code onto a test device. e.g. the V2 support for fastmodels involves only deploying the fastmodel inside a LAVA Test Shell on a test device, e.g. x86 or mustang or Juno.

Speed of running a test job in LAVA is important for CI

The goal of speed must give way to the requirement for reliability of automation

Resubmitting a test job due to a reliability failure is more harmful to the CI process than letting tests take longer to execute without such failures. Test jobs which run quickly are easier to parallelize by adding more test hardware.

Modifying software on the device

Not all parts of the software stack can be replaced automatically, typically the firmware and/or bootloader will need to be considered carefully. The boot sequence will have important effects on what kind of testing can be done automatically. Automation relies on being able to predict the behaviour of the device, interrupt that default behaviour and then execute the test. For most devices, everything which executes on the device prior to the first point at which the boot sequence can be interrupted can be considered as part of the primary boot software. None of these elements can be safely replaced or modified in automation.

The objective is to deploy the device such that as much of the software stack can be replaced as possible whilst preserving the predictable behaviour of all devices of this type so that the next test job always gets a working, clean device in a known state.

Primary boot software

For many devices, this is the bootloader, e.g. U-Boot, UEFI or fastboot.

Some devices include support for a Baseboard management controller or BMC which allows the bootloader and other firmware to be updated even if the device is bricked. The BMC software itself then be considered as the primary boot software, it cannot be safely replaced.

All testing of the primary boot software will need to be done by developers using local devices. SDMux was an idea which only fitted one specific set of hardware, the problem of testing the primary boot software is a hydra. Adding customised hardware to try to sidestep the primary boot software always increases the complexity and failure rates of the devices.

It is possible to divide the pool of devices into some which only ever use known versions of the primary boot software controlled by admins and other devices which support modifying the primary boot software. However, this causes extra work when processing the results, submitting the test jobs and administering the devices.

A secondary problem here is that it is increasingly common for the methods of updating this software to be esoteric, hacky, restricted and even proprietary.

  • Click-through licences to obtain the tools

  • Greedy tools which hog everything in /dev/bus/usb

  • NIH tools which are almost the same as existing tools but add vendor-specific "functionality"

  • GUI tools

  • Changing jumpers or DIP switches,

    Often in inaccessible locations which require removal of other ancillary hardware

  • Random, untrusted, compiled vendor software running as root

  • The need to press and hold buttons and watch for changes in LED status.

We've seen all of these - in various combinations - just in 2017, as methods of getting devices into a mode where the primary boot software can be updated.

Copyright 2018 Neil Williams

Available under CC BY-SA 3.0:

by Neil Williams at June 06, 2018 14:19

June 18, 2018

Senthil Kumaran

lava-dispatcher docker images - part 1

Introduction, Details and Preparation

Linaro Automated Validation Architecture a.k.a LAVA project has released official docker images for lava-dispatcher only containers. This blog post series explains how to use these images in order to run inpdependent LAVA workers along with devices attached to it. The blog post series is split into three parts as follows:

  1. lava-dispatcher docker images - part 1 - Introduction, Details and Preparation
  2. lava-dispatcher docker images - part 2 - Docker based LAVA Worker running pure LXC job
  3. lava-dispatcher docker images - part 3 - Docker based LAVA Worker running Nexus 4 job with and without LXC Protocol

Before getting into the details of running these images, let us see how these images are organized and what are the packages available via these images.

The lava-dispatcher only docker images will be officially supported by the LAVA project team and there will be regular releases of these images whenever there are updates or new releases. As of this writing there are two images released - production and staging. These docker images are based on Debian Stretch operating system, which is the recommended operating system for installing LAVA.

lava-dispatcher production docker images

The production docker image of lava-dispatcher is based on the official production-repo of LAVA project. The production-repo holds the latest stable packages released by LAVA team for each of the LAVA components.The production docker image is available in the following link:

Whenever there is a production release from the LAVA project there will be a corresponding image created with the tag name in The latest tag as of this writing is 2018.5-3. In order to know what this production docker images are built with, have a look at the DockerFile in

lava-dispatcher staging docker images

The staging docker image of lava-dispatcher is based on the official staging-repo of LAVA project. The staging-repo holds the latest packages built everyday by LAVA team for each of the LAVA components, which is also a source for bleeding edge unreleased software.The staging docker image is available in the following link, which is built daily:

Whenever there is a successful daily build of staging packages available, a docker image will be made available in with the tag name 'latest'. Hence, at any point of time there will be only one tag, i.e., latest in the staging docker image location. In order to know what this staging docker images are built with, have a look at the DockerFile in


Unlike regular installations of LAVA workers, installations via the above docker images will use a package called lava-lxc-mocker instead of lxc Debian package. lava-lxc-mocker is a pseudo implementation of lxc which tries to mock the lxc commands without running the commands on the machine, but providing the exact same output of the original lxc command. This package exists to provide an alternative (pseudo alternative) to lxc and also to avoid the overheads of running nested containers, which simplifies things without losing the power to run LAVA job definitions that has LXC protocol defined, unmodified.

Having seen the details about the lava-dispatcher only docker images, let us now see three different use cases where jobs are run within a docker container with and without using LXC protocol on attached device such as a Nexus 4 phone.

In demonstrating all these use cases we will use lava-dispatcher only staging docker images. We will use instance as the LAVA master to which the docker based LAVA worker will connect to. is an encrypted LAVA instance which accepts connections, only from authenticated LAVA workers. Read more about how to configure encrypted communication between LAVA master and LAVA worker in The following is a preparation step in order to connect the docker based LAVA slave to the encrypted LAVA master instance.

Creating slave certificate

We will name the docker based LAVA worker as 'docker-slave'. Let us create a slave certificate which could be shared to the LAVA master. In a previously running LAVA worker, issue the following command to create a slave certificate,

stylesen@hanshu:~$ sudo /usr/share/lava-dispatcher/ \
Creating the certificate in /etc/lava-dispatcher/certificates.d
 - docker-slave-1.key
 - docker-slave-1.key_secret

We can see the certificates are created successfully in /etc/lava-dispatcher/certificates.d As explained in copy the public component of the above slave certificate to the master instance (, which is shown below:

stylesen@hanshu:~$ scp /etc/lava-dispatcher/certificates.d/docker-slave-1.key \

docker-slave-1.key                            100%  364     1.4KB/s   00:00   

Then login to to do the actual copy as follows (since we need sudo rights to copy directly, this is done in two steps):

stylesen@hanshu:~$ ssh
stylesen@codehelp:~$ sudo mv /tmp/docker-slave-1.key /etc/lava-dispatcher/certificates.d/
[sudo] password for stylesen:
stylesen@codehelp:~$ sudo ls -alh /etc/lava-dispatcher/certificates.d/docker-slave-1.key
-rw-r--r-- 1 stylesen stylesen 364 Jun 18 00:05 /etc/lava-dispatcher/certificates.d/docker-slave-1.key

Now, we have the slave certificate copied to appropriate location on the LAVA master. For convenience, on the host machine from where we start the docker based LAVA worker, copy the slave certificates to a specific directory as shown below:

stylesen@hanshu:~$ mkdir docker-slave-files
stylesen@hanshu:~$ cd docker-slave-files/
stylesen@hanshu:~/docker-slave-files$ cp /etc/lava-dispatcher/certificates.d/docker-slave-1.key* .

Similarly, copy the master certificate's public component to the above folder, in order to enable communication.

stylesen@hanshu:~/docker-slave-files$ scp \ .

master.key                                    100%  364     1.4KB/s   00:00   
stylesen@hanshu:~/docker-slave-files$ ls -alh
total 20K
drwxr-xr-x  2 stylesen stylesen 4.0K Jun 18 05:48 .
drwxr-xr-x 17 stylesen stylesen 4.0K Jun 18 05:45 ..
-rw-r--r--  1 stylesen stylesen  364 Jun 18 05:45 docker-slave-1.key
-rw-r--r--  1 stylesen stylesen  313 Jun 18 05:45 docker-slave-1.key_secret
-rw-r--r--  1 stylesen stylesen  364 Jun 18 05:48 master.key

We are all set with the required files to start and run our docker based LAVA workers.

... Continue Reading Part 2

by stylesen at June 06, 2018 02:30

lava-dispatcher docker images - part 2

This is part 2 of the three part blog post series on lava-dispatcher only docker images. If you haven't read part 1 already, then read it on -

Docker based LAVA Worker running pure LXC job

This is the first use case in which we will look at starting a docker based LAVA worker and running a job that requests a LXC device type. The following command is used to start a docker based LAVA worker,

stylesen@hanshu:~$ sudo docker run \
-v /home/stylesen/docker-slave-files:/fileshare \
-v /var/run/docker.sock:/var/run/docker.sock -itd \
-e HOSTNAME='docker-slave-1' -e MASTER='tcp://' \
-e SOCKET_ADDR='tcp://' -e LOG_LEVEL='DEBUG' \
-e ENCRYPT=1 -e MASTER_CERT='/fileshare/master.key' \
-e SLAVE_CERT='/fileshare/docker-slave-1.key_secret' -p 2222:22 \
--name ld-latest linaro/lava-dispatcher-staging-stretch-amd64:latest

Unable to find image 'linaro/lava-dispatcher-staging-stretch-amd64:latest' locally
latest: Pulling from linaro/lava-dispatcher-staging-stretch-amd64
cc1a78bfd46b: Pull complete
5ddb65a5b8b4: Pull complete
41d8dcd3278b: Pull complete
071cc3e7e971: Pull complete
39bedb7bda2f: Pull complete
Digest: sha256:1bc7c7b2bee09beda4a6bd31a2953ae80847c706e8500495f6d0667f38fe0c9c
Status: Downloaded newer image for linaro/lava-dispatcher-staging-stretch-amd64:latest

Lets have a closer look at the 'docker run' command above and see what are the options used:

'-v /home/stylesen/docker-slave-files:/fileshare' - mounts the directory /home/stylesen/docker-slave-files from the host machine, inside the docker container at the location /fileshare This location is used to exchange files from the host to the container and vice versa.

'-v /var/run/docker.sock:/var/run/docker.sock' - similarly the docker socket file is exposed within the container. This is optional and may be required for advanced job runs and use cases.

For options such as '-itd', '-p' and '--name' refer to know what these option do for running docker images.

'-e' - This option is used to set environment variables inside the docker container being run. The following environment variables are set in the above command line which is consumed by the script inside the container and starts the lava-slave daemon based on these variable's values.

  1. HOSTNAME - Name of the slave
  2. MASTER - Main master socket
  3. SOCKET_ADDR - Log socket
  4. LOG_LEVEL - Log level, default to INFO
  5. ENCRYPT - Encrypt messages
  6. MASTER_CERT - Master certificate file
  7. SLAVE_CERT - Slave certificate file

We can see the docker based LAVA worker is started and running,

stylesen@hanshu:~$ sudo docker ps -a
CONTAINER ID        IMAGE                                               \
  COMMAND             CREATED              STATUS              PORTS    \

522f07964981        linaro/lava-dispatcher-staging-stretch-amd64:latest \
  "/"    About a minute ago   Up 58 seconds       \>22/tcp   ld-latest


If everything goes fine, we can see the LAVA master receiving ping messages from the above LAVA worker as seen below on the LAVA master logs:

stylesen@codehelp:~$ sudo tail -f /var/log/lava-server/lava-master.log
2018-06-18 00:24:30,878    INFO docker-slave-1 => HELLO
2018-06-18 00:24:30,878 WARNING New dispatcher <docker-slave-1>
2018-06-18 00:24:34,069   DEBUG lava-logs => PING(20)
2018-06-18 00:24:36,138   DEBUG docker-slave-1 => PING(20)

The worker will also get listed on in the web UI. The docker based LAVA worker host docker-slave-1 is up and running. Let us add a LXC device to this worker on which we will run our LXC protocol based job. The name of the LXC device we will add to docker-slave-1 is 'lxc-docker-slave-01'. Create a jinja2 template file for lxc-docker-slave-01 and copy it to /etc/lava-server/dispatcher-config/devices/ on the LAVA master instance,

stylesen@codehelp:~$ cat \

{% extends 'lxc.jinja2' %}
{% set exclusive = 'True' %}
stylesen@codehelp:~$ ls -alh \

-rw-r--r-- 1 lavaserver lavaserver 56 Jun 18 00:36 \


In order to add the above device lxc-docker-slave-01 to the LAVA master database and associate it with our docker based LAVA worker docker-slave-1, login to the LAVA master instance and issue the following command:

stylesen@codehelp:~$ sudo lava-server manage devices add \
--device-type lxc --worker docker-slave-1 lxc-docker-slave-01


The device will now be listed as part of the worker docker-slave-1 and could be seen in the link -

The LXC job we will submit to the above device will be which is a normal LXC job requesting a LXC device type and runs a basic smoke test on a Debian based LXC device.

stylesen@harshu:/tmp$ lavacli -i lava.codehelp jobs submit lxc.yaml 

NOTE: lavacli is the official command line tool for interacting with LAVA instances. Read more about lavacli in

Thus job 2486 has been submitted successfully to LAVA instance and it ran successfully as seen in This job used lava-lxc-mocker instead of lxc as seen from

Read part 1...                                                                                                                     ... Continue Reading part 3

Read all parts of this blog post series from below links:

  1. lava-dispatcher docker images - part 1 - Introduction, Details and Preparation
  2. lava-dispatcher docker images - part 2 - Docker based LAVA Worker running pure LXC job
  3. lava-dispatcher docker images - part 3 - Docker based LAVA Worker running Nexus 4 job with and without LXC Protocol

by stylesen at June 06, 2018 02:30

lava-dispatcher docker images - part 3

This is part 3 of the three part blog post series on lava-dispatcher only docker images. If you haven't read part 2 already, then read it on -

Docker based LAVA Worker running Nexus 4 job with LXC protocol

This is the second use case in which we will look at starting a docker based LAVA worker and running a job that requests a Nexus 4 device type with LXC protocol. The following command is used to start a docker based LAVA worker,

stylesen@hanshu:~$ sudo docker run \
-v /home/stylesen/docker-slave-files:/fileshare \
-v /var/run/docker.sock:/var/run/docker.sock -v /dev:/dev -itd --privileged \ -e HOSTNAME='docker-slave-1' -e MASTER='tcp://' \ -e SOCKET_ADDR='tcp://' -e LOG_LEVEL='DEBUG' \
-e ENCRYPT=1 -e MASTER_CERT='/fileshare/master.key' \
-e SLAVE_CERT='/fileshare/docker-slave-1.key_secret' -p 2222:22 \
--name ld-latest linaro/lava-dispatcher-staging-stretch-amd64:latest


There is not much difference in the above command from what we used in use case one, except for couple of new options.

'-v /dev:/dev' - mounts the host machine's /dev directory inside the docker container at the location /dev This is required when we deal with actual (physical) devices in order to access these devices from within the docker container.

'--privileged' - this option is required to allow seamless passthrough and device access from within the container.

Once we have the docker based LAVA worker up and running with the new options in place, we can add a new nexus4 device to it. The name of the nexus4 device we will add to docker-slave-1 is 'nexus4-docker-slave-01'. Create a jinja2 template file for nexus4-docker-slave-01 and copy it to /etc/lava-server/dispatcher-config/devices/ on the LAVA master instance,

stylesen@codehelp:~$ sudo cat \

{% extends 'nexus4.jinja2' %}
{% set adb_serial_number = '04f228d1d9c76f39' %}
{% set fastboot_serial_number = '04f228d1d9c76f39' %}
{% set device_info = [{'board_id': '04f228d1d9c76f39'}] %}
{% set fastboot_options = ['-u'] %}
{% set flash_cmds_order = ['update', 'ptable', 'partition', 'cache', \
'userdata', 'system', 'vendor'] %}

{% set exclusive = 'True' %}
stylesen@codehelp:~$ sudo ls -alh \

-rw-r--r-- 1 lavaserver lavaserver 361 Jun 18 01:32 \


In order to add the above device nexus4-docker-slave-01 to the LAVA master database and associate it with our docker based LAVA worker docker-slave-1, login to the LAVA master instance and issue the following command:

stylesen@codehelp:~$ sudo lava-server manage devices add \
--device-type nexus4 --worker docker-slave-1 nexus4-docker-slave-01


The device will now be listed as part of the worker docker-slave-1 and could be seen in the link -

The job definition we will submit to the above device will be which is a normal job requesting a Nexus4 device type and runs a simple test on the device using LXC protocol.

stylesen@harshu:/tmp$ lavacli -i lava.codehelp jobs submit nexus4.yaml 

Thus job 2491 has been submitted successfully to LAVA instance and it ran successfully as seen in

Docker based LAVA Worker running Nexus 4 job without LXC protocol

This is the third use case in which we will look at starting a docker based LAVA worker and running a job that requests a Nexus 4 device type without LXC protocol. The following command is used to start a docker based LAVA worker, which is exactly same as use case two.

stylesen@hanshu:~$ sudo docker run \
-v /home/stylesen/docker-slave-files:/fileshare \
-v /var/run/docker.sock:/var/run/docker.sock -v /dev:/dev -itd --privileged \
-e HOSTNAME='docker-slave-1' -e MASTER='tcp://' \
-e SOCKET_ADDR='tcp://' -e LOG_LEVEL='DEBUG' \
-e ENCRYPT=1 -e MASTER_CERT='/fileshare/master.key' \
-e SLAVE_CERT='/fileshare/docker-slave-1.key_secret' -p 2222:22 \
--name ld-latest linaro/lava-dispatcher-staging-stretch-amd64:latest


We will use the same device added for use case two i.e., 'nexus4-docker-slave-01' in order to execute this job.

The job we will submit to the above device will be which is a normal job requesting a Nexus4 device type and runs a simple test on the device, without calling any LXC protocol.

stylesen@harshu:/tmp$ lavacli -i lava.codehelp jobs submit nexus4-minus-lxc.yaml 

Thus job 2492 has been submitted successfully to LAVA instance and it ran successfully as seen in

Hope this blog series helps to get started with using lava-dispatcher only docker images and running your own docker based LAVA workers. If you have any doubts, questions or comments, feel free to email the LAVA team at lava-users [@] lists [dot] linaro [dot] org

Read part 2 ...

Read all parts of this blog post series from below links:

  1. lava-dispatcher docker images - part 1 - Introduction, Details and Preparation
  2. lava-dispatcher docker images - part 2 - Docker based LAVA Worker running pure LXC job
  3. lava-dispatcher docker images - part 3 - Docker based LAVA Worker running Nexus 4 job with and without LXC Protocol

by stylesen at June 06, 2018 02:30

June 17, 2018

Bin Chen

Understand Kubernetes 2: Operation Model

In the last article, we focus on the components in the work nodes. In this one, we'll switch our focus to the user and the component in master node.

Operation Model

From user's perspective, the model is quite simple: User declare a State he wants the system to be in, and then it is k8s's job to achieve that.
User send the Resouces and Operation to the k8s using the REST API, which is served by the API server inside of the master node, the request will be put into a stateStore (implemented using etcd). According to the type of resource, different Controllers will be delegated to do the job.
The exact Operations available depend on the Resource type, but most the case, it means CRUD. For create operation, there is a Specification define the attribute of the resource wanted to be created.
Here are two examples:
  • create a Pod, according to a Pod spec.
  • create a Deployment called mypetstore, according to ta Deployment spec.
  • update the mypetstore deployment with a new container image.
Each Resource (also called Object) has three pieces of information: Spec, Status and Metadata, and those are saved in the stateStore.
  • Spec is specified by the user for resource creation and update; it is desired state of the resource.
  • Status is updated by the k8s system and queried by the user; it is the actual state of the resource.
  • Metadata is partly specified by the user and can be updated by the k8s system; it is the label of the resource.
The class diagram looks like this:
ResourceSpec : create by userStatus : updated by k8s systemMetadata : may be updated by bothControllerCreate() : ResourceUpdate(Resource)Delete(Resource)GetStatus(Resource)CustomizedOps(Resource)K8sUseruse1ncontrols(CRUD)defines Spec, provides Metadata

Sequence Diagram:

Let's see how what really happens when you typing kubectl create -f deployment/spec.yaml:
UserUserkubectlkubectlAPI ServerAPI ServerStateStoreStateStoreControllerControllerWorkNodesWorkNodescreate spec.yamlkubectl turn it to REST callPost xxx/deploymentssave the specunblocked by new stateok (async)ok (async)do stuff to achieve the new stateok (async)update some new information (e.g pod & nodes binding)unblock by new state and do stuffdo stuff to achive the state


k8s cluster is managed and accessed from predefined APIkubectl is a client of the API, it converts the shell command to the REST call, as shown in above sequence diagram.
You can build your own tools using those APIs to add functionality that are currently not available. Since API is versioned and stable, it makes sure your tool are portable.
Portability and extensibility are the most important benefits k8s brings. In another word, k8s is awesome not only because it does awesome things itself but enables you and others build awesome things on top of it.


Controller is to make sure the actual state of the object matches the desired state.
The idea of matching the actual state to the desired state is the driving philosophy of k8s's design. It doesn't sound quite novel given most declarative tools follow the same idea. For example, both Terraform and Ansible are declarative. The things k8s is different is it keep monitoring the system status and make sure the desired status is always kept. And that means all the goodness of availability and scalability are built-in in k8s.
The desired state is defined using a Spec, and that is the stuff user will interact with. It is k8s's job to do whatever you requested.
The most common specs are:
  • Deployments for stateless persistent apps (e.g. http servers)
  • StatefulSets for stateful persistent apps (e.g. databases)
  • Jobs for run-to-completion apps (e.g. batch jobs).
Let's take a close look at the Deployments Spec.

Deployment Spec

Below is the deployment spec that can be used to create a deployment of nginx server with 3 replicas, each of which use nginx:1.7.9 as the container image and application will listen on 80 port.
apiVersion: apps/v1
kind: Deployment
name: nginx-deployment
app: nginx
replicas: 3
app: nginx
app: nginx
- name: nginx
image: nginx:1.7.9
- containerPort: 80
This should be simple to understand. Compared with a simple Pod/Container specification,it has an extra replica field. The kind is set as Deployment so that a right Controller will be able to pick it up.
Lots of specs will have a nested PodSpec, as shown below, since at the end of the day, k8s is a Pod/Container management system.
Deplyment Controller and Speck8s (master)cluster (work nodes)specDeploymentControllerspec : DeloymentSpecstatus: DeloymentStatusControllerPodKindMetadataSpec : PodSpecStatus : PodStatusSpecKindMetadataDeloymentSpecreplicas: intselector: LabelSelectorstrategy: DeloymentStrategytemplate: PodTemplateSpecPodTemplateSpecMetadataSpec: PodSpecPodSpecContainers: []ContainerVolumes : []Volumecreate/update/monitor$.specuseembed$.template$.spec
For a complete reference of the field available for deployment spec, you can check here.


In this article, we looked at the components of Master node and the overall operation Model of k8s: drive and maintainer the actual state of the system to be same as the desired state as specified by the user through various object specification. In particular, we took a close look at most used deployment spec.

by Unknown ( at June 06, 2018 01:34

June 10, 2018

Ard Biesheuvel

UEFI driver pitfalls and PC-isms

Even though Intel created UEFI (still known by its TLA EFI at the time) for Itanium initially, x86 is by far the dominant architecture when it comes to UEFI deployments in the field, and even though the spec itself is remarkably portable to architectures such as ARM, there are a lot of x86 UEFI drivers out there that cut corners when it comes to spec compliance. There are a couple of reasons for this:

  • the x86 architecture is not as heterogeneous as other architectures, and while the form factor may vary, most implementations are essentially PCs;
  • the way the PC platform organizes its memory and especially its DMA happens to result in a configuration that is rather forgiving when it comes to UEFI spec violations.

UEFI drivers provided by third parties are mostly intended for plugin PCI cards, and are distributed as binary option ROM images. There are very few open source UEFI drivers available (apart from the _HCI class drivers and some drivers for niche hardware available in Tianocore), and even if they were widely available, you would still need to get them into the flash ROM of your particular card, which is not a practice hardware vendors are eager to support.
This means the gap between theory and practice is larger than we would like, and this becomes apparent when trying to run such code on platforms that deviate significantly from a PC.

The theory

As an example, here is some code from the EDK2 EHCI (USB2) host controller driver.

  Status = PciIo->AllocateBuffer (PciIo, AllocateAnyPages,
                     EfiBootServicesData, Pages, &BufHost, 0);
  if (EFI_ERROR (Status)) {

  Bytes = EFI_PAGES_TO_SIZE (Pages);
  Status = PciIo->Map (PciIo, EfiPciIoOperationBusMasterCommonBuffer,
                     BufHost, &Bytes, &MappedAddr, &Mapping);
  if (EFI_ERROR (Status) || (Bytes != EFI_PAGES_TO_SIZE (Pages))) {
    goto FREE_BUFFER;


  Block->BufHost  = BufHost;
  Block->Buf      = (UINT8 *) ((UINTN) MappedAddr);
  Block->Mapping  = Mapping;

This is a fairly straight-forward way of using UEFI’s PCI DMA API, but there a couple of things to note here:

  • PciIo->Map () may be called with the EfiPciIoOperationBusMasterCommonBuffer mapping type only if the memory was allocated using PciIo->AllocateBuffer ();
  • the physical address returned by PciIo->Map () in MappedAddr may deviate from both the virtual and physical addresses as seen by the CPU (note that UEFI maps VA to PA 1:1);
  • the size of the actual mapping may deviate from the requested size.

However, none of this matters on a PC, since its PCI is cache coherent and 1:1 mapped. So the following code will work just as well:

  Status = gBS->AllocatePages (AllocateAnyPages, EfiBootServicesData,
                  Pages, &BufHost);
  if (EFI_ERROR (Status)) {


  Block->BufHost  = BufHost;
  Block->Buf      = BufHost;

So let’s look at a couple of ways a non-PC platform can deviate from a PC when it comes to the layout of its physical address space.

DRAM starts at address 0x0

On a PC, DRAM starts at address 0x0, and most of the 32-bit addressable physical region is used for memory. Not only does this mean that inadvertent NULL pointer dereferences from UEFI code may go entirely unnoticed (one example of this is the NVidia GT218 driver), it also means that PCI devices that only support 32-bit DMA (or need a little kick to support more than that) will always be able to work. In fact, most UEFI implementations for x86 explicitly limit PCI DMA to 4 GB, and most UEFI PCI drivers don’t bother to set the mandatory EFI_PCI_IO_ATTRIBUTE_DUAL_ADDRESS_CYCLE attribute for >32 bit DMA capable hardware either.

On ARM systems, the amount of available 32-bit addressable RAM may be much smaller, or it may even be absent entirely. In the latter case, hardware that is only 32-bit DMA capable can only work if a IOMMU is present and wired into the PCI root bridge driver by the platform, or if DRAM is not mapped 1:1 in the PCI address space. But in general, it should be expected that ARM platforms use at least 40 bits of address space for DMA, and that drivers for 64-bit DMA capable peripherals enable this capability in the hardware.

PCI DMA is cache coherent

Although not that common, it is possible and permitted by the UEFI spec for PCI DMA to be non cache coherent. This is completely transparent to the driver, provided that it uses the APIs correctly. For instance, PciIo->AllocateBuffer () will return an uncached buffer in this case, and the Map () and Unmap () methods will perform cache maintenance under the hood to keep the CPU’s and the device’s view of memory in sync. Obviously, this use case breaks spectacularly if you cut corners like in the second example above.

PCI memory is mapped 1:1 with the CPU

On a PC, the two sides of the PCI host bridge are mapped 1:1. As illustrated in the example above, this means you can essentially ignore the device or bus address returned from the PciIo->Map () call, and just program the CPU physical address into the DMA registers/rings/etc. However, non-PC systems may have much more extravagant PCI topologies, and so a compliant driver should use the appropriate APIs to obtain these addresses. Note that this is not limited to inbound memory accesses (DMA) but also applies to outbound accesses, and so a driver should not interpret BAR fields from the PCI config space directly, given that the CPU side mapping of that BAR may be at a different address altogether.

PC has strongly ordered memory

Whatever. UEFI is uniprocessor anyway, and I don’t remember seeing any examples where this mattered.

Using encrypted memory for DMA

Interestingly, and luckily for us in the ARM world, there are other reasons why hardware vendors are forced to clean up their drivers: memory encryption. This case is actually rather similar to the non cache coherent DMA case, in the sense that the allocate, map and unmap actions all involve some extra work performed by the platform under the hood. Common DMA buffers are allocated from unencrypted memory, and mapping or unmapping involve decryption or encryption in place depending on the direction of the transfer (or bounce buffering if encryption in place is not possible, in which case the device address will deviate from the host address like in the non-1:1 mapped PCI case above). Cutting corners here means that attempted DMA transfers will produce corrupt data, usually a strong motivator to get your code fixed.


The bottom line is really that the UEFI APIs appear to be able to handle anything you throw at them when it comes to unconventional platform topologies, but this only works if you use them correctly, and having been tested on a PC doesn’t actually prove all that much in this regard.

by ardbiesheuvel at June 06, 2018 17:45

Bin Chen

Understand Kubernetes 1: Container Orchestration

By far, we know the benefits of the container and how the container is implemented using Linux primitives.
If we only need to one or two containers, we should be satisfied. That's all we need. But if we want to run dozens or thousands containers to build a stable and scalable web service that is able to server millions transaction per seconds, we have more problems to solve. To name a few:
  • scheduling: Which host to put a container?
  • update: How to update the container image and ensure zero downtime?
  • self-healing: How to detect and restart a container when it is down?
  • scaling: How to add more containers when more processing capacity is needed?
None of those issues are new but only the subject become containers, rather than physical servers (in the old days), or virtual machines as recently. The functionalities described above are usually referred as Container Orchestration.


kubernetes, abbreviated as k8s, is one of many container orchestration solutions. But, as of mid-2018, many would agree the competition is over; k8s is the de facto standard. I think it is a good news, freeing you from the hassle of picking from many options and worrying about investing in the wrong one. K8s is completely open source, with a variety of contributors from big companies to individual contributors.
k8s has a very good documentation, mostly here and here.
In this article, we'll take a different perspective. Instead of starting with how to use the tools, we'll start with the very object k8s platform is trying to manage - the container. We'll try to see what extra things k8s can do, compare with single machine container runtime such as runc or docker, and how k8s integrate with those container runtimes.
However, we can't do that without an understanding of the high-level architecture of k8s.

At the highest level, k8s is a master and slave architecture, with a master node controlling multiple slave or work nodes. master & slave nodes together are called a k8s clusterUser talks to the cluster using API, which is served by the master. We intentionally left the master node diagram empty, with a focus on the how the things are connected on the work node.
Master talks to work nodes through kublet, which primarily run and stop Pods, through CRI, which is connected to a container runtime. kublet also monitor Pods for liveness and pulling debug information and logs.
We'll go over the components in a little more detail below.


There are two type of nodes, master node and slave node. A node can either be a physical machine or virtual machine.
You can jam the whole k8s cluster into a single machine, such as using minikube.


Each work note has a kubelet, it is the agent that enables the master node talk to the slaves.
The responsibility of kubelet includes:
  • Creating/running the Pod
  • Probe Pods
  • Monitor Nodes/Pod
  • etc.
We can go nowhere without first introducing Pod.


In k8s, the smaller scheduling or deployment unit is Pod, not container. But there shouldn't be any cognitive overhead if you already know containers well. The benefits of Pod is to add another wrap on top of the container to make sure closely coupled contains are guaranteed end up being scheduled on the same host so that they can share a volume or network that would otherwise difficult or inefficient to implement if they being on different hosts.
A pod is a group of one or more containers, with shared storage and network, and a specification for how to run the containers. A pod’s contents are always co-located and co-scheduled and run in a shared context, such as namespaces and cgroups.
For details, you can find here.

Config, Scheduing and Run Pod

You config a Pod using ymal file, call it spec. As you can imagine, the Pod spec will include configurations for each container, which includes the image and the runtime configuration.
With this spec, the k8s will sure pull the image and run the container, just as you would do using simple docker command. Nothing quite innovative here.
What missing here is in the spec we'll describe the resource requirement for the containers/Pod, and the k8s will use that information along with current cluster status, find a suitable host for the host. This is called Pod scheduling. The functionality and effectiveness of the schedule may be overlooked, in the borg paper, it is mentioned a better schedule actually could save millions of dollar for in google scale.
In the spec, we can also specify the Liveness and Readiness Probes.

Probe Pods

The kubelet uses liveness probes to know when to restart a container, and readiness probes to know when a container is ready to start accepting traffic. The first is the foundation for self-healing and the second for load balancing.
Without k8s, you have to do all these by your owner. Time and $$ saved.

Container Runtime: CRI

k8s isn't binding to a particular container runtime, instead, it defines an interface for image management and container runtime. Anyone one implemented the interface can be plugged into the k8s, be more accurate, the kubelet.
There are multiple implementations of CRI. Docker has cri-contained that plugs the containd/docker into the kubelet. cri-o is another implementation, which wraps runc for the container runtime service and wraps a bunch of other libraries for the image service. Both use cni for the network setup.
Assuming a Pod/Container is assigned to a particular node, and the kubelet on that node will operate as follows:
kubeletkubeletcri clientcri clientcri servercri serverimage serviceimage serviceruntime service(runc)runtime service(runc)run containercreate (over gPRC)pull image from a registryunpack the image and create rootfscreate runtime config (config.json) using the pod specrun container


We go through why we need a container orchestration system, and then the high-level architecture of k8s, with a focus on the components in the work node and its integration with container runtime.

by Unknown ( at June 06, 2018 07:04