About happi

I have been programming for fun since 1980, and for profit since 1989 when I started my first company. I have been working on native code compilation of Erlang, with the implementation of Scala, on virtual machines, and with internet scale online payment solutions. Currently I am writing a book on the Erlang Runtime System (ERTS).

Opinions

Just the facts…

A word of warning for you who stumbled upon my blog: this blog is not going to be that fun or interesting to read.

I find that the blog post I really like are the ones where the author has very strong opinions about a subject, so strong opinions in fact that he or she just can’t shut up about it. Usually the stated opinion is on the border of insanity and far from my view on the subject. These types of posts are interesting to read. They make you think. They often also make you feel something.

I am not going to write that type of blog posts. I hate opinions. I only have three:

  1. I don’t like opinions.
  2. XSLT is a terrible programming language.
  3. Redefineable syntax like operation overloading and macros is evil and the worst language feature ever.

Other than that I try to stay to just the facts. I like facts. Still, I might do two more blog posts about opinion two and three sometime. In this post I will just talk about my number one opinion: I hate opinions.

One of the thing I hate about opinions is that everyone seems to expect that you should have an opinion on every subject.

”So, what do you think about…”, and then insert any subject like ”the new Mac Book?”, ”the situation in Crimea?”, ”this blouse?”

What? Why should I have an opinion on these things?

It’s a computer, I can use it to program, thats good.

I have not lived in Crimea, I have no idea about the history between those people, but war is never the answer. 

It is red. What do you want me to say? You look good in it. But you look good in anything.

See, now I was forced to have opinions on things that I don’t care to have opinions about. And then we can start arguing. And there is no way that we can settle such an argument, because it is based on opinions and not facts.

Therefore I am always trying to just stick to the facts. We can argue about the facts, for sure, but such an argument can be settled by finding out if the facts are true or not. Facts can be proven or disproven. Opinions can not. They just are. And they are often crazy.

It is fun to read other peoples crazy opinions, but I’ll try to not have any opinions. If I do have opinions I’ll try not to tell you or write them down in my blog. I will just write facts about how Erlang works, how the Erlang runtime system works, and how programming in general works. Without opinions. It will not be that fun, but perhaps you will learn something. Something based on facts.

And please don’t force me to have opinions about things I don’t care about, or even worse force me to have opinions about thing I do care about.

I hate opinions, but hey, that’s just my opinion.

When is the Book Coming Out?

”So Erik, When is the book coming out?”

Last week I attended and spoke at Erlang Factory San Francisco.

The number one question that I got, and I got it from basically everyone I have ever met before, was ”When is the book coming out?”.

It was really nice to see that there was such an interest for the book.

The sad part is that the answer, at least in part, is: ”I don’t know”.

It is unfortunately late already. I was planning for a release in early 2014, and clearly that didn’t happen. I have only written about one third of the content. None of the chapters I have written has been edited, proofread, rewritten or even really finished. 

There is a huge amount of work left.

In the summer and fall of 2013 I again ended up as a manager at Klarna with the responsibility to get ”Kred” or ”Klarna Online”, our (well Klarna’s) OLTP system, to survive the Christmas load. The system was in a slightly unstable state at the beginning, but the Kred Core team managed to get it stable, responsive and really performing well straight through the Christmas rush. Unfortunately this took all of my time and energy, and the book writing got put on hold.

Fortunately, this led me to finally take the step and resign

The plan now is to take a week of vacation here in California, and then I will work full time on the book. If all goes well I should be able to finish most of the writing by early summer. Then perhaps I can get reviewer comments by the end of the summer and have a ”final” version ready by September or so.

From now on I will also try to keep you posted on the progress here on the blog.

I'm writing a book, I got the page numbers done.

Moving On

Yes, I quit my job.

Since so many of the people I meet keep on asking me I thought I’d best make it official.

After working at Klarna for more than eight years I decided to try something new. So I resigned and started my own company HappiHacking. Tomorrow, Sunday March 9th, is formally my last day of employment.

Right now I am taking a vacation in California, and after that I will start working on the Erlang runtime system book.

And, no, I do not really know what I will do after that.

I am on my own now, and I can do what I want. This is both exciting and very scary at the same time.

 

Sorry, this post was neither funny nor very informative, it was just a public announcement I’ll try to do better next time.

 

Hh

The BEAM Dispatcher

The instruction decoder in Beam is implemented with a technique called directly threaded code. In this context the word threaded has nothing to do with OS threads, concurrency or parallelism. It is the execution path which is threaded through the virtual machine itself.

Imagine a simple language of arithmetic expressions, where we can write expressions like "8 + 17 * 2.":


Statement :=  Expression '.'
Expression := Number
           |  Expression Op Expression
Op         := '+' | '*'
Number     := [0..9]
           |  [0..9] Number

We can use the Erlang compiler to generate code for a simple stack machine which can evaluate these expressions.
(See full code on github)


compile(String) ->
    [ParseTree] = element(2,
			  erl_parse:parse_exprs(
			    element(2,
				    erl_scan:string(String)))),
    generate_code(ParseTree).

generate_code({op, _Line, '+', Arg1, Arg2}) -> 
    generate_code(Arg1) ++ generate_code(Arg2) ++ [add];
generate_code({op, _Line, '*', Arg1, Arg2}) -> 
    generate_code(Arg1) ++ generate_code(Arg2) ++ [multiply];
generate_code({integer, _Line, I}) -> [push, I].

And now we can write an simplistic virtual stack machine:


interpret(Code) -> interpret(Code, []).

interpret([push, I |Rest], Stack)              -> interpret(Rest, [I|Stack]);
interpret([add     |Rest], [Arg2, Arg1|Stack]) -> interpret(Rest, [Arg1+Arg2|Stack]);
interpret([multiply|Rest], [Arg2, Arg1|Stack]) -> interpret(Rest, [Arg1*Arg2|Stack]);
interpret([],              [Res|_])            -> Res.

And a quick test run gives us the answer:


1> stack_machine:interpret(stack_machine:compile("8 + 17 * 2.")).
42

Great, you have built your first virtual machine! Handling subtraction, division and the rest of the Erlang language is left as an exercise for the reader.

If we take a look at our naive stack machine for arithmetic expressions we see that we use Erlang atoms and pattern matching to decode which instruction to execute. This is a very heavy machinery to just decode machine instructions. In a real machine we would code each instruction as a “machine word” integer.

We can rewrite our stack machine to be a byte code machine implemented in C. First we rewrite the compiler so that it produces byte codes. This is pretty straight forward, just replace each instruction encoded as an atom with a byte representing the instruction. To be able to handle integers larger than 255 we encode integers with a size byte followed by the integer encoded in bytes.

(See full code on github)


compile(Expression, FileName) ->
  [ParseTree] = element(2, erl_parse:parse_exprs(
                  element(2, erl_scan:string(Expression)))),
  file:write_file(FileName, generate_code(ParseTree) ++ [stop()]).

generate_code({op, _Line, '+', Arg1, Arg2}) ->
  generate_code(Arg1) ++ generate_code(Arg2) ++ [add()];
generate_code({op, _Line, '*', Arg1, Arg2}) ->
  generate_code(Arg1) ++ generate_code(Arg2) ++ [multiply()];
generate_code({integer, _Line, I}) -> [push(), integer(I)].

stop()     -> 0.
add()      -> 1.
multiply() -> 2.
push()     -> 3.
integer(I) ->
  L = binary_to_list(binary:encode_unsigned(I)), 
  [length(L) | L].

Now lets write a simple virtual machine in C.


#define STOP 0
#define ADD 1
#define MUL 2
#define PUSH 3

#define pop() (stack[--sp])
#define push(X) (stack[sp++] = X)

int run(char *code) {
  int stack[1000];
  int sp = 0, size = 0, val = 0;
  char *ip = code;

  while (*ip != STOP) {
    switch (*ip++) {
      case ADD: push(pop() + pop()); break;
      case MUL: push(pop() * pop()); break;
      case PUSH:
        size = *ip++;
        val = 0;
        while (size--) { val = val * 256 + *ip++; }
        push(val);
        break;
    }
  }
  return pop();
}

You see, a virtual machine written in C does not need to be very complicated. This machine is just a loop checking the byte code at each instruction by looking at the value
pointed to by the instruction pointer (ip). For each byte code instruction it will switch on the instruction byte code and jump to the case which executes the instruction. This requires a decoding of the instruction and then a jump to the correct code. If we look at the assembly for vsm.c (gcc -S vsm.c) we see the inner loop of the decoder:


L11:
    movl   -16(%ebp), %eax
    movzbl    (%eax), %eax
    movsbl       %al, %eax
    addl          $1, -16(%ebp)
    cmpl          $2, %eax
    je            L7
    cmpl          $3, %eax
    je            L8
    cmpl          $1, %eax
    jne           L5

It has to compare the byte code with each instruction code and then do a conditional jump. In a real machine with many instructions this can become quite expensive.

A better solution would be to have a table with the address of the code then we could just use an index into the table to load the address and jump without the need to do a compare. This technique is sometimes called token threaded code. Taking this a step further we can actually store the address of the function implementing the instruction in the code memory. This is called subroutine threaded code.

This approach will make the decoding simpler at runtime, but it makes the whole VM more complicated by requiring a loader. The loader replaces the byte code instructions with addresses to functions implementing the instructions.

A loader might look like:

(See full code on github)


typedef void (*instructionp_t)(void);

instructionp_t *read_file(char *name) {
  FILE *file;
  instructionp_t *code;
  instructionp_t *cp;
  long size;
  char ch;
  unsigned int val;

  file = fopen(name, "r");

  if(file == NULL) exit(1);

  fseek(file, 0L, SEEK_END);
  size = ftell(file);
  code = calloc(size, sizeof(instructionp_t));
  if(code == NULL) exit(1);
  cp = code;

  fseek(file, 0L, SEEK_SET);
  while ( ( ch = fgetc(file) ) != EOF )
  {
    switch (ch) {
      case ADD: *cp++ = &add; break;
      case MUL: *cp++ = &mul; break;
      case PUSH:
        *cp++ = &pushi;
        ch = fgetc(file);
        val = 0;
        while (ch--) { val = val * 256 + fgetc(file); }
        *cp++ = (instructionp_t) val;
        break;
    }
  }
  *cp = &stop;

  fclose(file);
  return code;
}

As we can see, we do more work at load time here, including the decoding of integers larger than 255. (Yes, I know, the code is not safe for very large integers.)

The decode and dispatch loop of the VM becomes quite simple though:


int run() {
  sp = 0;
  running = 1;

  while (running) (*ip++)();

  return pop();
}

Then we just need to implement the instructions:


void add() { int x,y; x = pop(); y = pop(); push(x + y); }
void mul() { int x,y; x = pop(); y = pop(); push(x * y); }
void pushi(){ int x; x = (int)*ip++; push(x); }
void stop() { running = 0; }

In Beam this concept is taken one step further, and beam uses directly threaded code (sometimes called only thread code). In directly threaded code the call and return sequence is replaced by direct jumps to the implementation of the next instruction. In order
to implement this in C, Beam uses the GCC extension “labels as values”.

We can see how the Beam dispatcher is implemented by looking at the add instruction in beam_emu.c. The STORE_ARITH_RESULT macro actually hides the dispatch function which looks something like: I += 4; Goto(*I);.


#define OpCase(OpCode) lb_##OpCode
#define Goto(Rel) goto *(Rel)

...

OpCase(i_plus_jId):
{
  Eterm result;

  if (is_both_small(tmp_arg1, tmp_arg2)) {
    Sint i = signed_val(tmp_arg1) + signed_val(tmp_arg2);
    ASSERT(MY_IS_SSMALL(i) == IS_SSMALL(i));
    if (MY_IS_SSMALL(i)) {
      result = make_small(i);
      STORE_ARITH_RESULT(result);
    }

  }
  arith_func = ARITH_FUNC(mixed_plus);
  goto do_big_arith2;
}

I will talk about the Beam virtual machine at EUC, covering how to look at compiled Beam code, how preemption is handled and some of the Beam instructions are implemented.

Hope to see you there.

The Erlang Tagging Scheme

Erlang is dynamically typed. That is, types will be checked at runtime and if a type error occurs an exception is thrown. The compiler does not check the types at compile time, unlike in a statically typed language like C or Java where you can get a type error during compilation.

These aspects of the Erlang type system, strongly statically typed with an order on the types puts some constraints on the implementation of the language. In order to be able to check and compare types at runtime each Erlang term has to carry its type with it. This is solved by tagging the terms.

In the memory representation of an Erlang term a few bits are reserved for a type tag. For performance reasons the terms are divided into immediate and boxed terms. An immediate term can fit into a machine word, that is, in a register or on a stack slot. A boxed term consists of two parts: a tagged pointer and a number of words stored on the process heap. The boxes stored on the heap have a header and a body, unless it is a list. Currently ERTS uses a staged tag scheme, the history and reasoning behind the this scheme is explained in a technical report from the HiPE group. (See http://www.it.uu.se/research/publications/reports/2000-029/)

The tagging scheme is implemented in erl_term.h. The basic idea is to use the least significant bits for tags. Since most modern CPU architectures aligns 32- and 64-bit words, there are at least two bits that are “unused” for pointers. These bits can be used as tags instead. Unfortunately those two bits are not enough for all the types in Erlang, more bits are therefore used as needed.

The current (R15-R16) tagging scheme looks like this:

 aaaaaaaaaaaaaaaaaaaaaaaaaatttt00 HEADER (see below)
 pppppppppppppppppppppppppppppp01 CONS
 pppppppppppppppppppppppppppppp10 BOXED (pointer to header)
 iiiiiiiiiiiiiiiiiiiiiiiiiiii0011 PID
 iiiiiiiiiiiiiiiiiiiiiiiiiiii0111 PORT
 iiiiiiiiiiiiiiiiiiiiiiiiii001011 ATOM
 iiiiiiiiiiiiiiiiiiiiiiiiii011011 CATCH
 iiiiiiiiiiiiiiiiiiiiiiiiii111011 NIL (i always zero...)
 iiiiiiiiiiiiiiiiiiiiiiiiiiii1111 SMALL_INT

 aaaaaaaaaaaaaaaaaaaaaaaaaa000000 ARITYVAL (Tuple)
 vvvvvvvvvvvvvvvvvvvvvvvvvv000100 BINARY_AGGREGATE       |
 vvvvvvvvvvvvvvvvvvvvvvvvvv001x00 BIGNUM with sign bit   |
 vvvvvvvvvvvvvvvvvvvvvvvvvv010000 REF                    |
 vvvvvvvvvvvvvvvvvvvvvvvvvv010100 FUN                    | THINGS
 vvvvvvvvvvvvvvvvvvvvvvvvvv011000 FLONUM                 |
 vvvvvvvvvvvvvvvvvvvvvvvvvv011100 EXPOR                  |
 vvvvvvvvvvvvvvvvvvvvvvvvvv100000 REFC_BINARY |          |
 vvvvvvvvvvvvvvvvvvvvvvvvvv100100 HEAP_BINARY | BINARIES |
 vvvvvvvvvvvvvvvvvvvvvvvvvv101000 SUB_BINARY  |          |
 vvvvvvvvvvvvvvvvvvvvvvvvvv101100 Not used
 vvvvvvvvvvvvvvvvvvvvvvvvvv110000 EXTERNAL_PID  |        |
 vvvvvvvvvvvvvvvvvvvvvvvvvv110100 EXTERNAL_PORT | EXT    |
 vvvvvvvvvvvvvvvvvvvvvvvvvv111000 EXTERNAL_REF  |        | 
 vvvvvvvvvvvvvvvvvvvvvvvvvv111100 Not used

 

The Erlang Runtime System

I recently started writing a book about the Erlang Runtime System. The plan is to have the book done by the end of 2013 and published early 2014. During this time I plan to do some blog posts about the topics I am writing on so stay tuned.

Currently I am thinking of writing the following chapters:

  1. Introduction
  2. The Compiler
  3. Processes and the PCB
  4. The Erlang Virtual Machine: BEAM
  5. Modules and The BEAM File Format
  6. Scheduling
  7. The Erlang Type System and Tags
  8. The Memory Subsystem: Stacks, Heaps and Garbage Collection
  9. Advanced data structures (ETS, DETS, Mnesia)
  10. Different Types of Calls, Linking and Hot Code Loading
  11. BEAM Instructions (25p)
  12. Platform Independence
  13. IO, Ports and Networking
  14. BIFs NIFs and Linked in Drivers
  15. Native code/li>
  16. Building ERTS
  17. The Shell
  18. Operation and maintenance
  19. Crash dumps
  20. The Debugger
  21. Tracing and Profiling
  22. Tweaking and optimizing

Comments on the suggested content are welcome.

Inline Json in Erlang

While I was writing about parse transforms for the ERTS book I needed an
example of a simple parse transform. So just for fun I came up with
a little syntax for writing Json structures in-line in Erlang.

With this parse transform you can write things like


-module(json_test).
-compile({parse_transform, json_parser}).
-export([test/1]).

test(V) ->
    <<{{
      "name"  : "Jack (\"Bee\") Nimble",
      "format": {
                  "type"      : "rect",
                  "widths"     : [1920,1600],
                  "height"    : (-1080),
                  "interlace" : false,
                  "frame rate": V
                }
     }}>>.

The code is available on my GitHub account:
https://github.com/happi/json-transform

Getting Erlang shell history as a term

When experimenting in the Erlang shell I have often felt the need to
grab history entries and rewrite them. Unfortunately, the Erlang history
command h() only gives you a printout of the history.

So yesterday I wrote this function:


hist() ->
  {links, [Shell|_]} = hd(process_info(self(), [links])),
  Shell ! {shell_req, self(), get_cmd},
  receive 
   {shell_rep, Shell, R} -> R 
  end.

Which (from the shell) returns the shell history as an Erlang term.

I also wrote a function to just get the commands:


get_command_hist() ->
  H = hist(),
  lists:sort([{N,Cmd} || {{command, N}, Cmd} <- H]).

And a function to format them “nicely”:


%% @doc Pretty prints an enumerated list
%%      of commands from the history,
list() ->
  CommandHist = get_command_hist(),
  [case Trees of 
    [] -> ok;
    [T] -> 
      io:format("~w: ~s.~n",
                [N, erl_prettypr:format(T)]);
    [T|Ts] ->
      io:format("~w: ~s~s.~n",
                [N, erl_prettypr:format(T), 
                [", "++erl_prettypr:format(Tree)
                 || Tree <- Ts]])
   end || {N, Trees} <- CommandHist],
  ok.

Well, ok now I am back where I started with only a printout…
but by using hist instead of list I can do interesting things, like
grabbing a pice of history and turn it into a function:


-spec grab(Start::integer(), Stop::integer(),
           Exclude::[integer()], Name::atom())
          -> syntax_tools:syntax_tree().
%% @doc Creates a syntax tree for a function called Name from
%%      the commands in the history between Start and Stop,
%%      excluding the history lines enumerated in Exclude.
grab(Start, Stop, Exclude, Name) ->
    CommandHist = get_command_hist(),
    ListOfTrees = filter_range(CommandHist, Start,
                               Stop, Exclude),
    Tree = lists:append(ListOfTrees),
    Args = get_free_variables(Tree),
    Clause = [erl_syntax:clause(Args,[],Tree)],
    erl_syntax:function(erl_syntax:atom(Name), Clause).

Blogging from Emacs

I am now actually getting somewhere with my blogging, I have a number of
topics lined up and more importantly some time to write.

Geekish as I am I had to get blogging working from Emacs, the one true
editor. Fortunately, with Emacs there is always someone who has already
implemented whatever you can think of.

I quickly found weblogger.el which provides WebloggerMode. and
which was a breeze to set up through ELPA.

So now I am writing my blog posts in Emacs. Oh the joy!

The TV Wall

After ten years of longing I finally bought a flat screen TV.

The old fat TV with a "media PC" under it.

The old fat TV with a "media PC" under it.

I managed to by a new TV just a few years before flat screen TVs became affordable, and since my nice Philips TV kept on working just fine it took me over ten years until I finally bought a new TV. All I needed to do was to choose which TV and convince my wife Cilla that the time had come, a reasonable project for the summer. It turned out that Samsung just had launched a new series of 3D TVs. Since I apparently only buy one TV every ten years I might as well get the latest technology… I thought.

I soon realized that a 3D-TV required a 3D blu-ray player and a 3D capable receiver so the project grew a bit.  Now I needed a TV, a receiver, and a blu-ray player plus some new cables.

The old living room

This is what our living room looked like a year ago (that is in April 2009).

But wait, I can not have a nice new TV and a 3D capable receiver and a blu-ray player and just have two old Jamo Art speakers from 1988. Hm, I might as well also get five new speakers and a sub-woofer. The project grew some more. Fine, no problem, at least I now knew what I wanted. For the next step I only needed to convince my wife.

I started that part of the project by showing her how nice the new TV looked and how much space we would save compared to the old TV. My wife soon agreed that it was time to get a new TV but demanded that there could be no more cables in the living room. The project suddenly grew. OK so I needed some new hardware and some way of hiding all the cables. After a few weeks of surfing and thinking I decided to build a new wall to hang the TV on and to hide the cables in. The project grew a bit again.

I started sketching the solution and soon realized that my old media player wouldn’t fit, so I decided to upgrade to a mac mini. And what use is a HD TV without HD content? So a new Humax HD PVR system from Comhem was also added to the mix. Now I knew what I needed and after some surfing I knew the dimensions of all pieces and I could start to draw a blue print for the construction. I needed a 20 cm deep wall to fit the speakers in the wall, and I needed room for the TV and for storage of  DVDs and Whiskey. I also wanted to be able to hide all media equipment but still have easy, well relatively easy, access to the connections at the back. I decided on a home built shelf system with a removable side and top.

Now that we where going to redecorate our living room, we might just as well get some new furnitures also, so a new sofa, table, rug, curtains and lamps where added to the shopping list.

The construction could now start.

Placeholder for TV

I started by putting up some placeholders for the TV and speakers.

studs

Now the the vertical studs are in place.

chaos

The old sofa covered in mid work chaos.

more studs

Most studs in place and the first layer of plywood is up.

planning

The new sofa arrives and my friend Olle helps me do some planning.

grandpa

My father in law Lasse gives another helping hand getting the gypsum board in place. (In truth Lasse helped me a lot with the whole project.) In the right hand corner (above the temporary TV) you can see the new surge protecting power outlet.

holes

Now all gypsum boards are up and we have taken holes for cupboards and speakers in them.

TV and Tor

The speakers, the receiver and the TV has finally arrived, Tor is testing 3-D. Cilla has repainted the wall (in the same color as before) with a nice white paint down from the ceiling.

wallpapers

Cilla, with unborn Nils in her belly, is putting up the new wallpaper.

final touches

My mother in law is putting some final touches to the wallpapering, hiding the cupboard doors behind a perfect pattern match. Cilla is looking for a new sofa table on the mac mini on the new TV.

The finished wall

And here is the final result. Almost completely hidden speakers, hidden doors to storage, an alcove with ornaments with lights in adjustable colors.