Manual: good code

Good coding practices and tips that apply on Lever. Short page but extended over time. Learn to write dynamically typed code that is safer, more maintainable and more reliable than most statically typed code.

Table of contents ↑
Table of contents
01. Top-down, Most relevant/Most interesting first
02. The semantic meaning of code
03. Reliability through crashing
03.1. Multiple-dispatch errors
04. Premature everything
04.1. Premature generalization
04.2. Premature perfection

01. Top-down, Most relevant/Most interesting first

I used to explain context-free grammars to my dad. started by showing something like this:

expr_bot -> term
          | '(' expr_top ')'
addition -> addition '+' expr_bot
expr_top -> addition
statement -> expr_top ';'
file -> statement
      | file statement 

He explained it was hard to understand because it was written all upside down. The highest thing in the hierarcy, largest piece forming up from the other pieces was on the bottom.

He proposed the following would be much cleaner:

file -> statement
      | file statement 
statement -> expr_top ';'
expr_top -> addition
addition -> addition '+' expr_bot
expr_bot -> term
          | '(' expr_top ')'

I had written the earlier program as I had because the evaluation flow often forces you to put things that come first on the top because you have to instantiate something before you have a reference to it, like below:

pack1 = Packet("1")
pack2 = Packet("2")
platform1 = Platform(pack1, pack2)
platform2 = Platform(platform1)

I kind of understood that this was an important point and tried out writing some programs top-down, such that the entry point would end up first. This is how I write code today:

main = ():
    window = open_window()
    rc = load_resources(resources)
    render(window, rc)

resources = {
    a = "a.jpg"
    b = "b.jpg"

init = (window):
    ... # not revelant on illustration

render = (window, rc):
    blit_texture(window, rc.a)
    blit_texture(window, rc.b)

blit_texture = (window, texture):
    ... # omitted

I realised this way of ordering the hierarchy in a program this way was immensely valuable. If you scroll into the text you would find out immediately the overall idea of what the program is doing and have a context to understand what comes next.

Sometimes I would even do this on constants, that is position the constant lower into a file and use it first. The reason being that the value of the constant isn't often interesting before it is used.

02. The semantic meaning of code

When writing code that is correct, the data types aren't important. What is important is the semantic meaning of the code.

For example lets consider that we want to know an arithmetic mean value for a set of values. A simple implementation of such program in Lever would be:

total = 0
for value in sequence
    total += value
return total / sequence.length

The above program relies on lot of facts:

  1. The sequence is iterable
  2. The values returned by iterating a sequence can be added to '0'
  3. The subsequent values can be divided with their count
  4. The sequence has a length and it reports it correctly

The elements the program has been made of preserve their semantic meaning, eg. The addition doesn't become exponentiation

The program could be written like this as well:

it = iter(sequence)
count = 1
total =
for value in it
    total += value
    count += 1
return total / count

This program relies on different set of facts to be 'correct':

  1. The sequence is iterable and not empty
  2. The values returned by sequence can be added with each other
  3. The sum of values can be divided by their count

Both programs are incorrect if you pass in an infinite sequence. If you do so, they also do attempt to calculate the value and never recover from it on their own.

Both programs are incorrect if you give them values that do not correctly add up.

Also the correctness of the output is bound to the correctness of the input. The program turns into giving approximate values if you give them floating point values that are approximate by their nature.

Since Lever doesn't have full numeric stack right yet, if you give the above program too big integers, the integers will overflow and the program gives the wrong result.

But for surprisingly many inputs, the both above programs are perfectly valid.

You could try to argue the below program is less correct because it fails on empty sequences. It really depends on the context though. If the input sequence is not supposed to be empty, the lower program would enforce the rule and therefore be much more desirable to use.

The important thing to notice is that static type systems, often advertised with safety, fail to represent all the invariants of the above programs. Even if you did dependent typing, there would be problems to represent everything that the above program relies on to function in terms of type systems.

Often type systems resort to overconstraining the programs. For example you can guarantee that the sequence is not infinite and that the values can be added together if you force the sequence to be an array that contains integers. Forced correctness like this doesn't come for free.

If you later decide that instead of arrays or integers you need lists and floats, you really have to work through every definition that treats with the data before the program can be compiled and tested again. A hefty price to pay for local correctness. If you use automatic refactoring techniques, the refactoring program isn't aware about the constraints subjected to the type anyway, so you may introduce an error unknowingly even if everything was fully typed.

The efforts for perfection and ultimate correctness in statically typed languages is left as an exercise for academics who do not care about delivering or maintaining software.

03. Reliability through crashing

When a program encounters a condition it has not been designed to handle, it should crash. A good crash is always the better option when the another option is a good state corruption.

The origin of this concept traces back to Erlang where programs have been isolated from each other and monitored by other programs that restart them if they happen to go down.

With dynamic typing you are very likely to compose complex objects from simpler ones and then there comes a time to take them apart. You will likely end up with the following code:

if isinstance(a, str)
    str things
elif isinstance(a, int)
    int things
    assert false, repr(a)

The 'else' with an assert clause such as described here should be very common occurrence in code that branches by type.

It is likely that the language will get some syntax for dispatching from types and patterns, for now such trivial improvements have to wait.

03.1. Multiple-dispatch errors

The Texopic HTML generator had this kind of code:

    html_env(node, context, out)
except KeyError as ke
    out.append(html.Node("pre", [texopic.stringify([node])]))

The html_env dispatched code depending on which node type is being used. Probably needless to say, this code also catched every single key set/get error in the generator and suspended them.

The solution in this case was to add a default dispatch into texopic Scope objects and use them instead. Catching errors originating from such dispatch situations are problematic because they can very well cause surprising situations during recursion.

04. Premature everything

Some people like to watch the world burn. Then there are people who like to create a new class, new file and a new function every time they spot a task that might be separate or reusable.

Seemingly it's harmless to divide something into its own function. Especially when you saw something similar earlier on this very same page. Above the reason for additional functions was that we needed functions to illustrate hierarchy and I didn't pick up a real example because I might not find something that would exactly illustrate the point as well as an invented example.

The problem in building up a new function when you see an opportunity is that in any program there can be only that many indirections before it becomes garbled for the person reading the code. If you are eager to abstract things it means you're almost always losing the opportunity for the best abstraction you can make.

04.1. Premature generalization

Lever has 2-4 implementations of the same pretty printing algorithm. It is not an ideal situation. Usually you'd want just one implementation. There is a reason they haven't been unified yet. They are all slightly different and I haven't come up with the best design.

Only lately I've come up to understand the problem and aspects of it better to consider that I would be confident to design something that works for all the cases I have had. But I'm still postponing it until I need to pretty print things again and observe how the various printers behave.

04.2. Premature perfection

Everybody approves finished designs. Many would prefer to have a full design up-front before they get to create anything at all.

I used to erase my drawing if it doesn't turn out right. If the stroke wouldn't come out right I would press CTRL+Z and erase the stroke. Later I got some help by a stranger. He taught me to leave the flawed stroke there. He told me that's how they used to do it when drawing on the paper. By leaving the flawed stroke in you could draw more and then you would see where the right stroke should go to.

Also I vividly remember the lesson about "drawing the space around the subject" rather than the subject itself, that was fairly helpful advice too.

Achieving one failure can take a weekend, yet you can learn from it more than from two months of up-front design work. Also, they tend to say that couple of months in the laboratory can frequently save a couple of hours in the library. So there is a sort of hierarchy there.

So it's great if you can learn from your failures, even greater if you can learn from the failure of others.

It's weird that I had to be taught that thing during drawing, because by then I was already using dynamically typed languages all the time. These tools are designed for getting the first stroke on the paper as quickly as it is possible. The obvious end result is that at the first time you get something quickly that fails to attain all or some of your objectives. Then you figure out what went wrong and refine the solution until the problem gets solved.

Mastering this aspect in your workflow makes the dynamic programming languages a powerful tool in your repertuare.