Posts Tagged ‘software development’

Code generation

Code generation is a common tool in software. Usually it sits behind the scenes, just a layer of automation translating from one programming language to another. But sometimes it becomes useful to build your own code generator. In a project at work, we’ve implemented a code generator based off a DSL (domain specific language) that generates an implementation of our domain contract. There’s a lot of meaty terms in that last sentence, let’s spend some time and break them down.

Common use case: compilers

First, I said that code generation is common, but behind the scenes. That’s because most compilers have some form of code generation happening in them. So for example, compiling Typescript for a Web app, or compiling Swift for an iOS app.

Let’s talk a little more about compilers. A compiler’s job is to take source code that humans can read and produce output that a computer can run. This is broken up into parts:

  1. Parsing the source code
  2. Stuff
  3. Generate output.

Parsing the source code is all about understanding text and turning it into a format a computer can do stuff with.

Stuff could be interesting things, like in Typescript, adding static type checking to vanilla Javascript. This usually involves manipulating an abstract syntax tree. For example, you could enforce rules in the tree and make sure everything is valid.

Generate output is the code generation part. The tree needs to turn back into something that the computer can run. For Typescript this is generating Javascript, for Swift, this is producing machine code.

A quick example

2 + 2

Given the above syntax, parsing turns it into a syntax tree:

Then for stuff maybe we want to enforce that a “Plus” always has “integers” connected to it. In this case, our tree looks good.

For generate output, we need a rule for what to do with a “Plus”. I’m making up this rule:

  • Load the left hand number into the computer.
  • Load the right hand number into the computer.
  • Add them.

So we apply that rule to our tree and get:

  • Load a 2 into the computer.
  • Load a 2 into the computer.
  • Add them.

Which is approximately what machine language looks like 🙂 I linked to NAND to Tetris which my friend Devin Mork showed me. It reminds me of my computer architecture course in college.

Domain Specific Languages

A domain specific language is making your own programming language. This lets you (1) create a language that is more meaningful and specific, and (2) produce generated code that does something useful.

One example is defining an API between two microservices. You could exchange URL’s back and forth between developers:

https://example.com/api/add-numbers
https://example.com/api/get-result

But then you’d also need to agree on an API body format (maybe JSON), whether things are a GET or a POST, and a myriad of other details.

What if instead you could use a DSL that let you define the following:

api Math {
    command add-numbers(
        leftHand: int,
        rightHand: int
    );
    query get-result(
    );
}

This defines something meaningful to the domain (we have an api with a command and a query), but hides the implementation.

The implementation could be a Web API, it could be a “REST API,” it could be something like Apple’s XPC. This implementation would be built out as a set of code generators, but it is not part of the domain definition. So for example, we could generate a Jersey controller:

@Path("/Math")
public class MathController {

    @GET
    @Path("/get-result")
    public Response getResult() {
        //something
    }

    @POST
    @Path("/add-numbers")
    public Response addNumbers(@QueryParam("leftHand") String leftHand, @QueryParam("rightHand") String rightHand) {
        //something
    }
}

This is a partial example, you’d need a bridge between the generated code and hand-written code. For example, maybe “something” is a call to an interface that a developer will implement by hand.

This is not a new idea, Martin Fowler wrote a book “Domain Specific Languages.” For web APIs see also grpc and api blueprint. If those meet your needs, I would use those instead of making the investment in building your own DSL and code generator.

Event sourcing and CQRS

As part of the #2018TechReadingChallenge, I’ve been working my way through Microservices Patterns, by Chris Richardson (currently a Manning MEAP).

Event sourcing and CQRS are two key architecture concepts in the book that have less to do with the “micro” in “microservices,” and more to do with software architecture in general. While they are often mentioned together, they are separate concepts, and solve slightly different problems.

Event sourcing fits in the general category of an append only model, which is a way of persisting the data in your application. Instead of storing the current state, it stores history that led up to the present. It stores these using events, which are immutable, and represent business intent. Events record a thing that has happened.

A related concept in domain driven design is an aggregate, which rebuilds state by querying the event history.

CQRS stands for Command Query Responsibility Segregation. This relates to an OO principle, command query separation, which classifies methods into “commands,” which mutate the data (but does not return data), and “queries,” which return data only. In CQRS, we extend this principle to the design of a subdomain, and separate the responsibility of writing the data from reading the data. The two sides often communicate using events.

It solves a “stale data” problem in a collaborative domain. A collaborative domain is a business domain where multiple users are working together on a set of data, and expect that data to be coherent. The data is only as stale as the lag between the command side and the query side. However, by separating these sides and putting a visible part of the architecture between them, we have raised awareness that there will always be stale data in a collaborative system.

Another advantage to CQRS is that the read side and the command side can be scaled independently, so for example, if the number of queries is vastly greater than the commands, you can add more nodes that handle the queries.

For more on these concepts, see the CQRS Journey page on Microsoft, specifically the Exploring CQRS and Event Sourcing e-book. I’m also drawing from Patterns for Building Distributed Systems for The Enterprise, by Michael Perry on Pluralsight.

Dreyfus model of skill acquisition

I just finished reading Pragmatic Thinking and Learning, by Andy Hunt, as part of my 2018 Tech Reading Challenge. (It fits in the category, “a book about an abstract concept.”) In lieu of a full review, I’m sharing some key takeaways from chapter 2, “Journey from Novice to Expert.”

The Dreyfus model of skill acquisition is a tool to help you understand  how you are doing in a new skill. It can apply to any skill area, such as cooking, writing, bowling, or software development. It comes from a paper by the Dreyfus brothers, published in 1980. In it, they identify five stages of skill development. I’ll stick these at the bottom of the article (It’s a bunch of bulleted lists) and talk about some of my takeaways.

Problems

The Dreyfus model helps us understand a few different common problems in the world of professional software development.

First is a tendency to over-rate our own abilities. This can happen when we don’t know what we don’t know. This is known as the Dunning-Kruger effect.

Second is when we decide to stop learning, and yet think that expert status has been achieved. This has been described as the expert beginner, and Erik Dietrich has a set of articles about it over at his blog.

The Dreyfus model does not address how large or small a skill is scoped, but it seems you can break this up into small pieces. For example, the skill of unit testing is separate from the skill of refactoring, or from designing for test. In The Art of Unit Testing, Roy Osherove lists these as separate skills.

No one size fits all

My big takeaway from the 5 stages is that it can help explain why I am uncomfortable in different problem solving situations. For example, if I’m being asked to fit into a strict set of rules, and yet solve a problem that is very familiar to me, I can chafe a little. The rules are more suited to a novice, and I’m feeling more confident. On the flip side, if I have junior developers on the team, it is a good thing to have conventions, standards, and some guidelines for them to follow. And if I’m pursuing a senior engineer role, it’s my job to create the guidelines.

Movements target certain skill levels

The different stages help us identify how movements can target different skill levels with different attributes. For example, Agile could be viewed as a set of rules, rituals, and ceremonies, which would appeal to the advanced beginner, or it could be viewed as a way to learn and correct past performance, which is something a proficient stage person can do.

A great example of this is applying design patterns. An advanced beginner sees the potential for patterns everywhere! And the end up with something like EnterpriseFizzBuzz.

Conclusion: Check out the book

I recommend picking up a copy of Pragmatic Thinking and Learning, by Andy Hunt. I didn’t even cover all of chapter 2—there’s a great case study on how the model was applied to the nursing profession, and the author ties this to software development. The rest of the book has some great tips for learning and problem solving. I highly recommend it.

The five stages

Novice

  • they need a recipe
  • don’t know what they don’t know
  • prefer walkthroughs, in person if possible

Advanced beginner

  • start to break away from the fixed rule set
  • learn new tasks as the need arises
  • can use advice in a similar context
  • want information fast, but not the big picture
  • no broad-based conceptual understanding of the task set
  • solo learning is more possible

Competent

  • Troubleshooting skills
  • “They can mentor the novices and don’t annoy the experts overly much”
  • informal leadership role
  • more deliberate planning
  • Still can’t apply agile methods to the full extent (!)

Proficient

  • can correct previous poor task performance
  • learn from the experience of others
  • apply maxims (Don’t Repeat Yourself)
  • know what they don’t know

Expert

  • work from intuition
  • unconscious competence (can’t explain why)
  • can take context into account
  • very rare!