Cody Shepp

Twitter | Github | LinkedIn

Kasai - Pattern Matching in TypeScript

Published on 9/26/2015

Pattern matching has become one of my favorite language constructs. Whenever I'm working in a language that doesn't have native pattern matching support, I really start to miss its powerful, concise syntax.

Recently I've been spending the majority of my time writing applications in TypeScript, so I decided to put together a small library that adds pattern matching to the list of other great features that TypeScript offers.

Kasai

When I wrote Kasai, I wanted to focus on two main benefits of pattern matching - (1) logic as data, and (2) concise inspection of complex data structures.

Logic as data

Representing logic as data allows you to take something like this:


var f;
var a = "1";

if(isString(a)){
    f = compareStrings;
}
else if(isNumber(a)){
    f = compareNumbers;
}
else if(isArray(a)){
    f = compareArrays;
}
else if(isObject(a){
    f = compareObject;
});

return f(a);

...and represent it with an array of guard/value pairs like this:


var a = "1";
var pattern = [
    [isString, (x) => compareStrings],
    [isNumber, (x) => compareNumbers],
    [isArray,  (x) => compareArrays ],
    [isObject, (x) => compareObjects]
];

return match(a, pattern);

The pattern matching syntax is not only shorter, but the patterns can also be created dynamically at runtime.

Also note that you can provide a function as a pattern - Kasai will pass the value you're matching against to that function, and if the result is true, that pattern will be a match.

You can also pass a function as the result of a match, in which case the value you are matching against will be applied to that function before it is returned. Because of this, if you want to return an unevaluated function from a match set, you'll need to create a little lambda like I did in the example above, which just throws away the input value and returns the function instead.

Inspection of complex data structures

How many times have you written code like this?


var a = {
    name: 'Cody',
    contactInformation: {
        telephone: '123-456-7890',
        address: {
            street: '123 abc st',
            city: 'Nowhere',
            state: 'XX',
            zip: '12345'
        }
    }
};

if(a.hasOwnProperty('contactInformation')
  && a.contactInformation.hasOwnProperty('address')
  && a.contactInformation.address.hasOwnProperty('zip')
  && a.contactInformation.addess.zip.length === 5)
{

    return "domestic";
}

return "foreign";

Yuck, yuck, yuck.

With Kasai, that mess turns into this:


return match(a, [
    [{contactInformation: {address: { zip: (z) => z.length === 5 }}}, 'domestic'],
    [_, 'foreign']
]);

Note that _ is a wildcard token that matches anything.

If you'd like to check out Kasai, you can find it on Github.


Sokoban in ClojureScript

Published on 4/25/2015

For a fun project that was manageable in size, I decided to create a Sokoban clone/demo using ClojureScript and Quil. Quil provides ClojureScript users with a (mostly) idiomatic wrapper around processing.js for creating 2d graphics in the browser using the canvas.

I'm not going to go through my code step-by-step, but I would like to mention a few things I noticed while working on this project.

You can browse the full source code for this project on github.

Note: You can play the game above. Click inside the frame and use the [w], [a], [s], and [d] keys to move the player (@). The goal is to move all of the boxes (X) onto the goals (). If you get stuck, you can reset the game with the [r] key.

Game State and Data Structures

In the early stages of building Sokoban, I went through three separate iterations involving how to store the game's state. All three methods have the same general approach to setting/getting the state, in that the main state data structure is stored in an atom which is derefed and passed to the rendering function once per frame.

I also need a set of functions to get and set data within the state. The most common operations are finding the location of the player, finding the entities that neighbor a tile, and setting the entities at certain indices to reflect player movement.

Finally, all three iterations shared the same way of representing tiles. Each type of tile on the game board was represented by a keyword - :w for wall, :p for player, :g for goal, :b for box, and :n for nothing (empty space).

2d Vector Approach

My first approach to storing the game state was a simple 2d vector. It looked something like this:


[[:w :w :w :w :w :w :w]
 [:w :n :n :n :n :n :w]
 [:w :n :n :b :g :n :w]
 [:w :p :n :n :n :n :w]
 [:w :n :n :n :n :n :w]
 [:w :w :w :w :w :w :w]]

With this representation, the location of a given entity is a vector that contains the index of the column in the inner vector, as well as the index of the row within the outer vector.

In the example above, the player (:p) is located at [1 3] = [x y] = [column row].

If you want to find the location of the player, you need to map over the rows, and see if :p exists in them, and if so, at which index.

This is definitely a doable approach, but it seemed convoluted to me...so I decided to try a different method of organizing the game's state.

Normalized Coordinates Approach

My second attempt centered around the idea of normalizing the coordinates of each entity, and storing them explicitly. It looked something like this:


[{:coords [0 0]
  :entity :w}
 {:coords [0 1]
  :entity :w}
 {:coords [0 2]
  :entity :w}...]

With all of our entities stored in one vector, finding the location of the player becomes easier. I can simply filter the vector with a function that inspects the :entity value for each entity, and checks to see if it's :p.


(filter (fn [x] (= :p (:entity x)))
        game-state)

Likewise, finding the entity at a certain set of coordinates is also easy.


(filter (fn [x] (= [0 0] (:coords x)))
        game-state)

This approach also solves a problem with the 2d vector idea - the player can stand on goal tiles. This means that for any given frame, we might need to render two entities on one tile. We can't do that with the vector approach.

But something still isn't right with this approach. The state is messy and verbose, and I didn't like sifting through the entities as I was building and debugging the game. So, time for approach number three.

Separate Tiles and Entities Approach

My final approach combines some ideas from the first two.

First, I decided to split up the 'entities' and the 'tiles' into two separate structures. This allows for rendering both a "goal" and the player at the same location on the game board. In the rendering function, I just paint all of the tiles first, and then go through and paint the entities.

For the tiles, I decided to represent the game board as a 1d vector. Using some easy math and the width of the board, I can convert a set of coordinates to an index and vice-versa. This also means that finding a certain tile is as easy as .indexOf (more on this in a bit).

For the entities, I decided to just store their index in the game board's 1d vector. Here's what it looks like:


{:tiles [:w :w :w :w :n :n :w :w :g ... ]
 :entities {:players [2]
            :boxes   [6 12]}}

Converting back and forth between an index and a set of coords is just some simple math:


(defn index->coords [a w]
  (let [x (mod a w)
        y (int (/ a w))]
      [x y]))

(defn coords->index [x y w]
  (+ (* y w) x))

Finding neighboring tiles' indices is also simple:


(defn above [i w]
  (- i w))

(defn below [i w]
  (+ i w))

(defn left  [i w]
  (- i 1))

(defn right [i w]
  (+ i 1))

ClojureScript and .indexOf

One slightly annoying thing about working with vectors in ClojureScript is that finding the index at which a value exists is a bit verbose. In Clojure, we can use Java's indexOf method directly, but in ClojureScript, we must convert our vector to a JavaScript array before performing an .indexOf lookup:


(let [coll [:a :b :c :d]]
  (.indexOf (clj->js coll) (clj->js :b)))

It's a minor annoyance (and one that can be abstracted away), but still worth mentioning.

Update Loop and Player Movement

Lastly, I wanted to quickly touch on the method I used to determine how the game state changes when the user presses a key. In Sokoban, we only need to be concerned with the 2 tiles next to the player in the direction they'd like to move.

#########
#  @X####
#########

Given the game state above, let's assume the player wants to move to the right. In order to determine whether or not that's a valid move, we need only look at the next 2 tiles in that direction. The first tile is a box, and we know the player can move a box. The 2nd tile, however, is a wall, and we know that the player cannot move a wall. The move must then be invalid.

I used core.match to determine the validity of moves, as well as the actions that correspond to certain valid moves.



;; n is a vector that contains the next 2 tiles in the direction the player wants to move

(match [n]
    [[:n  _]] ;; move player
    [[:b :n]] ;; move player and box
    [[:b :g]] ;; move player and box
    [[:g  _]] ;; move player
    [[ _  _]] ;; do nothing

I found this a succinct and readable way to summarize Sokoban's game rules.

Thanks for reading! Feel free to tweet @cas002 with any questions/comments.


Exploring Genetic Algorithms with Clojure

Published on 3/22/2015

The term "genetic algorithm" has an aura of mystery surrounding it, so I decided to dive in and see what all the fuss is about. When you boil them down and look at the essentials, genetic algorithms aren't that complicated - and they can help you find an optimal solution to complex combinatorial problems.

The big idea at the heart of GAs is this: randomly create a "population" of possible solutions to your problem, combine and modify those solutions in different ways to create new solutions, evaluate the new solutions to see how effective the are, and then rise and repeat.

*Disclaimer: I am pretty new to Clojure, so my code may not be the most idiomatic or performant implementation. If you have suggestions for ways to improve something, feel free to tweet me a gist @cas002

Parts of a Genetic Algorithm

Genetic algorithms have five main parts:

  • Encoding
  • Evaluation
  • Selection
  • Crossover
  • Mutation
We'll go through each step to see what it does and how it works.

Encoding

GAs manipulate possible solutions to our problem, which means that we need to devise a way to format our solutions in such a way that an algorithm can process them. Throughout this post, I'm going to use the Traveling Salesman Problem (TSP) as an example. Suppose we have a salesman that needs to visit 10 different cities. We know the distances between each of the cities, and we want to find the shortest round-trip route for the salesman that will take him to each of the cities only once.

In this example, the solution to our problem will take the form of a path - an ordered list of cities that the salesman will visit. We can represent each city with an integer, and use a vector to make our list of stops. Assuming that we have 10 cities to visit, a possible solution would look like this:

[3 5 4 6 2 7 8 1 9 0]
The solution above means the salesman would start in city 3, travel to city 5, then city 4, etc (you get the idea) and finally return to city 3.

In my opinion, encoding is the most difficult part of creating a GA. While it may seem obvious for the TSP, the method of encoding is going to be different for most problem domains and you may have to get pretty creative.

In Clojure, creating a random population is pretty easy:

Evaluation

The end goal of our GA is to find the optimal solution to our problem, so we need to give our GA the ability to determine how effective a given solution is. This measure of a solution is know as its fitness score.

For our example, we want to find the shortest path that the salesman can take - that means that our fitness score would simply be the total distance the salesman must travel. Let's define our data and a helper function to get the distance between two cities.

Our distance data is contained in a 2d vector, but you can think of it like a table:

City 1 City 2 City 3 ...
City 1 0 28 57 ...
City 2 28 0 28 ...
City 3 57 28 0 ...
... ... ... ... ...

With our data in this format, we can define a function that will calculate the fitness score for a given solution.

Selection

The meat of a GA is in the crossover function, but before we get to that, we need to determine which solutions in our population will be paired up in the crossover. This process is often called selection, and there are many different approaches to explore. I'm going to cover two possible selection processes - tournament selection and roulette selection. The outcome of these selection processes is going to be a vector that places mates next to each other:

[a1 a2, b1 b2, c1 c2]

Tournament Selection

In tournament selection, we randomly select n number of participants from our population, and the "winner" of the tournament is the participant with the best fitness score. Remember that the lower the score, the better (at least in our example).

Roulette Selection

Roulette selection gives solutions with a better fitness score a higher chance of being selected. My implementation is a bit...odd. Here's the basic idea: we want to create a "roulette wheel" that we can spin to find a random solution. The better the solution's fitness score, the bigger its slice on the roulette wheel.

To create the roulette wheel, I create a vector of cumulative fitness scores...sort of. Because the lower fitness scores are better, we have to transform them a bit. I picked a number (700) that is larger than the fitness score of any solution (I just guessed here). Then, we can subtract our actually fitness score from 700, and now our bigger scores are better. Using the bigger-is-better scores, I create a vector of cumulative scores, and that's the roulette wheel.

Now that we have our wheel, we need to spin it once for every solution of the population. To do so, we'll choose a random number between 0 and the last element of the roulette wheel vector. We then find the first element in the vector that is greater than our random value, and the index of that value is the index of the solution in the population vector that we picked.

You know what? Let's just step through it. Assume we have this population:

[[0 1 2 3] [2 3 1 0] [4 2 3 1]]
with fitness scores of:
[200 400 500]
To build our roulette wheel, we want to subtract each score from 700. Our scores are now 500, 300, and 200 respectively. The higher scores are now better. Now, let's create our roulette wheel - remember that it's a vector of cumulative scores:
0 + 500 = 500
          500 + 300 = 800
                      800 + 200 = 1000
which results in:
[500 800 1000]

Hopefully you can see how we're assigning a bigger slice of the roulette wheel to the solutions with better fitness scores. My implementation of this process in Clojure is below.

(I also added a 'fitness multiplier' to help tune the results of roulette selection)

Crossover

Note: I gloss over a lot of the details in this section, and I might write a follow-up post that goes into more detail.

Now that we've selected which solutions in our population will be crossed-over (bred), we can finally get into the good stuff. Just like selection, there are a number of different methods to perform crossover. The method with which you encoded your solutions may dictate which method of crossover works best. In our example, I'm going to use cycle crossover. Cycle crossover will prevent the creation of invalid solutions. Remember that we must visit every city, and we can only visit each city once. Therefore, the solution below is invalid because we visit city 1 twice, and because we don't visit city 9:

[7 6 5 1 8 3 2 4 1]

The cycle crossover finds subsets of values in two solutions that occupy the same set of indices. These subsets are called cycles. Let's look at the example below.

0 4 3 6 2 9 5 7 1 8
8 2 3 4 5 6 1 7 9 0

Notice the highlighted cells - the subset of values (8 and 0) occupy the same subset of indices (0 and 9) in both solutions. This means they can safely be switched between the solutions, and the resulting solutions will still be valid!

8 4 3 6 2 9 5 7 1 0
0 2 3 4 5 6 1 7 9 8

Awesome! But...given a pair of solutions, how do we go about finding a cycle? My implementation is below, but there's probably a better way!

We can then apply the crossover to the entire population with the functions below:

Mutation

Mutation causes the values at random indices in a given solution to switch places. Mutation should only occur in a small percentage of the population. This helps keep the population from settling into a local optimum. To perform a mutation, we simply randomly choose two indices and swap their values. In the example below, the random indices that were chosen are 0 and 6.

8 4 3 6 2 9 5 7 1 0

Mutates into:

5 4 3 6 2 9 8 7 1 0

In my implementation below, 15% of the population will undergo mutation, and we actually swap two sets of indices.

Putting it all together

Now that we have all of the steps in the process of the algorithm complete, we can piece everything together.

In the code above, you'll notice the references to 'draw-chan' - in the full version of this experiment, I've included a visualization component using Quil.

You can find the full code on Github here.

Thanks for stopping by!


Next