Thursday 18 April 2013

I/O Redirection in Bash

Since Bash is a superset of Bourne shell, the IO redirection is the same across the Bourne shell family. To find out exactly which shell you are using in Unix or Linux, typing in command
echo $SHELL
shall display an executable command with its path, for example
PathShell
/bin/shBourne shell
/bin/bashBourne Again SHell
/bin/cshC SHell
/bin/kshKorn SHell

Bash takes input from standard input (stdin), writes output to standard output (stdout), and writes error output to standard error (stderr). Standard input is connected to the terminal keyboard, while standard output and error are connected to the terminal screen, by default. Redirection of I/O is accomplished by using a redirection metacharacter followed by the desired destination (stdout & stderr) or source (stdin). Bash uses file descriptor numbers to refer to the standard I/O, where 0 is standard input, 1 is standard output, and 2 is standard error. Below are the common redirection for the Bourne shell family:

Meta CharacterAction
>Redirect standard output
>>Append to standard output
2>Redirect standard error
2>&1Redirect standard error to standard ouput
2>&1|Redirect standard error to standard ouput, and pipe them to another command
<Redirect standard input
|Pipe standard output to another command

Please note that because < and > are referring to standard input and output respectively, the numbers 0 and 1 are not required in this case.

Thursday 4 April 2013

Sorting Algorithm Performance Comparisons with Caliper's Micro Benchmarking

After inspired by Martin Thompson's presentation on Performance Testing Java Application, I decided to try out the micro benchmarking framework Caliper on the sorting algorithms that I recently wrote in order to understand better how different sorting algorithms behaves given an array of randomly distributed values.

It was actually a bit involving to find out where I could download the caliper JAR file, as it was not available on their google code base. You can download it from the maven repository, or grab it from my git described at the bottom of this article.

I created a SortingAlgorithmsBenchmark class that extends SimpleBenchmark, and some public methods, whose names have prefix 'time'. Then use the caliper Runner to run the benchmarks on the SortingAlgorithmsBenchmark class, as shown below
import com.google.caliper.Runner;
import com.google.caliper.SimpleBenchmark;
import sorting.*;

import java.util.Arrays;
import java.util.Random;

public class SortingAlgorithmsBenchmark extends SimpleBenchmark {

    private static final int SIZE = 10000;
    private static final int MAX_VALUE = 10000;
    private int[] values;

    @Override
    protected void setUp() throws Exception {
        values = new int[SIZE];
        Random generator = new Random();
        for (int i = 0; i < values.length; i++) {
            values[i] = generator.nextInt(MAX_VALUE);
        }
    }

    public void timeBubbleSort(int reps) {
        for (int i = 0; i < reps; i++) {
            BubbleSort.sort(values);
        }
    }

    public void timeInsertionSort(int reps) {
        for (int i = 0; i < reps; i++) {
            InsertionSort.sort(values);
        }
    }

    public void timeSelectionSort(int reps) {
        for (int i = 0; i < reps; i++) {
            SelectionSort.sort(values);
        }
    }

    public void timeMergeSort(int reps) {
        for (int i = 0; i < reps; i++) {
            MergeSort.sort(values);
        }
    }

    public void timeShellSort(int reps) {
        for (int i = 0; i < reps; i++) {
            ShellSort.sort(values);
        }
    }

    public void timeArraysSort(int reps) {
        for (int i = 0; i < reps; i++) {
            Arrays.sort(values);
        }
    }

    public static void main(String[] args) {
        Runner.main(SortingAlgorithmsBenchmark.class, args);
    }

}

Here are the results when running the benchmarks with a random array of size 10,000. Apparently, bubble sort and selection sort are the slowest of all, with selection sort behaving rather unstable (you will notice the selection sort benchmark is quite different across different runs. It is pleased to see that the sorting algorithm (Dual-Pivot Quick Sort) used by Arrays performs fairly well with a medium size array.

0% Scenario{vm=java, trial=0, benchmark=BubbleSort} 75703105.85 ns; σ=3428843.80 ns @ 10 trials
17% Scenario{vm=java, trial=0, benchmark=InsertionSort} 73305.66 ns; σ=712.37 ns @ 6 trials
33% Scenario{vm=java, trial=0, benchmark=SelectionSort} 49624130.05 ns; σ=345774.06 ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=MergeSort} 478932.15 ns; σ=683.99 ns @ 3 trials
67% Scenario{vm=java, trial=0, benchmark=ShellSort} 262061.27 ns; σ=11605.29 ns @ 10 trials
83% Scenario{vm=java, trial=0, benchmark=ArraysSort} 8063.53 ns; σ=20167.07 ns @ 10 trials

    benchmark       us linear runtime
   BubbleSort 75703.11 ==============================
InsertionSort    73.31 =
SelectionSort 49624.13 ===================
    MergeSort   478.93 =
    ShellSort   262.06 =
   ArraysSort     8.06 =

vm: java
trial: 0

Increasing the array size to 100,000, and excluding the bubble sort and selection sort in the benckmarks, we have another set of results, where insertion sort starts to behave terribly in this case, and the dual-pivot quick sort is still the quickest.

0% Scenario{vm=java, trial=0, benchmark=InsertionSort} 2031873339.00 ns; σ=3247182.51 ns @ 3 trials
25% Scenario{vm=java, trial=0, benchmark=MergeSort} 5701257.43 ns; σ=37183.00 ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=ShellSort} 2277851.07 ns; σ=21077.62 ns @ 3 trials
75% Scenario{vm=java, trial=0, benchmark=ArraysSort} 210179.59 ns; σ=164843.22 ns @ 10 trials

    benchmark      us linear runtime
InsertionSort 2031873 ==============================
    MergeSort    5701 =
    ShellSort    2278 =
   ArraysSort     210 =

vm: java
trial: 0

You can find the source code for the sorting algorithms here, and the benchmarks runner here, all of which are part of my github project DataStructureAndAlgorithms

Wednesday 3 April 2013

A Different Builder Pattern Example in Java

Builder design pattern is often used to separate the construction of a complex object from its various representations. I will show you a nicer way of writing a Builder pattern, rather than the example shown in the wiki.

First of all, I created a POJO called Vehicle, which has some simple properties about its model, its make, etc. Then, nested within this POJO, I created a Builder that holds some default properties which can be used to build a vehicle. The builder can be created by a static method newBuilder(), and we can override the vehicle properties by chaining the with* methods, and finally return the constructed vehicle using the build() method.

public class Vehicle {

    private String model;
    private String make;
    private int numberOfWheels;
    private double height;
    private double width;
    private double length;
    private int numberOfDoors;

    private Vehicle(String model, String make, int numberOfWheels, double height, double width, double length, int numberOfDoors) {
        this.model = model;
        this.make = make;
        this.numberOfWheels = numberOfWheels;
        this.height = height;
        this.width = width;
        this.length = length;
        this.numberOfDoors = numberOfDoors;
    }

    public static class Builder {

        private String model = "M5";
        private String make = "BMW";
        private int numberOfWheels = 4;
        private double height = 1.45;
        private double width = 1.9;
        private double length = 4.62;
        private int numberOfDoors = 5;

        public static Builder newBuilder() {
            return new Builder();
        }

        public Builder withModel(String model) {
            this.model = model;
            return this;
        }

        public Builder withMake(String make) {
            this.make = make;
            return this;
        }

        public Vehicle build() {
            return new Vehicle(model, make, numberOfWheels, height, width,length, numberOfDoors);
        }
    }
}

Let me use a unit test, written in Groovy, to demonstrate exactly how we use the builder. I created a test to show how we can build a default vehicle without providing any overrides to the default vehicle properties, and another test to show how we can override the default vehicle properties to create a different vehicle.

import org.junit.Test

class VehicleTest {

    @Test
    public void canBuildADefaultVehicle() {
        def defaultVehicle = Vehicle.Builder.newBuilder().build()
        assert defaultVehicle.model == "M5"
        assert defaultVehicle.make == "BMW"
        assert defaultVehicle.numberOfWheels == 4
        assert defaultVehicle.height == 1.45
        assert defaultVehicle.width == 1.9
        assert defaultVehicle.length == 4.62
        assert defaultVehicle.numberOfDoors == 5
    }

    @Test
    public void canBuildADifferentVehicle() {
        def defaultVehicle = Vehicle.Builder.newBuilder().withModel("M3").build()
        assert defaultVehicle.model == "M3"
        assert defaultVehicle.make == "BMW"
        assert defaultVehicle.numberOfWheels == 4
        assert defaultVehicle.height == 1.45
        assert defaultVehicle.width == 1.9
        assert defaultVehicle.length == 4.62
        assert defaultVehicle.numberOfDoors == 5
    }
}

Some says the chaining syntax makes the building block easier to read. :)

Tuesday 2 April 2013

Solving Farmer-Wolf-Goat-Cabbage riddle with Groovy/Java


  • The Riddle - Farmer Wolf Goat Cabbage
  • A farmer is on the west bank of a river with a wolf, a goat and a cabbage in his care. Without his presence the wolf would eat the goat or the goat would eat the cabbage. The wolf is not interested in the cabbage. The farmer wishes to bring his three charges across the river. However the boat available to him can only carry one of the wolf, goat and cabbage besides himself. The puzzle is: is there a sequence of river crossings so that the farmer can transfer himself and the other three all intact to the east bank?
  • The Solution
  • It is a task that given an initial state, a final state, and a set of rules, one will start from the initial state and try to reach to the final state by passing through some intermediate states. Each move will transform from one state to the other, and there might be multiple valid moves from a given state. Such a collection of interconnected states can be represented by State Space Graph.
    Let's use 'f', 'c', 'g', 'w' to denote the farmer, the cabbage, the goat, and the wolf, and use '|' to separate the river where left of the '|' denotes west bank and right of the '|' denotes east bank. Initially, they are all at the west bank of the river, which is represented as 'fcgw |' as shown below. We can solve the riddle by figuring out what the possible and valid moves are, using either Breadth-First Search or Depth-First Search, on a state space graph shown below
       

  • Depth First Search
  • I use DFS to find the first possible solution to the riddle, where it looks like this

    or with a possibility of backtracking like this

  • Breadth First Search
  • I can build a complete state space graph using BFS, excluding the red states as shown below

  • Sample Source Code
  • Implemented in a mix of Groovy and Java, where Java is mainly used for the POJOs, the sample source code, including the unit test, demonstrates how to build a complete state space graph (which is also a directed graph) and how to find all possible solutions based on the state space graph.
    DirectedGraph.java
    import java.util.*;
    
    public class DirectedGraph<T> implements Iterable<T> {
    
        // key is a Node, value is a set of Nodes connected by outgoing edges from the key
        private final Map<T, Set<T>> graph = new HashMap<T, Set<T>>();
    
        public boolean addNode(T node) {
            if (graph.containsKey(node)) {
                return false;
            }
    
            graph.put(node, new HashSet<T>());
            return true;
        }
    
        public void addNodes(Collection<T> nodes) {
            for (T node : nodes) {
                addNode(node);
            }
        }
    
        public void addEdge(T src, T dest) {
            validateSourceAndDestinationNodes(src, dest);
    
            // Add the edge by adding the dest node into the outgoing edges
            graph.get(src).add(dest);
        }
    
        public void removeEdge(T src, T dest) {
            validateSourceAndDestinationNodes(src, dest);
    
            graph.get(src).remove(dest);
        }
    
        public boolean edgeExists(T src, T dest) {
            validateSourceAndDestinationNodes(src, dest);
    
            return graph.get(src).contains(dest);
        }
    
        public Set<T> edgesFrom(T node) {
            // Check that the node exists.
            Set<T> edges = graph.get(node);
            if (edges == null)
                throw new NoSuchElementException("Source node does not exist.");
    
            return Collections.unmodifiableSet(edges);
        }
    
        public Iterator<T> iterator() {
            return graph.keySet().iterator();
        }
    
        public int size() {
            return graph.size();
        }
    
        public boolean isEmpty() {
            return graph.isEmpty();
        }
    
        private void validateSourceAndDestinationNodes(T src, T dest) {
            // Confirm both endpoints exist
            if (!graph.containsKey(src) || !graph.containsKey(dest))
                throw new NoSuchElementException("Both nodes must be in the graph.");
        }
    
    }
    
    BoatLocation.java
    public enum BoatLocation {
        WestBank, EastBank
    }
    
    RiverRole.java, implemented Comparable interface in order to be used in a SortedSet
    public class RiverRole implements Comparable<RiverRole> {
    
        public final String name;
        public final boolean canSailTheBoat;
    
        public RiverRole(String name, boolean canSailTheBoat) {
            this.name = name;
            this.canSailTheBoat = canSailTheBoat;
        }
    
        @Override
        public int compareTo(RiverRole o) {
            return this.name.compareTo(o.name);
        }
    
        @Override
        public String toString() {
            return name;
        }
    }
    
    RiverState.java
    import java.util.Collection;
    import java.util.SortedSet;
    import java.util.TreeSet;
    
    public class RiverState {
    
        private SortedSet<RiverRole> westBank;
        private SortedSet<RiverRole> eastBank;
        private BoatLocation boatLocation;
    
        public RiverState(Collection<RiverRole> westBank, Collection<RiverRole> eastBank, BoatLocation boatLocation) {
            this.westBank = new TreeSet<RiverRole>(westBank);
            this.eastBank = new TreeSet<RiverRole>(eastBank);
            this.boatLocation = boatLocation;
        }
    
        public SortedSet<RiverRole> getWestBank() {
            return new TreeSet<RiverRole>(westBank);
        }
    
        public SortedSet<RiverRole> getEastBank() {
            return new TreeSet<RiverRole>(eastBank);
        }
    
        public BoatLocation getBoatLocation() {
            return boatLocation;
        }
    
        @Override
        public String toString() {
            return "State{westBank=" + westBank + ", eastBank=" + eastBank + ", boatLocation=" + boatLocation + "}";
        }
    
        public boolean equals(Object o) {
            if (this == o) return true;
            if (!(o instanceof RiverState)) return false;
    
            RiverState riverState = (RiverState) o;
    
            return eastBank.equals(riverState.eastBank) &&
                   westBank.equals(riverState.westBank) &&
                   boatLocation.equals(riverState.boatLocation);
        }
    
        public int hashCode() {
            int result;
            result = westBank.hashCode();
            result = 31 * result + eastBank.hashCode();
            result = 31 * result + boatLocation.hashCode();
            return result;
        }
    }
    
    FarmerWolfGoatRiddle.groovy
    import graph.DirectedGraph
    import org.codehaus.groovy.runtime.MethodClosure
    
    import static fun.rivercrossing.BoatLocation.EastBank
    import static fun.rivercrossing.BoatLocation.WestBank
    
    public class FarmerWolfGoatRiddle {
    
        /*  Farmer Wolf Goat Puzzle
            A farmer is on the east bank of a river with a wolf, goat and cabbage in his care. Without his presence the wolf would eat
            the goat or the goat would eat the cabbage. The wolf is not interested in the cabbage. The farmer wishes to bring his three
            charges across the river. However the boat available to him can only carry one of the wolf, goat and cabbage besides
            himself. The puzzle is: is there a sequence of river crossings so that the farmer can transfer himself and the other three all
            intact to the west bank?
        */
    
        RiverRole farmer = new RiverRole("farmer", true);
        RiverRole wolf = new RiverRole("wolf", false);
        RiverRole goat = new RiverRole("goat", false);
        RiverRole cabbage = new RiverRole("cabbage", false);
    
        RiverState initialState = new RiverState([farmer, wolf, goat, cabbage], [], BoatLocation.WestBank)
        RiverState finalState = new RiverState([], [farmer, wolf, goat, cabbage], EastBank)
    
        Closure<Boolean> areRolesHarmonisedOnRiverBank = { Collection<RiverRole> roles ->
            // describe clearly what the rules are
            if (!roles.contains(farmer)) {
                if (roles.contains(wolf) && roles.contains(goat)) return false
                if (roles.contains(goat) && roles.contains(cabbage)) return false
            }
            return true
        }
    
    
        public Collection<Deque<RiverState>> findAllPossibleSolutions() {
            DirectedGraph<RiverState> graph = buildCompleteGraph()
    
            Collection<Deque<RiverState>> solutions = []
            Deque<RiverState> currentSolution = new ArrayDeque<RiverState>()
    
            explore(graph, initialState, currentSolution, solutions)
    
            return solutions
        }
    
        // using Depth First Search (Stack Implementation) to identify a solution
        private void explore(DirectedGraph<RiverState> graph, RiverState currentNode, Deque<RiverState> currentSolution, Collection<Deque<RiverState>> solutions) {
            currentSolution.push(currentNode)
            graph.edgesFrom(currentNode).each { RiverState node ->
                if (node == finalState) {
                    // add the current solution, including the final node, to the solutions
                    currentSolution.push(node)
                    solutions.add(new ArrayDeque<RiverState>(currentSolution))
                    currentSolution.pop()
                } else {
                    // recursively explore
                    explore(graph, node, currentSolution, solutions)
                }
            }
            currentSolution.pop()
        }
    
        // using Breath First Search to build the complete graph
        private DirectedGraph<RiverState> buildCompleteGraph() {
            DirectedGraph<RiverState> graph = new DirectedGraph<RiverState>()
            graph.addNode(initialState)
    
            Set<RiverState> visitedNodes = new HashSet<RiverState>()
    
            // use a queue to keep the states to be visited in sequence
            Deque<RiverState> currentStates = new ArrayDeque<RiverState>()
            currentStates.add(initialState)
    
            while (!currentStates.isEmpty()) {
                if (currentStates.peek() != finalState) {
                    def currentState = currentStates.removeFirst()
                    if (!visitedNodes.contains(currentState)) {
                        visitedNodes.add(currentState)
                        def possibleStates = findNextPossibleStates(currentState)
                        def validStates = possibleStates.findAll { RiverState possibleState ->
                            // matched state should not be one of the visited states, and it should pass the validations
                            return !visitedNodes.contains(possibleState) && passRuleValidations(possibleState)
                        }
                        // add successors and edges
                        graph.addNodes(validStates)
                        validStates.each { graph.addEdge(currentState, it) }
    
                        // add to current states so that we can visit them next
                        currentStates.addAll(validStates)
                    }
                } else {
                    currentStates.removeFirst() // remove the final state from the currentStates queue
                }
            }
    
            return graph
        }
    
        private boolean passRuleValidations(RiverState state) {
            return areRolesHarmonisedOnRiverBank(state.eastBank) && areRolesHarmonisedOnRiverBank(state.westBank)
        }
    
        private List<RiverState> findNextPossibleStates(RiverState currentState) {
            // from the last visited state, find out who can sail the boat, and where is the sailor
            List<RiverState> possibleStates = []
    
            transitFrom(currentState, possibleStates)
    
            return possibleStates
        }
    
        private void transitFrom(RiverState currentState, List<RiverState> possibleStates) {
            // check where the boat is
            BoatLocation boatOrigin = currentState.boatLocation
            BoatLocation boatDestination
    
            // we use method closure to get a copy of west bank or east bank roles
            MethodClosure origin, destination
    
            // multiple assignments
            (origin, destination, boatDestination) = boatOrigin == WestBank ?
                [currentState.&getWestBank, currentState.&getEastBank, EastBank] :
                [currentState.&getEastBank, currentState.&getWestBank, WestBank]
    
            def possibleSailors = origin().findAll { RiverRole role -> role.canSailTheBoat }
    
            possibleSailors?.each { RiverRole sailor ->
                // pick 0 passengers onto the boat
                Collection<RiverRole> onBoat = [sailor]
                possibleStates.add(createNextPossibleState(origin, destination, onBoat, boatDestination))
    
                Collection<RiverRole> possiblePassengers = origin() as Collection<RiverRole>
                possiblePassengers.remove(sailor)
    
                // pick 1 passenger onto the boat
                possiblePassengers.each { RiverRole passenger ->
                    onBoat = [passenger, sailor]
                    possibleStates.add(createNextPossibleState(origin, destination, onBoat, boatDestination))
                }
            }
        }
    
        private RiverState createNextPossibleState(MethodClosure origin, MethodClosure destination, Collection<RiverRole> onBoat, BoatLocation boatDestination) {
            Collection<RiverRole> newDestination = destination() as Collection<RiverRole>
            Collection<RiverRole> newOrigin = origin() as Collection<RiverRole>
    
            // boat transits from origin to destination
            newOrigin.removeAll(onBoat)
            newDestination.addAll(onBoat)
    
            return boatDestination == WestBank ? new RiverState(newDestination, newOrigin, boatDestination) : new RiverState(newOrigin, newDestination, boatDestination)
        }
    
    
    }
    
    FarmerWolfGoatRiddleTest.groovy
    import org.junit.Test
    
    import static fun.rivercrossing.BoatLocation.EastBank
    import static fun.rivercrossing.BoatLocation.WestBank
    
    
    class FarmerWolfGoatRiddleTest {
    
        FarmerWolfGoatRiddle riddle = new FarmerWolfGoatRiddle()
    
        @Test
        public void findAllPossibleSolutions() {
            def solutions = riddle.findAllPossibleSolutions()
            // the ordering of solutions will be random
            assert solutions.size() == 2
            assert solutions[0].size() == 8
            assert solutions[1].size() == 8
    
            def assertAllSolutions = { Deque<RiverState> solution1, Deque<RiverState> solution2 ->
                assertFirstSolution(solution1)
                assertSecondSolution(solution2)
            }
    
            // assert with the correct expectations regardless of which is the first solution
            if (solutions[0].contains(new RiverState([riddle.wolf], [riddle.farmer, riddle.goat, riddle.cabbage], EastBank))) {
                assertAllSolutions(solutions[0], solutions[1])
            } else {
                assertAllSolutions(solutions[1], solutions[0])
            }
        }
    
        private void assertFirstSolution(Deque<RiverState> solution) {
            assert solution.poll() == new RiverState([riddle.farmer, riddle.wolf, riddle.goat, riddle.cabbage], [], WestBank)
            assert solution.poll() == new RiverState([riddle.wolf, riddle.cabbage], [riddle.farmer, riddle.goat], EastBank)
            assert solution.poll() == new RiverState([riddle.farmer, riddle.wolf, riddle.cabbage], [riddle.goat], WestBank)
            assert solution.poll() == new RiverState([riddle.wolf], [riddle.farmer, riddle.goat, riddle.cabbage], EastBank)
            assert solution.poll() == new RiverState([riddle.farmer, riddle.wolf, riddle.goat], [riddle.cabbage], WestBank)
            assert solution.poll() == new RiverState([riddle.goat], [riddle.farmer, riddle.wolf, riddle.cabbage], EastBank)
            assert solution.poll() == new RiverState([riddle.farmer, riddle.goat], [riddle.wolf, riddle.cabbage], WestBank)
            assert solution.poll() == new RiverState([], [riddle.farmer, riddle.wolf, riddle.goat, riddle.cabbage], EastBank)
            assert solution.poll() == null
        }
    
        private void assertSecondSolution(Deque<RiverState> solution) {
            assert solution.poll() == new RiverState([riddle.farmer, riddle.wolf, riddle.goat, riddle.cabbage], [], WestBank)
            assert solution.poll() == new RiverState([riddle.wolf, riddle.cabbage], [riddle.farmer, riddle.goat], EastBank)
            assert solution.poll() == new RiverState([riddle.farmer, riddle.wolf, riddle.cabbage], [riddle.goat], WestBank)
            assert solution.poll() == new RiverState([riddle.cabbage], [riddle.farmer, riddle.goat, riddle.wolf], EastBank)
            assert solution.poll() == new RiverState([riddle.farmer, riddle.cabbage, riddle.goat], [riddle.wolf], WestBank)
            assert solution.poll() == new RiverState([riddle.goat], [riddle.farmer, riddle.wolf, riddle.cabbage], EastBank)
            assert solution.poll() == new RiverState([riddle.farmer, riddle.goat], [riddle.wolf, riddle.cabbage], WestBank)
            assert solution.poll() == new RiverState([], [riddle.farmer, riddle.wolf, riddle.goat, riddle.cabbage], EastBank)
            assert solution.poll() == null
        }
    }