Tuesday, September 27, 2011

Complexity on the workbench

Today’s technical systems contain more and more components which are typically networked and interacting with each other. So, these systems become very complex, which makes it difficult to engineer and maintain the system using traditional, hierarchical approaches.
Looking into complex systems in nature, we see that they are controlled by distributed self-organizing mechanisms that are simple, scalable, robust, and adaptive. However, putting a self-organizing approach into technical systems is not straightforward, because such complex systems are typically hard to predict. A particular change in an interaction mechanism might even have counter-intuitive effects.
In nature, the driving mechanism behind building self-organizing behavior is evolution - why not use the very same method in form of an evolutionary algorithm?
However, there is a need to integrate different tools and models like neural networks, mutation and recombination, and problem-specific simulations. With our tool FREVO we provide a unifying framework to reduce this problem to basically three components: a problem representation, an agent representation and an evolutionary algorithm.
FREVO has been used to solve quite different problems and is available as open source to everyone. It is a very flexible framework open to new components and simulations, thus, we are looking forward to see you testing your ideas with it :-)

This talk was given by István Fehérvári at FET 2011 in the science café. This work was supported in part by the Lakeside Labs project MESON (Modeling and Engineering of Self-Organizing Networks) and the Lakeside Labs GmbH.

Thursday, September 22, 2011

Replacing the Java random generator

The Java random number generator was implemented in 1995. It is based on a linear congruential generator. Since then, there was considerable advancement in algorithms for pseudo random number generators. For most applications, Java's random number generator might be sufficient, but at a closer look, the algorithm has a fairly short period of 248 and fails some tests on the randomness of its output.
The good news is that there is an algorithm which is faster and better: the Xorshift algorithm. Xorshift is using a few (in the following implementation exactly three) of shift and exclusive-or operations. The best way to integrate it to Java is to make a subclass of java.util.random and to overwrite the seed variable and the next() method:

import java.util.Random;

 * A subclass of java.util.random that implements the 
 * Xorshift random number generator

public class XSRandom extends Random {
 private long seed;

 public XSRandom(long seed) {
  this.seed = seed;

 protected int next(int nbits) {
  long x = seed;
  x ^= (x << 21);
  x ^= (x >>> 35);
  x ^= (x << 4);
  seed = x;
  x &= ((1L << nbits) - 1);
  return (int) x;

Since all methods of the Random generator (nextBoolean(), nextInt(), nextLong(), nextFloat(), nextDouble()), nextBytes(), nextGaussian()) depend on the next() method, this efficiently changes all number generation to the Xorshift algorithm.
A complete Java class including also a clone() method can be downloaded here (Code is under LGPL Version 3). Note that this implementation, unlike the java.util.Random is not exactly thread-safe - concurrent threads might access and change the seed variable inconsistently. However, note that concurrent access on the same random object would anyway end up in a nondeterminstic sequence of numbers for each thread.
This implementation is about 30% faster than the generator from java.util.random. It's output passes the Dieharder test suite with no fail and only two announced weaknesses. To use the class in legacy code, you may also instantiate an XSRandom object and assign it to a java.util.Random variable:
  java.util.Random rand = new XSRandom();
All method calls to rand are then using the newer, faster implementation.

Tuesday, September 20, 2011

Pseudo random number generators

In the previous post, we learned that a complex system can exhibit deterministic, but unpredictable behavior. Actually, this aspect has a strong application in computers that has been around for a long time: random number generators.
Many computer programs, like for example games or Monte Carlo simulation require a random function. Typically, a computer operating on defined binary states that only change at particular clock instants has no true randomness - its operations are deterministic. So, typically pseudo random number generators are used to generate number sequences similar to random ones.
In 1946, John von Neumann suggested such an algorithm named the middle-square method. Von Neumann's algorithm is simple: take a number as the "seed", square it, remove the middle digits of the resulting number (when viewed in decimal format) and take the remaining digits as your "random number". This number becomes also the new seed. The algorithm is fast, but statistic tests on the generated random numbers show that the distribution of numbers significantly differs from true random numbers. Worse, once the seed becomes zero, the algorithm gets stuck.
A few decades after Neumann, the programming language C came with a library function for generating random numbers using the functions rand (to pull a number) and srand (to set the seed). For generating one number, the algorithm linearly combines several terms based on constants derived from the seed. This means a higher implementation effort, but the test passes most statistical tests on randomness, such as for example are given ba the Dieharder random test suite.
The famous home computer Commodore 64 had a different algorithm for random number generation, which was simpler: The last seed is multiplied with a big number, a small number is added and the resulting number (given in a floating point format) is mixed up by switching bytes. Although the C64 algorithm was considered practically useful in a publication from Bowman, the Diehard test quickly shows a lot of deficiencies of the algorithm.
In 1997, Makoto Matsumoto and Takuji Nishimura came up with a very good algorithm, the Mersenne Twister. While this algorithm passes most of the Dieharder tests, the implementation has also grown in its complexity. For example, a common implementation of the algorithm requires an array of 624 numbers.

So - do you think now there is a tradeoff between quality of randomness and computation effort? Wrong!

The late (and great) George Marsaglia showed us a simple way to generate high quality numbers: the Xorshift algorithm.
public long randomLong() {
  x ^= (x << 21);
  x ^= (x >>> 35);
  x ^= (x << 4);
  return x;
The computational effort of the algorithm is obviously very low. If you chose "good" values for the three magic numbers (e.g., 21, 35, 4 as in the above example) Xorshift passes the Diehard tests. As it is the case in many complex systems, there is a simple way to control the systems behavior as intended, but is very difficult to find that simple way. In the case of pseudo random numbers it took 60 years from John von Neumann's middle-square to George Marsaglia's Xorshift.
  • J. von Neumann. "Various techniques used in connection with random digits", in A.S. Householder, G.E. Forsythe, and H.H. Germond, eds., Monte Carlo Method, National Bureau of Standards Applied Mathematics Series, 12:36-38, reprinted 1951.
  • M. Matsumoto and T. Nishimura. "Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator". ACM Transactions on Modeling and Computer Simulation 8(1):3–30, 1998
  • R. Bowman. Evaluating Pseudo-Random Number Generators. Comput. &amb; Graphics, Vol. 19, No. 2, pp. 315-324, 1995.
  • G. Marsaglia. "Xorshift RNGs". Journal of Statistical Software Vol. 8 No. 14, 2003.

Wednesday, September 7, 2011

Langton's Ant - from simple rules to complex behavior

The following system is created using a very simple set of rules:
  1. Let's assume an infinite 2D grid world. Each grid cell can be either black or white.
  2. For simplicity, all grid cells are white at the beginning.
  3. There is an ant, which can move up to four directions (N,E,S,W).
  4. Whenever the ant enters a white field, it toggles the grid color and performs a clockwise (right) turn.
  5. Whenever the ant enters a black field, it toggles the grid color and performs a counter-clockwise (left) turn.
You got the rules? Let's play! Get yourself some squared paper, a pencil and a rubber and start drawing. Or, if you are lazy, just use the Java applet below. Unfortunately I failed in simulating the infinite grid, so the simulation reverses whenever the ant leaves the system boundary, but this does not make a difference for the first 10000 iterations. You may use the mousewheel to scroll and modify the simulation speed in order to investigate what the ant is up to.

Despite the simple rules, the ant is following an interesting and unexpected path. This relates to what Carlos Gershenson has pointed out in his blog: a deterministic system does not necessarily mean that it is predictable. A system is deterministic if no randomness is involved in the development of future states of the system. Thus, given a particular starting condition, the system will always develop its future states in the same way. The real physical world, considering the randomness in quantum effects, is not deterministic. But the simulated ant in our example is. On the other hand, prediction means the ability to make a forecast of a system's state (either qualitatively or quantitatively). These two terms represent different concepts and should not be mixed up.

In our deterministic ant system, it is not possible to predict the behavior by looking at the rules unless you have tried out the simulation beforehand. Although we have just one agent, we face a complex system, since the agent is interacting with every single cell it arrives. And each move changes the cell and orientation, thus creates new information. The interesting and unexpected thing is what the ant is doing after roughly 10200 steps. Until then the movement is chaotic. But subsequently, the ant produces a "highway" by walking away to the south-east in repeated sequences of 104 steps. From that point on, the future system states can be predicted for all future states.
If you liked this, check out the work of Propp et al. on Langton's ants with multiple grid colors. They present a three-color model where it is not known if the ant will ever enter a stage producing a predictable highway. So far on deterministic models - in your face, predictability!