# Archive for the 'Java' Category

### A TestNG Listener can register itself with TestNG

The TestNG documentation lists 5 ways of registering a TestNG listener.  Here’s a sixth, that’s sometimes useful:

If your listener is specific to a test suite (it’s not really general purpose) or if you don’t mind dropping the source into each test suite, then relying on the fact that the @Listeners annotation makes the named listener(s) available to the entire suite (not just the one annotated class) you can make your listener itself be a test class!

• Put your listener class in the same place as the rest of the test classes in your suite (or name it in your testng.xml, or whatever).
• Put an @Listeners annotation in front of the class, referring to itself.
• Add a method annotated with @Test inside the listener class.

Now the listener class will be identified as a test class, and it’s @Listener annotation will be respected, and thus it itself will be a listener.

Note that your test method in the listener class can have @Test(enabled=false) (so it doesn’t pollute your results) and it still works!

### Breadth-First Traversal with Alternating Directions at Each Level

Yesterday I discussed the interview coding problem I blew: Code, in Java, a breadth-first traversal of a binary tree, alternating directions at each level.

The key observation is obvious: At each level you process nodes in the reverse order of the previous level.  (Duh! That’s the problem statement.)  To turn it around: At each level you must preserve for later processing the child nodes in reverse order of the nodes you’re processing at this level.  A data structure which let’s you pull off nodes in reverse order in which you saved them is a stack, one of the simplest data structures of all. Continue reading ‘Breadth-First Traversal with Alternating Directions at Each Level’

### Two coding interview questions done badly, for different reasons

In my recent job search I was asked very few coding questions.  Not sure why not—I was interviewing for individual contributor roles as per usual.  Anyway, two coding questions stand out for basic errors I made. One was at the whiteboard, one was a “homework” problem.

The whiteboard problem had a very simple statement: Code, in Java, a breadth-first search, but switch the traversal direction at each level.  A nice twist on a typical problem!  I did it by inserting a sentinel into the standard queue of pending nodes.  Using a Java Queue<Object> I inserted an (arbitrary) Integer as the flag, but it would have been just as reasonable to use a Queue<Node> and insert the tree root as sentinel.  Remember to check for termination: if you’ve just dequeued the sentinel then if the queue size is zero, you’re done.  Very short and sweet—but wrong! It hit me like a ton of bricks as soon as I drove off!  It doesn’t traverse alternate levels in different directions, but only the nodes of alternate levels in different directions.  Bungled!  The error, my fault, was that I didn’t fully work things out, with an example, before I started coding.  Had I, I would immediately have caught my error.  In my defense, the interviewer and I were under very tight time constraints-it was a screening interview, it started late (scheduling mishap at their end), and we did a bunch of design questions first.  On the other hand, he seemed very happy with the solution: he complimented it and said I would be invited back for a full loop (though I bailed on that for unrelated reasons).  (Anyway, the correct solution. which immediately followed my shocking realization, is here.)

Now, what is it that you should never do with an interview homework problem? Continue reading ‘Two coding interview questions done badly, for different reasons’

### Comment on: A Memory-Efficient Queue for Primitive Types (in Java)

Have you ever designed and coding something and got it working and felt really satisfied and then went to bed and woke up then next morning with the sparkling clear thought in your mind that all of that work was unneeded, that there was a much better and much simpler way to do the exact same thing?   Did that happen after you had blogged about your cool piece of work?  Well, it just happened to me.

Well, in my previous post I carefully justified a queue that directly contained primitive types, in order to have a queue that could hold hundreds of millions of elements.  That’s alright, and the discussion of memory consumption and when to optimize it is good, but …

It is now clear as can be to me that the right way to handle this queue is to write my tuples-of-primitive-types to a file, not keep them in memory at all, and to run the file as a queue.  The code is very simple – here it is: externalqueue.zip.  Basically, I open a RandomAccessFile and keep track of head and tail filePointers.  The queue class is parameterized by a small interface that the caller must pass in to the constructor, where the caller takes responsibility for serializing and deserializing his element objects to via a DataInput or DataOutput.  (This isn’t official Java serialization, where objects are tracked.  This queue is meant largely for “struct-like” things.)

This simple class is enough for my needs but it has a limitation that might need to be fixed, depending on the usage.  The external file grows to contain all the elements that have ever been written to the queue.  It is only truncated when the queue becomes empty.  For my application (the retrograde analysis of the 15-puzzle) the queue never becomes empty until the algorithm terminates, yet this is fine since all the enqueued elements will still fit on disk (it should take no more than 5-7Gb).  But in other applications it might be desirable to enhance this queue to start using alternate files once the files reach a certain threshold.  Then you can start deleting files once the head pointer (next element to be dequeued) advances past the end of a file and you rotate to the next one.

### A Memory-Efficient Queue for Primitive Types (in Java)

Data structures in Java can take more memory than equivalent structures in C++ or C#, for various reasons, including general per-object overhead, and the dichotomy between primitive types and objects. For many applications that doesn’t really matter, but for some the excess memory usage in Java is critical and can mean the difference between success and failure.

I’m studying combinatorial search techniques now, using (for some reason) Java. At this point I’m using retrograde analysis to compute pattern databases for the 15-puzzle. (Retrograde analysis = searching backward from the goal state.) The easiest algorithm for this uses a breadth-first search of all positions from the goal.

Breadth-first search is done by enqueuing states-next-to-search onto a queue, and processing them one by one off the queue. For a lot of problems the queue size gets prohibitively large and can’t be used, which is why IDA* and other algorithms that go depth-first have been developed.

For building pattern databases for the 15-puzzle the queue can get quite long, but should still be tractable if care is taken. For Java, in particular, you can’t just blindly use the standard collection classes.

The two main Java classes that implement the interface Queue<E> are ArrayDeque, which is based on a Java array, and LinkedList, which is a typical linked list with external links.

Suppose, for the sake of argument, that we’re working with a 32-bit OS, and with queues of 100 million elements.

Well, if you’re talking 100 million distinct objects you’re already in trouble. Each object takes 16 bytes, minimum, and your 100M objects will need at least 1.6G of memory, more heap than you can get (with the Sun Hotspot VM). But maybe you’re talking primitive types, like long. (A 15-puzzle state will fit into a long – a 64-bit word: 16 tile positions of 4 bits each.) 100M longs will fit in 800M bytes of memory, if placed in an array, so your queue could work.

Except for a few things. First, both ArrayDeque and LinkedList use as their representation type the type that they’re instantiated with. Generics in Java can’t be instantiated with primitive types, so you need to use, e.g., Long instead of long. This means boxing all of the longs you want to put in the queue, which means you’re back to separate objects of 16 bytes instead of array slots of 8 bytes. (And of course, things are worse, proportionally, if you want a queue of int or byte. Why would you want a queue of 100M bytes given that there are only 256 unique values for a byte? Answer: If you really want to have a queue of a tuple of long and byte, and you’re going to run it as two queues of primitive objects, rather than one queue of a reference type. Which is the case for the breath-first search in the 15-puzzle.)

In addition to your boxed elements, the LinkedList has a 16-byte object for every element in the queue. So each element in the queue actually takes 32-bytes, and furthermore, is a separate object to manage.

That point is also important: In an experiment I ran with a 1Gb heap space, the test program started thrashing in the garbage collector and made no further progress after only 33,740,000 Longs were allocated and put into an ArrayDeque. (That’s only 540Mb, there should have been plenty of space left.)

Anyway, to make a long story short, I implemented a class, CompactQueue, that works with either primitive types or reference types, and, if using a primitive type, stores the enqueued elements directly into an array without boxing them. The operations of add() and remove() are constant time. (By the way, this isn’t true of ArrayDeque where add() is only amortized constant time because of the need to occasionally reallocate the array holding the elements, if the queue size increases.)

Using this class, and the heap size set to 1000Mb, I was able to put 126M longs, or 1012M bytes, into the queue. And there’s no GC performance problem caused by having 126M separate objects to manage (252M for the LinkedList!).

The class uses two techniques: First, it is parameterized by a array wrapper class that provides a factory for arrays of primitive (or reference) type, and array-like get() and set() operations. And second, it uses many smaller arrays (“blocks”) to store the elements of the queue, instead of one large array (as in ArrayDeque) or an object-per-element (as in LinkedList). This means that it can smoothly expand to take all necessary (available?) memory without hitting a roadblock when the array size doubles (as in ArrayDeque).

Code is provided here. (Note that only the array wrapper classes for byte and long are provided, array wrapper classes for the other primitive types are left as an exercise for the reader.)

### An Unbounded Spigot Algorithm for Pi – the Java version

Some time ago I read Jeremy Gibbons’ article “Unbounded Spigot Algorithms for the Digits of Pi” and liked it a lot- the problem was amusing, and the article was easy to underestand.

A spigot algorithm is an algorithm for producing digits of an unbounded sequence without reusing digits after they have been computed.  The digits are produced “one by one, as if from a leaky tap” (Gibbons).  An unbounded spigot algorithm doesn’t need to have some predetermined number of digits to run on—it’ll just keep going and going until you run out of memory.

I was recently experimenting with programming an FPGA and thought of this paper; my idea was that a little coprocessor (the FPGA) hanging off my server could just start producing digits of pi and keep running indefinitely.  Even though it isn’t the fastest way go generate digits of pi (not by a longshot) it still seemed like fun.  And I thought of a nice “agent” architecture where I’d take the digits and run them past an array of recognizers, each one to do some kind of statistics on the digits of pi, or look for special sequences like ten digits in a row made of one each of the ten digits.  Those recognizers would all run in hardware, in parallel.

I remembered there was a “catch” about spigot algorithms, so I reread the paper.  And the catch is that you obviously can’t generate digits of pi, an irrational (also transcendental of course), from a finite process.  So even though the algorithm can be easily specified and for sure will generate one digit at a time, the implementation might need to use arbitrary amounts of memory to produce that digit.  In fact, the memory complexity is hidden inside the bignum rational arithmetic that the algorithm uses.

Now, this wasn’t necessarily a stopper.  I thought that maybe the problem was that as digits were generated the bignum rational arithmetic was generally done on reasonably sized bignums (maybe several thousand bits of numerator or denominator), and only every once in awhile did you have a few terms which blew up to much larger sizes.  In that case you could mainly run the algorithm on the FPGA but which, when it discovered a multiplication that resulted in a too-large-for-the-hardware result, would trap back to the server which would run that particular multiplication with much more resources available, and would then pass the result back to the FPGA, which could continue.

But no, that wasn’t to be.  I experimented with an implementation in Java and discovered that the bignums get larger and larger from the very beginning, and even if you not only reduce each bignum to smallest terms, but also do the same for the 2×2 matrix that the algorithm uses, there’s no use.  You’re doing really big bignum arithmetic after the first few terms.

Anyway.  I was happy to see that the Haskell algorithms in the paper transferred very easily and very directly to Java (once I supplied a BigRational class).  Here’s the code: spigot.tar

One curious bug kept me puzzled for hours: I got bit by a 32-bit integer overflow.  Turns out if you’re computing 3(3i+1)(3i+2)(5i-2) for i from 1 to ∞ … well, that simple expression overflows along around i = 280.  I didn’t expect it at all and looked everywhere else in my bignum-using expressions for it.

At this time I’d like to point out a couple of typographical errors in the programs in the Gibbons’ paper cited above, “An Unbounded Spigot Algorithm for Pi”, so you can save some time if you want to investigate this algorithm yourself.

• pg 7, definition of extr—the / should be %
• more crucially, pg 9, definition of next+15 should be -12, also in conjecture 1, definition of n

### Object-Oriented Permutation Generation

A classic interview coding question is—or at least used to be—to write a program to generate all the permutations of a string.

I used to like to get it because it had a simple and elegant recursive solution that was easy to right the first time, when writing it on the whiteboard. And coming up with a good recursive solution in an interview used to impress interviewers, especially if you explained it nicely in a “mathematical induction” sort of way.

Recently I had two additional thoughts about that easy solution. The first was that recursive solutions were sort of the epitome of procedural programming, but what would be the object-oriented variant, and would it be as easy to write it correctly at the whiteboard?

The second thought was a bit more interesting: suppose you actually needed a permutation generator in your program (I never have) then you probably would not want the recursive solution anyway, which would generate all permutations—and process them—before returning to the caller. Instead, you’d want a demand-oriented, that is, iterator-style, solution that you could pull solutions out of when you wanted to. That sort of solution would be more difficult to express in a procedural language, and you couldn’t easily change a recursive solution into an demand-oriented solution, but the object-oriented solution might be easier to adapt. And could you still do it at the whiteboard?

I was going to make a blog post about this—but it turned into quite the megilla and I put it on CodeProject as a tutorial article, here.

In that article I not only show the object-oriented data-pipeline version that corresponds directly to the procedural (recursive) solution, but I also show two variants of it in “pull-style” as a iterator.

However, here is the object-oriented solution to generating permutations. It is in fact as simple as the recursive solution, and perhaps even easier to get right the first time while standing at the whiteboard. I don’t know why it hasn’t been seen, as an example, earlier.