Harald Kirsch

genug Unfug.

2014-10-29

Use Cases for ValEx

The last blog entry introduced the idea of ValEx as a variation of Java's Optional. It can be initalized not only with a value, but alternatively with an exception which explains why no value is available. The central method of this class is the getter for the contained value:

    public class ValEx<T,E extends RuntimeException> {
      ...
      public T get() throws E {
        if (t==null) throw e;
        return t;
      }
    }

It allows the caller to decide how sure (s)he is that a value is available. If, from our code structure, we are sure that there is a value, we can just call get() without previously checking with isPresent(). If we are wrong, this is a bug and an unchecked exception is thrown. How does this feel in practical use. Lets look at some typical exceptions in Java.

Exceptions When Parsing Strings

There are many cases where strings are parsed or converted into objects. This includes parsing dates and regular expressions, or even very simple things like getting Charset or an encoding. As already shown in the last blog entry, converting static strings should not throw a checked exception. Who has not yet written code like

      try {
        ...
        Writer w = new OutputStreamWriter(out, "UTF-8");
        ...
      } catch (UnsupportedEncodingException e) {
        log.error("how can UTF-8 be missing", e);
        ...
      }

and wondered why the wrong character set results in a checked exception. The solution here is actually to use a slightly different call:

      Writer w = new OutputStreamWriter(out, Charset.forName("UTF-8"));

where forName() throws an unchecked exception that will only be thrown if things are getting weird. Does VarEx have an application here. Well, not as long as 100% of the cases involve static strings. But suppose the character set name is read from a property.

      String csName = System.getProperty("application.charset.name");
      Writer w = new OutputStreamWriter(out, Charset.forName(csName));

This code is now missing a try/catch, because it is not improbably that a system property contains a wrong string. In this particular case we can switch back to not using forName, but with a hypothetical factory method returning a VarEx we can have both easily. With a static string

       Writer w = Files.streamWriter(out, "UTF-8").get();

we get an exception from the get() in case we have a typo in the code. And with a character set name from a property

      String csName = System.getProperty("application.charset.name");
      VarEx<Writer,?> vw = Files.streamWriter(out, csName);
      if (vw.isEmpty()) {
        // handle the problem, possibly re-throwing
        throw w.getException();
      }
      Writer w = vw.get();

we can first check whether we got something back. So the VarEx allows us to have both, the convenience of an unchecked exception with the minor price to pay being the unshielded get() call, as well as the awareness about a potentially unsuccessfull operation that can be checked for with isEmpty().

IOException

An IOException is probably one of the most frequent exceptions to deal with. Let's see whether VarEx can help here too. With a hypothetical factory method again, assume we had

      String readFile(String name) throws IOException {
        VarEx<Reader,IOException> vr = Files.newReader(name);
        if (vr.isEmpty()) {
          throw vr.getException();
        }
        Reader r = vr.get();
        ...
      }

Is this any better than the try/catch version? Probably not if used this way. What still bugs me is that an exception is prepared in newReader() and eventually thrown despite the fact that absolutely nothing exceptional is going on. Opening a file and finding that this cannot be done is completely normal business. Even if we just wrote the file, some other process or thread could have intercepted already and messed with the file in all kinds of ways that prevent us from opening it — normal, not exceptional!

In this case it may be worth considering this implementation.

    VarEx<String,String> readFile(String name) {
      VarEx<Reader,String> vr = Files.newReader(name);
      if (vr.isEmpty()) {
        return VarEx.empty(vr.getCause());
      }
      Reader r = vr.get();
      ...
    }

Here I start to change my mind with regard to how exactly VarEx should be implemented. In the previous blog I proposed to always use an exception as the description of what went wrong. But this forces the provider of the VarEx to create an exception with its full stack trace, even for normal business. Therefore I now rather would implement the VarEx with a completely unrestrained second generic argument.

    public class VarEx<T,Cause> {
      ...
      public static <T,Cause> ValEx<T,Cause> of(T value) {
        return new ValEx<>(value, null);
      }
      public static <T,Cause> ValEx<T,Cause> empty(Cause cause) {
        return new ValEx<>(null, cause); 
      }
    

The get() method becomes a bit more involved, but would basically throw an IllegalStateException with the Cause either as the message or as the cause of the exception.

Wrapup

It looks like VarEx allows to combine the benefits of checked exceptions — visibility in the api — with the convenience of unchecked exceptions thrown only when the cause is a programming problem. By allowing VarEx to store a cause also just as a message, with no exception, the expensive exception generation can be prevented where a failure to provide a value is normal and expected. Still, if it is necessary to log the case, the VarEx improves over Optional in that in can provide the cause why no value is available.

2014-10-27

To check or not to check, what is your Exception

Several years ago, Robert C. Martin declared in Clean Code that the debate is over with regard to the use of checked or unchecked exceptions in Java. I bought into this opinion for some time until it really got in the way recently, where an otherwise nice library to simplify JDBC use catches the SQLException deep within and converts it into a general RuntimeException. Since I wanted to catch the SQLException to perform a rollback, I was forced to dig into the code and follow all possible paths to see which kind of unchecked exceptions may be thrown were to then decide whether its sensible or not to peform the rollback.

While the library declares its own UncheckedSqlException, I was still forced to look through the code to find out. But in case I missed one of the conversions to unchecked, the only safe way is to catch RuntimeException and then figure out wether it is a real runtime exception to bubble up or just an unchecked one that should trigger the rollback.

"What do you mean by real runtime exception?", I hear you ask. Well, the Java documentation puts it this way: Runtime exceptions represent problems that are the result of a programming problem, the canonical example being the NullPointerException. You could as well call it the result of a programmer's mistake. This seems clear-cut enough, doesn't it?

Then lets try this seemingly simple definition on an example: Pattern.copile(regex) throws an unchecked PatternSyntaxException if the string provided does not parse as a regular expression. So lets consider the code

      Pattern varname = Pattern.compile("[A-Za-z][0-9");

which is missing a close bracket in the regular expression. Clearly a programming problem, a programmer's mistake, so the unchecked exception is well deserved.

But now consider reading regular expressions from a configuration file. The programmer has no control over the strings found and whether they are well formed regular expressions or not. For the pattern to not compile now is not a progamming problem, but completely normal behavior, because the programmer has no means to check whether a string will compile other than by trying it. And we certainly do not want an additional method isWellFormedRegex(String s), because checking for well-formedness is as expensive as just trying to compile the string.

This example shows, that under the definition from the Oracle tutorials, for some APIs neither checked nor unchecked exceptions are correct for all use cases. Interestingly, parsing a date with DateFormat.parse() throws a checked exception, while the new LocalDate.parse() again throws an unchecked exception. In particular for date parsing I think it happens much more often on input data than on static strings, so I would prefer a checked exception.

Now what? When even the creator's of the Java core class libraries are unsure, how am I supposed to know what exceptions to use? The string parsing examples show it quite well. When parsing static strings, a parse error is really a programming problem and deserves an unchecked exception. When parsing input data, a parse error, however, is completely normal business. Wait, what did I just say, normal business? Why do the methods then throw an exception at all, when nothing exceptional is going on?

In C there are no exceptions (just core dumps). Methods that want to signal that they cannot return the expected value do so by returning a special value, often -1, and set a global error message. Not that I want to go back there, but the idea to either return a value or a description of what went wrong, in particular when it is normal business that something goes "wrong", can be implemented in Java quite elegantly. Java 8 gave us Optional, but it only provides a value — or not. It does not allow to pass an explanation of why a value is not available, which is what exceptions do. But we can roll our own.

We need something like Optional which does not only hold a value, but can also provide an explanation if no value is available. Since we are so used to exceptions, the explanation could be an exception with the full stack trace. Another alternative could be just a message, but lets go for the exception first. In German I would not mind to call the class Üei, which is short for Überraschungsei = Kinder Surprise, but lets call it ValEx, because it can contain a value or an exception.

    public class ValEx<T,E extends RuntimeException> {
      private final T t;
      private final E e;
      public ValEx(T t) {this.t = t; this.e = null}
      public ValEx(E e) {this.t = null; this.e = e;}
    }

This class can be instantiated with either a value or an exception. Now for the methods. Naturally we must be able to get the value.

    public class ValEx<T,<E extends Exception>> {
      ...
      public T get() throws E {
        if (t==null) throw e;
        return t;
      }
    }

But we also want to be able test whether there is a value available to avoid the exception.

    public class ValEx<T,<E extends Exception>> {
      ...
      public boolean isEmpty() {
        return t==null;
      }
    }

Of course we can easily add more convenience methods like a get(T defaultValue) and a getter for the exception. Now suppose that a parsing method for regular expressions was declared to return a ValEx object like.

      ValEx<Pattern,PatternSyntaxException> compile(String s);

Then we could now call it with a static string like so:

      Pattern p = Pattern.compile("[a-z]+").get();

Since PatternSyntaxException is an unchecked exception, we get what we deserve if we pass a string that is not a regular expression. On the other hand, if we are compiling a property value we got from a file, we would use

      String regex = props.get("pattern");
      ValEx<Pattern,<? extends Exception>> p = Pattern.compile(regex);
      if (p.isEmpty()) {
        // do what is necessary, like logging the wrong pattern
        LOG.warn("ignoring wrong pattern "+pattern, p.getException());
        return;
      }

Whether the explanation when no value is available should always be an exception, is debatable. Creating exceptions with their whole stacktrace is not a lightweight operation (so I heard). Another aspect of this implementation is that even on the fast, normal path of the computation, we always create the ValEx object, which is ready to be garbage collected very soon after.

2014-10-07

Java equals and canEqual

I stumbled upon canEqual in Scala code and immediately wondered why this is needed. Searching for Java, equals and canEqual brings up tons of hits on Google, but most of them will have some relation to Scala. When reading the first Google hit it becomes clear that the idea of canEqual has its place in Java too.

The article is well written and describes the problem along examples. I thought to take a more compact and formal approach to concisely describe what an equals method may do and what not, in particular taking inheritance into consideration.

The equals contract

The Javadoc specifies that equals must implement an equivalence relation, which is a relation that is reflexive ($a=a$), symmetric ($a=b\Rightarrow b=a$) and transitive ($a=b\wedge b=c \Rightarrow a=c$).

Ok, symmetric, hmmm? Calling an object's method, like in a.equals(b)is inherently non-symmetric, since it is a's equals that is called and b is "only" a parameter. Suppose

A a = new A(...)

and of course when you implemented A you took utmost care to make it symmetric.

Comes along your colleague, half a year later, and writes:

class B extends A {...}

You forgot to make equals a final method to make sure no derived class can ruin your well crafted equals method. And your colleague thinks that B deserves its own, specific equals method. What are the constraints?

Forced to call super.equals

Symmetry, oooohkeeey!? With

B b = new B(...)

he is forced to ensure that when he implements equals such that

b.equals(a) $\to$ true

which uses B.equals, he has to make sure that also

a.equals(b) $\to$ true

which uses the "old" A.equals of your class. Now consider some arbitrary

A a1 = new A(...) such that a.equals(a1) $\to \alpha$ and
B b1 = new B(...) such that a.equals(b1) $\to \beta$.

The transitivity requirement forces your colleague to make sure that comparing b to a1 and b1 returns the exact same results $\alpha$ and $\beta$ as when comparing a. Since the two were arbitrary objects of A and B, this is true for all elements of these two classes. The safest way to get this result is to make sure that b.equals(...) calls super.equals(...). To summarize

Conclusion 1: If an object B b = new B(...) of a subclass of A shall have b.equals(a)$\to$true for at least one a of A, then for this b the parent implementation super.equals() should be called to treat b as if it were genuinely of class A.

The small room for a new (in)equality

If every b has at least one a of A with which it shall be equal, then obviously super.equals would be always called. But that would mean we don't need to implement B.equals in the first place.

Consequently there is at least one B bx = new B(...) which has

bx.equals(a)$\to$false for all A a = new A(...)

This leads to

Conclusion 2: If a subclass B overrides equals of its parent class A, its objects belong to one of two disjoints sets:
  1. those which have at least one B.equals partner from A and
  2. those that are not equal to any element of the superclass.
The second set must not be empty, since otherwise the derived equals need not be implemented.

Enforcing the inequality for the superclass

Due to symmetry, a.equals(bx) must also return false for all objects a created with new A(...). But how can it be that the implementation of A.equals, which was written when no bx yet existed, can do just the right thing when some such new type of object comes along? What A.equals typically does is if (bx instanceof A), but this returns true for the objects of the derived class and makes no difference between as and bs.

The solution is the canEqual method featured in the title. But since the mentioned article describes it so well, I don't need to repeat this here.

2014-09-21

Photon contained in its own Schwarzschild radius

As a followup to my previous post, where I showed that the Schwarzschild radius $r_s$ of a photon with wave length

$l_p$: Planck length
$G$: gravitational constant
$c$: speed of light
$h$: Planck's constant

$$\lambda_o =2\sqrt{2\pi}\, l_p = 2\sqrt{\frac{Gh}{c^3}} $$ is $\lambda_o/2$, I want to add a few simple fun calculations.

The energy of a photon of frequency $\nu$ is $E_p = h\nu$, where $\nu=c/\lambda$ for a given wave length $\lambda$. With the specific $\lambda_o$ we get \begin{align*} E_o &= h c/\lambda_o \\ &= \frac{1}{2} hc \sqrt{\frac{c^3}{Gh}} \\ &= \frac{1}{2} \sqrt{\frac{h^2 c^5}{Gh}} = \frac{1}{2} \sqrt{\frac{h c^5}{G}} \end{align*}

Using Einstein's formula $E=mc^2$ relating energy $E$ and mass $m$, the mass of this photon is \begin{align*} m_o &= E_o/c^2 \\ &= \frac{1}{2} \sqrt{\frac{h c^5}{c^4 G}} = \frac{1}{2} \sqrt{\frac{h c}{G}} \end{align*} Replacing $h$ by $2\pi\hbar$ we arrive at $$ m_o = \frac{1}{2}\sqrt{\frac{2\pi\hbar c}{G}} = \frac{1}{2} \sqrt{2\pi}\, m_p $$ where $m_p$ is the Planck mass.

So the photon for which one wave length "fits" into its Schwarzschild radius sphere has a wave length of the Planck length and its mass is the Planck mass, both multiplied by a factor of $1/2\sqrt{2\pi}$. And I wonder if I messed up something here to be left with this silly factor?

I also wanted to know the numerical value of the frequency of such a photon to see where it is located in the electromagnetic spectrum. The frequency is $$ \nu_o = c/\lambda_o = \frac{1}{2} \sqrt{\frac{c^5}{G h}} $$ which Google happily computes for us without the need to type in all those digits for the constants to be $$ \nu_o = 3.70003533\cdot 10^{42}\, \text{Hz} .$$ It is fun to note that this contains the Answer to the Ultimate Question of Life, the Universe, and Everything.

A more serious note is that this frequency is 23 orders of magnitude beyond gamma rays, where Wikipedia's description of the electromagnetic spectrum ends. My hunch is that we are not going to generate such a photon anytime soon.

2014-09-19

Wavelength and Schwarzschild Radius of a Photon

Inspired by the questions is there a smallest length, I wondered what it takes, at least formally, to have a photon that might contain itself in its own black hole.

Mass is able to deflect the path of light or photons. The more mass there is, the stronger the deflection. If a given mass $m$ is compressed into a sphere smaller than its Schwarzschild radius, it is no longer only a deflection but, the light cannot escape anymore from that sphere. The formula for the Schwarzschild radius $r_s$ of $m$ is $$r_s(m) = \frac{2Gm}{c^2}$$ where $G\approx 6.6\times 10^{-11}\frac{m^3}{kg\cdot s^2}$ is the gravitational constant and $c=299\,792\,458\frac{m}{s}$ is the speed of light. For a photon with frequency $\nu$, its energy is $h\nu$, where $h\approx 6.6\times 10^{-34} Js$ is Planck's constant. This energy can be related to a mass using Einsteins famous formula $E=mc$ to get $$m = h\nu/c^2 .$$ Due to the fixed relation $c=\lambda\nu$ betweenn the frequency $\nu$ and the wave length $\lambda$ of a photon, we can express the mass also as $$ m = \frac{h}{\lambda c} .$$ We can insert this relation into the formula for $r_s(m)$ and get $$ r_s(m) = \frac{2Gh}{\lambda c^3} .$$

The interesting bit is that $r_s$ as well as $\lambda$ have the unit of length, so we are relating the wave length of a photon to its Schwarzschild radius. Further, as we decrease the wave length $\lambda$ of a photon, its frequency and thereby its energy increases — as does it Schwarzschild radius. Consequently we can ask when $\lambda$ and $r_s$ are equal. Or, rather, we can ask when a photon of wave length $\lambda$ "fits" into a sphere of radius $r_s$, i.e. $\lambda = 2r_s$ or $\lambda/2=r_s$.

This is the case when $r_s = a = \lambda/2$, where $a$ is an arbitrary new variable which we now enter into the last equation for $\lambda/2$ and $r_s(m)$ to get $$ a = \frac{Gh}{ac^3} .$$ We solve this for $a$ and get $$ a = \sqrt{\frac{Gh}{c^3}}. $$ So the wave length $\lambda$ of a photon "fits" into a sphere the size of its Schwarzschild radius $r_s(m)$ when both are equal to $a$, which is $$ r_s = \sqrt{\frac{Gh}{c^3}} = \lambda/2 .$$ This may not look very interesting, but physicists will recognize this square root as something they know, but not quite. Looking up the Planck length $$ l_p = \sqrt{\frac{G \hbar}{c^3}} $$ and knowing that $h = \hbar\cdot 2\pi$, we see that $$ r_s = \lambda/2 = \sqrt{\frac{G\cdot2\pi\hbar}{c^3}} = \sqrt{2\pi}\, l_p .$$

Does this mean that when we confine a photon of wave length $\lambda=2\sqrt{2\pi}\,l_p$ into a sphere with that radius $\lambda/2$, that it can not escape and in particular not spread out of this sphere?

I wonder whether this quite simple result is trival, given the definitions of Planck's units, or whether it is a more deeper consequence of the theory behind the Schwarzschild radius.

Further links:

2014-09-14

Java unmodifiable vs. immutable vs. recursively immutable

During my current experiments with abstract polynomials for Java, I thought that it would be good to implement them immutable and so searched the Internet for an immutable list for Java. What I found were blogs that use immutable and unmodifiable synonymously, as well as at least one blog which clearly makes the differences, as also explained in this stackoverflow answer.

For the sake of clarity, let me try to define three related concepts:

unmodifiable
shall denote an object that has no has methods itself that change its state,
immutable
shall denote an object that is unmodifiable and, in addition, makes defensive shallow copies of incoming and outgoing objects stored in fields.
recursively immutable
shall denote an object that is immutable and has only fields with recursively immutable content. We leave perfidious changes to the object by reflection out of the picture.

The problem with Java is, that it cannot fit immutable objects anymore into the collection framework. This gets most obvious from Collections.singletonList(). While it returns an immutable list, as the documentation says, this list can be considered broken, for the simple reason that it is immutable. Although the List interface clearly allows for operations on lists to throw an Unsupported­OperationException, this can lead to bugs which are hard to debug. The list will be passed around in the program from one place to the next and eventually some code tries to add and element to the list, because this is what one typically expects can be done with a list — booom, you get an UnsupportedOperationException out of nowwhere. And it is even an unchecked exception, to be even more surprising.

Making objects immutable by implementing an interface for mutable objects only halfway and throwing RuntimeExceptions from the mutating methods really looks like hack. Some people argue that to fix this, Java's mutable collection interfaces need to inherit from immutable ones. That would require to sneek an ImmutableCollection in as a parent interface of Collection. But looking at the Scala approach, this might not be needed. But a completely new hierarchy of immutable colletions would indeed be necessary.

2014-09-01

Alles neu hier

Nun ist es soweit. Ich kann meine Webseite aus ein paar XML-Vorlagen statisch generieren. Bis vor zwei Monaten habe ich dazu einen Satz von XSL-Transformationen verwendet, den ich mir vor Jahren einmal mühsam zusammengebaut habe. Dann wollte ich eine Klitzekleinigkeit ändern und musste zum X-ten Male feststellen, dass XSL einfach nur Unfug ist: es gibt zu viele Dinge, von denen man als Programmierer einfach gewohnt ist, dass sie in jeder Programmiersprache gehen, die aber in XSL entweder gar nicht gehen, oder nur indem man ziemlich schräge Konstruktionen verwendet.

Mir hat es gereicht. Der nächste Versuch wäre eine fertige Software gewesen, mit der man statische Webseiten generieren kann. Aber die diversen Googleergebnisse haben mir alle nicht gefallen. Deshalb habe ich es selbst geschrieben, auf Basis und mit Hilfe von Xmldego, einem Paket, das ich bereits als Experiment in 2009 aufgesetzt hatte. Das Ergebnis sind ein paar einfache Javaklassen, mit denen ich meine HTML-formatierten Texte in einfache Vorlagen einfüge, so dass dann der komplette Webauftritt heraus kommt. In Kürze werde ich das Paket hier auch publizieren. Die Vorteile gegenüber XSL: Java ist eine echte Programmiersprache und keine Karikatur, und ich habe mir die Kontrolle zurück geholt.

My recent experiment is an HTML app showing Open Street Map maps. It is particularly targeted on mobile devices with Javascript support for geolocation.

Here is the map.