Biggest mistake in C#: That strings can be null

I really like C#.  It could be better by adding lots of my favorite things … but as it stands it is very useable, very expressive, very readable.  And it has only one major mistake (IMO):  Strings (variables, parameters, fields, etc.) can be null.

Oh my, how many coding errors have been made by forgetting strings could be null?  How many crashes have users suffered?  Oh well.

Anyway, here’s a brief proposal on how to correct the problem.  It isn’t carefully thought through … just off the cuff as it were.  But:

Let there be a unary operator that, when applied to a typed null value (something that isn’t dynamic) acts like this:  if the value is not null then the operator returns that value unchanged; if the value is null then the operator returns the result of calling the no-parameter constructor for the type of the value.  (Where the type of the value is whatever the compiler things it is using standard type inference, where it’s an error if the type doesn’t have a no-parameter constructor, etc.)

(For the sake of argument, assume the operator symbol is a postfixed exclamation point.)

Then you could easily (single character!) coerce any null value to a default constructed value of the proper type.  It would be easy to insert the operator at return statements, or after a method call where you weren’t sure if a null value might be returned, or on the use of a parameter.

And then, the next step is to allow that operator to be used in three more places: After the declared return type of a method, after the declared parameter type of any method parameter, and after the declared type of a property.  (It would work with generic parameter and return types too, if the generic type had the new() constraint.)  This annotation would mean that the compiler would automatically apply the operator at each return statement, and on each annotated parameter on method entry.  The annotation would also be a simple and easily understood way to communicate to the programmer the guarantee that the method never returned null and that, inside the method, the parameters would never be null.

And, with those annotations in place, if you went ahead and modified the IL to incorporate the annotations (rather than just having it as a C# compiler implemented feature) then the JITter could perform flow analysis (of whatever complexity) and probably eliminate a bunch of explicit invocations of the operator.

The final step:  Annotate the .NET framework with the operator where appropriate, which would be practically everywhere.

Well … your input?  Good idea, I’ve neglected a major flaw, or what?

 

5 Responses to “Biggest mistake in C#: That strings can be null”


  • This is a very bad idea. Casting the potential null values in your program to “the result of calling the no-parameter constructor for the type of the value” doesn’t fix anything. In fact, it’ll break more things. Your program will not be able to distinguish between an Integer with value of 0, and a “null” that a function might return to denote an invalid value (a bad idea nevertheless).

    I concur with you in that nulls, in and of themselves are terrible. Tony Hoare, the inventor of the null reference said: “I call it my billion-dollar mistake. It was the invention of the null reference in 1965. At that time, I was designing the first comprehensive type system for references in an object oriented language (ALGOL W). My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes, which have probably caused a billion dollars of pain and damage in the last forty years.” (http://www.forbes.com/sites/quora/2013/08/09/what-is-the-worst-mistake-ever-made-in-computer-programming-that-proved-to-be-painful-for-programmers-for-years/)

    A better solution would be to use a programming language that does not have null references. This StackOverflow thread delves into this: http://stackoverflow.com/a/3990754

    To quote: “I think the succinct summary of why null is undesirable [in a programming language] is that meaningless states should not be representable.”

    • One, suggesting that in an explicitly strong typed language the compiler wouldn’t know a integer 0 from a null is ludicrous. Two, null can be said to represent an indeterminate state, or, in other words, a wave function prior to it’s collapse.

  • Your program will not be able to distinguish between an Integer with value of 0, and a “null” that a function might return to denote an invalid value (a bad idea nevertheless).

    Well, yes, it is a bad idea. So I say: don’t do it! Instead, if you really need to signal the use of an invalid value, for a value type like integer or a should-be-value type like string, then use any of a number of “Maybe” monad implementations for C# that you can find on the web. Or consider throwing an exception.

    Thanks for the Tony Hoare quotation. I’ve read that before but it deserves to be more widely known.

  • I see your point, but if you’re going to create a value (int, string etc) when the value has not been set – you’re changing the data.

    For a simple example, take an Int32 which defines some number you want to manipulate. Let’s say the valid range is exactly the same as the range of an Int32 (-2B to +2B).

    How do you distinguish the “user hasn’t selected a number” state from the “user has selected the value that is the ‘default'”? No matter what you select as the “default” value from the constructor, it’s a valid value for the user to choose. So now you can’t determine whether the user has selected the value or if it was chosen for you.

    And now, you’ve also hardcoded a compiler dependency. If the C# 8.0 compiler changes the default to -1 instead of 0, now you can’t tell whether zero OR -1 was explicitly chosen.

    Yes, I suppose you could define a companion variable “hasSelectedX” for each variable X, but now you might heed to pass hasSelectedX into each function using X.

    Or consider what happens when you load a nullable value from a database? When you inspect the null value it changes to something new! What if you load a table row containing a null, inspect the null (changing it!) then change a different value and write it back? What gets written to the DB?

    Null has meaning (actually it can have many meanings). Your proposal does away with that meaning, and unintentionally destroys metadata.

  • This is a very good idea and could be implemented with reflection in C#. To take the idea a little farther, you could say that an object which lacks a public paramaterless constructor can define a default constructor of it’s own, or just return null if it doesn’t define said constructor.

    I agree regarding nulls, they’re an important part of data modelling, and it could even be said that they represent an indeterminate state, or, in other words, a wave function prior to it’s collapse.

Leave a Reply