Background
Around a month ago, I got my hands on the XKCD What-If book. It is an excellent read (when I’m not chewing through Terry Pratchett).
It did prompt some math programming questions in my mind. These are things that Wolfram Alpha has likely solved ad infinitum. But I wanted to explore the challenge set a bit. So feel free to follow along on Github!
The Problem?
Units within Mathematics
When you conduct mathematics, you are dealing with three things:
- Quantities
- Units
- Equations
When programming, there is an extensive number of potential logic errors based around using the wrong units.
When you deal with an equation like F=MA (Force is equal to Mass times Acceleration).
Say you were trying to find Mass. You solve for M = F/A. But say you type the parameters in the wrong order.
You must now wait until you see the erroneous logic manifest through incorrect behavior. It might be subtle, or only arise in certain circumstance.
Sinking Floats
Float is one of the most common math variables you encounter. Distances are made up of floats. Vectors and Quaternions are several specialized floats.
Floats are highly flexible, but they offload aspects of the logic to the programmer.
A compiler is incredibly skilled at following arbitrary rules. Why not leverage the compiler to reduce or eliminate the entire class of logic errors.
A prototype Solution
It started with Distance. I thought about making an AbstractDistance class. Then I would create a class for each type of unit I hoped to use. I quickly realized I should make an AbstractUnit to wrap the float value at the center of it all.
Then I made my AbstractDistance. So functions could ask for an AbstractDistance, and you could provide any distance you had.
And then I made some child classes. Using C#’s abstract classes lets me force child objects to implement certain functions. You can take any AbstractDistance and call ToMeter() on it.
Each Distance class has a ‘raw-value’ constructor which takes a float and gives you a typed object.
Each AbstractDistance contains the unit conversion to each different data type. This becomes Pro #1 – avoid in-line conversions and Con #1 – need to implement each conversion.
Take a look at a test class for some of the distances.
And the results:
- 74 Inches = 1.88 m, 6.17 ft, .00117 Mi.
- 3 Meters = 118 in, 9.84 ft, .00186 Mi.
- 238900 Miles = 3.84 * 10^8 m, 1.51 * 10 ^ 10 in, 1.26 * 10^9 ft.
- 3 Feet = .914 m, 36 in, .000568 Mi.
Seems like the math checks out.
Giving Names a Clearer Voice
I did learn something as I made examples. With strongly typed Units, I no longer needed to convey units in the names (ThreeFeet could’ve been named something else).
This let the units speak clearly for WHAT they were. This let the name regain its ability to speak WHY they are. This is Pro #2.
Look at my variable of DistanceToTheMoon. It conveys that it stores that distance in Miles. You could easily create a variable [Inch DistanceToTheMoon] and another programmer would have no challenge understanding your intent. The distance doesn’t change, you simply don’t interact as directly with the floats (which is where human error comes in)
Moving Onward – Velocity
After distance (and time) comes Velocity. Nothing too surprising here given other units we have created.
We expose a MovementDistance and a TimeInterval. Note how they are abstract, this means child Velocities can store a different variable in each field.
For instance, in MetersPerSecond’s constructor, it takes a distance (in Meters) and a time interval (in seconds). While we could take a reductionist approach to accepting just a float, we ask for a Meter unit and a Second unit. We also track these because 30 m/s isn’t quite the same as 10800 m/s (which is 30 m/s for an hour). In short, storing those units for later might be useful. This elegance is Pro #3.
A similar test to the DistanceTest. We’re starting to see an unfortunate amount of new() usage. It also crops up frequently in the ToMilesPerHour type functions. This is Con #3.
Accelerating in Complexity
So the next inevitable challenge was the second derivative of displacement with respect to time.
But how do we represent Second * Second? Surely we don’t want to make a SecondSqr unit. We’d have to make EVERY possible unit. Con #1‘s excessive initial implementation is already big enough. There has to be a better way.
What about Generics<T>?
The idea occurred to me. I could implement a Squared<T> generic. This would represent the unit is squared, allowing for easy conversion to and from the squared unit. You could provide functions for the square root, multiplying seconds together, etc.
Downside: What about Cubic? What about Inverse (raised to the negative first)? What about ^e?
Solution? Make it more generic!
I’ll step through this one a bit more slowly.
where T : AbstractUnit, new() where U : RaiseToPower, new(): These are C# constraints on what can be provided to this function. We only support ‘Pow’ing our AbstractUnits and raising them to a predefined Power, which the RaiseToPower class signifies.
T: Represents the unit we are raising to a power. So for Acceleration (m/s^2) T = Second
U: Represents the power you are raising the unit too. So U = Squared (instead of implementing a single generic, I made the concept of raising to powers generic.
public static Pow<T, U> New(T TVariable): This functions like our constructor (cause generics don’t play nice with constructors). You provide a BaseUnit which is being ‘Squared’. U is lightweight and has hardly any fields which you’ll see in a moment.
Overall, it isn’t that gruesome. I even added a function inside of Pow<T, U> which you can call to get a value of T (after it has been raised to the unit)
So what does the Use case look like?
This does speak fairly well about what is going on.
The word new does occur four times in this line (3 for generic new() constructors and once for our static Pow constructor-like), so another vote for Con #2 – overuse of new().
There is an additional problem under the hood. AbstractAcceleration wanted to have a Pow<AbstractTime, Squared> variable, which children could assign into, the same way AbstractVelocity has a AbstractTime variable. The downside is that you can’t assign a Generic<Bar> even if Bar : Foo and the variable is of type Generic<Foo>.
This means that when dealing with Pow<T, U>, you lose some of the expressiveness that AbstractVelocity sought to achieve. Whether this is back breaking, I don’t know yet. I’ll label it Con #3 – Wonky Pow Generic issues for now.
Final Example
As I move forward with this, I have implemented time, temperature, distance, velocity, several constants, exponents and more. I am moving towards force, pressure and eventually heat. I am interested to see how things continue to develop when I get to the Ideal Gas Laws.
As I was implementing towards AbstractForce, I realized I needed the concept of Area. So I thought about how I would implement this, would it be yet another class?
This was the result. While it is by no means perfect, it took only 1 line of code to implement this new functionality (adding an empty constructor to Meter, since AbstractUnit can’t enforce everyone has an empty constructor).
I think there are better forms of Area (Length x Width, Pi*r^2, etc) but this served my immediate purpose. This does fall into another vote for Con #3 – Wonky Pow generic usage.
I’ll go ahead and give Pow<T, U> Pro #4 for flexibility.
Pros
- Conversions exist in code. This avoids lots of off-the-cuff conversions or potential mistakes.
- Types are well defined. It allows the type to speak to what it is, and the name to speak to it’s purpose.
- Certain compound units are surprisingly elegant
- Generic Pow<T, U> is very flexible.
Cons
- Plenty of initial programming to implement everything
- Excessive use of new() operators when using ToOtherUnit() functions
- Acceleration and Area get really interesting but poainful with the Generic classes.
General Lessons
- You can achieve a lot with generics. However you can’t have a child pass Pow<Second, Squared> into a parent’s Pow<AbstractTime, Squared> variable. No implicit cast.
- Sometimes a good amount of up-front work can make your environment and code very expressive.
- Eliminating classes of errors is worth the experimentation.
Other Notes
There were a few things I didn’t bother to include. You can easily use operator overloading to support addition (not even of like units)
With a signature like [public static Milogram operator +(Milogram c1, AbstractMass c2)] you can call c2.ToMilogram() and directly add the values. This is pretty functional because it removes the concept of units from addition, and converts to whatever you have on the left side of the +/- sign. Say you screw up your unit target, you get a compiler error rather than a logic one.
To my understanding, I could achieve even stronger operator overloading in C++.
The concept of something being Unitless, or a ‘off-the-cuff’ unit like ‘Gazelles’ could be worth exploring. Have it be the fallback type when mathematics renders things as Unitless.
Future Considerations
The purpose of this experiment was to explore what it is like to eliminate a class of errors rather than solving them individually over time. It is very easy to fix bug after bug, but identifying the root of why they keep occurring is difficult. Preventing that entire category tends to be an additional level of challenge.
C# was the easiest tool to prototype these sorts of interactions but probably one of the long-term worst to keep at it. The code I wrote is not performant (all the new() usage) nor finished (the simple properties, the public variables, conversion duplication, etc)
C++ is a much more flexible tool for the way it handles operator overloading, certain aspects of generics and the concept of Phantom types. Phantom types allow you to achieve the strong types that I implemented in C# with less immediate effort.
If you didn’t see the link at the top, the code is available on Github (for Unity 5+)
Thanks for reading, hope you enjoyed this!
-Jonathan Palmer