Wednesday, January 5, 2011

Duality in Computer Science

In category theory every construction has a dual which is just as valid a structure as the original. The dual object is constructed by reversing the arrows in the diagram describing the original construction. This is very rigorous and well defined, and sometimes cuts the work in half as every time you go through the effort to describe something you have also described its dual. It is also nice as it makes explicit the relationship between certain concepts- like the difference between Cartesian products and disjoint unions in set theory.
I've seem some interesting duality in certain programming concepts. First a disclaimer- I don't think these dualities are always valid or rigorous, and I don't like the idea of over simplifying what are otherwise rich and complex concepts by lumping them into opposing sides. With that said, some of these are instructive and fun.

The first is OOP and FP. This video mentions the duality I'm talking about "http://channel9.msdn.com/Shows/Going+Deep/C9-Lectures-Dr-Ralf-Laemmel-Advanced-Functional-Programming-The-Expression-Problem" where FP (if we consider Algebraic Data Types as a FP specific thing) is very easy to add operations to manipulate the data, but hard to add new data types, where in OOP it is easy to add new data (by subclassing) but hard to add new operations. Both sides can solve their problems with some more advanced techniques, but it is interesting that at least naively they are the opposite.
Another OOP vs FP is the duality of abstraction and encapsulation (using certain definitions of the two) as found here: http://lukepalmer.wordpress.com/2010/11/23/encapsulation-considered-harmful/ where abstraction is considered more FP-like and encapsulation more OOP-like (at least in practice).
In OOP the data (objects) are complex and able to perform tasks (they are smart) while the control flow is typically just the usual structures like conditionals, iteration, ect. In FP the data is not so smart- it is simple and easy to describe, built up, and take apart, but it has no idea how to manipulate itself. The control flow is of course where the power is, and there are all sorts of inexpensive abstraction available to create ways to manipulate the data and create arbitrarily complex control structures.

Another is Forth vs Lisp. Both are homoiconic language that easily allow EDSLs (embedded domain specific languages) and both are fundamentally very simple. The differences are pretty interesting though- in the video found here http://archive.forthworks.com/ots/ots-01.mpg?torrent the speaker says that in Lisp "Code is Data" which is certainly true as we are able to manipulate code as symbol lists giving rise to macros (which is made possible because in lisp we program the abstract syntax tree directly). On the other hand, he says that in Forth "Data is Code" in that we can write config files and data files that are actually executable code, and forth is a untyped language so there is really no difference between the data and the code. Another, kind of silly, duality here is that in Lisp everything is in parenthesis and semicolons are comments, while in Forth things in parens are ignored as comments and semicolons are used to end the definition of words (and so appear all the time).

There is a paper called "Call-by-Value is Dual to Call-by-Name" by Phillip Wadler which is a more rigorous duality between evaluation orders in the lambda calculus through application of de Morgans law (using the dual nature of 'and' and 'or'.

Dynamic typing and static typing have a complex relationship that is not so much about duality, but the (very good) article found here http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/ does mention one way in which they are opposite- static typing excludes program behavior through proofs (which will exclude certain "correct" programs) while dynamic typing (in practice) excludes program behavior by testing, which may include "incorrect" programs.

No comments:

Post a Comment