Monday, November 5, 2012

Evolution of Code

A common conception--if that's the proper word--is that code is alien, a foreign language difficult and tedious to learn. A certain assumption of weird magic is sometimes implicit. As mentioned, I dislike this opinion. While programming languages are based on the counter-intuitive zeroes and ones running around the circuit board, the evolution of programming languages have tended on the more intuitive, creating languages made by humans for humans. At least, that's my hypothesis. In this post, I want to trace the trends and types of programming languages to see whether that's true. In the mean time, here's a cool graph. Also, an easy history.

Binary is the basics, the language of the processor. While some well-trained programmers understood how to navigate this world, machines soon each had assembly languages that named particular series of 0s and 1s for convenience. Names are fundamental to language and how humans think about the world, so it's no surprise code immediately took this direction. Finding the connection between binary series and these real-world concepts is the first step towards meaningful programming, but the only real transformation in this step is how the programmer perceives the code. Already expediting this process were forward-thinking project leaders concerned with making code accessible to those without an expert background.

The first real higher level language is Fortran, first dubbed Speedcoder and developed primarily by John Backus. The underlying difference between Fortran and assembly was a compiler, created and advocated for by the great Grace Hopper. Compilers translates code in a higher-level language into the appropriate list of assembly commands, allowing code to expand beyond line-by-line memory manipulation. Relative to current languages, few conveniences were made available in Fortran, but the ability to code and manipulate variables (a move towards a formal but familar language) was revolutionary and intuitive for scientists.

The technical motive to shift away from machine level was prompted by the machine-specific nature of code; compilers let one code according to intentions instead of a machine's technicalities. For the trade-off of efficiency, higher level languages like Fortran are (almost by definition) more similar to natural language, disposing or hiding technicalities like registers, call stacks, or memory addresses. In fact, in looking for a definition of a high-level language, the general consensus trended on progress towards natural language, which is delectably accurate. But the move to higher-level was not due to a potential gain of programming efficiency; it was to make code more people-friendly. Example: Grace Hopper saithe computer should “learn how to respond to people because I figured we weren’t going to teach the whole population of the United States how to write computer code. There had to be an interface built that would accept things that were people-oriented and then use the computer to translate to machine code.”
The practical nature of Fortran was quickly proven when it cut down lines of code by a factor of 20, transforming the role of a programmer from an elite and sacred oracle to a an accessible job.

Concurrently (or at least a couple years after), an alternate approach to programming was being formed. LISP, created by John McCarthy, sought to characterize problems with functions and lists, shifting the focus away from manipulating memory  and towards creating a practical notation for math. Similar motive than Fortran's, but conceptualized very differently. Built on top of lambda calculus, this language was unorthodox and far ahead of its time, inventing recursion, garbage collection (automatically dealing with memory), and dynamic memory allocation while challenging the canonical approach to problem-solving. The purest version of LISP is theoretically one where all items are functions, but the practical version includes all the numbers, strings, and basic types programming languages are expected to have (a detailed look at the differences here). LISP was (and sometimes but to a lesser degree, still is) considered one of the best AI languages and was used fairly often to teach with, which in itself is interesting. The relation of lambda calculus and LISP to the arguable intuitiveness is interesting but too deep to delve into.

Object-oriented code features another direction for code. Conceptualizing large projects such as games and software as an interlocked system of objects makes large projects easier to manage and control. Programming languages like Java are so popular because it's used in large corporations and it lends well to programming little games for students.While function- and object-oriented code are currently considered opposites, the idea for objects came from LISP's atoms like string. Smalltalk, Eiffel, C--they all were forerunners in this ideology.

I hold these three programming trends to be the most important, but it inevitably leaves a huge amount out. Sometimes explicitly and sometimes less so, most significant changes in programming language have been explicitly stated as an attempt to open up the field through more intuitiveness. Based on the small sample size, I will sit happy that my thesis is mostly accurate.

In many learning languages (i.e., languages made for/good for teaching programming), teachers aim to find a program that sounds as similar to English that they can. This makes me wonder whether higher-level programming languages are going in that direction or the conception of natural language as the ultimate form of expression is exaggerated.

In fact, Columbia school of engineering counts programming as a foreign language. Not sure how to take that.