In the week of July 21, 2010, I attended my first OSCON conference.
The lineup of presentations is normally quite impressive already, but this year it was even more so with the inclusion of the Emerging Languages Camp. Kudos go to Brady Forrest (O’Reilly Media, Inc.) and Alex Payne (BankSimple) for recruiting such excellent speakers.
As the organizers put it in their own words, “new programming languages are born all the time. Some languages are created to tackle new problems. Some languages are evidence proofs towards a better way of programming. Some are created just for fun or to scratch an itch. The Emerging Languages Camp is a gathering of the creators of recent programming languages, their peers, colleagues, interested programmers, technologists, and journalists.”
The event was spread out over two days at OSCON 2010, with a dedicated room filled to maximum capacity of 120. The event was inspiring due to the enthusiasm with which each and every of the two dozen speakers presented their language, how they made a historical comparison with other languages, how they explained the motivation behind why the world needs yet another language, and the way they gave compelling demos of sample code written using their programming language.
The only problem with events like this is that I now feel inspired to design another language of my own, just because it would be cool and just because I can
A running inside joke of the 2 day session was how to write an accumulator in the speaker’s proposed language. Next time you develop a language, forget “hello world”, and show an accumulator function, if you want to get recognition from your peers!
Some random twitter quotes:
- “Q: What kind of programs do you imagine writing with this language? A: I have no idea.”
- “There goes Lisp’s Last Great Hope”, as Rich Hickey dashed across the road in front of traffic.”
- “When facing a new problem, you may look for the best language to solve it, or create your own”
Below follow the presentations I attended, as short description of each, and some personal highlights. Some I attended but could not make notes for due to brain overload.
Rob Pike (Google, Inc.)
“Go’s approach to concurrency differs from that of many languages, even those (such as Erlang) that make concurrency central, yet it has deep roots. The path from Hoare’s 1978 paper to Go provides insight into how and why Go works as it does.”
Rob gave three presentations at OSCON 2010, and in the first one he specifically focussed on concurrency. The real breakthrough in parallel programming came in 1978 when Tony Hoare published his seminal paper on Communication Sequential Processes (CSP). Communication is fundamental. Parallel composition is multiprocessing. CSP has guarded commands, which allow you to express the willingness to execute a command based on a given expression. Furthermore, the composition language is very similar to the “pipe” construct in UNIX, but not used in any language up to 1978.
From the theoretical basics of CSP, we forked off into many different more practical applications, such as Occam, Erlang, Squeak (SIGGRAPH ’85), Newsqueak, Alef, Limbo, and Go. The important discovery along this evolution was to make communication channels be first class citizens. That means you can send channels over a channel. Process composition works by naming channels to allow goroutines to communicate with each other.
“Ioke is a dynamically typed language – a language experiment with a focus on expressivity. It’s prototype based, object oriented, homoiconic and have powerful macro facilities – and runs both on the JVM and the CLR. Seph is a language currently being developed, based on Ioke. It’s a functional object oriented hybrid with explicit concrrency features inspired by Erlang and Clojure.”
Perhaps the most distinguishing contribution of Ioke (pronounce: eye-oh-kee) is the notion of a formalized “decision system”. It is similar to continuable exceptions from Smalltalk, and much unlike exception handling in Java, where developers have to constantly unroll stacks.
Ioke cares less about concurrency, as that was not the topic of study that led to Ioke. Interestingly, Ola reported on that fact almost apologetically, and I kind of agree he should feel a bit embarrassed. Modern languages should be *based* on concurrency as that is the main problem to solve in the near future, with the increasing emergence of multi-core chipsets. My current desktop at work is a 12-core CPU, and at best I can exercise maybe 3 or 4 cores, while most stay dormant
because the programming languages we use to develop contemporary programs have a hard time expression concurrency across processes.
Ioke has a nice mechanism for DSLs. If you are into developing your own language, Ioke may be a language to investigate. Ioke allows you to change the underlying AST while the program is running, which greatly improves expressiveness, at a grand cost of performance, of course.
Ioke has no floats, just BigDecimals. Kudo!
Phil Mercurio (Thyrd Informatics)
“Thyrd is an experimental visual programming language built as a proof of concept. Thyrd is reflective (a Thyrd program can inspect and modify itself) and concurrent. Visually, it resembles a spreadsheet. Underneath is a stack-based functional language in the same family as Forth, Joy, and Befunge. This talk will present the key concepts in Thyrd and some of the directions it might take.”
Thyrd is heavily based on spreadsheets to combine the same representation for both the user interface and the underlying data model. This allows you to define variables, lists, tables, and tree datastructures, based on simple addressing semantics on the Thyrdspace. From a given cell, you can use relative addressing to get to other nodes.
Cell nodes are like spreadsheet functions and are constantly evaluated when dependent nodes change value. Expression evaluation uses a stack execution model, very similar to Forth (and Postscript). Using the “Fold” operator, you can easily develop a MapReduce equivalent.
Debugging is extremely visual by dragging a breakpoint into the code wave viewer. Similar tools have been developed for displaying the internal execution, allowing you to watch the execution stack being animated in a visual display.
I really enjoyed this presentation. At lunch I spoke with Phil and grilled him on scalability of visual programming techniques and performance, and he eluded to both of those being hard problems to solve. Even something as simple as TicTacToe becomes a wieldy set of colorful boxes. Phil remarked that the hierarchical nesting that the spreadsheet metaphor offers does help with navigation, as it encourage the brain to use its specialized spatial memory abilities. That
sorting routine is somewhere to the top right, then 3 down, right next to that purple box.
Allison Randal (O’Reilly Media, Inc.)
“The Parrot virtual machine hit 2.0 in January of this year, and the 2.6 production release will be out the day before this talk. A virtual machine like no other, Parrot targets dynamic languages such as Perl, Ruby, Python and PHP. It incorporates an object-oriented assembly language, is register-based rather than stack-based, and employs continuations as the core means of flow control.”
A core component of Parrot is a parsing expression grammar (PEG), plus tree transformations to generate the instructions to run on the virtual machine. The transformations are heavily based on attribute grammars.
The work being focussed on now is “Lorito” (Spanish for ‘small parrot’), to reconsider and refactor the existing VM and improve startup time, resource consumption, and targetting clouds and mobile architectures. Parrot now has 1,200+ static opcodes. That makes it hard to write a JIT. Lorito limits itself to 20 opcodes. Higher level opcodes are encoded in lower-level opcodes to allow JITs to do a better job. Other improvements focus on improved garbage collection algorithms and reduce cost between the various memory spaces. Lorito has only one object system, and hence no cost for going between C and the VM.
Allison gave a passionate and enjoyable overview of Parrot and it inspired me to read up some more on it. Although Parrot is not a language per se, I think the topic was quite appropriate for the audience, as environments like Parrot solve many challenges language designers face and don’t want to necessarily focus on, such as parsing, garbage collection, and JIT compilation.
Adam Chlipala (Impredicative LLC)
“Ur/Web is a new domain-specific language for programming Web applications, based on a new general-purpose language called Ur. Ur features new abstraction and modularity features that make serious code reuse and metaprogramming possible within a strong static type system.”
The main goal of Ur/Web is to enforce entitlements on subsets of datastructures, in particular related to web applications, such as controlling accress to a subtree of a given web document. It is one of the various approaches to solve the “browser problem”. A great side-effect of Ur is its highly optimized packaging technology that automatically compresses code before sending it to the browser.
Alan Eliasen (Frink)
“Frink is a practical programming language and calculating tool designed to make physical calculations simple. It tracks units of measure through all calculations, ensuring that answers are correct. Back-of-the-envelope calculations become trivial, and more complex physical and engineering calculations become simpler to write and read, and allow transparent use of any units of measure.”
This was an awesome presentation. Alan started with explaining how he got to design Frink many years ago. A friend of him claimed that 9,5 years of farting creates the same amount of energy as an atomic bomb explosion. Alan had his doubts and set out to disprove the theory using C++. He quickly got stuck and realized he needed a much better language to reason about the numerous units of measure and conversions between them. The end result is a system where you can simply request the output of a nuclear explosion in kilotons, convert that to joules, convert to calories, and simply divide that by “9 years and 6 months”.
Frink empowers such computations by including a smart parser, lots of trivia, and a vast library of conversions between countless physical metrics, such as kilograms, meters, miles, etc, etc. The transformation rules in Frink remind me of peephole optimization tools for IR optimization and use an elegant DSL that is very similar to regular expressions as I used myself to optimize IR opcodes in the Amsterdam Compiler Kit, some 20 years ago.
Another innovative contribution offered by Frink is the adoption of interval arithmetic. Try this: load your favorite compiler or interpreter and add 0.1 to itself 1,000 times. In systems that store 0.1 as a float, you are guaranteed to end up with a number that is unequal to 100, as 0.1 cannot be stored reliably as a binary number. Alan contributes to the IEEE standard that suggests a different type of math that allows you to compare numbers in ranges. Rather than saying x=0.1, you can say x=[2,4], which means that x can be any number in between 2 and 4. In other words, x is all those possible values at the same time. This is almost as cool as quantum computing!
With interval math, Alan convincingly showed how much more reliable complex computations can be, as error margins can be controlled and reflected upon. In this way, we can explicitly compute not only the result, but also the risk behind a certain computation.
Thoughts on the F# Productization
“F# was already a fairly mature language with roots in Microsoft Research, Cambridge, and a steadily growing user base when the decision was made to officially support it in Visual Studio 2010. Having just shipped F# 2.0, the goal of this talk is to outline the experiences, both positive and negative, we had in transitioning the F# language and its implementation.”
This presentation was a bit different from all the others, as it did not talk at all about F#, showed no sample code, and did not relate it to any other languages at all. Instead, it was a great meta-topic discussion on the perils of success. Joe elaborated on his personal experience what things the F# team needed to worry about to allow adoption to a larger (and more commercial) audience of the original research project:
- much better error/warning messages
- a well-written and well-maintained spec
- dog-fooding (use your own language to implement your compiler)
- be a first-class CLR language, no shortcuts
- clean up the crud collected over the years in the runtime DLL (camelcase vs underscores)
- binary compatibility between different versions of runtimes
- modern tooling support (debugging, etc)
- writing documentation, automatic generation
- high quality, localizable diagnostics (changes how you print errors)
- figure out runtime deployment
- improved experience on alternative platforms (i.e., test on Mono)
It’s amazing how generic this list is. Substitute F# for your own language, platform, or product and the list of lessons learned apply directly to your own project to make it prime for successful adoption. I have been involved in quite a few large projects that went through this very same transition, and it was interesting to see how many deja-vus I had during the presentation. In return for his great talk, I showed Joe the direction to the beer at the local Python User group
Jeremy Ashkenas (DocumentCloud)
As far as the syntax goes, CoffeeScript looks a lot like python It is an awesome language and implementation. Not only is Jeremy a natural presenter, his materials were nicely presented, and various explanations and demos looked very nice. I bet many languages would be envious and hope to have such a great evangelist.
improved debugger integration with Firebug. This may help in debugging CoffeeScript.
Charles Nutter (Engine Yard, Inc)
“Mirah (formerly Duby), is a Ruby-inspired, statically-typed, lightweight, platform-agnostic language with backends for JVM bytecode, Java source, and more platforms planned. It borrows features from several static and dynamic languages, but with a twist: no runtime dependency on any additional library; everything is done at compile time.”
The goal is to have the best of Ruby, but be as fast as Java, and see where the ship sinks. Very small code base. Only 10K lines of Ruby code.
For almost all samples, the generated JVM bytecodes are much more condensed for Mirah than they are for JRuby. Requires function arguments to be typed. Uses type inferencing to propagate type information to declare locals and fields of more specific types that a dynamically typed JVM implementation could ever get.
It reminded me of the research work on “typed Smalltalk”. I myself have implemented a Smalltalk to Java byte code translator in the late nineties, and the opportunities for optimization are huge when generating to the JVM when type signatures are available. Lookup through reflection API is horribly expensive both in code size and in CPU consumption.
Matt MacLaurin (Microsoft FUSE Labs)
“Kodu is a new, purpose-built programming language designed as a first programming experience for kids or folks who want a very accessible intro to programming. Kodu is a visual language embedded in a 3D world, with language features specifically aimed at game design and interactivity programming. While deceptively simple, Kodu also introduces advanced concepts such as concurrency and arbitration.”
Kodu was designed to make it very easy to develop XBox games, assuming an underlying rendering engine.
Motivation is to reduce the impedance mismatch when going from gameplay to editing. Using simple metaphors that are close to the kids’ vocabulary, such as “score” instead of “int”. Furthermore, Kudo provides instant gratification by giving immediate feedback on scripts you just wrote. A debugger is not really needed as recoding is really easy.
OK. I cannot wait to get home, and download Kodu from XBox Live and sit down with my 13 year old and buy our own game. In the past, I have played with systems like Alice from Carnegie Mellon, but it is amazing to see what the latest generation of game engines allow you to do.
I also think the commercial/business software industry has been categorically ignoring lessons from gaming. I have yet to see an XBox 360, PS3, or Wii game that really needs a manual. All UIs are extremely intuitive or simplified to not overwhelm the user and sit in the way of the real aim: shoot monsters. Most commercial software thinks the business function is subordinate to the application framework and applications are over-engineered in the wrong areas. How
many games do you think get 5 stars when they take 4 hours to install, or when it takes 5 minutes to start up?
Update: I told my son about Kodu this afternoon. Half an hour later, he had downloaded by himself from the Indie game category at XBox Live, and had his first game implemented already. Amazing.
Rich Hickey (Clojure)
“This talk will provide a brief experience report on Clojure, a dynamic, functional language targeting the JVM. It will detail the challenges faced in providing a practical and approachable programming language featuring pervasive immutability on top of the commodity infrastructure of the JVM.”
Rich talked about a topic he is working on right now that may make it into Closure, and yet again, may not. I think it should as the concept is very similar to the project I am now working on at Bank of America/Merrill Lynch, and we agree on need for such and approach.
Rich’s goal is to implement persistence as a stored graph, where each node is immutable. As soon as you write to a node in the graph, a new path is created to provide you a new view of the world with the updated value. Only you will see your changes. Others use their own access paths, which takes them to the old values, not your new values. This persistence model strongly encourages concurrent modifications to the graph, facilitating caching, and replication and redundancy.
Clojure remains an extremely elegant system. I just cannot get used to the syntax, as my mind has been too polluted by C, Java, and more recently Python.
Mark Miller (Google, Inc.)
The second goal is to improve concurrency. Tradeoffs are shared state vs. message passing in one dimension and blocking vs. non-blocking in the second dimension.
Mark explained the various meanings of the word Caja, and reinforced another lesson. If you are designing your own programming language, platform, or framework, spend at least as much time at choosing the right name as you do in implementing it.
Tom Van Cutsem (Vrije Universiteit Brussel)
“AmbientTalk can best be summarized as “a scripting language for mobile phones”. It’s a dynamic, object-oriented, JVM-compatible, distributed programming language. AmbientTalk’s focus is on applications to be deployed in so-called “mobile ad hoc networks” – networks of mobile devices that communicate peer-to-peer using wireless communication technology, such as WiFi or Bluetooth.”
I thought the coolest takeaway from AmbientTalk is the notion of message queues and the fact that network failures are not exceptions, but are to be expected and anticipated. A lot of the plumbing in AmbientTalk concerns itself with talking to partners that we have no physical connection with yet, and dealing with messages that time out after a certain time.
OK, that wraps up my brain dump of a subset of the languages being presented at the Emerging Languages Camp 2010. I am hoping we will see many more of them in the future…