Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The flip side of this is that if you're writing something in C, you actually do have to understand how details like closures are implemented on real hardware. These are not trivial problems, yet many programming language design papers and articles effectively brush the whole issue of actually implementing these things under the carpet.

Go ahead and search for that particular example. There is almost nothing on the Web about practical, real world techniques for implementing that sort of programming language feature, despite its ubiquity in functional languages and the increasing presence of -- or emphasis on -- higher order functions and related tools in mainstream imperative languages. The same could be said of many other non-trivial language features, say laziness, or various concurrency models.

I think this is regrettable, because it creates a real barrier for anyone who's interested in learning about these features and maybe writing their own languages one day, but who doesn't come from an academic background where this kind of material was taught. It also makes it unnecessarily difficult for someone from an imperative programming background -- which is probably still the overwhelming majority of programmers -- to understand the real costs and performance implications of using these kinds of features.



This is not quite true. There is SPJ's famous "The implementation of functional programming languages" [1]. Then there is lots of material on various abstract machines, like the SECD machine, the CAM, the many variants of the Krivine machine, the G-machine and plenty of others whose name I forget.

All that said, I kind-of agree with you. It would be nice to have some easy to grasp explanations of how to compile higher-order functions to actual machines, x86, ARM, MIPS ... This material can be found e.g. in Appel's book [2], but I don't think it's available freely (and legally) on the web. I think this is in parts because the compilation process is not so easy: you need to understand the compilation of normal languages, in particular stack layout, and then add more pointers. You also need to think about how to do memory management if a closure lives longer than its creating context. This is quite rich material, and not even usually covered in undergraduate compilers courses.

The situation with concurrency is much worse, but that only reflects the fact that this is far from a solved problem on many levels. To cite a recent paper [3]: "Despite decades of research, we do not have [in 2015] a satisfactory concurrency semantics for any general-purpose programming language that aims to support concurrent systems code [...] Disturbingly, 40+ years after the first relaxed-memory hardware was introduced (the IBM 370/158MP), the field still does not have a credible proposal for the concurrency semantics of any general-purpose high-level language that includes high-performance shared-memory concurrency primitives. This is a major open problem for programming language semantics."

And memory models are but one issue to deal with in concurrency.

[1] http://research.microsoft.com/en-us/um/people/simonpj/papers...

[2] A. Appel, Modern Compiler Implementation in ...

[3] M. Batty et al, The Problem of Programming Language Concurrency Semantics.


> The flip side of this is that if you're writing something in C, you actually do have to understand how details like closures are implemented on real hardware.

If you're writing a compiler targeting real hardware and not a high level virtual machine, you'll need to learn how to implement closures regardless of the "host" language you write your compiler in.

But you're correct, when implementing an interpreter, it's so much easier if you have a garbage collector and an object system of some kind at hand. You'll have to implement this yourself if you're working with C.


If you're writing a compiler targeting real hardware and not a high level virtual machine, you'll need to learn how to implement closures regardless of the "host" language you write your compiler in.

Right, but a lot of the tutorial material in this area doesn't really consider that issue, even though it's rather fundamental to practical applications. Instead the explanation often settles for some sort of indirect dependency on the implementation language's existing version of a similar feature. I think this is unfortunate, not least because it reduces us to a model where most people implementing these ideas depend on some sort of magic run-time environment instead of learning how to actually write that run-time.


Thats because tutorial material is going to keep things simple. If you really want to implement a garbage collector on your own then you should go looking for an advanced compiler book instead of a tutorial.


You can also implement an interpreter in Haskell/ML without using closures to implement closures. It would still be a lot less painful than C.


Of course. But it's still reasonable to ask how you'd write a compiler that implements such features in a free-standing way, without depending on either the built-in tools of the language you're writing an interpreter in or a black box run-time environment/VM that does all the tricky stuff for you by magic. With a relatively low-level language like C, you don't have any choice but to confront such issues head-on, while many tutorials manage to side-step them using higher-level languages. Of course it's easier to write almost anything in a suitable higher-level language, but for educational purposes that isn't really the point IMHO.


In that case, why not write a compiler in Haskell to C ;-)


You're confusing levels. If you are compiling to C or assembly, you have to understand how the constructs you are building work. That places no requirement on the language of implementation of the compiler.


Yes, but IME many tutorial presentations on these subjects never get as far as actually building a complete, functional compiler to demonstrate the concepts. Often they use some form of interpreter instead, which means they can use structures in the interpreter's implementation language that somewhat reflect the run-time structure of the implemented language using the analogous language features.


Well, it makes sense for tutorials to use what they can to simplify the bits they're not focusing on, but certainly implementing a construct by mapping it into an identical construct is not going to be terribly informative as to the workings of that construct. That may be totally appropriate if the focus of the tutorial is on the parsing or such, but it sounds like that sort of tutorial (of which there are many, and which are perfectly useful to other purposes) are not what you are looking for.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: