Dispatch Magic and Concurrency

I have long enjoyed the convenience of the “dot” dispatch syntax: instance.log("Ethereal cognistrands do not support quantum entanglement") The magic of Universal Function Call Syntax makes this sugar for a function call: log(instance, "Ethereal cognistrands do not support quantum entanglement") The broad popularity of method dispatch is driven by these benefits: Ad hoc polymorphism. When used in conjunction with method overloading, we can reuse the same easier-to-remember semantic names across multiple types and parameter configurations, with minimal-to-no ambiguity on which implementation to use.

Data Flow Analysis

Note: This is a heavily revised version of an earlier post The Cone compiler performs a data flow analysis pass after name resolution and type checking. Given that this sort of analysis is rarely covered by compiler literature, I thought it might be useful to jot down some thoughts about its purpose and intriguing mechanics. Trigger Warning: This blog post is highly technical and brief. It reads more like an organizing outline for a design spec than a typical essay-oriented post.

Region Modules: The Rest of the Story

The previous Lifecycle of a Reference post illuminates the working relationship between references and regions. It explains how a reference’s region annotations are programmer-definable struct-like types, which can specify an allocated object’s region state using fields, and region-based operations on an object (e.g., allocate and free) using methods. This, however, leaves out an important part of the story about region definitions: their global state and global runtime logic. Holistically, a Cone region is fully defined by an importable module, within which lies:

The Lifecycle of a Reference

Over two and half years ago, I described something I called Gradual Memory Management. Inspired by Rust and Pony, my paper proposed that it is feasible and desirable for a systems programming language to allow programs to exploit multiple memory management and permission strategies. Without putting memory and data race safety at risk, doing so would facilitate significant improvements to throughput and latency, even in multi-threaded architectures. It is easy to propose wild ideas in words.

Unifying Type and Value Expressions

I seem to be on a roll with syntactic concerns lately, which is not where I typically spend most of my time. Today’s issue is, at least, more unusual than stylistic preferences regarding semicolons and significant indentation. This post explores the issues that arise when teaching the compiler to be agnostic as to whether it is parsing a value expression or a type expression. A value expression computes some typed value when evaluated:

Significant Indentation

The last post focused narrowly on evaluating various techniques for semi-colon inference, thereby enabling a program to successfully compile even when the programmer omits statement-ending semicolons. This post broadens this theme, looding at how the compiler can take advantage of significant indentation when handling block inference and multi-line string literals. I wander into this topic knowing that for many programmers, compiler support for significant indentation is troubling and uncomfortable. They prefer the explicit clarity of required semi-colons to end statements and curly braces to delimit blocks.

Semicolon Inference

Many programming languages require that statements end with a semicolon. Some languages, such as Javascript, Go, Kotlin, Swift, Scala and Lua, make this requirement optional. Although a semicolon may be specified, it is not required to terminate a statement. This post explores the underlying rules various languages use to make semicolon inference possible. It also articulates the arguments for and against a language offering support for it.1 The Challenge The semicolon inference rules would be simple if every statement fit on a single line.

Practical Subtyping

The last two posts introduced the fundamental elements underlying Cone’s type system. They also distinguished between nominal vs. structural type equivalence. However, they were largely silent about subtyping relationships between types. Given how important subtyping is to Cone’s type versatility, I thought it might also be valuable to offer a detailed treatment of these mechanisms. My approach will be practical, rather than formal (alas! no subsumptions nor lattices), distilling lessons learned from implementing a rich, but type-safe collection of subtyping mechanisms.

Reference Type Semantics

The prior post described the semantics underlying Cone’s fundamental types, for all types except the all-important pointer and reference types. The presence of versatile pointers/references as an explicit and distinctly-separate kind of type is a distinguishing characteristic of systems programming languages. I choose to cover them here as a separate post, since the safety guarantees of Cone’s references make them semantically rich and complex. In harmony with the subatomic theme of the prior post, reference types are themselves composed of “quarks”: pointers, regions, permissions, and lifetimes.

Searching for Quarks in Systems Types

When I began work on the Cone programming language, I believed I had a confident handle on which primitive types this modern, safe systems language required: various sizes of atomic number types, fixed-size arrays, programmer-defined structs, raw pointers, safe references, discriminated (and raw) unions, generics, and limited support for void and tuples. My confidence was misplaced. Once or twice a year, implementation challenges and further research forced me to re-think aspects of the type system, each time deepening my understanding of what it required.