Types

It is time for Cone to get a proper module system. This design space is complex and rife with historical missteps. In order to distill the topography of the landscape and clarify the key requirements, I felt it necessary to begin the journey with cross-language research and contemplation. I wound up pursuing three rabbit-chasing adventures in module wonderland: What do programmers want from modularity? You can read about this adventure in my earlier post on modularity, which captures what we want from modularity, and summarizes how the three modularity capabilities are surfaced across different layers of programming language features (and program decomposition).

Cone’s module system has been on my mind of late. The best design for modules is neither easy or obvious, as evidenced by how much modules vary from one language to the next. To guide my approach for Cone, I went back to basics: What is the role (and benefit) that modularity plays in programming (languages)? What role do modules play within this larger picture. This post synthesizes my findings.

Over two and half years ago, I described something I called Gradual Memory Management. Inspired by Rust and Pony, my paper proposed that it is feasible and desirable for a systems programming language to allow programs to exploit multiple memory management and permission strategies. Without putting memory and data race safety at risk, doing so would facilitate significant improvements to throughput and latency, even in multi-threaded architectures. It is easy to propose wild ideas in words.

It is notable how often paradoxes arose in the historical journey that led to the Rise of Type Theory. Resolving Russell’s paradox led to his theory of types. Gödel’s Incompleteness theorems rested its proof around a variation of the Liar Paradox. Church and Turing recapitulated this result using their distinctive formalisms. Intuitionistic type theory was deliberately designed to avoid these paradoxes. And so it goes. As an undergraduate, I had the pleasure of studying Gödel’s ingenious proof as part of a course on Predicate Logic.

After three dreary posts on syntax, let’s change the pace and pursue an entirely different, deeper and more fun topic! In our community’s #theory channel on Discord, someone asked: “Is there a basis for some of the mathematics related to type theory and its relationship to its usage in programming languages?” This triggered a spirited dialogue about the historical antecedents of type theory, how type theory evolved from that, and the interplay between programming languages and type theory.

I seem to be on a roll with syntactic concerns lately, which is not where I typically spend most of my time. Today’s issue is, at least, more unusual than stylistic preferences regarding semicolons and significant indentation. This post explores the issues that arise when teaching the compiler to be agnostic as to whether it is parsing a value expression or a type expression. A value expression computes some typed value when evaluated:

The last two posts introduced the fundamental elements underlying Cone’s type system. They also distinguished between nominal vs. structural type equivalence. However, they were largely silent about subtyping relationships between types. Given how important subtyping is to Cone’s type versatility, I thought it might also be valuable to offer a detailed treatment of these mechanisms. My approach will be practical, rather than formal (alas! no subsumptions nor lattices), distilling lessons learned from implementing a rich, but type-safe collection of subtyping mechanisms.

The prior post described the semantics underlying Cone’s fundamental types, for all types except the all-important pointer and reference types. The presence of versatile pointers/references as an explicit and distinctly-separate kind of type is a distinguishing characteristic of systems programming languages. I choose to cover them here as a separate post, since the safety guarantees of Cone’s references make them semantically rich and complex. In harmony with the subatomic theme of the prior post, reference types are themselves composed of “quarks”: pointers, regions, permissions, and lifetimes.

When I began work on the Cone programming language, I believed I had a confident handle on which primitive types this modern, safe systems language required: various sizes of atomic number types, fixed-size arrays, programmer-defined structs, raw pointers, safe references, discriminated (and raw) unions, generics, and limited support for void and tuples. My confidence was misplaced. Once or twice a year, implementation challenges and further research forced me to re-think aspects of the type system, each time deepening my understanding of what it required.

Type theory has many kinds of typing mechanisms: product, sum, top, bottom, recursive, universal, existential, subtypes, linear, refinement, dependent, and so on. I have run across something different from all those, which I am calling “infectious typing”. If anyone knows a formal name and/or treatment for this mechanism, I would love to hear about it. Type composition allows us to create new types out of existing ones. A struct or class product types is composed of several fields, each with its own type.

Types

When Modules Are Not Just Namespaces

Modularity in Programming

The Lifecycle of a Reference

Weakening Cycles So That Turing Can Halt

The Rise of Type Theory

Unifying Type and Value Expressions

Practical Subtyping

Reference Type Semantics

Searching for Quarks in Systems Types

Infectious Typing