Notes

2025 In Review

January 01, 2026

2025 was odd.

It was the best of times. I am happier today than I ever have been. I learned gorgeous and utilitarian concepts alike; I crossed the employability threshold and work with incredible comrades; I live in a home with some of my favorite people on the planet. I'll be 20 soon and the future is bright.

It was the worst of times. Mired in a haze of hopelessness and confusion. I spent months and months in pain and as an insomniac; I was cripplingly sick for half a year. For the first time, I have regrets. Proper ones, ones that I'll never forget as long as I live.

It is good to have lived, maybe. I never want to let the preconditions for this year exist again.

Ideas

Spent the first half of the year thinking about how to think about neural networks. Mean-field approaches to spin glasses, nonequilibrium QFT, KL bounds via stochastic coupling, depth-width tradeoffs, representational capacity (benchmarked by circuit complexity classes), singular learning theory, computational mechanics, variational reformulations of RL. Some of it was misguided, much was interesting.

Vaintrob et. al. have been pursuing a QFT-inspired approach to interpretability. It's really cool, and I'm excited to (hopefully) see their empirical work come out this year. Dmitry's post on "SLT as a thermodynamic theory of Bayesian learning, but not the thermodynamic theory of Bayesian learning" is one of the most interesting I've read this year.

Read Debreu on microeconomics. Yasheng Huang on Chinese capitalism. Ran a Land-focused reading group running through Bateson, Hegel, analytical Marxism, Kant, and Land himself. Finnegans Wake and EGA are both prime examples of idiolects. Yuxi Liu has a great blog. Read the formalization of Chomsky's syntax-semantics theory while recovering from surgery. Thought about chromosomal selection and inducing meiosis in oogonia at the Reproductive Frontiers conference.

Summer rolled around, I was briefly back to 95% capacity, and I was contracting for ILIAD (the org) & planning to intern at Softmax. Making an open problems list for ODYSSEY (the conference) was enlightening. Curation is very, very difficult! Making sure that your theoretical brainchildren touch grass is almost as hard!

Softmax forced me to properly learn to code. Programming is very different when done collaboratively in a shared codebase, doubly so when one's research code must be written quickly, efficiently, and in a manner interpretable to others. I'm glad I got to think about multi-agent RL and what makes good management good.

My thoughts for the rest of the year were less legible. Tiling is an important problem. Formalizing it is hard. Acausal coalitional structure is confusing. Astronomical waste may or may not be an issue. Transpiling meta-ethical frameworks is hard. Biosingularitarian governance is hard. Is macrohistory determined by the nucleation stages of technological development or by deep convergent pressures leading to certain outcomes? How would we know?

At SI, we think about meta-learning and recurrence, among other things. Attempting to understand these deeply has been fruitful. A Berkeley professor ran a seminar on Adorno and poetry with a wonderful reading list. Semiotic physics is confusing, but Owain Evans has some good frames. Evolutionary optimization is surprisingly sample-efficient. Training deep neural networks is hard. Post-AGI futures are confusing and hard to think about, but Korinek has good frames. Weight-sparse models are deeply interesting.

In 2026, I want to drastically shorten my map-territory feedback loops. Intellectually, my greatest flaw this year was not investing in it. Admittedly, I probably did not have the energy for it. Luckily, this year is shaping up to be different.

Some Randomly Sampled Moments

[redacted]

People

[redacted]

Miscellanea

[TBD]

commentary on Leviathan rights

December 07, 2025

[see Leviathan XVII]

[1]. Revolution is illegitimate. Sovereign power is derived from the one-time consent of the governed. Not only is it illegitimate to attempt a contractual renegotiation, it is unjust; doubly so if done in the name of God.

[2]. Sovereigns do not contract with the People. Such a construction is structurally incoherent. Thus sovereign authority cannot be forefeited, and regardless the state maintains a monopoly on violence.

[3]. Protest is illegitimate.

[4]. A subject cannot justly critique a sovereign's actions, as partaking in the Covenant consequently places the responsibility for the sovereign's actions on the subject.

[5]. Sovereigns are unpunishable.

[6]. Matters of peace and defence are solely the purview of the sovereign, domestic or otherwise. Education of the populace likewise.

[7]. Law-making is the sole purview of the sovereign, with the aim of ensuring the consituency acts peaceably and justly.

[8]. The power to judge "controversy", be it legal or factual, belongs to the sovereign.

[9]. The sovereign has the sole authority to make war and peace.

[10]. The sovereign has the sole authority to staff an executive.

[11]. The sovereign may reward or punish his subjects as he sees fit.

[12]. The sovereign maintains a monopoly on status signifiers and honors conferred on subjects.

Criticality in Value Formation

November 22, 2025

underspecified thesis: qualitative differences in phenomenal effects are primarily determined by the conditions under which nucleation occurs; the environmental conditions of phase transitions are the primary determinants of the long-run behavior.

examples: prion diseases, ritonavir. cases where a structure exhibits polymorphism & the particular polymorph propagated is sensitive to initial conditions
counterexamples: error-correcting codes (robust to perturbation), some chaotic systems (no 'qualitatively different' basins in double pendulum behavior), mutational reproductive success (more fit mutations will propagate more widely, this is not generally determined by the time at which the mutation appears in the population)

is this true for value formation? some cases:

broadly, "developmental interpretability," insofar as one is interested in characterizing the stage-wise development of a neural network's policy. the SLT thesis as pursued by Timaeus (see Influence Dynamics and Stagewise Data Attribution, Embryology of a Language Model, Modes of Sequence Models and Learning Coefficients) falls in this category, as does characterizing the inductive bias of SGD, expanding the SLT story to encompass RL, attempts to link algorithmic information theory to modern training dynamics, the "neural nets as QFTs" perspective (see Grokking as a First Order Phase Transition in Two Layer Networks).
- pros: empirical work on actual neural nets we can train and try to interpret!
- cons: much work involves toy models and doesn't address the "what are values" question; there's a streetlighting effect where we find structure that we look for & ignore the parts of the network which look "random" from this perspective
- meta-con of all? the theoretical interp work being somewhat predicated on the thesis that the algorithmic structure of the learned policy is determined by phase transitions in some thermodynamic-ish measurables of the network
  - comp-mech/Simplex not like this
- success of these agendas should be evidence in favor of the thesis
sharp-left turn discourse
- existence of sharp-left turns implies criticality in value formation (not polymorphism)
- my summary of the original argument:
  - "being generally capable" is instrumentally useful in a way that "being aligned" is not (also my understanding of the corrigibility is anti-natural argument), so there exists a strong attractor towards capability improvement that does not exist for alignment, alignment & capabilities are not aligned in the limit thus capabilities generalize farther & faster than alignment so your alignment breaks
- i don't quite understand the arguments or counterarguments or really the arguments for why corrigibility is anti-natural?
- one way I want to concretize this is saying something about the stability of a logical inductor's value of statements which refer to itself (goals are 'just' beliefs about future actions, values are 'just' persistent goals)
  - LIs have Introspection (4.11) and Self-Trust (4.12) which makes their behavior nice in the limit
  - plausibly you'd want to study beliefs in a game-like setting, either with information revelation over agent preferences or environment state, and see what happens?
humans
- trauma / philosophy / psychosis / abnormal psychological effects can induce extreme value shifts. this does not seem to be accompanied by an overall increase in individual performance
- humans raised in a slightly abnormal environment are pretty normal. humans raised outside of society are not very normal.
- the 'philosopher AI concern' comes from a belief that at some point the agents will be able to arbitrarily reflect & decide what their values should be. i feel like consequentialist agents at time $t_0$ are incentivized not to let this happen at time $t_1 > t_0.$
- in particular humans cannot arbitrarily intervene on their values very well

Hobbling-Induced Innovation

November 02, 2025

Rather famously, Tesla refuses to use LIDAR and Autopilot only takes 2D observational video data as input. Autopilot is the only production-ready, end-to-end self-driving model. Waymo currently relies on a modular architecture using LIDAR, but is pivoting to end-to-end as well. Tesla seems to have made the correct long-term technical bet (end-to-end models for self-driving), but at the cost of a prima facie nonsensical constraint (strictly less sensory input).
AlphaGo Zero was the first of its kind to be trained only on self-play, without reliance on human data. It beat Lee Sedol and the rest is history.
At Softmax, we made the Cogs face in their chosen direction before taking a step. This made the agents harder to train and led to less consistent behavioral patterns. However, we made progress on our goal-conditioning agenda.
Apple deprecated Flash on iOS in 2010, pivoting to a solely HTML5-based stack. Adobe stopped developing Flash for mobile in 2011 and eventually deprecated Flash entirely in 2020. Apple lost market share in the short-term but clearly won (Flash was not a good product).
Rust's borrow checker forbids shared mutable aliasing. As a result, memory safety errors have been drastically reduced (compared to C/C++) and new security levels have been reached.

All five of these share the property of "removing functionality to hopefully raise the long-term ceiling of performance." It is unclear if all of these modifications did raise the ceiling! Hindsight informs us that unsupervised learning on human data for two-player, zero-sum, perfect information games is indeed a crutch. But it seems to be relatively straightforward to integrate LIDAR or radar data into an E2E self-driving model training stack, and both grant visibility in environments where video-only data is differentially disadvantaged.

Picking at the Tesla case more: it is true that LIDAR sensor per-unit prices were at ~$8,000 in 2019. Integrating that would kill any chance at making an affordable FSD consumer vehicle. Today, Luminar has brought this down to $500 in the USA and Chinese manufacturer Hesai sells sensors for $200 a pop. Prices will continue to fall, LIDAR will no longer be price-prohibitive, and Rivian plans to take advantage of the full sensor array when developing its FSD model. What gives?

Google X has the mindset that one must kill good things to make way for truly great ones. "Necessity is the mother of invention." Making a 10x breakthrough is only 2x harder. And for sure, constraining the problem to only its essential inputs can result in more scalable and successful solutions (SpaceX's Raptor 3 is no exception). But was it fundamentally necessary for Tesla to ban LIDAR?

Argument for: LIDAR was prohibitively expensive, Tesla would have failed to get the necessary distribution for data collection by using LIDAR. Counter: fair, but doesn't address why there's a lack of radar (very useful in low-visiblity scenarios, cheap, would have improved safety).

Argument for: Elon-culture is a package deal, Elon-culture was the determinative factor in the development of Autopilot, Elon-culture takes the hardcore minimalism and runs with it. Counter: I can believe this (Casey Handmer says this), but it still seems so obviously optimal that once the 0-1 is achieved you optimize for having a good product. Human eyes are not optimized for terrestrial vision, there's no point sticking to the human form factor!

Moving away from Tesla: I think we can construct a typology of reasons why one would intentionally hobble their development (via restriction) for the sake of innovation. First, because it bakes in a fundamental limitation (AlphaGo is like this, Tesla's original argument can be argued to be like this). Second, because restriction allows for better design (as in the case of Rust and Apple's refusal to use Flash), and better design creates a healthier ecosystem (this seems to be mostly applicable to platform-based products). Third, because adopting the stance of doing a Hard thing is useful, and artificially increasing the Hardness of the task has better consequences (I think of Elon like this, within limits: push up to the boundaries set by physics and no farther).

It takes skill to understand the directions in which one can make a problem harder productively. Facebook actually failed miserably at pivoting to HTML5 at the same time as Apple. Tesla's removal of radar ruffled feathers in the engineering team. Survivorship bias rules all, and given PMF it's probably easier to make development too hard rather than too easy (following customer incentive-gradients sets a floor & strong signal).

It's probably good to implement a kind of regularization in research-heavy, 0-1 product development: strip out all the assumptions, solve the core task, add additional configurations on top of a good foundation. I don't think it's necessary to continue hobbling oneself when its proven unnecessary. That is masochism, and your competitors will beat you.

Idiolects?

November 01, 2025

French fluency is neither necessary nor sufficient for understanding EGA.
There's a certain sense in which understanding a particular French "dialect" (the collection of words + localized grammar + shared mental context required to make sense of EGA, the one which forms the basis for modern French algebraic geometry (?)) is a sufficient condition for understanding EGA.
There's also a sense in which understanding this French algebro-geometric dialect is an almost necessary condition for understanding EGA past a certain point (happy to consider disputations, and perhaps the understanding one receives from the necessity condition is less directed at the concepts which the literature built off of but rather the peculiarities of Grothendieck et. al.'s mental states & historical context).
Packaging "shared mental context" with a "dialect" and subsequently claiming that understanding the "dialect" is necessary and sufficient for understanding the embedded concepts is begging the question.
It seems like there is this restricted language associated with a set of concepts, the concepts themselves can are understood in the context of the restricted language, the concepts are mostly divorced from the embedded grammar of the parent language, and we don't have a very good way of drawing a boundary around this "restricted language."
In a general sense, this kind of "conceptual binding" is not rigid. Strong Sapir-Whorf is incorrect, the Ghananian can learn English, I can just read Hartshorne or solely Anglophonic literature to learn algebraic geometry.
However, canonical boundaries make sense even when the the boundaries are leaky. A species is not completely closed under reproduction, however it makes sense to think of species as effectually reproductively closed. A cell wall separates a cell from its environment, even if osmosis or active transport allows for various molecules to be transported in and out.
One might expect this binding to be "stronger" when the inferential distance between the typical concepts of some reference class of language-speaker and the concepts discussed in the "dialect" to be larger.
A general description of a language used by a group of communicators is the tuple (alphabet, shared conception of grammatical rules, shared semantic conception of language atoms & combinator outputs).
Outside of purely formal settings, the shared conceptions of grammar & semantics will be leaky. How much can be purely recovered from shared words?
However, there are natural attractors in this space. Ex. traditional dialects, modern languages. Shared conception diffs between language-speakers are significantly smaller than shared conception diffs between two different language speakers (this is by default unresolvable unless there's some shared conception of translation, at which point they're sort of speaking the same conceptual language?)
When talking about algebraic geometry, it feels like an English geometer and a French geometer are speaking more similar languages than a French geometer and a French cafe owner.
I want to say: "an idiolect is a natural attractor in the space of languages for a group of communicators discussing a certain set of concepts, the idioms of the idiolect are identified with the concepts discussed, and the idiolect is quasi-closed under idiomatic composition."
Identifying shared languages as emergent coordination structures between a group of communicators feels satisfying.
However, returning to the case of algebraic geometry, it feels like I can "grok" the definitions of the structures described without understanding the embedded French grammar in EGA. Maybe the correct decomposition of a shared language is (shared idiomatic conception) + (translation interface), and we should just care about the "pre-idiolect."
This is just a world model? Describable without reference to other communicators? Loses some aspect of "coordination"?
Maybe the pre-idiolect is s.t. n communicators can communicate idioms & their compositions with minimal description of a translation interface.
The idiom <-> concept correspondence feels correct. Like, on some level, one of the primary purposes of a grammatical structure is to take the concepts which are primarily bound to words & make sense of their composition, and lexicogenesis is a large part of language-making. But it feels like restricting to wordly atoms is too constraining and there are structural atoms that carry semantic meaning, and idiom can encompass these.
How do you reify concept-space enough to chunk it into non-overlapping parts?
I am trying to point at a superstructure and say "here is a superstructure." I am trying to identify the superstructure by a closure criterion, and I am trying to understand what the closure criterion is. Something language-like should be identifiable this way? And the appropriate notion of closure will then let us chunk correctly?
Maybe superstructures are not generally identifiable via closure?
The load-bearing constraint for considering species as superorganisms is a closure property. They're not particularly well-describable by Dennett's intentional stance.
I want to say "idiolect:communicator:idiom :: species:member-organism:gene."
I don't want to identify lexemes as the atoms of a language-like-structure. Chomsky et. al.'s new mathematical merge formalism is cool but construed, and I have not seen a clean way to differentiate meaningful lexeme composition from non-meaningful lexeme composition.
"Shared understanding" feels better? The point of a language is a mechanism by which communicators communicate, and it so happens that languages happen to be characterizable by some general formal propeties.