Abhinav Madahar · अभिनव ਮਦਾਹਰ

Why Do We Pursue Science?

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Tue, 03 Feb 2026 11:23:32 GMT

My life is very, very unusual.

I spend my time attempting to construct a mathematical structure which can reason generally. To do so, I need to examine what intelligence itself is, abstracting away from any particular local or general realization.

Conceptually, this is very far removed from finding berries and hunting gazelle, the behavioural envelope against which humans are optimized. Science, generally, is far removed from direct importance.

Why do we pursue science? Why do we examine the zeros of functions, the behaviour of enzymes, and the motion of particles? I do not ask this question while hoping for a vaguely self-congratulatory answer, e.g. that we pursue science to satisfy a sense of curiosity inherent to humans. Really, why do we pursue science? At the fundamental level, science is pursued because its results are useful. The criteria of usefulness may change, but this structure is steady.

Science was first used because its results were theologically useful. Religious dynamics turned on the visible behaviour of celestial objects, so improvements in the ability to predict these behaviours ex ante improved the ability to predict religious activities. Because the complexity of celestial behaviour was sufficient that studying it through its utility to religion overly dragged progress, astronomers emerged who studied celestial behaviour directly without individually considering downstream theological utility of the knowledge they produce. Consequently, astronomy was the first science to emerge in the sense of being the first field which saw intergenerationally-sustained growth in an understanding of a matter which was not itself applied. We can contrast this with, for example, improvements to agricultural techniques in the millennia preceding the third-millennium BCE emergence of astronomy; although intergenerational growth was certainly present, it was on an applied technique, not on a body of knowledge whose value derived from having that knowledge itself.

Astronomy in this context was funded in the hopes that its findings would eventually be theologically useful, but individual astronomers and individual findings were not evaluated by directly forecasting their eventual theological utility. Instead, they were evaluated on the extent to which they advanced astronomical knowledge itself, and the eventual theological utility of this knowledge and its producers were considered separately by examining the extent to which astronomical advancement imparted theological utility.

This structure has persisted since its first emergence of science in the third millennium BCE, and it emerged in all other contexts where science emerged and lasted: individual scientists and individual scientific findings were evaluated on the extent to which they advanced science itself while scientific fields as a whole were evaluated on the extent to which they were useful for matters important. This was necessary: as bodies of scientific knowledge grow, it becomes increasingly difficult to evaluate how useful individual findings are for downstream uses, forcing an evaluative decoupling.

This decoupling is where I want to focus my attention, and it lets us sharpen our earlier question. Replacing our earlier query of why science is pursued, we can ask whether advances in science should be evaluated exclusively on the extent to which they advance their respective fields, evaluated exclusively on the extent to which they impact downstream utility, or something else.

This is a difficult question to answer. When scientific funding was limited, the difficulty of the question was immaterial as the answer to the question itself had minimal society-wide impact. However, with the rapid increase in scientific funding with the transition to the modern era, this question became increasingly salient as states and societies became compelled to justify their increasingly large expenditure on basic scientific research.

The scientific community in the nineteenth century attempted to evade this question by constructing an understanding of how basic scientific work is eventually translated into practical utility. Explicit attention was systematically paid to the chain through which findings in a given field would eventually lead to utility inherently useful, arranged into what were networks in substances even if not explicitly articulated as such. For example, it became recognized that discoveries in how specific chemical reactions proceed could plausibly lead to improvements in the collective understanding of certain biological processes, whose improved understanding could itself plausibly lead to improved medicine. As such, it became structurally routine and central to aim to articulate the pathway by which basic research could be eventually translated into practical utility.

These translational pathways have grown in scope and nuance, and for many areas of science, they obviate the need to ask the question. If a field is sufficiently mature that the form of its findings can be reliably predicted ex ante, then it is possible to identify translational pathways for its findings, obviating the need for the field to answer this question for findings which have an identified translational pathway. This is not an obscure observation—far from it. In many fields, identifying such translation pathways to avoid needing to answer this question is a major research activity, and engaging in it is not seen as defensive justification but as responsible consideration of downstream utility. Given the difficulty of this question and the possible benefits of certain scientific advances, this is less a cowardly evasion and more a case of individuals needing to conduct important work before philosophers finish answering a question.

This question does not need to be answered given the construction of an appropriate translational pathway.

My own research has such a translation pathway. Advances in our understanding of the nature of intelligence improve our ability to construct an artificial intelligence which can answer questions we ask and conduct intellectual labour we consider useful. The translational pathway is so simple as to be nearly comical: as long as there are questions and problems worth answering and solving, respectively, advances in intelligence are useful.

My expectation is that, over a very long horizon, continued economic growth will lead to this question becoming increasingly less important. If increases in scientific funding at the onset of the modern era forced this question, increases in the economic base freeing more funding for science would lessen its significance. Perhaps humanity in the distant future will look back at our current attempts to justify science as quaint endeavours when scientific funding was limited and in need of rationing.

My education was in mathematics with my education in artificial intelligence fitted atop. As the reader can expect, I deeply internalized the prevailing norm in mathematics that advances in the field should be evaluated on their respective contributions to mathematics, not on their practical utility. More than most fields, mathematics is structurally forced to address this question as its translational pathways are arguably the most complex of any field in the sciences. Its continued existence is not the result of articulations of translational pathways but through an exceptional risk–reward structure: even if almost all mathematical findings are not useful, the few which are justify the field.

I will most likely continue thinking about this question for the remainder of my life, assuming the philosophers of science do not provide an unexpected answer, even if its answer is immaterial for my research. If there does exist a satisfying answer to this question, my research in intelligence may even lead to the production of an artificial intelligence which answers it for us. My hope is that a solution is in our stars.

Artificial Intelligence Is Arriving in the World We Already Live In

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Fri, 23 Jan 2026 02:34:53 GMT

Discussions of advanced artificial intelligence often assume a comforting distance: that systems with genuinely transformative capabilities belong to a future society—one that has already resolved today’s political, legal, and institutional challenges. That assumption is increasingly untenable.

The pace of progress in artificial intelligence has already exceeded what many experts expected only a few years ago. While it is possible that this period of rapid improvement will slow, the more responsible assumption is that capability growth will continue, and may even accelerate. We should therefore seriously consider the possibility that highly capable AI systems will emerge not in a hypothetical future, but in the world as it exists now: fragmented, unequal, and governed by institutions that were not designed with such systems in mind.

This matters because the benefits and risks of artificial intelligence scale together. Systems capable of solving harder problems can materially improve medicine, make infrastructure safer, and expand access to education for people who are currently under-served. At the same time, those same systems can cause harm—either through intentional misuse or through unintended consequences. Preventing such harm is not a problem we can defer until after these systems exist. It is a problem that must be addressed in advance.

Crucially, many of the most serious risks posed by advanced AI cannot be managed by individual actors acting alone. They are international in structure, cross-border in impact, and coordination-dependent by nature.

Preventing Misuse Requires International Coordination

Highly advanced AI systems require enormous upfront investment to develop. As a result, only a small number of states and well-resourced non-state actors are capable of creating them. This concentration is often overlooked, but it is important: while the use of AI systems can be widely distributed, the creation of frontier systems is not.

That distinction creates a narrow surface on which safety measures can be applied. Regulating individual users of AI is a highly dispersed and often intractable task. But the fact that only a limited number of entities can build the most capable systems means that meaningful safeguards are, in principle, feasible—if approached at the right level.

The difficulty is that misuse can be deliberately obscured through decentralization. A malicious actor need not commit an overtly harmful act in any single jurisdiction. Instead, harmful activity can be partitioned across systems, locations, or intermediaries, such that each individual use appears benign in isolation. The malicious intent becomes visible only when the full pattern is examined.

For example, consider a hypothetical actor attempting to develop a novel biological weapon. Rather than using a single AI system to pursue this goal directly, they might distribute the process across multiple systems located in different jurisdictions—using each system for tasks that appear innocuous on their own. No single state, acting alone, would necessarily detect the misuse. The risk emerges only at the global level.

This is not an argument that international coordination is preferable. It is an argument that, for the most difficult and dangerous forms of misuse, international coordination is necessary. Individual states and organizations can prevent some misuse on their own. But the forms of misuse that are hardest to detect and most consequential to prevent are precisely those that evade unilateral oversight.

It is therefore appropriate to say that only the international community is positioned to prevent the most challenging cases of AI misuse—not because it holds unique authority, but because it is the only level at which the relevant patterns are visible.

Importantly, this does not require prescribing specific legal mechanisms or enforcement strategies. The point is not to dictate how prevention must occur, but to recognize where responsibility ultimately lies. Once that responsibility is acknowledged, different states can pursue different implementation paths consistent with their legal and political systems.

At the same time, the scientific community has made substantial progress in understanding how to constrain AI systems so that they behave in ways humans consider acceptable. The question of how to influence system behaviour is increasingly a technical one. What remains unresolved—and cannot be answered by engineers alone—is which behaviours should be permitted and which should not. That is a normative question, and it belongs to society and its institutions, not to laboratories.

Accidents Are a Different Kind of Risk

Not all harm from artificial intelligence arises from malicious intent. As AI systems become more capable, they can engage in increasingly complex behaviour. In some cases, this complexity leads to decisions or actions that were not intended by their human users, and that nonetheless cause harm.

Unlike intentional misuse, such accidents cannot always be predicted in advance. They arise from interactions between system capabilities, deployment contexts, and real-world environments that are difficult to model exhaustively. As AI systems are deployed at global scale, the consequences of such failures are unlikely to remain confined within national borders.

A single state may be able to regulate how AI systems are used within its own jurisdiction. But it is far more difficult for any state to ensure that systems deployed elsewhere do not cause harm within its borders. When systems operate across digital and physical infrastructure that spans countries, impact—not jurisdiction—becomes the relevant unit of analysis.

Containing the risk of unintended harm therefore requires governance frameworks that assume cross-border spillovers as the default, not as an exception. At that scale, bespoke bilateral treaties or narrowly scoped agreements are unlikely to be sufficient. What is required instead is coordinated international effort that treats accidental harm as a shared problem rather than a series of isolated failures.

This, again, is not a call for centralized control or uniform regulation. It is a recognition that when impact crosses borders by design, governance must do so as well.

The Role of the International Community

Artificial intelligence presents challenges that cannot be fully addressed by individual states or private actors acting independently. As systems grow in capability and reach, so too does their potential to help and to harm. While many concerns can and should be handled locally, those involving AI systems deployed and used across jurisdictions quickly become difficult—often intractable—for any single actor to manage alone.

At the same time, the scientific community continues to advance rapidly in its understanding of intelligence and in its ability to realize it in machine form. Institutional progress must occur in parallel. The international community has a responsibility to examine how these systems can be constrained, how risks can be anticipated, and how harms that were never intended can nonetheless be prevented.

The question is not whether advanced AI will arrive. It is whether our institutions will be prepared when it does.

Toward 'Theory of Reasoning' Reasoning Architectures

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Fri, 24 Oct 2025 03:32:48 GMT

Introduction

Recent years have seen the rise of reasoning architectures—frameworks such as Chain-of-Thought (CoT), Tree-of-Thoughts (ToT), and their successors—that enable large models to reason through structured intermediate steps. Yet despite impressive empirical results, these systems remain heuristic and cognitively ungrounded.

This white paper introduces Theory-of-Reasoning Reasoning Architectures (ToR-RAs), a programme to formalise the cognitive architecture of reasoning itself: how coherence, metacognitive control, and resource allocation jointly give rise to structured thought.

Why This Work

Reasoning architectures today resemble early computing before Turing—powerful, but lacking a theory of what they actually are. Just as the theory of computation defined ‘what it means to compute’, a theory of reasoning should define ‘what it means to reason’.

The ToR-RA framework treats reasoning as a cognitive architecture composed of interacting subsystems for inference, evaluation, and metareasoning operating under bounded resources. The aim is not merely to engineer better reasoning algorithms, but to found a science of reasoning architectures linking cognitive-scientific principles with computational design.

Conceptual Foundations

A reasoning architecture, as defined in the paper, satisfies four invariants:

Coherence: Local inferences support global consistency.
Causality of thought: Representational changes have identifiable precursors.
Introspectability: The system can form meta-representations of its own reasoning.
Economy: Inference optimises progress per unit of cognitive cost.

These invariants align with long-standing frameworks in cognitive science—Marr’s three levels of analysis, rational metareasoning, and dual-process models—and together they provide a bridge between biological and artificial reasoning.

Relation to Prior Work

ToR-RAs extend my earlier developments in reasoning architectures:

Lateral Tree-of-Thoughts (LToT) — reasoning breadth and cognitive economy.
Natural Language Edge Labelling (NLEL) — semantic self-instruction and metacognitive control.

Both suggested that reasoning architectures could serve as models of reasoning rather than merely as engineering tools. ToR-RAs make that synthesis explicit.

Significance

ToR-RAs aim to unify cognitive science and reasoning-architecture research under a shared theoretical language, integrating interpretability, efficiency, and scientific clarity.

They represent a step toward understanding reasoning not as a black-box capability but as an architectural phenomenon—something that can be studied, designed, and improved with the same rigour once brought to computation itself.

Abhinav Madahar · अभिनव ਮਦਾਹਰ
Independent Computer Scientist
abhinavmadahar.com | abhinav@abhinavmadahar.com

Toward a General Cure for Cancer

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Wed, 08 Oct 2025 19:33:39 GMT

One-sentence summary:
Rather than relying solely on incremental, subtype-by-subtype progress, the world should fund and govern a globally accessible AGI cancer researcher — a language-based, tool-using system coupled to autonomous laboratories — explicitly tasked with discovering general interventions that restore growth control across tumors.

What unites cancers — and why a general strategy is thinkable

Cancer’s common denominator is loss of growth control: cells sustain proliferative signaling, evade suppressors, resist death, and bypass intrinsic and extrinsic checkpoints. These capabilities recur across diverse tissues even as the initiating lesions vary; the hallmarks of cancer framework distills this convergence and has been repeatedly refined with new dimensions, including phenotypic plasticity, non-mutational epigenetics, and the tumor microenvironment [1–3]. Large-scale pan-cancer efforts show conserved disruptions to pathways across histologies, supporting the premise that cross-tumor interventions can be reasoned about without denying heterogeneity [4, 5].

Clinical practice already acknowledges shared vulnerabilities. Tumor-agnostic approvals — pembrolizumab for MSI-H/dMMR tumors and for TMB-high disease; larotrectinib for NTRK fusions — demonstrate that targeting a common failure mode (mismatch-repair deficiency, hypermutation, or an oncogenic fusion) can cut across tissue of origin. These are not “general cures,” but they prove the concept that lineage-agnostic levers exist [6–9].

Two routes to a general cure

The direct route is today’s default: mobilize human expertise to map mechanisms, integrate multi-omic and clinical data, run trials, and — through cumulative advances — approach broadly applicable interventions.

The indirect route is technically plausible now: build an artificial general intelligence (AGI) capable of cross-domain scientific reasoning at the effective scale of the global research enterprise and assign it the problem “discover a general cure for cancer.” Such a system would read and synthesize literature; generate and prune hypotheses; design, schedule, and interpret experiments; integrate multi-modal data; and orchestrate long-horizon programs with automated labs and human clinicians.

Importantly, this augments rather than displaces scientists: by coupling models to self-driving laboratories, hypotheses must survive experiment, and error-prone reasoning is bounded by closed-loop validation and independent safety gates. The constituent technologies exist and are improving: “robot scientist” systems in biology and modern autonomous experimentation in chemistry, materials, and protein engineering have demonstrated end-to-end loops from hypothesis to result [10, 11].

The practice is not without precedent. When a concise, hand-checkable route was out of reach in mathematics, the Four-Colour Theorem was settled via a computer-assisted proof, later formalized in a proof assistant. A comparable modus operandi — computational procedures that scale beyond unaided human capacity, paired with verification — could apply to discovering pan-cancer interventions [12, 13].

Why a language-centric AGI is a plausible substrate for science

Skeptics doubt language-model-based systems can reliably “do science.” Yet a substantial literature treats language as a scaffold for abstract thought and self-regulation, with “inner speech” coordinating cognition. If much high-level reasoning is conducted through language, scaling language-native models — augmented with tools, external memory, and verification — becomes a plausible route to general scientific competence (not the only route, but a powerful one) [14–16].

Just as relevant, contemporary reasoning models improve when afforded more test-time compute: structured deliberation, multiple solution paths, and self-consistency checks raise accuracy on hard problems. For frontier queries — e.g., “design a multi-component regimen that durably restores growth control across contexts” — reliability will hinge as much on inference-time thinking and verification as on pretraining scale [17–19].

A pragmatic program sketch

A credible AGI-for-cancer effort should be practical and staged:

Knowledge integration. Unified corpora (literature, protocols, negative results) with governed access to clinical and genomic datasets; privacy-preserving analytics (e.g., federated learning) to respect data sovereignty and patient consent while enabling cross-site modeling. Frameworks from the Global Alliance for Genomics and Health (GA4GH) and large biobanks illustrate workable governance [20, 21].
Hypothesis generation and curation. Multi-omic integration to surface lineage-agnostic vulnerabilities (DNA-damage response, cell-cycle checkpoints, apoptotic priming, telomere maintenance), systematically cross-validated against pan-cancer resources [3, 5].
Automated experimentation. High-throughput perturbation (CRISPR/RNAi, small-molecule/protein modalities) run by self-driving labs to test and refine mechanistic claims, with pre-registered plans and independent replication nodes [10, 11].
Design against evolution. From the outset, incorporate adaptive-therapy principles — explicitly optimizing combinations and schedules to delay or prevent resistance — so candidates are evaluated not just for response but for evolutionary stability [22, 23].
Translational pathways. Use tumor-agnostic regulatory precedents as a template for trials centered on shared biomarkers or pathway states, then iterate toward progressively broader indications [6–9].

The economics: train once, think many times

Public estimates indicate that compute for recent frontier models already costs in the tens to low hundreds of millions of dollars, with several analyses projecting over $1 billion training runs within the next few years if current trends persist. The Stanford AI Index and Epoch AI both document rapid growth in training cost and compute; importantly, the total cost to reason on hard scientific questions will also include substantial inference-time computation (deliberation, retrieval, simulation, verification). Planning should assume wide error bars but a clear directional trend: training may be expensive, and sustained “thinking” on a problem this hard will be, too [24, 25].

Who pays — and how to keep it fair, safe, and sustainable

At this scale, durable funding and legitimate governance exceed the remit of any single philanthropist, firm, or nation. Analogues exist: CERN’s council and shared capital facilities; ITER’s cost-sharing treaty; pooled health-financing and access mechanisms in the Global Fund and Gavi; and UNESCO’s Open Science Recommendation [26–30].

A treaty-backed, UN-anchored vehicle could (i) pool contributions, (ii) procure compute, power, and data at utility scale, (iii) embed auditability, safety testing, and biosecurity review, (iv) guarantee open-science outputs proportionate to public funding, and (v) allocate inference budgets to a short list of humanity-scale questions — with a general cure for cancer as a flagship.

Because public investors will expect safeguards, such a body should align with the WHO’s guidance on artificial intelligence in health and with emerging international safety norms (for example, the Bletchley Declaration). Concretely: staged capability evaluations, independent red-teaming, strict separation between in silico design and wet-lab actuation, tiered access to high-risk tools, and human-in-the-loop oversight for any experimental transition — especially in biology [31, 32].

Data stewardship should be designed in, not bolted on. Federated learning and federated analysis let models learn from distributed datasets without moving patient data; GA4GH’s frameworks provide principled governance; and biobank access regimes supply working templates for consent, oversight, and transparency [20, 21].

Biosecurity must be non-negotiable. Wet-lab integration should comply with existing nucleic-acid synthesis-screening frameworks (for example, the U.S. HHS guidance for providers of synthetic double-stranded DNA) and the International Gene Synthesis Consortium’s Harmonized Screening Protocol, with procurement requirements that restrict access to screened providers and benchtop devices. Independent review boards should have veto authority over any agent-design capabilities, and audit logs should be mandatory for all tool calls that could touch sequence design or synthesis [33, 34].

Environmental externalities must be managed up front. Training and large-scale inference consume energy and, depending on siting and cooling, water. Best-practice engineering — efficient hardware and algorithms, carbon-free energy procurement, low-water cooling, circularity targets — materially reduces impact. Credible social license will require binding carbon and water budgets, transparent accounting, and independent auditing [35, 36].

What success would mean

If such an undertaking succeeded, it would culminate nearly two centuries of pathology and oncology — from Virchow’s cellular pathology, through the molecular and genomic eras, to a general restoration of growth control delivered with machine reasoning. The field’s telos — durably ending malignant proliferation — would be realized not by displacing human scientists, but by completing their project with a new kind of colleague.

References

[1] Hanahan D. & Weinberg R.A. The Hallmarks of Cancer. Cell 100, 57–70 (2000).
[2] Hanahan D. & Weinberg R.A. Hallmarks of Cancer: The Next Generation. Cell 144, 646–674 (2011).
[3] Hanahan D. Hallmarks of Cancer: New Dimensions. Cancer Discovery 12, 31–46 (2022).
[4] Vogelstein B. et al. Cancer Genome Landscapes. Science 339, 1546–1558 (2013).
[5] ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-Cancer Analysis of Whole Genomes. Nature 578, 82–93 (2020).
[6] Le D.T. et al. Mismatch-repair deficiency predicts response of solid tumors to PD-1 blockade. N. Engl. J. Med. 372, 2509–2520 (2015).
[7] Marabelle A. et al. Tumour mutational burden and outcomes with pembrolizumab (KEYNOTE-158). Ann. Oncol. 31, 131–141 (2020).
[8] Drilon A. et al. Larotrectinib in TRK fusion-positive cancers. N. Engl. J. Med. 378, 731–739 (2018).
[9] U.S. FDA. Pembrolizumab for TMB-H solid tumors. News Release (2020).
[10] King R.D. et al. The Automation of Science. Science 324, 85–89 (2009).
[11] MacLeod B.P. et al. Self-driving laboratory for accelerated materials discovery. Matter 3, 613–628 (2020).
[12] Appel K. & Haken W. Every planar map is four colorable. Illinois J. Math. 21, 429–567 (1977).
[13] Gonthier G. Formal Proof — The Four-Color Theorem. Notices Amer. Math. Soc. 55, 1382–1393 (2008).
[14] Alderson-Day B. & Fernyhough C. Inner Speech: Development, Cognitive Functions, Phenomenology, and Neurobiology. Psychol. Bull. 141, 931–965 (2015).
[15] Carruthers P. The Cognitive Functions of Language. Behav. Brain Sci. 25, 657–726 (2002).
[16] Lupyan G. & Clark A. Words and the World. Curr. Dir. Psychol. Sci. 24, 279–284 (2015).
[17] Wang X. et al. Self-Consistency Improves Chain-of-Thought Reasoning. arXiv:2203.11171 (2022).
[18] Yao S. et al. Tree of Thoughts: Deliberate Problem Solving with LLMs. arXiv:2305.10601 (2023).
[19] OpenAI. Learning to Reason with LLMs (o1). Technical Report (2024).
[20] Global Alliance for Genomics and Health. Framework for Responsible Sharing of Genomic and Health-Related Data. (2018).
[21] Bycroft C. et al. The UK Biobank Resource with Deep Phenotyping and Genomic Data. Nature 562, 203–209 (2018).
[22] Gatenby R.A. et al. Adaptive Therapy. Cancer Res. 69, 4894–4903 (2009).
[23] Gatenby R.A. & Brown J.S. Evolution and Ecology of Resistance in Cancer Therapy. Cold Spring Harb. Perspect. Med. 8, a033415 (2018).
[24] Stanford HAI. AI Index Report 2024. (2024).
[25] Epoch AI. Trends in AI Training Compute and Costs. (2024).
[26] CERN. Convention and Governance. (1953 – present).
[27] ITER Organization. Agreement on the Establishment of the ITER Project. (2006).
[28] The Global Fund. Results Report. (2025).
[29] Gavi, the Vaccine Alliance. Governance and Board Resources. (2025).
[30] UNESCO. Recommendation on Open Science. (2021).
[31] World Health Organization. Ethics and Governance of Artificial Intelligence for Health. (2021).
[32] UK Government. The Bletchley Declaration. (2023).
[33] U.S. HHS. Screening Framework Guidance for Providers of Synthetic Double-Stranded DNA. (2010).
[34] International Gene Synthesis Consortium. Harmonized Screening Protocol v2.0. (2017).
[35] Patterson D. et al. The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink. IEEE Computer 55, 18–28 (2022).
[36] Masanet E. et al. Recalibrating Global Data Center Energy-Use Estimates. Science 367, 984–986 (2020).

Note: This essay is adapted from a scholarly draft intended for submission as a Nature Comment; it has not been peer-reviewed.

If AI is a GPT, Are We Missing a Complement?

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Thu, 03 Jul 2025 09:07:33 GMT

Abstract

Artificial intelligence is plausibly a general‑purpose technology (GPT), and the current AI discourse overlooks at least one potentially decisive complement: the ability to scale continuously-available, low‑carbon electricity. Drawing on nineteenth‑century mis‑predictions—Qing vs. Tokugawa and Ottoman/Turkish vs. Meiji—this paper argues that rapidly deployable nuclear‑fission baseload, especially via small‑modular reactors (SMRs), may determine which economies can translate frontier AI into broad productivity gains. Empirical power‑curve data show frontier‑model training demand doubling roughly every year and inference demand doubling every three years, implying a plausible multi‑hundred‑terawatt‑hour baseload gap by the early 2030s—far beyond the scaling potential of renewables‑plus‑storage. Mainland China’s 150 GW nuclear pipeline illustrates how overlooked complements can invert the rankings suggested by GPU or talent metrics alone, outpacing the United States despite its semiconductor lead.

The analysis concludes that states should hedge complement uncertainty by fast‑tracking SMR licensing and finance alongside other policy initiatives. Otherwise, today’s AI‑readiness dashboards may prove as myopic as nineteenth‑century trade‑balance tables that failed to foresee Japan’s industrial leapfrogging.

1. Introduction

A general‑purpose technology (GPT) is a breakthrough that, like the steam engine or electrification, finds applications across many sectors, improves rapidly, and demands complementary investments—skills, infrastructure, business models—before its full economic impact is felt. Artificial intelligence—especially its prospective culmination in artificial general intelligence (AGI)—fits this definition: it is already seeping into code writing, molecular design, and logistics, its capabilities double on an annual cadence, and it calls for costly complements.

Current AI‑readiness conversations typically underline a familiar constellation of complements: frontier‑class GPU clusters and specialised accelerators; vast, well‑curated proprietary data sets; deep pools of machine‑learning talent (often buttressed by liberal immigration regimes); clear regulatory frameworks for data privacy, liability, and safety evaluation; and ample venture or sovereign capital capable of underwriting billion‑dollar training runs. These are crucial—but they share an unstated assumption: electricity supply can scale up as necessary and does not form a bottleneck. This paper argues that nuclear fission capacity, especially via small‑modular reactors, may prove a more decisive complement—yet it is largely absent from current readiness indices and investment road‑maps.

2. General‑Purpose Technologies

General‑purpose technologies (GPTs) are foundational innovations whose impact radiates far beyond any one sector. Economists Bresnahan and Trajtenberg (1995) define a GPT by three traits:

Pervasiveness—the technology can be applied across many industries (e.g. steam engines powered mines, factories, and railways).
Continuous technical improvement—the core technology keeps getting cheaper and more capable, making new uses viable (electric motors from Edison dynamos to micro‑drives).
Wide‑ranging complementarities—real benefits emerge only when firms and governments invest in co‑inventions such as new skills, business models, or infrastructure.

Classic GPTs include the Watt steam engine, electrification, and digital computing. Each triggered growth waves only after decades of complementary investment: factory re‑layout for electric motors, interstate highways for automobiles, organisational IT restructuring for computers.

Artificial intelligence—especially its potential culmination in artificial general intelligence (AGI)—exhibits the same signature. Large language models already play roles in code generation, drug design, and logistics; capabilities improve on an annual cadence; and they demand costly complements ranging from specialised chips to data‑governance regimes. Whether AGI becomes transformative therefore hinges less on raw algorithmic progress than on which societies assemble the right hidden complements. The sections that follow show how past observers repeatedly mis‑identified those complements—and why energy‑system readiness may prove the overlooked GPT enabler today.

3. Historical Mis‑Specification

Tokugawa Japan vs. High‑Qing Mainland China (c. 1800)
The transformative GPT on the horizon in 1800 was the steam‑powered factory system—initially textile spindles and iron rolling mills. European merchants assumed the Qing Empire, with its gigantic domestic market and entrenched silver flows, would adopt industrial steam first. Yet two unseen complements were brewing in Tokugawa Japan:

Mass literacy and numeracy. By 1800 as many as 40 % of commoners attended terakoya schools; printed kanbun primers circulated through commercial lending libraries. That literacy pool became essential for running and maintaining imported steam machinery and for double‑entry cloth‑mill accounting.
Proto‑financial depth. Osaka’s rice‑futures exchange—arguably the world’s first standardised commodities market—provided credit instruments, forward contracts and risk‑transfer mechanisms that lowered capital costs for mechanised mills.

Qing China, despite larger factor endowments, lacked both complements: broad‑based literacy remained confined to the exam‑elite; merchant remittance networks were powerful but regionally siloed. When Western iron steamers arrived, Japan could mobilise talent and finance in a decade; the Qing could not. By 1890 Meiji cotton‑spindle capacity exceeded China’s threefold, and Japan fielded steam‑powered warships that defeated the Beiyang Fleet.

Meiji Japan vs. Ottoman/Turkish Reforms (1868–1923)
By the 1880s the next GPT wave—late‑steam railways, telegraphy and early electrification—required both engineering talent and centralised fiscal capacity. European economists wagered the Ottoman Empire would industrialise alongside Japan: it enjoyed access to London capital markets and was already importing British locomotives. Two mis‑specified complements upended that forecast:

Universal elementary schooling. Japan’s 1872 Gakusei law pushed enrolment above 90 % by 1905, creating a broad technician and clerk class that could absorb telegraph codes, blueprint reading and dynamo maintenance. Ottoman primary enrolment remained under 20 % until the Young Turk era, limiting the skilled workforce.
Cohesive fiscal centralisation. The Meiji state overhauled tax collection, created the Bank of Japan (1882), and channelled surplus into zaibatsu industrial combines. Ottoman finance remained fragmented across provinces and subject to European debt control, choking railway extensions and delaying electrified streetcar concessions.

By 1913 Japan’s factory output was growing ~9 % annually, while Ottoman industrial share stagnated. Agrarian exports, not electrified factories, dominated Turkish GDP through the 1920s.

Lesson. Historically, observers have repeatedly failed to identify the actual complements a GPT would require. That historical blind spot warns us that today’s overlooked complement may be something as unglamorous—but indispensable—as a bunch of merchants in Osaka talking rice.

4. Complement Uncertainty Today

“If you look at Chinese power production versus U.S. power production, the Chinese graph is straight up and to the right … AI capabilities ultimately depend on incredible amounts of energy.” — Alexandr Wang (CSIS 2025)

4.1 Why electricity moves from cost variable to hard constraint

The transformative nature of AI makes decade‑long forecasts especially treacherous: language‑model inference demand depends on how deeply AGI capabilities become woven into the wider economy—an adoption curve that is itself hard to forecast ex ante—so its decade‑scale growth is intrinsically uncertain, whereas the training cost of frontier models follows a clearer hardware‑scaling trajectory that analysts have tracked for half a decade. That distinction shapes the stress‑test below. Recent studies separate AI workloads into two very different energy trajectories:

Mainstream forecasts—such as the IEA’s 2024 Electricity outlook, which projects data‑centre use merely doubling by 2030—rely on linear extrapolations of efficiency gains and assume limited adoption breadth. Our stress‑test suggests those models may severely underestimate long‑run electricity demand because they omit two compounding drivers: (i) scaling laws that push every new frontier model up by an order of magnitude and (ii) second‑order economic diffusion once AGI becomes a genuinely general‑purpose technology. If either factor materialises at half the historical rate observed in early LLM deployments, aggregate load would already exceed IEA’s ‘high’ scenario by the early 2030s.

By 2035 AI could plausibly require a continuous electricity flow equivalent to 30 % of all 2024 global generation—most of it 24 / 7 baseload for multi‑week training runs.

4.2 Why only nuclear fission can close the 24 / 7 gap in the 2030s window

Intermittent renewables require massive over‑build plus long‑duration storage to deliver round‑the‑clock power. Delivering a steady 1 GW load from solar‑wind hybrids typically demands 3‑5 GW of nameplate capacity plus 8‑10 hours of battery or pumped‑hydro storage, pushing levelised costs above $120 MWh⁻¹ (IEA 2024). Scaling that to multi‑gigawatt AI campuses compounds mineral bottlenecks—lithium, nickel, rare‑earth magnets—already stressed by EV demand.
Fossil generation can deliver 24 / 7 power today, but doing so at scale clashes with tightening carbon‑pricing regimes and statutory net‑zero targets. At a CO₂ price of US $60–100 t⁻¹ (already legislated in the EU ETS and proposed in several U.S. states), unabated gas‑turbine electricity exceeds US $110 MWh⁻¹—on par with or costlier than SMRs, while adding > 400 g CO₂ kWh⁻¹. Carbon‑capture retrofits can cut emissions by 90 % but add another $25‑40 MWh⁻¹ and consume ~15 % of plant output, further eroding economic viability. Consequently, most G20 decarbonisation road‑maps leave little headroom for new unabated fossil capacity in the 2030s, making it a politically and financially risky bet for long‑term AI baseload.
Hydro & geothermal can offer true baseload in a handful of locations—think Iceland’s geothermal datacentres or Norway’s surplus hydro—but their global expansion potential is severely limited. The best hydro sites are already dammed; new large dams face ecological‑social pushback, and prospective capacity adds only ~70 GW worldwide by 2035 (IEA 2024). Geothermal is constrained by favourable geology; even generous DOE estimates put scalable flash‑steam potential below 200 GW globally—< 3 % of the stress‑test AI load. Crucially, most of the regions where large‑scale AI compute centres are clustering—Pacific Northwest cloud zones, Northern Virginia’s "Data‑Center Alley," Phoenix’s solar‑fed campuses, Dublin and Frankfurt in the EU, and the Seoul‑Incheon and Shenzhen‑Guangzhou megaregions in Asia—possess little untapped hydro headroom and minimal high‑enthalpy geothermal resources. The compute hubs cannot count on local renewable baseload, sharpening the case for deploy‑anywhere fission.
Small‑modular reactors (SMRs) offer a fundamentally different nuclear‑build paradigm:
- Factory‑fabricated modules. Pressure vessels, steam generators, and containment systems are built on dedicated production lines, shipped by rail or barge, and installed “plug‑and‑play” on site—cutting quality‑control risks that plagued gigawatt‑scale stick‑built reactors.
- Shorter schedules. Standardised modules plus reduced on‑site civil works target 36–48 months from first concrete to grid‑sync for FOAK units; clone copies could drop to < 30 months.
- High capacity factor (≥ 90 %) and load‑following. Passive safety systems (natural‑circulation core cooling, gravity‑fed ECCS) allow flexible output without costly fossil peakers, making SMRs ideal for variable AI‑training loads that nonetheless require 24 / 7 availability.
- 10–300 MW electric blocks. A single 77 MW NuScale VOYGR or 100 MW GE‑Hitachi BWRX‑300 module can be co‑located with a hyperscale datacentre; sites can scale by adding more pods as demand grows—mirroring cloud “availability‑zone” doctrine.
- Minimal exclusion zones. Integrated containment and passive decay‑heat removal shrink emergency‑planning zones to the plant fence line (< 1 km), unlocking brownfield siting at retired coal plants or even industrial parks.
- Competitive cost trajectory. DOE Cost Study (2024) projects US $75–90 MWh⁻¹ LCOE for nth‑of‑a‑kind SMRs by early 2030s—20–40 % below dispatchable renewables + storage and ~30 % below unabated gas at $80 t‑CO₂.
- Environmental profile. Life‑cycle analyses show SMRs emit < 15 g CO₂‑eq kWh⁻¹—on par with wind and an order of magnitude below gas with CCS (UNSCEAR 2016; IPCC 2022). Land footprint is tiny: a 300 MW SMR campus fits in ~12 ha versus > 5 000 ha of solar‑plus‑storage for the same firm output. Cooling demand can be air‑based or use recycled municipal water, mitigating freshwater constraints. High‑level waste volume per kWh is comparable to large reactors, but smaller core inventories enable on‑site dry‑cask storage for decades, buying time for eventual deep‑geologic repositories.

These traits map cleanly onto AI‑compute needs: modularity matches datacentre scaling, high uptime suits multi‑week training runs, small footprint eases siting near load, and low life‑cycle carbon aligns with net‑zero mandates.

4.3 Fission as a Hidden GPT Complement

The examples above imply an unsettling parallel with the 19th‑century cases: observers in 1800 tracked Qing trade volumes yet missed Tokugawa literacy; planners in 1880 prized Ottoman access to European capital yet overlooked universal schooling. In the same vein, rapidly deployable fission capacity may constitute the overlooked complement that determines which states can transform AI from laboratory showcase to economy‑wide GPT. If the stress‑test electricity gap materialises and only a subset of nations can bring SMRs online at scale—due to supply‑chain depth, licensing agility, or public‑finance tools—those nations could enjoy a multi‑year window of compute abundance that accelerates domestic productivity, defence applications, and AI research itself. The complement is “hidden” not because energy is unfamiliar, but because today’s readiness frameworks treat energy as a price variable, not a binding technical pre‑condition.

Mainstream projections—IEA’s “data‑centre demand doubles” case, Goldman Sachs’ 3× scenario—anchor policy debates, yet they almost certainly low‑ball plausible electricity demand because they assume linear efficiency gains and limited GPT diffusion. When forecasters under‑shoot the requirement by an order of magnitude, energy is framed as a cost variable, not a capacity constraint.

That framing means the ability to rapidly scale fission baseload—particularly via SMRs—plausibly remains a complement whose very status as a complement is under‑recognized ex ante. The blind spot rhymes with 19th‑century mis‑specifications: observers prized Qing trade volumes yet missed Meiji literacy, or hailed Ottoman capital inflows while overlooking universal schooling. If nuclear agility turns out to be decisive, today’s readiness dashboards will look as myopic tomorrow as those earlier benchmarks appear in hindsight.

5. Illustrative 2025 Contrast: Mainland China vs. United States

Mainland China and the United States appear neck‑and‑neck on most headline AI‑readiness metrics—GPU clusters, talent pipelines, venture funding—but they diverge sharply on their ability to deliver continuous, low‑carbon baseload in the crucial 2025‑2035 window.

Implications. If fission baseload turns out to be the hidden complement that gates AGI scale, Beijing could unlock tens of GW of new 24 / 7 power by early 2030s, while Washington might still be waiting for FOAK SMRs to clear regulatory and funding hurdles. That gap mirrors the 19th‑century mis‑specification: outside observers bet on the Ottoman Empire’s European capital inflows, yet Japan’s overlooked literacy and fiscal cohesion let it industrialise first. Today’s analysts may overweight U.S. semiconductor design dominance while under‑weighting Mainland China’s energy‑infrastructure surge.

6. Conclusion

History teaches a sober lesson: the complements that unlock a general‑purpose technology often remain invisible until after diffusion begins. Early‑nineteenth‑century merchants prized Qing silver flows and Ottoman access to London capital, yet mass literacy, fiscal cohesion, and proto‑finance turned out to be the catalytic enablers in Meiji Japan. Today’s AI‑readiness dashboards risk the same error. By treating electricity as a priced‑in commodity rather than a hard technical prerequisite, they under‑estimate a looming constraint that even optimistic energy outlooks do not capture.

If the empirically observed 12‑month doubling in training power and three‑year doubling in inference load persist—even at half speed—AI workloads alone could require hundreds of terawatt‑hours of continuous, low‑carbon baseload within a decade. Intermittent renewables with storage, geographically capped hydro, and politically constrained fossil generation cannot scale quickly or cleanly enough. Small‑modular fission reactors are, for now, the only technology offering grid‑agnostic, 90 %‑uptime power at a levelised cost below dispatchable renewables. Yet not one major readiness index tracks SMR licensing agility, supply‑chain depth, or nuclear‑finance instruments.

The policy implication is straightforward. States that wish to lead—or simply remain competitive—in an AGI world must adopt portfolio hedging against complement uncertainty. Rapid fission build‑out should sit alongside chip subsidies, data‑governance reforms, and talent visas. Governments that begin permitting, financing, and standardising SMRs today will create an option value that those who wait cannot replicate on short notice.

In the nineteenth century, literacy and proto‑finance were the hidden complements that let late‑arriving Japan overtake larger polities. In the twenty‑first, scalable nuclear baseload may play the same quiet, decisive role.

References

Bresnahan, Timothy F., and Manuel Trajtenberg. 1995. “General Purpose Technologies: ‘Engines of Growth’?” Journal of Econometrics 65 (1): 83–108.

Datawrapper. 2025. “Nuclear-Build Readiness: Mainland China vs. United States.”

DOE (U.S. Department of Energy). 2024. Advanced Small Modular Reactor Cost Study: nth-of-a-Kind Projections for the Early 2030s. Washington, DC: Office of Nuclear Energy.

Epoch AI. 2025. Compute and Power Trends Across Large AI Models, 2019-2025. Epoch Policy Brief 11.

Goldman Sachs Research. 2024. “The Power Behind the Cloud: AI Electricity Demand Could Triple by 2030.” Global Energy Note, October 2024.

IEA. 2024. Electricity 2024 – Annex: Artificial Intelligence and Data-Centre Outlook. Paris: International Energy Agency.

IPCC. 2022. Climate Change 2022: Mitigation of Climate Change. Contribution of Working Group III to the Sixth Assessment Report. Geneva: Intergovernmental Panel on Climate Change.

OECD. 2024. OECD AI Policy Observatory – Going Digital Indicators. Paris: Organisation for Economic Co-operation and Development.

Oxford Insights. 2024. Government AI Readiness Index 2024. London: Oxford Insights.

Stanford Institute for Human-Centered AI. 2025. AI Index Report 2025. Stanford, CA: Stanford University.

UNSCEAR. 2016. Sources, Effects and Risks of Ionizing Radiation: 2016 Report to the General Assembly. New York: United Nations Scientific Committee on the Effects of Atomic Radiation.

Wang, Alexandr. 2025. “Scale AI’s Alexandr Wang on Securing U.S. AI Leadership.” Remarks at the Center for Strategic and International Studies, May 1, 2025. Transcript lines 121–124.

Wells Fargo Securities. 2024. “AI Compute Energy Demand: A First Look at the Power Curve.” Investor Note, March 2024.

Constraining Malicious AGI

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Sun, 25 May 2025 12:59:55 GMT

Author’s Note: This article introduces a artificial intelligence policy research question I intend to develop over the coming months: to what extent do states and non-state actors differ in their capacities to create and use artificial general intelligence for malicious purposes, and how effectively can current legal regimes constrain them? This is an increasingly important question to consider, but the current discourse unfortunately inadequately addresses it. My coming publications will partially fill this lacuna.

Introduction

As the capabilities of artificial intelligence (AI) systems continue to advance, the prospect of artificial general intelligence (AGI) has transitioned from speculative possibility to plausible medium-term outcome. This transition invites a reconsideration of how the global system might constrain the malicious use of AGI—not merely in the abstract, but across the concrete capacities of actors who may seek to develop and deploy it. While international discourse often emphasizes state behavior, non-state actors—from research collectives to organized illicit networks—may also possess the capabilities necessary for AGI creation and use. Understanding these actors’ comparative capacities, the legal structures constraining them, and the gaps therein, is critical to effective governance in the years ahead.

Background: Artificial General Intelligence

Artificial intelligence refers to the field of creating machines capable of solving problems that require reasoning. Since the mid-twentieth century, this field has undergone several paradigm shifts, with the multilayered perceptron marking an early milestone. The modern era of AI began in earnest with the 2012 demonstration by Krizhevsky et al. that increasing model size yields large performance gains, inaugurating the era of scaled deep learning.

AGI refers to systems capable of solving nearly all problems that humans can solve using reasoning. In contrast to narrow AI—designed for specialized tasks such as image recognition or translation—AGI is intended to perform across domains and adapt to new problems. Recent progress toward AGI is largely driven by large language models (LLMs), which integrate symbolic reasoning, memory, and latent concept formation in a single architecture. These models are trained on massive corpora using transformer architectures and have shown generalization capacity beyond initial expectations.

Developing AGI entails challenges across four domains: scientific (developing new architectures and optimization techniques), engineering (operationalizing and scaling frontier techniques), data (curating, processing, and aligning high-quality training sets), and compute (securing the hardware and energy necessary for training large models). Scientific progress is largely confined to elite research communities. Engineering implementation is more broadly accessible but still requires deep technical skill. Data and compute are limited by institutional capacity.

In the medium-term future—defined here as the next three to seven years—AGI systems are likely to first solve all closed-form expert-level problems (e.g., those in the Humanity’s Last Exam benchmark), then extend to highly structured intellectual labor (e.g., evaluating weapons design feasibility), and finally encroach upon unstructured problem-solving domains (e.g., developing operational plans for unconventional biological attacks). These capacities will not emerge uniformly, but rather across a distributed ecosystem of actors with differing capabilities and constraints.

Background: Political Science and International Law

What is a state, and what is its capacity?

Under the Montevideo Convention (1933), a state is defined as a political entity with a defined territory, permanent population, governing structure, and the capacity to enter into relations with other states. Jurisprudential debate remains on whether this capacity requires institutional potential or actual functional engagement. Although the Convention is binding only on its American signatories, it is often treated as a codification of customary international law.

Alternative conceptions include the Weberian view, where the state holds a monopoly on legitimate violence, and the recognition-based positivist view, where statehood derives from international recognition. The concept of the state must also be distinguished from that of a polity—a broader term encompassing any internally governed political community. The administrative, fiscal, and military capacity associated with modern states has expanded considerably since the eighteenth and nineteenth centuries, when centralization and institutional consolidation became core to statecraft.

What is a non-state actor, and what is its capacity?

Non-state actors include all entities operating outside formal state institutions. These range from multinational corporations to criminal networks, religious movements, philanthropic foundations, and research collectives. Importantly, non-state actors are not inherently benevolent or malicious; Médecins Sans Frontières, for example, is widely regarded as a highly benevolent non-state actor.

Some non-state actors operate with substantial structural coherence and strategic sophistication. While they typically lack sovereign legal status, treaty access, or diplomatic protections (e.g., immunity), they may possess state-like capacities such as territorial control, coercion, and infrastructure. Their ability to develop AGI depends on access to compute, data, engineering talent, scientific insight, and their ability to evade regulatory oversight.

Comparative capacity analysis

States and non-state actors share several core capacities: the ability to organize labor, allocate resources, and pursue high-risk technical projects. However, states derive these from sovereignty, taxation, recognition, and institutional permanence; non-state actors rely on market position, ideological coherence, or network dynamics. States enjoy superior access to scale-dependent assets—e.g., national labs, classified datasets, and strategic infrastructure—but are often hindered by bureaucratic inertia. Non-state actors can iterate more rapidly and evade scrutiny, particularly if operating transnationally.

Legal exposure also differs: states may be shielded by sovereign immunity or enforcement asymmetries, while non-state actors face more direct coercive constraints—but also benefit from attributional ambiguity. In practice, state initiatives often depend on non-state contractors, blurring distinctions and creating hybrid dependencies.

How are their respective actions externally regulated?

International law lacks a centralized enforcement mechanism. Instead, compliance is encouraged through reputational, reciprocal, and institutional means—treaties, soft law, judicial forums, and multilateral sanctions. The Westphalian system used today, centered on the principle of anarchy—the absence of any supranational authority above states, and the presumption that no state holds formal legal authority over another—is similar in this way to other systems of international relations, such as the Sinocentric tributary system and the Islamic jurisprudential siyar, in that all three systems lacked centralized enforcement.

States are bound by treaties and customary norms, but enforcement is politically contingent. Non-state actors are regulated indirectly—primarily through state actions (e.g., sanctions, criminalization). AGI-related governance will likely exacerbate these structural enforcement gaps, especially given problems of jurisdiction, attribution, and multi-actor diffusion.

Literature Review: International Constraints on Malicious AGI Development and Use

International legal and normative frameworks aimed at constraining the malicious development and use of artificial general intelligence (AGI) are presently limited in both scope and enforceability. While efforts to govern artificial intelligence more broadly have accelerated in recent years, few existing instruments directly address the risks posed by general-purpose reasoning systems. This section reviews the primary international treaties, non-binding norms, and emerging institutional mechanisms intended to regulate harmful AGI use by states and non-state actors. Emphasis is placed on both the formal architecture of these regimes and their practical limitations in a world where AGI development is rapidly diffusing.

Binding Instruments and Legal Principles

The most consequential legally binding instrument to date is the Council of Europe’s Framework Convention on Artificial Intelligence, Human Rights, Democracy and the Rule of Law (2024). This treaty obliges signatories to ensure that AI development and deployment, including across the full system lifecycle, conforms to human rights obligations and democratic norms. Its jurisdictional reach remains regionally anchored but open to global accession. Crucially, it exempts national security and defense domains—precisely those where malicious AGI use may be most likely to arise.

Other binding legal sources derive from existing international humanitarian and criminal law. The Geneva Conventions and their Additional Protocols already govern state conduct in armed conflict, including the deployment of new technologies. Article 36 of Additional Protocol I requires legal review of novel weapons systems, a mandate applicable to AGI-enabled military tools. Nevertheless, compliance is self-administered, and enforcement is rare. Proposals for an AGI-specific arms control treaty remain speculative, though the UN General Assembly’s 2024 resolution calling for negotiations on lethal autonomous weapons (LAWS) marks a notable development. While not binding, this resolution signals growing consensus on the need for preemptive restriction of AI systems capable of causing mass harm.

In the realm of international criminal law, no specific offense of “malicious AGI development” exists. However, state or non-state actors who deploy AGI in ways that result in mass atrocity or systemic rights violations could, in principle, be held liable under existing war crimes or crimes against humanity statutes. Yet such avenues are reactive, depend on attribution, and lack preventive force.

Soft Law and Normative Frameworks

Non-binding norms have proliferated in the absence of comprehensive treaty law. UNESCO’s Recommendation on the Ethics of Artificial Intelligence (2021), adopted unanimously by 193 member states, articulates global principles around transparency, accountability, and non-maleficence. Similarly, the OECD AI Principles (2019), now endorsed by over 40 countries, emphasize human-centered design, safety, and robustness. These frameworks implicitly apply to AGI, although neither uses the term directly. Their persuasive authority lies in norm diffusion, peer review, and reputational pressure, rather than coercive sanction.

Multilateral declarations—such as the 2023 Bletchley Park Declaration on AI safety—extend these efforts into the frontier-AI domain. While voluntary, such documents increasingly reflect consensus that advanced AI systems, including potential AGI, require international coordination and risk mitigation strategies. Proposals from the UN Secretary-General’s High-Level Advisory Body on AI suggest the eventual emergence of a distributed global governance framework, though institutional design remains underdefined.

Efforts to regulate lethal autonomous weapons within the Convention on Certain Conventional Weapons (CCW) process have similarly produced guiding principles affirming the necessity of human control over use-of-force decisions. However, treaty negotiations have stalled amid geopolitical disagreement, and the potential for AGI-enhanced LAWS development remains unconstrained.

Enforceability and Structural Gaps

International legal enforceability remains weak across these regimes. Binding treaties are implemented domestically, with uneven fidelity. Non-binding norms rely on reputational incentives and voluntary compliance. There is no centralized AI oversight body equivalent to the International Atomic Energy Agency, and existing mechanisms—such as export control regimes or domestic regulatory frameworks—function in a fragmented and uncoordinated fashion.

Attribution poses a critical barrier to enforcement. As AGI systems become more autonomous and development more transnational, assigning responsibility for harms becomes increasingly complex. This is particularly true for non-state actors operating with plausible deniability or in permissive jurisdictions. Although international law holds states responsible for activities emanating from their territory, enforcement depends on proof of knowledge and effective control—both difficult to establish in distributed AGI projects.

Moreover, jurisdictional diffusion undermines regulatory coherence. An AGI trained across multiple cloud providers in different countries, developed by a decentralized research collective, and deployed via anonymized infrastructure, may elude any single legal regime. Existing doctrines of universal jurisdiction or state responsibility offer limited recourse in such scenarios.

Prospects for Future Constraint

While existing instruments fall short of comprehensive AGI governance, they lay normative groundwork for future regimes. A precautionary principle—requiring risk assessment and mitigation prior to deployment—has begun to take shape across ethical guidelines and summit declarations. Discussions around international safety standards, model evaluation protocols, and cross-border incident response mechanisms suggest emerging pathways for formalization.

Nonetheless, the current international legal order lacks the institutional density, technical capacity, and political alignment necessary to effectively constrain malicious AGI development and use. Any future regime will need to address enforcement asymmetries, enhance attribution mechanisms, and develop institutional architectures capable of monitoring high-risk AGI trajectories. Without such evolution, the gap between technological capability and legal constraint will only widen.

This literature review thus identifies both the partial scaffolding and profound insufficiency of current international frameworks for constraining malicious AGI. The burden of anticipatory governance remains unmet. Recognizing this shortfall is a necessary precondition for building effective future constraints.

Conclusion

As artificial general intelligence transitions from theoretical construct to foreseeable reality, the question of who may develop and wield such systems—and under what constraints—gains strategic urgency. This article has examined the comparative capacities of states and non-state actors to maliciously create and use AGI, as well as the external legal frameworks that currently exist to restrain such actions. States possess clear advantages in scale, infrastructure, and regulatory power, while non-state actors benefit from agility, transnational reach, and opacity. Both actor types face significant, though asymmetrical, constraints imposed by international law, norms, and treaty regimes.

Yet these constraints remain uneven, fragmented, and frequently reactive. The existing patchwork of international instruments—while foundational—does not yet rise to the level of a comprehensive AGI governance regime. Enforcement asymmetries, attribution difficulties, and normative ambiguities present systemic vulnerabilities that could be exploited by actors pursuing harmful ends. Bridging these gaps will require deliberate institutional innovation: the development of actor-specific safeguards, the refinement of legal attribution doctrines, and potentially the establishment of new treaty frameworks or monitoring institutions.

A clear understanding of which actors are most capable of—and most likely to—pursue malicious AGI trajectories is a prerequisite to meaningful intervention. Future work must build on this foundation, identifying specific technical thresholds, institutional failure modes, and enforcement architectures that can mitigate cross-actor risk. The regulatory space around AGI remains under-structured, but it need not remain so. By anticipating the shape of its challenges and the actors involved, policy can begin to move at the speed of the technologies it seeks to govern.

Artificial Intelligence Pedagogy Must Incorporate Interdisciplinary Exposure to Ensure Safety

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Fri, 02 May 2025 19:07:12 GMT

I have noticed that, when computer scientists are asked the potential impacts of artificial intelligence, they rarely use frameworks from economics, political science, sociology, etc. Instead, they will give answers roughly comparable to what someone with no academic training in the relevant fields would give, e.g., ‘AI models will be able to replace receptionists at paper companies’, etc.

Prima facie, I would expect exceptionally well-trained computer scientists to be able to situate their advancements into the broader ecosystem of our world, analysing, for example, how artificial general intelligence would impact the environment using a framework informed by political-science-theoretic models of green energy production, e.g., ‘AGI models would cause less environmental degradation in a world where states are able to engage in popularly disfavourable, but expert-supported, activities—fission production in this case.’

My guess is that this derives from the currently dominant pedagogical models we have from American-led education [1], where students are siloed into their discipline, with limited exposure to other fields. Without my informal education, for example, I would personally not be able to situate my research in artificial general intelligence within a macroeconomic framework [2].

I am very inclined towards the current trend for more interdisciplinary work. More specifically, I would strongly encourage computer science departments to require early-stage undergraduates interested in artificial intelligence to take coursework in economics and political science, focusing primarily on the fundamentals of macroeconomics and the fundamentals of democratic institutions, respectively. Do we really want our leading computer scientists to lack an understanding of how their research impacts our broader social ecosystem? Do we not want Dr Frankenstein to understand how his Adam would affect the townspeople?

[1]: Levine, Emily J. 2024. ‘Research & Teaching: Lasting Union or House Divided?’ American Academy of Arts & Sciences, 30 May 2024. https://amacad.org/publication/daedalus/research-teaching-lasting-union-or-house-divided.

[2]: Though I emphasize that my ability to do so is a fraction of what someone with doctoral training in macroeconomics would be able to accomplish.

Subscribed

How Poland and Its Peers Can Advance Their Economies Using Artificial General Intelligence

Abhinav Madahar · अभिनव ਮਦਾਹਰ — Tue, 22 Apr 2025 19:18:41 GMT

A friend recently asked me whether Poland and its peers can meaningfully incorporate artificial general intelligence into their respective economies. This piece is adapted from my response.

Yes, the Polish economy would gain from exposure to artificial general intelligence. However, the Polish economy can only expose itself to some areas in the industry of artificial general intelligence. First, let's map out the artificial general intelligence industry, examining different scientific and engineering questions asked as well as which economies are able to insert themselves in different areas.

I am not very familiar with hardware matters in artificial intelligence, e.g. the creation of more advanced GPUs which are able to perform tensor operations more quickly, etc. I will focus on the areas with which I am familiar—I recommend learning more about hardware matters from someone better versed in that area.

Not considering hardware, artificial general intelligence can be divided into three major areas1: the field of scientific research into more advanced techniques which will hopefully achieve artificial general intelligence; the field of engineering which applies already-known techniques to create artificial intelligence models; and the field of business, engineering, and design which uses existing artificial intelligence models to create software which solves business problems. Of these three areas, the first two are out of Poland's reach, but the third is arguably not only achievable but necessary.

The first area, the scientific field which aims to discover the techniques necessary to achieve artificial general intelligence, poses a high barrier to entry. For an economy to be competitive here, it needs a strong academic community, something incredibly difficult to foster. Germany, for example, had arguably the strongest academic community of any economy prior to the Second World War, but the war destroyed the German academic community, especially because German academics were disproportionately Jewish. Germany still has not recovered its academic edge, and there is no indication that it will.

Currently, only a handful of economies have academic communities able to make meaningful strides towards artificial general intelligence: the United States, mainland China, the United Kingdom, Canada, the Republic of Korea, and the State of Israel. Of these, the United States and mainland China are dramatically ahead of the remaining, with the United States having an edge over mainland China. As previously established, it is almost impossible for an economy to nurture an academic community competitive on the world stage, so it is not feasible for Poland to achieve this.

A similar story presents itself for the second area. Here, we consider the field which applies existing scientific knowledge about techniques towards artificial general intelligence to engineer artificial-general-intelligence-like models. As in the previous area, this area presents a high barrier to entry, one even higher than the previous. While work in the first area is present in economies such as the United Kingdom, Canada, the Republic of Korea, and the State of Israel, the second area only meaningfully sees two economies: the United States and mainland China, though the United Kingdom adds a small addition through Google DeepMind.

This barrier is arguably even higher as training frontier models now costs around one billion USD, a figure which I expect to rise rapidly. In particular, I expect that training a frontier model will cost around one hundred billion USD in five years' time. Building a technology industry and ecosystem is already incredibly difficult, but the incredible training costs associated with frontier models turn this from difficult to effectively impossible without continental support. Even if Poland and other European economies are able to attract the necessary engineering talent and nurture their respective business environments, the cost of the compute, data, and electricity necessary would require a non-negligible portion of Europe's GDP to be spent on training frontier models. While the United States and mainland China have governments prepared for this, European institutions generally have not been as friendly. Unless Europe as a whole changes course to facilitate massive, continent-wide endeavours in building frontier models, it is not possible for Poland to establish a domestic industry focused on engineering frontier artificial-general-intelligence-like models.

This brings me to the third area, which is not only much more reachable for an economy like Poland but, arguably, necessary. The landmark World Bank report China 20302 argues that, for an economy to advance from upper-middle income to high-income and, onwards, to the economic frontier, it needs to grow what is often referred to as its 'innovation economy'. The report does not use that terminology, and I generally avoid using that term as well because it derives primarily from applied work in industry rather than from academic sources in economics and political science, but those whose expertise derives primarily from applied fields can certainly use that term if it is helpful.

The argument raised in China 2030, which is fairly widely held within the academic community, is that advanced economies, especially those at the economic frontier, e.g. the United States and Taiwan, derive their economic strength from their respective scientific and technological industries. That is to say, Taiwan is an advanced economy because there exist organizations on the island which make discoveries and advancements which are not made elsewhere. For example, TSMC is the world's most advanced chipmaker for CPUs; when they discover a novel technique for increasing CPU speed, for example, they are the first in the world to make this discovery, so they issue a surcharge for anyone who wants to purchase a chip they manufacture with this advancement. Through scientific and technological advancement, Taiwanese firms are able to achieve particularly-high revenues, which keeps the Taiwanese economy at the frontier. To advance from the lower levels of high-income, which is where the Polish economy is currently, to the economic frontier, Poland needs to nurture its scientific and technological industry.

Specifically, it would behoove an economy like Poland's to have a strong industry focused on creating software which applies existing artificial-general-intelligence-like models to solve business problems. The recent few Y Combinator batches have a large portion of their startups focus on applying artificial-general-intelligence-like models to, for example, improve medical insurance processing or facilitate education by streamlining the educators' toil tasks3. An economy like Poland's is much better suited to address these problems. They require far fewer upfront capital costs, generally only the standard single-digit million USD or equivalent figures expected of a tech startup, in stark contrast to the punch-a-hole-in-your-nation's-GDP figures seen when training frontier artificial-general-intelligence-like models. Poland would benefit from a business ecosystem and industry where startups and enterprises regularly apply already-built artificial-general-intelligence-like models to create software which solves business problems. For example, perhaps a startup in Kraków will be formed which aims to use artificial-general-intelligence to help large enterprises manage their supply chain, responding to blocks and relieving bottlenecks.

From here, the next step is to consider the specifics of Poland and its economy to form policy recommendations which can facilitate the growth of a nascent Polish startup ecosystem and industry in this way.

Subscribe now

This is just one useful way to structure the field; others are valid depending on the context.

World Bank and Development Research Center of the State Council, P. R. China. China 2030: Building a Modern, Harmonious, and Creative Society. Washington, DC: World Bank, 2013. https://documents.worldbank.org/en/publication/documents-reports/documentdetail/781101468239669951/china-2030-building-a-modern-harmonious-and-creative-society.

I do not know if there are any examples from the recent few Y Combinator batches which aim to achieve these exact directions. These are just examples I pulled from the aether.

Abhinav Madahar · अभिनव ਮਦਾਹਰ

Why Do We Pursue Science?

Artificial Intelligence Is Arriving in the World We Already Live In

Preventing Misuse Requires International Coordination

Accidents Are a Different Kind of Risk

The Role of the International Community

Toward 'Theory of Reasoning' Reasoning Architectures

Introduction

Why This Work

Conceptual Foundations

Relation to Prior Work

Significance

Toward a General Cure for Cancer

What unites cancers — and why a general strategy is thinkable

Two routes to a general cure

Why a language-centric AGI is a plausible substrate for science

A pragmatic program sketch

The economics: train once, think many times

Who pays — and how to keep it fair, safe, and sustainable

What success would mean

References

If AI is a GPT, Are We Missing a Complement?

Abstract

1. Introduction

2. General‑Purpose Technologies

3. Historical Mis‑Specification

4. Complement Uncertainty Today

4.1 Why electricity moves from cost variable to hard constraint

4.2 Why only nuclear fission can close the 24 / 7 gap in the 2030s window

4.3 Fission as a Hidden GPT Complement

5. Illustrative 2025 Contrast: Mainland China vs. United States

6. Conclusion

References

Constraining Malicious AGI

Introduction

Background: Artificial General Intelligence

Background: Political Science and International Law

What is a state, and what is its capacity?

What is a non-state actor, and what is its capacity?

Comparative capacity analysis

How are their respective actions externally regulated?

Literature Review: International Constraints on Malicious AGI Development and Use

Binding Instruments and Legal Principles

Soft Law and Normative Frameworks

Enforceability and Structural Gaps

Prospects for Future Constraint

Conclusion

Artificial Intelligence Pedagogy Must Incorporate Interdisciplinary Exposure to Ensure Safety

How Poland and Its Peers Can Advance Their Economies Using Artificial General Intelligence

4.2 Why only nuclear fission can close the 24 / 7 gap in the 2030s window