The Short Case for Closed AI

Academic discourse and in particular Artificial Intelligence (AI) research strongly embodies the so-called 'communist pillar' of scientific ethos1 in that heretofore the vast majority of AI research output has been made publicly available. It is not clear however that this open approach can or should continue into the future. OpenAI’s decision to incorporate OpenAI LP was ostensibly driven in part by security concerns2 and represents an implicit portentous admission of the credibility of 'The Bitter Lesson'3. That is, that the leveraging of compute resources (and not necessarily solely intelligent algorithms) are integral to present and future advancements in AI.

Despite recent community and media attention, very little has been written on the topic of 'Closed AI' (CAI) – AI research the development and results of which are not made public – and its benefits. This essay will seek to address this gap in the literature by outlining some arguments opposing Open AI (OAI) development going forward. I describe how OAI represents a far greater existential threat than CAI, using the vulnerable world hypothesis (VWH)4 as an exploratory framework. Additionally, I show how OAI can exacerbate other existential threats, create new ones, and plausibly lead to historically unprecedented levels of sentient suffering.

0: Limitations of the vulnerable worlds typology

The VWH, roughly stated, holds that there are certain inventions which if conceived would make the destruction of civilization (DOC) inevitable. It hypothesizes white, gray, and black balls within an urn which are picked out one by one with no a priori knowledge of what color will be picked. White balls signify beneficial inventions, gray signifies inventions yielding mixed blessings and black balls represent those inventions which if discovered 'invariably or by default' cause DOC. A typology of possible DOC inventions/vulnerabilities is outlined, which I will make reference to but not expound on.

Type-1 inventions are those so destructive and so easy to use that the actions of actors in the 'apocalyptic residual' make DOC extremely likely.

Type-2A vulnerabilities constitute those inventions with which powerful actors have the ability to produce DOC harms and, in the 'semi-anarchic default condition’, face incentives to use that ability5.

I hold that while the development of CAI in and of itself poses a serious existential risk as a likely Type-0 discovery – a technology with a 'hidden risk' such that the default outcome when it is discovered is inadvertent DOC – the development of OAI constitutes a more insidious type of risk. Indeed, one which does not fit neatly into the VWH typology but rather in addition to embodying Type-0 concerns, also invariably facilitates the creation of Type-1 and Type-2A vulnerabilities beyond the inherent human capacity to do so.

I: OAI as an enabler of superintelligence

I begin by considering the ways in which CAI can be viewed as a Type-0 invention. I argue that if I accept any of the various singularity arguments then CAI must represent a Type-0 invention. These arguments hold that, absent defeaters, the advent of a human-level AI (whether CAI or OAI) can logically lead to an 'intelligence explosion' – the emergence of a superintelligent AI capable of eradicating humanity whether maliciously or as a byproduct of the fulfillment of its goals. Note that 'inadvertent' (from the definition of Type-0) refers to the fact that key actors continue to develop and use the technology (believing that the benefits outweigh the costs), despite the threat of DOC. This is of course the scenario in which we find ourselves, due to the various economic and humanitarian pressures impelling the development of AI more generally6. I assume for this essay that the semi-anarchic default condition holds, and is in fact the normative condition in which we find ourselves.

It is not the case however that CAI and OAI are equally bad in the singularity scenario. As noted, OAI development results in a more rapid pace of progress than CAI in the short-to-medium term. For our purposes, it follows this OAI serves only to push us ever closer to the existential cliff by accelerating research. However just because a civilization-ending invention is created, does not mean that civilization will end immediately. In this light it is desirable to retard the research process by keeping research outcomes classified. This would have the beneficial effect of giving us more time to develop sufficient countermeasures, as well as reducing the number of people capable of working on the problem, thereby decelerating progress. For example, the above 'intelligence explosion' may not necessarily be a bad thing, if we solve the 'alignment problem' 7 first. Since in the case of certain singularity both CAI and OAI represent a clear existential threat, I will for the rest of this essay assume that a superintelligent AI will not emerge (e.g. perhaps because of physical limitations on intelligence). Additionally, I stress that it is not necessary to subscribe to the above singularity arguments in order to find developmental retardation and secrecy amenable.

II: Distributed Von Neumann Agents

While a CAI itself (that is, an AI developed in secret and accessible only to a small number of individuals) can be viewed as some combination of Type-0 and Type-2A threats, OAI carries with it additional risks. Consider that AI is inherently different to nuclear weapons, which are also a Type-2A threat. I hold that the development of AI is not akin to Von Neumann creating the atomic bomb, but rather it is akin to creating Von Neumann himself. In this way, even as a mere tool (or collaborator), OAI could prove more dangerous than CAI in that distributed access to such an intellect would constitute a severe information hazard8, ratchet up Type-2A tensions through proliferation9, and empower malicious actors.

I define a Von Neumann Agent (VNA) as an agent moderately more intelligent than John Von Neumann (widely considered the smartest human to have ever lived, but possessing no agency or desires of its own. Such an agent would doubtless contribute to the rapid advancement of scientific progress across the board, effectively speeding up the rate at which we pluck balls from the urn. This is of course undesirable, given the increased likelihood of pulling a dreaded black ball. However, the true danger of such a development lies in the creation of Distributed Von Neumann Agents (DVNAs). Such agents, if made widely available, by the open-sourcing of the code or data necessary for their implementation, would result in an increase in the number of actors pulling balls from the urn, the rate of that pulling, as well as the number of 'easy' black balls in the urn (discussed below). Indeed, they would also facilitate those 'apocalyptic residuals' dedicated to finding a black ball. In this way, OAI represents a greater threat than CAI even at 'weakly superhuman'10 levels of intelligence.

I now consider why such DVNAs would also increase the number of black balls in the urn. As noted before, it is only by sheer luck that humanity does not have access to 'easy nukes', which are classified as a Type-1 invention. Consider that a Type-1 invention requires that it is 'either extremely easy to cause a moderate amount of harm or moderately easy to cause an extreme amount of harm'. Implicit in this statement of the self-evident non-existence of 'easy nukes' (in the truest sense, of genuine nuclear weapons being an 'easy' creation), is the proposal that nuclear weapons are not 'easy' for humans. However it is clear that the creation of DVNAs would open up new strata of Type-1 inventions.

Suppose the existence of 'easy nukes' as touched on by Bostrum, which require glass, metal, and batteries. I posit that at least a human-level intellect is necessary to make such a discovery 'easy’. It must hold then that there lie discoveries contiguous to the realm of human knowability (per se) that require at least superhuman levels of intellect and/or speed11. In this scenario, the creation of a VNA increases the number of possible black balls in the urn, while the proliferation of DVNAs has the effect of greatly increasing our capacity and likelihood of plucking a black ball from the urn in proportion to the number of agents with access to the urn.

III: Attendant malice

As is the case with all technological advances, seemingly innocuous discoveries in disparate areas may have unintended applications in malicious areas, and vice versa. There are some applications of CAI which are normatively simpler to make an argument against, such as the direct development of autonomous weapons. These issues are only exacerbated by the proliferation of their OAI counterparts, and indeed constituent components. For example, if the code for AI which can reliably recognize objects is openly available, it enables both the detection of human casualties in rubble and the unprecedented oppressive surveillance of minority groups in China and elsewhere. AI-powered emotion-detection systems directly enable improved elderly-patient care and the real-time monitoring of 'potentially problematic' factory workers. OpenAI themselves cited the undesirability of granting public access to highly effective 'fake news generators'12 which have the potential to destabilize nations.

I posit that it is untenable (short of an oppressive 'singleton solution’) to regulate the underlying statistical methods employed by certain areas of AI research, and to do so would be no more efficacious than Australia’s recent ban on encryption. More generally, though all technology can be viewed as a tool with no intrinsic moral valence, the increasing ubiquity of mass-effect technologies (those which can adversely affect a large number of people) provides ample ground to contend the morally-neutral view of technology. I argue that in the same way that the circulation of plutonium is closely monitored and regulated, so too should AI research be monitored and regulated.

IV: Ubiquitous deities

Here I illustrate one intended consequence of OAI, which in and of itself may not pose an existential risk, but nevertheless is morally reprehensible – the creation of AI minds. One immediate consequence of the creation of such 'digital minds' 'is the potential for a huge population explosion’13. The progression of AI research to the point where the brain capacity of even a semi-intelligent animal (equivalent to that of a young human child) could be simulated raises serious ethical concerns. It is not difficult to imagine entire digital universes and billions of minds ruled over by sadistic humans for subjective eternities14. In this light the question of OAI becomes: should self-deification be open-source?

Even in the (extremely unlikely) absence of any malicious intent, the aforementioned 'subjective eternities' themselves represent an ethical dilemma reminiscent of the 'doggy living' thought experiment in which it is proposed that a 'weakly superhuman' mind running cognitive processes at far higher speeds than human minds would likely 'burn out' within a few weeks of 'outside time' (that time generally experienced by normal humans). It is worth recognizing that 'burning out' can here be understood as the mental collapse of a mind alike or even superior to, our own as a result of its experience of many years of subjective time during which it has no ability to effect change in its environment. For example, consider a digital world in which one minute of 'outside time' is equivalent to 1,000 subjective years.

It is worth noting too that in such a scenario, the statistical likelihood of being 'born' into a universe of torture is higher than that of being born into a 'real' (that is, the empirically normative universe we inhabit), or one without suffering15. Setting aside for now the nonidentity problem16, OAI’s inherent capacities for temporal and 'corporeal' torture of digital minds seem reason enough to never widely distribute it, lest we create unspeakable torment. Nevertheless, our reliably poor intuitions regarding the value of human life at scale17 do not instill much hope that this perhaps most egregious consequence of OAI will be considered seriously.


I have outlined why OAI poses a far greater threat to humanity than CAI, both existentially and morally. If the singularity argument holds, then it seems likely that only the adoption of CAI research can offer humanity a chance to implement countermeasures (e.g. solving the alignment problem, reaching other planets and cutting off contact, etc.). In the case that it does not hold, I have shown how the distribution of superhuman knowledge constitutes an 'information hazard', and belies the capacity for the immense suffering of sentient beings. While I do not offer prescriptive solutions here, I hope to highlight that pluralistic ignorance18 regarding OAI is not a tenable solution.


Robert K Merton. The normative structure of science. The sociology of science: Theoretical and empirical investigations, pages 267–278, 1979.


Alec Radford. Better language models and their implications, Mar 2019.


Richard Sutton. The bitter lesson, Mar 2019.


Nick Bostrom. The vulnerable world hypothesis. 2018.


Nick Bostrom. Superintelligence. Dunod, 2017.


Nick Bostrom. Strategic implications of openness in ai development. Global Policy, 8(2):135–148, 2017.


Eliezer Yudkowsky. Creating friendly ai 1.0: The analysis and design of benevolent goal architectures. The Singularity Institute, San Francisco, USA, 2001.


Nick Bostrom et al. Information hazards: a typology of potential harms from knowledge. Review of Contemporary Philosophy, (10):44–79, 2011.


Robert Sparrow. Killer robots. Journal of applied philosophy, 24(1):62–77, 2007.


Vernor Vinge. The coming technological singularity: How to survive in the post-human era. Science Fiction Criticism: An Anthology of Essential Writings, pages 352–363, 1993.


John R Searle. Minds, brains, and programs. Behavioral and brain sciences, 3(3):417–424, 1980.


Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. Technical report, Technical report, OpenAi, 2018.


Robin Hanson. If uploads come first. 1994.


Rebecca Anne Peters. Re-creation, consciousness transfer and digital duplication: The question of what counts as human?in charlie brookers black mirror. In Film-Philosophy Conference 2018, 2018.


Nick Bostrom. Are we living in a computer simulation? The Philosophical Quarterly, 53(211):243–255, 2003.


Derek Parfit. Future people, the non-identity problem, and person-affecting principles. Philosophy & Public Affairs, 45(2):118–157, 2017.


David Fetherstonhaugh, Paul Slovic, Stephen Johnson, and James Friedrich. Insensitivity to the value of human life: A study of psychophysical numbing. Journal of Risk and uncertainty, 14(3):283–300, 1997.


Daniel Kahneman. A perspective on judgment and choice: mapping bounded rationality. American psychologist, 58(9):697, 2003.