(no subject)
Nov. 20th, 2021 02:21 pmIt is possible that efforts to contemplate some risk area—say, existential risk—will do more harm than good. One might suppose that thinking about a topic should be entirely harmless, but this is not necessarily so. If one gets a good idea, one will be tempted to share it; and in so doing one might create an information hazard. Still, one likes to believe that, on balance, investigations into existential risks and most other risk areas will tend to reduce rather than increase the risks of their subject matter.
"one likes to believe" is not a reasoned argument. Arguably this author's writing was part of the chain of events leading to the creation of OpenAI, which Yudkowsky says "trashed humanity's chance of survival" (though he doesn't mention Bostrom as a cause)
I have no particular reason to believe that's true. In any case, although some pieces of writing have a large impact, it's hard to know what it will be in advance.
Jessica Taylor's post about her experience at CFAR talks about the insersection of unverifiable claims of future negative impacts and ordinary power, like how you don't want your boss or your friends to be mad at you.
I was discouraged from writing a blog post estimating when AI would be developed, on the basis that a real conversation about this topic among rationalists would cause AI to come sooner, which would be more dangerous (the blog post in question would have been similar to the AI forecasting work I did later, here and here; judge for yourself how dangerous this is). This made it hard to talk about the silencing dynamic; if you don’t have the freedom to speak about the institution and limits of freedom of speech, then you don’t have freedom of speech.
(Is it a surprise that, after over a year in an environment where I was encouraged to think seriously about the possibility that simple actions such as writing blog posts about AI forecasting could destroy the world, I would develop the belief that I could destroy everything through subtle mental movements that manipulate people?)
Can only speculate whether the motivation for telling Taylor not to write the blog post was really the one given, but there's reason to suspect. The idea is just that someone migh thave ordinary petty reasons not to want you to do something, and they're in a position to make pronouncements on long-term consequences, so they say the consequences are bad. For a blatant example see Justinian's decree against homosexuality which cites sodom and gomorroh and says of course we don't want that to happen here; of course that's not the real reason he disapproves of homosexuality.
The important thing here is that there's really no way to like, prove that it won't cause god to destroy the city? I mean... well, now we generally think that kind of thing doesn't happen. But with like, "this blog post could lead to unfriendly AI", it's like, yeah, sure, that could happen. But like... we're multiplying a very small number by a very large number. Probability this blog post makes the difference times badness of disaster. Anyone with experience in numerical computing can tell you there's an issue there. It leaves plenty of room for superstition, and for the accepted answer to just be social consensus, which someone in power has the power to shape.