Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: One bit of observation can unlock many of optimization - but at what cost?, published by dr s on April 29, 2023 on LessWrong.
This question by johnswentworth nerd sniped me, so I ended up thinking a lot about the relationship between information and control over the world in the simplified scenario of a single tape of bits. The question asked how many bits of optimization one could unlock with a single bit of observation; the answer ended up being "arbitrarily many", as proven by a simple example: suppose you have a rule that says that if the first bits of your action Amatch those of your observation O, then the rest of A gets copied to your target value Y; then there's no limit to how many bits you can copy, and the only information you need to leverage is the knowledge of the secret "key". We know this works because this is exactly how locks work in real life too.
If I possess someone's bank account password, or the combination to a safe, or the US nuclear codes, then I'm able to produce disproportionate effects to the tiny size of that knowledge, just because I can also rely on the state of the world and its laws being set up in such a way that I can use that knowledge as a pivot to trigger much bigger effects.
The thing I wanted to focus on then was those conditions: even given that I know the "password", what else do I need to know about the world at large, and what limits are there on my power to optimize the final state? I decided to focus on the following model:
a world string W of N bits, prepared in some initial state W0, with some regions known and some randomized;
a discrete map f that determines the evolution of this world, such that W1=f(W0), W2=f(f(W0)) and so on. The map is reversible, such that there exists an inverse Wi−1=f−1(Wi);
The focus on the map being reversible is because in the real world the laws of physics are time symmetric too and microscopically should not destroy information. Irreversible computing allows the deletion of information, which reduces the entropy of the system. In a computer embedded in a larger world this can be compensated by creating entropy somewhere else, but if our string has to represent the entire world, then it should preserve information. The world string has several identifiable regions:
an action region A, within which we can set up bits arbitrarily in the initial state;
an observation region O, which we can't affect but whose contents we know exactly in the initial state;
a target region Y, to which we aim at writing certain bits so as to maximize the mutual information I(Y;G) with some goal string G;
two fuel regions F0 and F1, at the ends of the string, filled respectively with all 0s and all 1s. We'll see the use for them in a moment.
The map f can be defined as a series of instructions. Since we're doing reversible computing, we can use only one universal logic gate, like the Toffoli gate (CCNOT) or the Fredkin gate (CSWAP). These take three bits as arguments, so once we've decided our gate of choice an entire program can be composed simply of triples of addresses of bits, wherein each address will require L=⌈log2(N)⌉ bits, meaning a program's size is 3L bits/instruction. Consider the simplest possible version of a "lock-like" program, which compares b1 and b2, and if they're identical, it swaps b3 and b4 (note that we can't simply copy b3: that would erase information and not be reversible). We will also need two "fuel" bits f1 and f2 prepared respectively in the 0 and 1 states. The program can be written with just three Fredkin gates:
CSWAP b_2 f_1 f_2
CSWAP f_2 b_3 b_4
After this, the two "fuel" bits are used up and can't be relied on any more for future calculations. From the viewpoint of the entire string, of course, the entropy is constant and the process is entirely reversible; but if you only looke...
view more