I’ve been working on a post on predictions which has rather gotten away from me in scope. This is the first of a couple of building-block posts which I expect to spin out so I have things to reference when I finally make it to the main point. This post fits neatly into my old (2014!) sequence on systems theory and should be considered a belated addition to that.
Systems can be deterministic or random. A system that is random is, of course… random. I’m glad the difficult half of this essay is out of the way! Kidding aside, the interesting part is that from the inside, a system that is deterministic also appears random. This claim is technically a bit stronger than I can really argue, but it guides the intuition better than the more formal version.
Because no proper subsystem can perfectly simulate its parent, every inside-the-system simulation must ultimately exclude information, either via the use of lossy abstractions or by choosing to simulate only a proper, open subsystem of the parent. In either case, the excluded information effectively appears in the simulation as randomness: fundamentally unpredictable additional input.
This has some interesting implications if reality is a system and we’re inside it, as I believe to be the case. First it means that we cannot ever conclusively prove whether the universe is deterministic (a la Laplace’s Demon) or random. We can still make some strong probabilistic arguments, but a full proof becomes impossible.
Second, it means that we can safely assume the existence of “atomic randomness” in all of our models. If the system is random, then atomic randomness is in some sense “real” and we’re done. But if the system is deterministic, then we can pretend atomic randomness is real, because the information necessary to dispel that apparent randomness is provably unavailable to us. In some sense the distinction doesn’t even matter anymore; whether the information is provably unavailable or just doesn’t exist, our models look the same.
Whoops, it’s been over a month since I finished my last post (life got in the way) and so now I’m going to have to dig a bit to figure out where I wanted to go with that. Let’s see…
We ended up with the concept of a mechanical brain mapping complex inputs to physical reactions. The next obviously useful layer of complexity is for our brain to store some internal state, permitting the same inputs to produce different outputs based on the current situation. Of course this state information is going to be effectively analogue in a biological system, implemented via chemical balances. If this sounds familiar, it really should: it’s effectively a simple emotional system.
The next step is strictly Pavlovian. With the presence of one form of internal state memory, the growth of another complementary layer is not far-fetched. Learning that one input precedes a second input with high probability, and creating a new reaction for the first input is predictably mechanical, though now mostly beyond what modern AI has been able to accomplish even ignoring tokenized input. But here we must also tie back to that idea (which I discussed in the previous post). As the complexity of tokenized input grows, so does the abstracting power of the mind able to recognize the multitude of shapes, colours, sounds, etc. and turn them into the ideas of “animal” or “tree” or what have you. When this abstracting power is combined with simple memory and turned back on the tokens it is already producing, we end up with something that is otherwise very hard to construct: mimicry.
In order for an animal to mimic the behaviour of another, it must be able to tokenize its sense input in a relevant way, draw the abstract parallel between the animal it sees and itself, store that abstract process in at least a temporary way, and apply it to new situations. This is an immensely complex task, and yet it falls naturally out of the abilities I have so far layed out. (If you couldn’t tell, this is where I leave baseless speculation behind and engage in outrageous hand-waving).
And now I’m out of time, just as I’m getting back in the swing of things. Hopefully the next update comes sooner!
I now take a not-so-brief detour to lay out a theory of brain/mind, from a roughly evolutionary point of view, that will lay the foundation for my interrupted discussion of self-hood and identity. I tied in several related problems when working this out, in no particular order:
- The brain is massively complex; given an understanding of evolution, what is at least one potential path for this complexity to grow while still being adaptively useful at every step?
- “Strong” Artificial Intelligence as a field has failed again and again with various approaches, why?
- All the questions of philosophy of identity I mentioned in my previous post.
- Given a roughly physicalist answer to the mind-body problem (which I guess I’ve implied a few times but never really spelled out), how do you explain the experiential nature of consciousness?
Observant readers may note that I briefly touched this subject once before. What follows here is a much longer, more complex exposition but follows the same basic ideas; I’ve tweaked a few things and filled in a lot more blanks, but the broad approach is roughly the same.
Let’s start with the so-called “lizard hindbrain”, capable only of immediate, instinctual reactions to sensory input. This include stuff like automatically pulling away your hand when you touch something hot. AI research has long been able to trivially replicate this, it’s a pretty simple mapping of inputs to reactions. Not a whole lot to see here, a very basic and (importantly) mechanical process. Even the most ardent dualists would have trouble arguing that creatures with only this kind of brain have something special going on inside. This lizard hindbrain is a good candidate for our “initial evolutionary step”; all it takes is a simple cluster of nerve fibres and voila.
The next step isn’t so much a discrete step as an increase, specifically in the complexity of inputs recognized. While it’s easy to understand and emulate a rule matching “pain”, it’s much harder to understand and emulate a rule matching “the sight of another animal”. In fact it is this (apparently) simple step where a lot of hard AI falls down, because the pattern matching required to turn raw sense data into “tokens” (objects etc.) is incredibly difficult, and without these tokens the rest of the process of consciousness doesn’t really have a foundation. Trying to build a decision-making model without tokenized sense input seems to me a bit like trying to build an airplane out of cheese: you just don’t have the right parts to work with.
So now we have a nerve cluster that recognizes non-trivial patterns in sense input and triggers physical reactions. While this is something that AI has trouble with, it’s still trivially a mechanical process, just a very complex one. The next step is perhaps less obviously mechanical, but this post is long enough, so you’ll just have to wait for it 🙂
From the nature of the brain, through the nature of the mind, we now move on to the last of this particular triumvirate: the nature of intelligence.
A good definition of intelligence follows relatively cleanly from my previous two posts. Since the brain is a modelling subsystem of reality, it follows that some brains simply have more information-theoretic power than others. However, I believe that this is not the whole story. Certainly a strictly bigger brain will be able to store more complex abstractions (as a computer with more memory can do bigger computations), but the actual physical size of human brains is not strongly correlated with our individual intelligence (however you measure it).
Instead I posit the following: intelligence, roughly speaking, is related to the ability for the brain to match new patterns and derive new abstractions. This is information-theoretic compression in a sense. The more abstract and compact the ideas that one is able to reason with, the more powerful the models one is able to use.
The actual root of this ability is almost certainly structural with the brain somehow, but the exact mechanics are irrelevant. It is more important to note that the resulting stronger abstractions are not the cause of raw intelligence so much as an effect: the cause is the ability to take disparate data and factor out all the patterns, reducing it down to as close to raw Shannon entropy as possible.
Having just covered in summary the nature of the brain, we now turn to the much knottier issue of what constitutes the mind. Specifically I want to turn to the nature of self-awareness and true intelligence. Advances in modern computing have left most people with little doubt that we can simulate behavioural intelligence to within certain limits. But there still seems to be that missing spark that separates even the best computer from an actual human being.
That spark, I believe, boils down to recursive predictive self-modelling. The brain, as seen on Monday, can be viewed as a modelling subsystem of reality. But why should it be limited to modelling other parts of reality? Since from an information-theoretic perspective it must already be dealing in abstractions in order to model as much of reality as it can, there is nothing at all to prevent it from building an abstraction of itself and modelling that as well. Recursively, ad nauseum, until the resolution (in number of bits) of the abstraction no longer permits.
This self-modelling provides, in a very literal way, a sense of self. It also lets us make sense of certain idioms of speech, such as “I surprised myself”. On most theories of the mind, that notion of surprising oneself can only be a figure of speech, but self-modelling can actually make sense of it: your brain’s model of itself made a false prediction; the abstraction broke down.
Our little subsection on biology and genetics has covered the core points I wanted to mention, so now we take a sharp left turn and head back to an application of systems theory. Specifically, the next couple of posts will deal with the philosophy’s classic mind-body problem. If you haven’t already, I suggest you skim through my systems-theory posts, in particular “Reality as a System“. They really set the stage for what’s coming here.
As suggested in my last systems-theory post, if we view reality as a system then we can draw some interesting information-theoretic conclusions about our brains. Specifically, our brains must be seen as open (i.e. not closed), recursively modelling subsystems of reality.
Simply by being part of reality it must be a subsystem therein. Because it interacts with other parts of reality, it is open, not closed. The claim that it provides a recursive model of (part of) reality is perhaps less obvious, but should still be intuitive on reflection. When we imagine what it would be like to make some decision, what else is our brain doing but simulating that part of reality. Obviously it is not simulating the actual underlying reality (atoms or molecules or whatever) but it is simulating some relevant abstraction of that.
In fact, I will argue later that this is effectively all our brains do: they recursively model an abstraction of reality. But this is obviously a more contentious claim, so I will leave it for another day.
(Note: my roadmap originally had planned a post on Gödel’s Incompleteness Theorems, but that’s not going to happen. It’s a fascinating topic with some interesting applications, but it’s even more mathematically dense than a lot of my other stuff, and isn’t strictly necessary, so I’m skipping it, for now. Maybe I’ll come back to it later. Read the wiki page if you’re interested.)
This post marks the final cherry on top of this whole series on systems theory, and the part where we finally get to make practical philosophical use of the whole abstract structure we’ve been building up. I’ve telegraphed the whole thing in the roadmap, and the thesis is in the title, so let’s just dive right in: reality is a system. It’s layed out almost already right there in axioms #3 and #5.
We can also tie this in with our definitions of truth and knowledge. If the absolute underlying reality of what is (forming absolute truth) is a system, then the relative truth that we regularly refer to as “truth” is just a set of abstractions layered on top of the underlying reality.
Dogs and cats and chairs and tables are just abstractions on top of molecules. Molecules are just an abstraction on top of atoms. Atoms, on top of protons, electrons, and neutrons. Protons and neutrons on top of quarks and other fundamental particles I don’t understand. The absolute true underlying system is, in this view, not possible to know. In fact, since we as persons are inside the system (we can in fact be seen as subsystems of it), then we literally cannot model the entire thing with complete fidelity. It is fundamentally impossible. The best we can do is to model an abstraction within the bounds of the entropy of the system. This is in some distant sense a restatement of the circular trap.
Given my previous definition of system simulation (aka modelling) it seems intuitive that a finite system cannot model itself except insofar as it is itself. Even more obviously, no proper subsystem of a system could simulate its “parent”. A proper subsystem by definition has a smaller size than the enclosing system, but needs to be at least as big in order to model it.
(An infinite subsystem of an infinite system is not a case I care to think too hard about, though in theory it could violate this rule? Although some infinities are bigger than others, so… ask a set theorist.)
However, an abstraction of a system can be substantially smaller (i.e. require fewer bits of information) than the underlying system. This means that a system can have subsystems which recursively model abstractions of their parents. Going back to our game-of-life/glider example, this means that you could have a section of a game of life which computationally modeled the behaviour of gliders in that very same system. The model cannot be perfect (that would require the subsystem to be larger than its “parent”) so the abstraction must of necessity be incomplete, but as we saw in that example being incomplete doesn’t make it useless.
Now that we have the link between systems theory and information theory explicitly on the table, there are a couple of other interesting topics we can introduce. For example, the famous Turing machine can both:
- Be viewed as a system.
- Model (aka simulate) any other possible system.
And it is on the combination of these points that I want to focus. First, I shall define the size of a system as the total number of bits that are needed to represent the totality of its information. This can of course change as the entropy of the system changes, so the size is always specific to a particular state of a system.
With this definition in hand (and considering as an example the Turing machine above), we can say that a system can be perfectly simulated by any other system whose maximum size is at least as large as the maximum size of the system being simulated. The Turing machine, given its unlimited memory, has an infinite maximum size and can therefore simulate any system. This leads nicely to the concept of being Turing complete.
(Note that an unlimited memory is not in itself sufficient for Turing completeness. The system’s rules must also be sufficiently complex or else the entropy over time of the system reduces to a constant value.)
In several of my last few posts I have touched on or made tangential reference to the topic known as information theory. It’s kind of a big and important field, so I’ll give you a few minutes to at least skim the Wikipedia entry before we continue.
Alright, ready? Let’s dive in. First note that in my original definition of a system I defined an element as a mapping from each property in the system to a distinct piece of information. This was not an accident. Systems, fundamentally, are nothing more than sets of information bound together by rules for processing that information (which are themselves information, in the relevant sense). The properties set is nothing more than useful labels for distinguishing pieces of information; labels are also a form of information, of course.
As such, we have all the rather immense mathematical power of information theory available to us when we talk about systems. In hindsight, this should probably have been the very next post I wrote after the introduction to systems theory; all of the other parts I’ve written between then and now (specifically the ones on patterns, entropy and abstraction) make far more sense given this idea as context.
In this view, patterns and abstractions go hand in hand as ways of using the low entropy of a system to produce representations of that system using fewer bits. They are, in fact, a form of compression (and what I called an incomplete abstraction simply means that the compression is lossy).