Abstractions on Inconsistent Data

[I’m not sure this makes any sense – it is mostly babble, as an attempt to express something that doesn’t want to be expressed. The ideas here may themselves be an abstraction on inconsistent data. Posting anyway because that’s what this blog is for.]

i. Abterpretations

Abstractions are (or at least are very closely related to) patterns, compression, and Shannon entropy. We take something that isn’t entirely random, and we use that predictability (lack of randomness) to find a smaller representation which we can reason about, and predict. Abstractions frequently lose information – the map does not capture every detail of the territory – but are still generally useful. There is a sense in which some things cannot be abstracted without loss – purely random data cannot be compressed by definition. There is another sense in which everything can be abstracted without loss, since even purely random data can be represented as the bit-string of itself. Pure randomness is in this sense somehow analogous to primeness – there is only one satisfactory function, and it is the identity.

A separate idea, heading in the same direction: Data cannot, in itself, be inconsistent – it can only be inconsistent with (or within) a given interpretation. Data alone is a string of bits with no interpretation whatsoever. The bitstring 01000001 is commonly interpreted both as the number 65, and as the character ‘A’, but that interpretation is not inherent to the bits; I could just as easily interpret it as the number 190, or as anything else. Sense data that I interpret as “my total life so far, and then an apple falling upwards”, is inconsistent with the laws of gravity. But the apple falling up is not inconsistent with my total life so far – it’s only inconsistent with gravity, as my interpretation of that data.

There is a sense in which some data cannot be consistently interpreted – purely random data cannot be consistently mapped onto anything useful. There is another sense in which everything can be consistently interpreted, since even purely random data can be consistently mapped onto itself: the territory is the territory. Primeness as an analogue, again.

Abstraction and interpretation are both functions, mapping data onto other data. There is a sense in which they are the same function. There is another sense in which they are inverses. Both senses are true.

ii. Errplanations

Assuming no errors, then one piece of inconsistent data is enough to invalidate an entire interpretation. In practice, errors abound. We don’t throw out all of physics every time a grad student does too much LSD.

Sometimes locating the error is easy. The apple falling up is a hallucination, because you did LSD.

Sometimes locating the error is harder. I feel repulsion at the naive utilitarian idea of killing one healthy patient to save five. Is that an error in my feelings, and I should bite the bullet? Is that a true inconsistency, and I should throw out utilitarianism? Or is that an error in the framing of the question, and No True Utilitarian endorses that action?

Locating the error is meaningless without explaining the error. You hallucinated the apple because LSD does things to your brain. Your model of the world now includes the error. The error is predictable.

Locating the error without explaining it is attributing the error to phlogiston, or epicycles. There may be an error in my feelings about the transplant case, but it is not yet predictable. I cannot distinguish between a missing errplanation and a true inconsistency.

iii. Intuitions

If ethical frameworks are abterpretations of our moral intuitions, then there is a sense in which no ethical framework can be generally true – our moral intuitions do not always satisfy the axioms of preference, and cannot be consistently interpreted.

There is another sense in which there is a generally true ethical framework for any possible set of moral intuitions: there is always one satisfactory function, and it is the identity.

Primeness as an analogue.

Link #84 – Compressionism: A New Theory of the Mind Based on Data Compression

http://jessic.at/writing/rba.pdf

Note: not that “new” anymore, this is from 2011. Both highly technical and philosophical, but gestures towards some ideas I find interesting as far as the question of what intelligence might be.

Disclaimer: I don’t necessarily agree with or endorse everything that I link to. I link to things that are interesting and/or thought-provoking. Caveat lector.

Modelling Systems

Now that we have the link between systems theory and information theory explicitly on the table, there are a couple of other interesting topics we can introduce. For example, the famous Turing machine can both:

  1. Be viewed as a system.
  2. Model (aka simulate) any other possible system.

And it is on the combination of these points that I want to focus. First, I shall define the size of a system as the total number of bits that are needed to represent the totality of its information. This can of course change as the entropy of the system changes, so the size is always specific to a particular state of a system.

With this definition in hand (and considering as an example the Turing machine above), we can say that a system can be perfectly simulated by any other system whose maximum size is at least as large as the maximum size of the system being simulated. The Turing machine, given its unlimited memory, has an infinite maximum size and can therefore simulate any system. This leads nicely to the concept of being Turing complete.

(Note that an unlimited memory is not in itself sufficient for Turing completeness. The system’s rules must also be sufficiently complex or else the entropy over time of the system reduces to a constant value.)

Information Theory, Compression, and Representing Systems

In several of my last few posts I have touched on or made tangential reference to the topic known as information theory. It’s kind of a big and important field, so I’ll give you a few minutes to at least skim the Wikipedia entry before we continue.

Alright, ready? Let’s dive in. First note that in my original definition of a system I defined an element as a mapping from each property in the system to a distinct piece of information. This was not an accident. Systems, fundamentally, are nothing more than sets of information bound together by rules for processing that information (which are themselves information, in the relevant sense). The properties set is nothing more than useful labels for distinguishing pieces of information; labels are also a form of information, of course.

As such, we have all the rather immense mathematical power of information theory available to us when we talk about systems. In hindsight, this should probably have been the very next post I wrote after the introduction to systems theory; all of the other parts I’ve written between then and now (specifically the ones on patterns, entropy and abstraction) make far more sense given this idea as context.

In this view, patterns and abstractions go hand in hand as ways of using the low entropy of a system to produce representations of that system using fewer bits. They are, in fact, a form of compression (and what I called an incomplete abstraction simply means that the compression is lossy).

Patterns and Entropy

Our next foray into systems theory involves the definitions of patterns and the study of entropy (in the information-theoretical sense). Don’t worry too much about the math, I’m going to be working with a simple intuitive version for the most part, although if you have a background in computers or mathematics there are plenty of neat nooks and crannies to explore.

For a starting point, I will selectively quote Wikipedia’s opening paragraph on patterns (at time of writing):

A pattern, …is a discernible regularity… As such, the elements of a pattern repeat in a predictable manner.

I’ve snipped out the irrelevant bits, so the above definition is relatively meaty and covers the important points. First, a pattern is a discernible regularity. What does that mean? Well, unfortunately not a whole lot really, unless you’re hot on the concept of automata theory and recognizability. But it really doesn’t matter, since your intuitive concept of a pattern neatly covers all of the relevant facts for our purposes.

But what does this have to do with systems theory? Well, consider our reliable example, Conway’s Game of Life. A pattern in Life is a fairly obvious thing: a big long line of living cells is a pattern for example. This brings us to the second part of the above quote: the elements of a pattern repeat. This should be obvious from the example. Of course you can have other patterns in Life; a checkerboard grid is another obvious pattern, and the relatively famous glider is also a pattern.

It seems, on review, that I am doing a poor job of explaining patterns, however I will leave the above for lack of any better ideas at the moment. Just rest comfortable that your intuitive knowledge of what a pattern is should be sufficient.

For the more mathematically inclined, a pattern can be more usefully defined in terms of its information-theoretical entropy (also known as Shannon entropy after its inventor Claude Shannon). Technically anything that is at all non-random (aka predictable) is a pattern, though usually we are interested in patterns of particularly low entropy.

Apologies, this has ended up rather incoherent. Hopefully next post will be better. Reading the links may help, if you’re into that sort of thing.