Simplicity is held by many to be the key to general intelligence. Simpler models tend to “generalise”, identifying the cause or generator of data with greater sample efficiency. The implications of the correlation between simplicity and generalisation extend far beyond computer science, addressing questions of physics and even biology. Yet simplicity is a property of form, while generalisation is of function. In interactive settings, any correlation between the two depends on interpretation. In theory there could be no correlation and yet in practice, there is. Previous theoretical work showed generalisation to be a consequence of “weak” constraints implied by function, not form. Experiments demonstrated choosing weak constraints over simple forms yielded a 110-500% improvement in generalisation rate. Here we show that all constraints can take equally simple forms, regardless of weakness. However if forms are spatially extended, then function is represented using a finite subset of forms. If function is represented using a finite subset of forms, then we can force a correlation between simplicity and generalisation by making weak constraints take simple forms. If function is determined by a goal directed process that favours versatility (e.g. natural selection), then efficiency demands weak constraints take simple forms. Complexity has no causal influence on generalisation, but appears to due to confounding.In Press: Accepted for publication in the Proceedings of The 17th Conference on Artificial General Intelligence, 2024
Please note this poster and abstract are to be presented at ASSC27 in Tokyo. The full paper will be posted here soon. The hard problem of consciousness (Chalmers, 1995) asks why there is something it is like to be a conscious organism. We address this from 1st principles, by constructing a formalism that unifies lower and higher order theories of consciousness (Morin, 2006). We begin by assuming that humans are biological organisms that aim for survival at the fundamental level. We then assume pancomputationalism and hold that the environment learns organisms that exhibit fit behaviour via the algorithm we call natural selection. Selection learns organisms that learn to classify causes, facilitating adaptation. Recent experimental and mathematical computer science elucidates how (Bennett, 2023). For example, it has been shown that scaling this capacity implies progressively higher orders of “causal identity” facilitating reafference (Merker, 2005) and P-consciousness, then A-consciousness (Block, 1995), then meta-self-awareness (Morin, 2006). In this paper we suggest that we can use this framework to address the hard problem in precise, computational terms. First, we deny that there can always exist a philosophical zombie as capable in all respects as a P-conscious being. This is because a variable presupposes an object to which a value is assigned. Whether X causes Y depends on the choice of X, so causality is learned by learning X such that X causes Y, not by presupposing X and then learning if X causes Y (presupposing abstractions reduces sample efficiency). However, learning is a discriminatory process that requires states be differentiated by a given value. Without objects, there is only quality. By this we mean an organism is attracted to or repulsed by a physical state (i.e. “stay alive”, “die”). Learning reduces quality into objects by constructing policies classifying causes thereof. Where selection pressures require an organism classify its own interventions, that policy (i.e. a “1st-order-self”) has a quality that persists across interventions, and so there is something it is like to be that organism. Thus organisms may have P-consciousness because it allows them to adapt with greater sample efficiency. We then argue neither P nor A-consciousness in isolation are remarkable, but when P-consciousness combines with A-consciousness we may obtain “H-consciousness'' (which (Boltuc, 2012) argued is the crux of the hard problem). We speculate that this occurs when natural selection pressures require organism X infer organism Y's prediction of X's interventions (a “2nd-order-self” approximating intent). We suggest then that A-consciousness is the contents of 2nd-order-selves. In this paper we will show how by predicting another's predictions of one's own 1st-order-self it becomes possible to know what one knows and feels, and act upon this information to communicate meaning in the Gricean sense (recognising communicative intent). We will formally present the psychophysical principle of causality, namely the process of learning and acting in accord with a hierarchy of causal identities that simplify the environment into classifiers of cause and effect. We will then show how this new framework ‘dissolves’ the hard problem of consciousness by simply going one level down and starting with the most fundamental drive of all living biological systems such as humans: stay alive!
Artificial general intelligence (AGI) may herald our extinction, according to AI safety research. Yet claims regarding AGI must rely upon mathematical formalisms – theoretical agents we may analyse or attempt to build. AIXI appears to be the only such formalism supported by proof that its behaviour is optimal, a consequence of its use of compression as a proxy for intelligence. Unfortunately, AIXI is incomputable and claims regarding its behaviour highly subjective. We argue that this is because AIXI formalises cognition as taking place in isolation from the environment in which goals are pursued (Cartesian dualism). We propose an alternative, supported by proof and experiment, which overcomes these problems. Integrating research from cognitive science with AI, we formalise an enactive model of learning and reasoning to address the problem of subjectivity. This allows us to formulate a different proxy for intelligence, called weakness, which addresses the problem of incomputability. We prove optimal behaviour is attained when weakness is maximised. This proof is supplemented by experimental results comparing weakness and description length (the closest analogue to compression possible without reintroducing subjectivity). Weakness outperforms description length, suggesting it is a better proxy. Furthermore we show that, if cognition is enactive, then minimisation of description length is neither necessary nor sufficient to attain optimal performance. These results undermine the notion that compression is closely related to intelligence. We conclude with a discussion of limitations, implications and future research. There remain several open questions regarding the implementation of scale-able general intelligence. In the short term, these results may be best utilised to improve the performance of existing systems. For example, our results explain why Deepmind’s Apperception Engine is able to generalise effectively, and how to replicate that performance by maximising weakness. Likewise in the context of neural networks, our results suggest both limitations of “scale is all you need”, and how those limitations can be overcome.