Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system 2 theories #21

Open
KnutJaegersberg opened this issue Aug 12, 2024 · 40 comments
Open

system 2 theories #21

KnutJaegersberg opened this issue Aug 12, 2024 · 40 comments
Assignees

Comments

@KnutJaegersberg
Copy link

there is interesting literature from psy/cognitive science how system 2 might work. it's not describing thorough cognitive architectures, but is relevant nonetheless. I'll grab some and drop them here.

I think it is in general a good perspective to think of system 2 as build on top of system 1, though involving it's own neurocircuitry. I'd presume symbolic reasoning and logic emerge from association and biased / steered usage of system 1 components. Executive functions and attention are used to control neural activity, giving rise to reason. but it's not an entirely independent system. Reason is intuition doing things to itself.

The Pros and Cons of Identifying Critical Thinking with System 2 Processing
https://philpapers.org/rec/BONTPA-3

Three stages of system 2
image

Analytic Thinking (Type 2 or “System 2”) for Large Language Models: using Psychology to address hallucination and reliability issues

https://osf.io/preprints/psyarxiv/n7pa4

gotta dig more

@KnutJaegersberg
Copy link
Author

one suggestion is that system 2 is repeated intuitive responses that are evaluated for fit and modified, until something good comes up.

@KnutJaegersberg
Copy link
Author

note it has also been suggested there might be a system 3. it's not mainstream.

https://www.moneyonthemind.org/post/on-the-hunt-for-system-3-is-it-real

@KnutJaegersberg
Copy link
Author

with regard to executive control, it's in general thought of as frontal parts of the brain biasing activity to be goal directed.
executive functions play a huge role in the 'mechanics' of system 2.

the casscade theory is prominent and has been around since quite a while. this works via prompt chains with LLMs, too.
image

https://www.nature.com/articles/s41386-021-01152-w#Sec11

@KnutJaegersberg
Copy link
Author

The functional anatomy of cognitive control: A domain-general brain network for uncertainty processing

https://sci-hub.usualwant.com/10.1002/cne.24804
image

Cognitive Control as a Multivariate Optimization Problem

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8939373/

The role of PFC networks in cognitive control and executive function

https://www.nature.com/articles/s41386-021-01152-w

A middle ground where executive control meets semantics: the neural substrates of semantic control are topographically sandwiched between the multiple-demand and default-mode systems

https://academic.oup.com/cercor/article/33/8/4512/6706757?login=false

Neurokognitive Grundlagen des kreativen Denkens

  • default mode network (daydreaming)
  • executive control areas to evaluate new ideas
  • saliency network (notice new things)

https://www.youtube.com/watch?v=dM_hambWQCk

image

https://www.psychologytoday.com/us/blog/experimentations/201802/your-brain-creativity

There is also that concept called semantic control, seems important to me for logical reasoning.

Semantic cognition uses executive semantic control and hub-and-spoke semantic representation systems

https://www.sciencedirect.com/science/article/pii/S001094521830073X

Creativity in verbal associations is linked to semantic control

https://academic.oup.com/cercor/article/33/9/5135/6759328?login=false

A Tri-network Model of Human Semantic Processing

image

https://www.researchgate.net/publication/319645677_A_Tri-network_Model_of_Human_Semantic_Processing

Another model on attention and executive control is described in this research, using dynamical systems theory

https://www.youtube.com/watch?v=19ZqeQzXVV4

https://direct.mit.edu/netn/article/6/4/960/109066/It-s-about-time-Linking-dynamical-systems-with

@KnutJaegersberg
Copy link
Author

another thing is that reason is likely dynamically learning new things, which is not just what an iteratively prompted LLM does.
I think fluid intelligence and reason are connected with consciousness, as described in the learning to be conscious paper.

I think system 2 is actively learning new things on the fly.

https://www.semanticscholar.org/paper/Learning-to-Be-Conscious-Vermeiren-Cleeremans/a17ac90617afd51bcf94988217d0c96058b927aa

the active inference paper I've tagged you in yesterday adds to this picture that it might be the process of that higher representational system to predict the contents of the lower representational systems which might give rise to the experience of access consciousness, in contrast to the learning to be conscious account that just sees it as learning a representation about those.

An active inference model of conscious access

https://www.sciencedirect.com/science/article/pii/S2665945X22000092

image

@KnutJaegersberg
Copy link
Author

I think LLMs have the fundamental limitation that they don't optimize for a 'new cognition' - necessary for system 2 and logical thinking. Human logical reasoning is not infallable either, but I'd guess a good chunk of our higher accuracy as compared to LLMs trained on logic is this active modulation, 'making thought fit to the rule of logic, for new things'.
logical thinking is not just template matching, otherwise we could not extrapolate.

also this research here:

Bridging Machine Learning and Logical Reasoning by Abductive Learning

https://proceedings.neurips.cc/paper_files/paper/2019/hash/9c19a2aa1d84e04b0bd4bc888792bd1e-Abstract.html

@KnutJaegersberg
Copy link
Author

this is a bit confusing, because in GOFAI, logical reasoners are literally template matchers, and they work reliably.
I have the intuition we humans recreate the templates all the time, they are never 'recalled' only reconstructed. of cause with very high fidelity. but I presume that GOFAI reasoners are actually kinda alien, we just don't see it.

@KnutJaegersberg
Copy link
Author

well that's my dynamical cognitivism speaking. it's all a flow. but in computer science, originally one thought there is 'stuff' in the mind.
the best book I found on that was

Cognitive Dynamics: Conceptual and Representational Change in Humans and Machines

https://annas-archive.org/md5/39fdd62ff2c171c10a5a78aee755ae3c

I think it gives a reasonable account on what representations are. I've always felt during my whole psy study that the way the mainstream theory talked about those things was kinda off.
I find myself most in the stuff the dynamical systems theory video is talking about, that seems most realistic to me.

@KnutJaegersberg
Copy link
Author

now it's becoming mainstream in cognitive neuroscience, but I remember when they talked very differently about the brain, as if it did literally explicit computations. it works as if it did it (as we see with ANNs), but there are no numbers in our heads.

@KnutJaegersberg
Copy link
Author

the simples view on system 2 I get from all those theories is the iterative updating model of working memory, but used in a 'certain modus' to operate as reason. so I think in a way, you could 'abuse' LLMs as true thinking machines, that's what I feel. might not be the most elegant thing to do, but I think if you just want it to think logically as good as we do, one can do 'stuff' to LLMs to achieve that goal, is my hunch.

@KnutJaegersberg
Copy link
Author

this paper here is also very important

Toward a Control Theory of LLMs

https://aman-bhargava.com/ai/2023/12/17/towards-a-control-theory-of-LLMs.html

@KnutJaegersberg
Copy link
Author

while LLMs have flaws, I think they do simulate human linguistic thinking good enough to 'get something going' - which you then could call a thinking machine.

@KnutJaegersberg
Copy link
Author

it's what my hf post was about, too. there I suggest to use differentiation to improve the accuracy of reasoning.

https://huggingface.co/blog/KnutJaegersberg/active-reasoning

@andreaskoepf andreaskoepf self-assigned this Aug 13, 2024
@KnutJaegersberg
Copy link
Author

Rational rationalization and System 2

image

https://hal.science/hal-03025339/file/commentary%20cushman%20postprint.pdf

@KnutJaegersberg
Copy link
Author

The Elusive Notion of “Argument Quality”

https://www.wellesu.com/10.1007/s10503-017-9442-x

image

@KnutJaegersberg
Copy link
Author

Towards a metacognitive dual process theory of conditional reasoning

(important chapter)

https://annas-archive.org/md5/db1cc71fe7ddd30c2abb8a6c93bc0339

@KnutJaegersberg
Copy link
Author

metacognition effects on network structure (if integrated in training)

it's part of system 2, apparently an impactful part

image

https://arxiv.org/abs/2407.10188

@KnutJaegersberg
Copy link
Author

old but prominent paper on working memory

Making Working Memory Work: A Computational Model of
Learning in the Prefrontal Cortex and Basal Ganglia

image

https://ccnlab.org/papers/OReillyFrank06.pdf

@KnutJaegersberg
Copy link
Author

hmm that paper by Schmidhuber is of cause also relevant: learning to think

On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

in the video he calls it a kind of prompt writing model that is learned, to retrieve data that is useful for solving the kinda problems at hand

https://arxiv.org/abs/1511.09249

@KnutJaegersberg
Copy link
Author

so you have a learned way to 'populate' working memory with useful contents for whatever system 2 is doing / pursuing

@KnutJaegersberg
Copy link
Author

I think that was about retrieving the right abstractions according to Schmidhuber in the video - brilliant!

@KnutJaegersberg
Copy link
Author

instead of simulating every small step, such approach would help to learn to grab the right chunks for the task at hand, it's for a faster way of achieving a goal as long as the world model is somehow useful

@KnutJaegersberg
Copy link
Author

this is not accepted theory, but there are also different suggestions of a system 3. of cause that is also just used for self promotion, but there is something to it.
Different accounts on it:

  • as imagination system
  • as executive inhibition of system 1 with system 2 established rules
  • as flow states or kinda zen
  • as wise decision making

While I doubt if that necessiates naming a system on it's own, the account from wise decision making that mentions a balance is something which I think is interesting.
System 2 is cold and calculative. Optimizing the balance between system 1 and 2 sounds like an interesting thought, as system 2 just has the purpose to make human decision making more adaptive. it's not an end in itself, and it has flaws, too. soo looking for something that optimizes 'the whole' of human intelligence in some way sounds intriguing to me.

https://static1.squarespace.com/static/60f9012ad56b966a9e76b364/t/611b77ba6d9e784015a8c4d3/1629190075410/Webb+%282020%29.pdf

@KnutJaegersberg
Copy link
Author

wise decision making sounds like something the whole should optimize for, using whatever helps.

@KnutJaegersberg
Copy link
Author

I forgot one:

  • social understanding / social framing. a bit like stereotypical reasoning, talking as people get it without much reflection, though again as for imagination, i'm not sure if one ought to call these things as system 3.
    but communicating in social frames can also be a subutility of wisdom, as imagination can be.

@KnutJaegersberg
Copy link
Author

what I was looking for was something to balance or tweak "mind" like a metacognitive objective.
something an agent can learn over time to improve it's decisions, whatever the basis of it is.
wisdom sounds a bit cliche and common sense like, but some interpretation of it can be a good objective to optimize using all those cognitive gear.

@KnutJaegersberg
Copy link
Author

of cause one can also select other things to optimize for. like intelligence itself, or say, wit. but those targets that make the agent operate in harmony with its environment will likely all converge to wisdom.
just prompt it to become a better Ravenclaw.

@KnutJaegersberg
Copy link
Author

better give it some extra wit, Gandalf was not dumb either. To compensate for not wise smart agents.

@KnutJaegersberg
Copy link
Author

image

@KnutJaegersberg
Copy link
Author

Multi-Task Brain Network Reconfiguration is Inversely Associated with Human Intelligence

https://www.biorxiv.org/content/10.1101/2021.07.31.454563v2.full

image

@KnutJaegersberg
Copy link
Author

Integrated Intelligence from Distributed Brain Activity

https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(20)30169-8

image

image

image

image

@KnutJaegersberg
Copy link
Author

this notion of mixed selectivity and efficient, orthogonal task representations working together / filtering simultaneously mattering for gf again reminds me of Resers model. it's as if you only have to populate working memory in a smart way and then let it do it's thing.

@KnutJaegersberg
Copy link
Author

referring to active working memory here, using effort and executive control to populate it. but then again that other paper showed that in intelligent people there is not a big difference between task and resting state for the brain activity involved. it's a kinda zen on focussed attention. letting it happen, not stiffing the focus in a cramping way.

@KnutJaegersberg
Copy link
Author

Reser is of cause kinda short term, so one needs constant long term working memory utilization to stay in touch with 'bigger' goals - as intelligence is not lost in the immediate

https://www.jneurosci.org/content/42/38/7276

image

@KnutJaegersberg
Copy link
Author

not the same as a task list, more like aspirations, because it's not possible for intelligent people to plan everything happening to them

@KnutJaegersberg
Copy link
Author

image
image

@KnutJaegersberg
Copy link
Author

this reinforces my view that all reasoning can be understood as a kind of categorization, as a kind of pigeonholing. when done right, the right pigeon goes into the right hole. which is also why GPT is not entirely off. it's lacking recursion, though.

@KnutJaegersberg
Copy link
Author

it's important to note that the category can be made up, on the fly. that's where gpt lacks. you need to place the attractor in semantics, not in a fixed distribution. there are different neural network architectures for this, but I bet one can tinker something for LLMs, which, with a little RAG help, will work at human level. one ought to endow the emerging thought with a kind of p-consciousness. that's not achieved with automated prompt engineering. each p-consciousness, each thought is original.

@KnutJaegersberg
Copy link
Author

Hot and cold executive functions in the brain: A prefrontal-cingular network

it's mixed, as I said.

image

https://journals.sagepub.com/doi/pdf/10.1177/23982128211007769

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants