Nonlinear Recurrent Connections

Unusually for me, I’ve usually agreed with the choice of linear circuits; especially when applied to practical stuff.  Noisechaotic behaviorphase distortion  – there’s a long list of reasons guiding designers away from a nonlinear approach in many diverse domains.  But I could never relate to nonlinear design’s harshest critiques: “strange, odd, risky” given the need for differential equations and the layout separation that often accompanies their implementation.

However, we can make mistakes when defaulting to linear design, regression, and analysis; many fundamental processes are elastic, and serious consequences in science, finance, and so on result when ignoring the ubiquitous nonlinear nature of interactions.  To me, the most interesting of which are the infinite number of self-organizing nonlinear transfer functions at the heart of neurons, which comprise a critical building block in the opaque essence of human consciousness.

In addition, in most large electrical systems, feedback loops nested throughout complex networks, or simpler cascading directed graphs, can give rise to incredible potential and features; and natural neural systems appear no different.  The recurrent wiring inherent across the neocortex and other critical regions are vital to an array of conscious and unconscious processes.  However, for quite a while, it seemed most neural network experiments seemed focused on feedforward networks.

It was an early time in the reintroduction of research into nonlinear connections and much time was spent on exploring basic tenants of their domain; however, another likely reason was the order of magnitude of complexity recurrent connections introduce in order to analyze and understand their chaotic behaviors in the presence of nonlinear systems.  This includes temporal effects, and how they influence short-term memory in the hypocampus leading to the subsequent, necortex formation of long term associative neural connections.

On the path to sentience, I believe recurrence is paramount to providing continuity of thought, convergence of auto-associative memory, as well as predictive abilities.  To me, nonlinear, recurrent, connectionist architectures of enormous scale have very little in common with current digital computers and much hyped AI simulations, other than the basic ability of universal turning machines to emulate similar interactions.  As such, to me this is an early, but fascinating time in the exploration of singularly unique machines, whose successors I expect will progressively give rise to systems it’s easy to imagine will possess incredible promise.

As such, this blog entry will touch on the advantages and challenges of complex designs applying nonlinear architectures in recurrent connectionist systems, as compared to more deterministic approaches toward artificial intelligence.  But more importantly, how their use can diverge in important ways, including fundamental questions, which when taken to the limit, may even ultimately affect humanity.  Perhaps here I’m edging toward philosophy as much as computer science and cybernetics, the usual focus of this blog.  Thank you for indulging me.

Simple is better; however, newer technology and tools support malleable, easy to use nonlinear designs.  An example of the multifaceted, combinatorial aspects of technology today; borne of insight out of phase from past leaders, replaced by entrepreneurs with the mettle for exploring new territory, and in the process continually refining the state of the art.   I can’t remember running across a young electrical engineer worried about the use of nonlinear designs, once considered heresy by so many.

My perspective also hosts nostalgia and admiration for past mentors, whose incredible narratives and vision of the facilities and inventions they leveraged from the initial analog design phase inspired me; like Solomon secrets:  hosting a style of nonlinear thinking, deeply expressive, representing art as much as engineering.   An early phase in the remarkable era that inspired amazing combinations of analog, digital and nonlinear advances.  For example, the PSTN provided five 9s to millions long before incorporating  ICs.  Yet, ultimately losing the race to linear design and digital technology rapidly boosted by Moore’s Law, along with libraries of arrays of simple flip-flops; queued for fabrication techniques and ready to begin mass production, before there were projects and services ready to receive them.  The die was cast before we were aware of how digital dies could best be cast, and in their wake analog design faded fast.  Moores the pity and the law.

Perceptrons showed up early in the AI contests: supervised learning computers designed as linear classifiers, with support from Ivy League institutions, hyped by even the Navy as a breakthrough in image recognition, heralding great promise: “the embryo of an electronic computer that [the Navy] expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.”  That still sounds a bit goofy, and the Perceptron story a bit stale; probably recounted more often than it deserves, but an important catalyst for a broad effect on the legacy of AI that I still see echos of today.  Somehow, along the way, we lost sight that op amps and transduction are realistic reflections of interfaces and processes everywhere and as such, should never be shed as engineering tools.  To me, digital and analog are symbiotic lexes useful on anyone’s journey to know more – it’s a self con to discard one for the other.

One of the Perceptron’s novelties was the ability to partition sections on a two-dimensional plane, classifying objects into distinct categories; this was a cool and intriguing first step, except when it became clear nonlinear patterns couldn’t be segmented in this fashion, due to the linear transfer function at the heart of this very particular paradigm’s training scheme. Several esteemed researchers gained notoriety for stomping on the Perceptron, proving even simple XOR problems couldn’t be linearly separated.  Funding quickly evaporated in vast, related fields.  Like ancient feuds, one can still find the very definition of Perceptrons debated, as if an attempt to repair their lack of malleability and justify an early stumble.  A lesson for me is that linear and nonlinear designs, as simply one example of many design considerations, ultimately lead to tremendous variance of design potential and philosophy of approach, as do countless other emerging ideas on the road to architect sentience.

Of course now there are oodles of endless nonlinear transfer functions of every shape, wrap, band-pass, cadence, and other creative names in recurrent, deep hierarchies that have and continue to be proven mathematically to be able to emulate most, if not any kind of function.  So much for the powerful XOR nonlinear problem that derailed an entire branch of research, funding, and opportunity. For most complex, modern networks and transfer functions, it’s typically the more important question for research how networks of infinite forms of transfer functions can be most effectively trained and stabilized.  Many are also becoming aware there may be issues constraining them, once scaled up to trillions of concurrent interactions, which we have no current hope of gleaning insight into and perhaps even interrupting once launched.

Curiously, I recently read that new generations of AI students seem to go through a predictable cycle: initially excited by its prospects and promise, soaking up directed training/unsupervised learning paradigms, making incremental progress from decades of documented dabbling, and embedding them in all sorts of emerging distributed and mainframe contexts.  Ultimately financial needs, the end of their thesis, time, etc. steer them to embrace more practical areas of deterministic AI where progress (self driving cars and an endless succession of other cool stuff on the horizon) is ever easier to achieve and sustain a career.  But most realize whatever they are creating is a compromise and another delta in updating the ongoing definition of Weak AI.

Students and researchers seem bummed by memories of endless training sessions, disillusionment as they ultimately glean insight their solutions are completely opaque, and incredibly difficult to reliably reproduce; yet, surprisingly, frustratingly, and ironically astonishingly powerful in the narrow domain where targeted – an irony similar to many tenants of physics and science in general.  The field is clearly having second thoughts of late, brandishing their frustrations across several decades of slow, incremental progress, and focusing anew with ever more powerful contexts of software and machines toward parallel, self organizing connectionism.  The idea seems to be to be to encapsulate them in discrete, controlled environments for safety and focus of function.  A tall order that.

It’s interesting how some have decided the answer is a stone’s soup of all corners of current AI approaches, and at least one recent book I read challenged the reader to get on with assembling it, as if we simply need to follow through with his recipe.  It appears some support deterministic supervision using expert systems, a pinch of SVMs, and so on along a host of other attributes misc paradigms still in vogue can offer to balance, provide safety, and add value to the whole.  I wonder how they skip over the obvious fact the brain doesn’t use any current AI paradigms.  For anyone who is familiar with business, its easy to sympathize with the reality that their investors now probably demand to see AI as part of the corporate portfolio, but that doesn’t mean a solution is nascent.  Nor has it ever been.

A proof based, deterministic approach eclipsed everything AI for decades after connectionism was temporarily abandoned, continually fostering interesting, new experiments and prototypes; wafting about a bright future and financial fortunes, while blowing oodles of taxpayer and private money alike.  Quite many seemed to shimmer until limitations dragged reality back to the forefront, and the next ‘breakthrough’ percolated up.  Regardless, its clear there were and continue to be many stunning successes: in particular, exhaustive search in the form of chess, combinatorial statistics based strategies that beat Jeopardy champions while refining their approach over distributed architectures, as well as other domain specific problems which gave momentum to an approach which still encourages many, if not most developers, who likely, in this life, will continue to believe in it as a panacea.

I’m a pragmatic fan of determinism and Weak AI for intellectual and other earthy reasons.  Yet, I’ll never empathize with why some were given so much credit for compering the clarion call that set back research in important, tangential fields, like nonlinear connectionist systems, for decades. Conversely, Grossberg’s Adaptive Resonance Theory is still considered a watershed in advancing understanding of convergence within connectionist systems and the power it brings to neural systems.  I can ostensibly forever reflect on it with new insight.  Thank you Steven; yours is the closest point of perigee to understanding how neurons converge towards quiescence IMHO.

To me, nonlinear complexity seems to hold an interesting psychoanalytical relationship with humans, given the immense potential and power it represents.  My own ten-cent psychology leads me to believe it’s upsetting to realize something so powerful converges toward the incomprehensible.  As such, we’re prone to classify it as non science, regardless of the theoretical potential and hard evidence of realization via our own neural network, the human mind.  No wonder it was set aside for the much more practical, immediate, and rapidly expanding digital world.

So it’s odd that posterior probabilities, and random statistical emulations, are clearly part of the kindling it appears to me are fundamental for creating and testing ‘Broad or Strong AI;’ which leads to a more capable, complex, and ultimately opaque intelligence that inevitably will learn on its own.  In contrast to ‘Narrow or Weak AI:’ typically a deterministic, potentially comprehensible approach usually built from reason and logical inference.  Perhaps a more precise understanding follows if we see what the neocortex might or can potentially grasp at some specific level, e.g. Narrow/Weak AI, and in the case of the former, what it never will, when taken to the limit, as the system naturally, irrevocably becomes both erudite and impenetrable.

To refine the idea, perhaps the point where human intellect becomes challenged in understanding a complex system is simply somewhere a bit before the cross over to what gives rise to systems which can bootstrap their own intelligence.  Even if the process of launching sentient life is known, the resulting system will likely be unknowable by us.  In contrast, Narrow or Weak AI can already do pretty amazing stuff and, given enough effort, showcase malleable attributes that create systems directed toward specific goals that quite clearly appear imbued with more than just determinism – perhaps a kindling of human-like intelligence, but clearly not the same.

But it’s Broad or Strong AI that both threatens and beguiles us toward something permanently beyond ourselves that may lead to fundamental, perhaps one-way changes. That’s a bit concerning to many already, perhaps as it should be, especially as time and systems keep advancing exponentially.  Yet it’s difficult to imagine thwarting the vector of human will to innovate; ironically illustrated in the ending of Player Piano, as a vending machine is repaired in the ruins of an apocalypse brought on by machines, and the effort applauded.  Like many I don’t see an answer, just concern over something important and fascinating, moving rapidly, and which we appear unable to grasp, much less define currently.

The analytically impervious nature of neocortex meshes of connections host the impossible to comprehend: concurrent, millisecond synapses and an average of eight thousand axial to dendritic pairs between billions of neocortex neurons, along with as yet unimaginable numbers of other discoveries ahead.  I agree with those who classify our own neocortex as the first and only example we’ve seen of Broad or Strong AI.  The issue then is that there may ultimately be more examples which we inadvertently play a role in boot strapping and quite feasibly lose control of; all the while only dimly aware of the initial catalyst we stumbled across, much less the inestimable outcomes.  That systems become opaque to human understanding as they become complex seems self-evident.   The problem is then they can also become uber capable in ways we just can’t relate to in our present form.

Back to incremental progress over the past few decades, the many learning paradigms I’ve reviewed seem to have little to do with natural self-organization:  inspired by the fundamental ways nature pieces everything together, from molecular crystals to information in the nucleus of cells used as a function to create life which gives rise to consciousness.  Perhaps we could foster vital insight through staid, sustained focus on how self-organization of recurrent meshes of massively parallel, recurrent, neural connections with nonlinear transfer functions, low pass filtered by glia cells and awash in hormones, fosters sentient thought.  Breakthroughs might follow faster than we expect.   For a simple comparison, we marvel at the less than one-hundred years it took for flight to space travel to transpire.

The outcome now of many a blur of Narrow and Broad AI projects, seems to have been to attract investment toward small eddies next to sand bars that spin until the money drains away; where it’s easy to get lost in goofy goals cast as zillion dollar future IPOs, launched by the chemically clouded/technically vacant evangelists who whisper frantic fear to confidants far afield from their typical, initial over the top braggadocio confidence – eventually lost in disillusionment and a dissolved financial stake they’ll never pay back, representing monkeys on their backs for the duration of most of their lives, forcing them and their beneficiaries to forget the beautiful ideas that once inspired them.  Do yourself a favor and run from them.

But it doesn’t have to be that way.  Conversely, developers can give quiet rise to some of the most potent concepts and instantiations that make progress, with small, even singular cadres of developers who seek anonymity; fostered by development contexts which cost virtually nothing; save a compiler, an average computer, perhaps a Faraday Cage and a box of ARM machines with pieces of magic that cost pennies.  Some desire nothing more than to develop systems that seek answers at the top of a mountain in the clouds.  Even if it leads nowhere to nothing, except a closer understanding of being human.  Seems to me it’s a magnificent time to occupy a digital abbey.

In addition, all this culture of AI may be getting in the way of itself.  Perhaps we should stop naming new learning paradigms, like we do compilers and languages, and instead itemize their coefficients and parameters, such as nonlinear matched filters, spiking patterns, and so on in biological and evolutionary sets.  At any rate, it appears many now see this in a more mature context; that self-awareness from other than carbon based life forms is probably going to get here in a few generations.  If and when so, I hope they have mercy on us before they decide to escape this important rock, with its thin, gentle, paradise biosphere.

So it seems clear to me there are well-known connectionist architectures with specific attributes that look to possess incredible potential. Although an easy trap to get lost in, and wholly unwieldy, recurrent connections seem fundamental for sustaining the catalyst for vital aspects of AI. As a personal preference, it’s often selected as part of the building blocks chosen from the many variants on auto associative memories, bidirectional or otherwise.  This often stands out to me more important than an overriding focus on scaling up the number of hidden neuron layers across hierarchies, which has received much importance given its relationship to deep learning.  In my experience nonlinear, recurrent connectionist systems become entirely opaque, as they distribute complex functions across vast, cavernous, meshed layers of both forward and coiled, backward connections; encapsulated in massively parallel, recursive copies of the same.  They quickly become imperceptible as their potential exponentially rises.  As the decades pass and new instantiations are realized, I wonder how long will we be a part of the progress.

The context I enjoy tinkering in this field are small, embedded, concurrent Turing machines, that emulate nonlinear transfer functions with more backward referenced connections than forward, hosting countless varieties of varying coefficients influenced by time, phase, and frequency matched Kalman like filters.  These trillions of what appears to be packets of order statistics seem to saturate themselves, for me, in the form of mesh typologies, an interesting aspect of which time doesn’t always waft across the connections at the same rate, unlike a CPU or Stratum clock tick; while generally overall producing a system that might look like mud to the casual observer.  There are endless small loci of effects, where every neuron fires within its own context, while asynchronously interacting with neurons several inches away across a small skull holding a three pound, impossibly opaque, squashy chaos of 100 billion cells that can produce anything from wit to imagination via what looks like a scaffold of semi-organized mud.

How to wash it in simulated software hormones that can mimic catalysts for system vectors on the path to sentience eludes me: one of millions of hidden heuristics that I believe hinder progress across the spectrum of the ostensibly inestimable deltas of knowledge prerequisite for neocortex emulation.  There are many other mysteries which I have confidence will ultimately fall and leave us not sure what is the next step, much less the path forward.  I agree with those who say we must plan for that eventuality.  I don’t think it’s wise to tinker with large systems of this ilk, much less share it in tangible form.  What’s daunting and damning to the churn of the future is apparently all you need are a few billion, concurrent, emulated Arduinos/Raspberry Pis along with a not so weak mind, which is probably pretty common in any State University CS program.

For the deterministic mind-set, I do believe some logic bounds exist at certain levels to give some semblance of control, for which we take great pride in ourselves as conscious beings that can cast our own destiny.  Infinite quantum waves collapse from infinite particle potentials, which free will aggregates into the reality we experience, supporting the ego joy we revile in, as we convince ourselves that we’ve charted our unique future while threads of other universes have, as I perceive the theory, effectively vanished from our current ability to get to them.  I agree with Richard Feynman that I’ll never understand quantum mechanics, even as quantum theory will likely, and ironically in a few years, usher in quantum machines that have a very realistic shot at emulating true sentience.  Go figure.  As a very respected physicist has said: “The universe is under no obligation to make sense to you.”

This mental magic transpires, completely oblivious to us, interwoven in the vast, unexplored characteristics of the wiring of the brain which rattles consciousness along following no granularity of time; with eigenvectors of electrochemical effects washing concurrently in every direction across a massively parallel cortex firing in infinite slices, roughly at the millisecond level, hosting magic that somehow integrates thought from a billion corners within a small skull, enabling me to put nouns and verbs together.  It’s daunting to say the least, to consider how far we would have to go to get near anything resembling comprehension of the least of the details of this incredible dance of infinite potential.

The self organization of our minds is a beautiful counter example to entropy and a characteristic of life that is likely ubiquitous, across countless galaxies Hubble collected photons reveal after weeks of looking at a pinpoint in the blackness of space.  It’s painfully apparent how little we understand of ourselves and how what we perceive as consciousness has emerged; but most now realize it’s extremely unlikely we’re not the only creatures, made of star stuff, who have, are, or will ponder their own meaning and origin.  How can a human not be spiritual in the face of this?

An interesting analyst of consciousness once said “The essence of society is the repression of the individual and the essence of the individual is repression of himself.” Psychoanalytical considerations aside, I expect unraveling ourselves from the id to the axon to glutamic acids are an important step to questions many orders of magnitude more interesting; and, will ultimately affect society in ways we can’t predict.  This may be especially significant when insight inescapably compels forward engineering of sentience instantiations that immediately cross over, beyond each of us individually, and echo back onto the larger locus of humanity that will, in my opinion, effect everyone, everywhere.  What then of societal and psychoanalytical repression forms that have held us wherever and whatever we currently are?  What happens to sentience when it can self evolve via massively parallel, self-organizing systems in a microsecond what took carbon based life forms several billion years?  A clear reason Artificial Intelligence is, and will increasingly be, the major question worth pursuing.  Plausibly the only question rightly worth pursuing, as it determines answers to so many other questions.

This entry was posted in Nonlinear recurrence. Bookmark the permalink.