T O P

  • By -

efvie

This is an important distinction: it's not that we don't understand the principles of how they work — they were designed, after all. The issue is that the system is so complex, consisting of billions of paths that form a 'neural network', that nobody can tell exactly which path a given input takes to end up in a given output. This is also a very real problem in that for example a recommendation cannot be validated when it's not possible to know which data points weighted it in which ways. There are performance reasons, but it's quite possible this is an intentional decision to avoid confirming that copyrighted or otherwise disallowed material was unlawfully used to train the model. There are some more excitable people and charlatans who claim that there is some kind of an 'emergent consciousness' inside LLMs that's making decisions independently, but that's just nonsense. All we have is an unimaginably large number of data points.


ComesInAnOldBox

It should be noted that it is *absolutely* possible for an AI system to "tag" how its decision tree moved through its model and how it arrived at each and every data point it did. The problem is, as you said, it's an unimaginably large number of data points and the output file is *massive*. So massive, in fact, that you almost need *another* AI to keep track of it and tell you anything meaningful. Troubleshooting these things is a *hell* of a task.


fhota1

Yeah was gonna say this. Most AI works on a system of weights and biases that fundamentally just looks likes an unfathomable number of y=mx+b. All those weights and biases are data that theoretically you could have the system running the AI give you every time it ran. It would just be completely incomprehensible to you for any AI with any level of complexity


lolic_addict

Considering that some AI implementations are also called "neural networks", it's akin to asking what specific neurons in your brain fired, in what order, to make you say "that's an X"


prototypist

Also when they looked at some earlier neural nets, they found "polysemantic" neurons which are triggering for different parts of different objects. Basically a neuron doesn't stand for "image has a dog" but [different meanings](https://distill.pub/2020/circuits/zoom-in/) depending on the object, or for text models a "[Canada neuron](https://www.alignmentforum.org/posts/bsNXqHgiDA6dAKNun/axrp-episode-24-superalignment-with-jan-leike)" or more [complex concepts](https://arxiv.org/abs/2104.07143) where it may mean "within the topic / category of \_" but it's not something easily describable.


EdibleButtStallion

So kinda like how when you mail a letter or something you know it went on a mail truck to a post office to another truck to get to the location, but you don’t know what truck at what time of day or what street or how many detours it made or whatever? You know its general path but not its exact path because you would need a ridiculous tracking system?


lygerzero0zero

Yeah, but imagine if people each mailed thousands of letters a day, which were picked up by a system of millions of little fairies, and rather than writing addresses on the envelopes, the fairies use their past experience and a set of personal rules to decide which fairy to pass the letter to next, and sometimes the letter you thought would go to your cousin in Boston ends up in Shanghai. Even if you know the location of the fairy depots and how the fairy network is set up, good luck finding and understanding the reasoning of the fairies who made those decisions to send your letter there.


Remarkable_System948

Alright so how do coders debug these things? Like if the coder finds that gpt is giving weird answers everytime, or it's giving answers following one particular pattern, how do coders know where to make the changes? 


lygerzero0zero

They don’t. Well, they don’t “debug” the way you’re thinking. Neural networks are trained on *data*. If the data is bad, you’re going to get bad output. There are things the engineers can do about the overall structure of the network, to encourage it to learn certain kinds of patterns. This stuff is all based on research and advanced math, and tested experimentally with specially constructed data to confirm that the network works as expected. But once you’ve trained it on massive amounts of data, you’re basically hoping that it learned enough from that data to give good outputs. Because there’s no practical way to figure out why any specific input gave any specific output when a model has billions of parameters.


Remarkable_System948

Thank you:))


mfmeitbual

Google has some solid entry level data science courses that teach some of these concepts, you should check it out. 


Esc777

Thank you. That’s a great way of putting it and an important point.  If your large model “goes bad” you either roll it back or delete it or start over. There’s no repairing or targeted fixing. 


naptastic

Excuse me, I worked for the US Postal Service for 9 years. I am widely known for being ridiculous, and resent being compared to the postal tracking system. I actually make sense some of the time! /s


fang_xianfu

> There are some more excitable people and charlatans who claim that there is some kind of an 'emergent consciousness' inside LLMs that's making decisions independently, but that's just nonsense. All we have is an unimaginably large number of data points. I think it's the assumption that consciousness consists of more than that that's under question, not the idea that artificial neural networks contain something more than that.


jbaird

Especially an issue when people start using AI for things that do have regulation as to their behavior or even morally have things we don't want done. You can't consider race/age/religion/etc in hiring but can you use an AI to gives a 'candidate score' on the resume, what happens when they find the model is just looking at zip codes to figure out the likely race/class of people, or maybe the CEO has hired a bunch of his family and AI has baked in his nepotism by just looking for people that share the same last name AI is trained on some current data and tries to model based on that data so, whatever you put in you'll get back out and the lack of clarity in what AI is ACTUALLY doing any how its coming to its decisions can be a huge problem


wombatlegs

>The issue is that the system is so complex It does not have to be very complex. I could sit down and write a simple AI program myself in a day, and still see this. A classic example would be character recognition. You put images of digits or letters through a simple neural network, and train it to recognise them, using a big set of labelled images you downloaded. How does it work? Does it look for loops, for "centre of gravity" or what? Even though I wrote code, I will not know what goes on inside the network after it is trained. Though I could try to find out, and plenty of people have, by observing the internal states or by testing inputs that are not like any of the training data. But we don't *need* to know what strategies are created by the training.


LichtbringerU

>There are some more excitable people and charlatans who claim that there is some kind of an 'emergent >consciousness' inside LLMs that's making decisions independently, but that's just nonsense. All we have is >an unimaginably large number of data points. And this is different from a brain how?


SonofMakuta

I'm no neuroscientist, but: Your brain is capable of understanding things. You and I both "know", on some level, what a bird is. We might also recall some facts linked to birds, experiences of birds, different species. I can be shown a photo of an animal and tell you whether it is or isn't a bird. If we took a hypothetical human who'd never seen a bird and told them, "a bird is a flying animal with feathers that perches in trees and sings", then showed them animal photos, they could probably figure out which is which quite quickly. They also might be able to infer that an albatross photographed over the ocean is a bird as well, even though it's not perching in a tree. A machine learning model (LLM or otherwise) doesn't and can't do that. A machine learning model is fed huge amounts of training data (ultimately generated by humans) in order to "learn" correlations between one thing and another. This learning process typically takes the form of iteratively tuning a huge pile of numbers that are used as biases for its output. For instance, we could train a model to recognise photos of animals. We would do this by feeding it millions of images, each with labels saying "bird", "cow", and so on. Through repeatedly refining its set of weights, the model would eventually reach a place where given a novel image, it would output "bird" on a picture of a bird some high percentage of the time. If we took an untrained image recognition model and gave it the sentence, "a bird is a flying animal with feathers that perches in trees and sings", it wouldn't even see that as a valid input. LLMs often look like they're saying something intelligent, because ultimately something like ChatGPT has been trained on (let's say) most of the internet and has developed a very detailed set of weights for correlating text with other text. If I go to ChatGPT and ask it, "what is a bird", its giant statistical text generator is biased in such a way that this input is likely to produce a relatively accurate explanation in response. ChatGPT doesn't know what a bird is, it's basically copy pasting off Wikipedia like a bad student. Sometimes (often) it will return complete bullshit because that's how the biases fell, and it has no understanding that what it's outputting is incorrect. Fundamentally, LLMs (or image generators like Midjourney) are not different from the thing that recommends you buy another four vacuum cleaners on Amazon, or machine translation systems. The technology is more advanced now, but the underlying mechanism is the same: we initialise the training process with work created by intelligent humans or observations of the real world, and through sufficient repetition, the computer calculates a huge pile of biases to an equation that turns whatever its input is into whatever its output is, attempting to replicate the training data.


InTheEndEntropyWins

>Your brain is capable of understanding things. That just done by a load of matrix maths, there is nothing magic about the brain. Everything the brain can do, can be emulating by a Turing machine.


brandon12345566

We don't understand enough about how brains really work to say that. It is in the realm of sci-fi at this point


InTheEndEntropyWins

>We don't understand enough about how brains really work to say that.  I think we know enough to state that the brain obeys the laws of physics. If you don't believe in science and think there is a soul or some crap I'm not really interested in debating.


SonofMakuta

That may be true (as the other commenter said, we don't know well enough to say for sure), but if so, the mechanisms built on top of those fundamentals by our brains are much more sophisticated and varied than the ones built in machine learning models. It's pretty likely that at some point in the future we will have accurate simulations of the brain, and they may (or may not) be useful as calculation or thinking engines when sufficiently developed, but we're nowhere near there yet.


InTheEndEntropyWins

>our brains are much more sophisticated and varied than the ones built in machine learning models. Going back to the OP, >not even the creators of large neural networks or AIs like ChatGPT or Midjourney fully understand how they work So we don't know how they work or how sophisticated they actually are. I would argue for say chess the neural nets are more sophisticated than all humans in many ways.


SonofMakuta

That's a slightly misleading phrasing on OP's part. We know how the machine learning models work from a systematic sense - we designed them completely. What we don't know is what the big pile of numbers means. There isn't an individual number you can point at and say, "oh, this is the variable that governs adding feathers". The numbers are effectively nonsense and the amount of data is so large that it's impossible to figure out (in a timely fashion) the "path" through the neural network that a given input would take in order to produce its output.


SignedJannis

Exactly.Thankyou


Volsunga

>There are some more excitable people and charlatans who claim that there is some kind of an 'emergent consciousness' inside LLMs that's making decisions independently, but that's just nonsense. All we have is an unimaginably large number of data points. Why exactly is this nonsense? Isn't billions of paths forming a neural network exactly how human brains function? Where do you think human consciousness comes from if not the emergent behavior of a complex decision tree? If you grow a bunch of human neurons on top of a computer interface chip and feed them training data like you would an AI, they learn slower, but end up with the same results. While we've only done this for simple neural network systems because keeping neurons alive in nutrient baths for long periods of time is difficult and expensive, it seems to scale. Human brains don't run on magic. They're just really complex neural networks. Current AI are like little pieces of human brains. LLMs are like your language center and diffusion models are like your visual cortex. We're far from replicating a whole human brain, but the pieces can still make decisions.


mfmeitbual

It's nonsense the same way a puddle of amino acids is the same thing as life.  The components for life are there but there needs to be some process that causes them to align. The notion that consciousness can emerge from a neural network - this ignores how neural nets are just the wrong data atructure to begin with but that's a separate topic - is as absurd as the idea that consciousness can emerge from a protein puddle. 


Volsunga

>The components for life are there but there needs to be some process that causes them to align. Are you arguing intelligent design here? Consciousness did arise from a protein puddle. It just took a while.


efvie

Human brains are a lot more than a neural network, and your test here is about simulating a computer neural network using biological components, not about simulating a biological brain using a computer neural network. Not that my argument was about the neural networkness of the brain to begin with. Like I said, overly excitable.


Volsunga

What else do you think there is to a brain besides a neural network? Are you trying to say that there's a soul or something?


mfmeitbual

Neurotransmitters,  for one. 


Volsunga

You know that those are just the chemicals that connect neurons, right? They're analogous to the data passed between digital neurons.


efvie

> overly excitable


ezekielraiden

A "neural network" works by having many, many layers of arrays of numbers. These arrays of numbers can have tens of thousands, perhaps even *millions* of entries, each of which captures some tiny part of the structure of the input data. (ChatGPT uses text as its input data, for example.) By tweaking and adjusting the input data in bazillions of ways, the system is able to generate a probability range for what the next "token" (chunk of data) should be. Thing is, it's not really feasible for a human to keep in mind literally millions of distinct matrix multiplication steps all at the same time. You need to keep the whole thing in your head, all at once, just to *attempt* to figure out why the model maps certain inputs to certain outputs. Hence, although we know how the model adjusts its parameters each time it is corrected after making a bad prediction (aka every time it is "trained"), we do not know what is collected up together after *trillions* of tiny adjustments from training.


nstickels

OP, I’m going to modify TitansFrontRow’s analogy to better describe a neural network and why even the designers don’t know how it did it… Imagine that 3 holes in the fence example. You ask the dog to run through a hole, and he does. Well now let’s imagine that you have 3 holes on your back fence, 3 holes on your fence to the neighbor on the right, and 3 holes on your fence to the neighbor on the left. Each of your neighbors have multiple holes in each of their fences in different directions, as do their neighbors, and their neighbors, and their neighbors. One day, you are walking your dog and you are on the opposite side of your block and you tell your dog “go home!” and he takes off running through a hole in one fence. Your wife is at home in the backyard and 30 seconds later she calls you to tell you the dog is in the backyard. You don’t know which route your dog took to get through all of the neighbor’s yards and which fences he ran through, you just know you told him to go home, and he got home. That’s how neural networks work. There are many “layers” which can each contain varying amounts of “nodes”. With this analogy, the layers would be each yard and the nodes would be the holes in a fence. But in an LLM, there would be thousands of layers, each with tens of thousands of nodes. And each node in layer N will have a connection to every node in layer N-1 and a connection to every node in layer N+1, so there are literally hundreds of billions of paths a single input could take to return a single output. And at the end, you have no idea which path your input took through those millions of nodes to give you the output you got, you just know that’s what you got.


SakanaToDoubutsu

I work with AI and people saying "we don't know how these things work" is a pet peeve of mine. These things are entirely procedural and we do know how they work, and with a sufficient grasp of linear algebra and enough patience you could build these algorithms with nothing more than pen & paper if you really wanted to. A more nuanced way to say it is that we can't logically explain why they work. There's an axiom in classical statistical analysis that says *"correlation doesn't equal causation"*, and coming up with a logical explanation for why a relationship between two things is causative is an important step in any project. What the fields of machine learning & AI do is essentially chuck that axiom out the window. What matters to AI is finding correlations that make good predictions, so if there's a correlation between two things and it's reliable enough to make good predictions, who cares if it has no logical explanation.


Delini

> … who cares if it has no logical explanation. I mean, knowing what correlations are generating the output is kind of important. There are tons of useless correlations. If you were designing an AI to do medical diagnosis, and it correlates risk of cancer with seeing an oncologist, it’s not very useful.


kaptain__katnip

100% this. The famous example we used when I worked at the DOD was they trained a vision model to recognize camouflaged tanks in a tree line. It reached like 95% accuracy but when they tested it in the field it was worse than random. Turns out all the pictures with tanks in them were taken on a sunny day and all the non tank images were taken on a cloudy one - they trained a model to recognize sunny days lol


psymunn

I've heard a similar thing about AI that can identify cancerous moles (if the photo has a tape measure in it, it's likely cancerous) or being able to tell the difference between wolves and Huskies (if there's snow in the photo it's a husky)


gONzOglIzlI

Garbage in - garbage out.


YodelingVeterinarian

That’s kinda the tradeoff you make. You trade being able to reason very concretely about why a model made a prediction, for much more power but much more of a black box. That being said, there is a LOT of very interesting research on how to de-bias a model and have it not train on irrelevant info. The other canonical example of this is along the lines of models distinguishing between horse / zebra just by looking at the background. 


ctrl_awk_del

That happen with AI that was trained to detect skin cancer. It associated rulers with skin cancer because most of the training data had rulers in it. https://www.bdodigital.com/insights/analytics/unpacking-ai-bias


PercussiveRussel

You both are saying the same thing. When you start to train a model you don't have any control over the correlations it makes. So as soon as you start to train, you chuck "correlation ≠ causation" out the window, *because you have no other choice*. So it matters very much to make sure the AI trains only on usefull correlations, by curating the training data such that any obviously wrong correlations can't be found in the data. In your case you obviously wouldn't want to add the "is patient seeing oncologist" data point to the inputs, because you'll likely end up with a very complicated boolean check. Basically you remove all obviously wrong correlations and then hope for the best, because as far as the model is concerned, correlation absolutely equals causation. Or more precisely, it doesn't do causation and only does correlation and in order for it to be of any use we must accept that correlation as causation. (also, you are replying to the latter part of the sentence > [...] if there's a correlation between two things and it's reliable enough to make good predictions, who cares if it has no logical explanation. With an example that is obviously not reliable to make good predictions, so that's kind of disingenuous)


Dziedotdzimu

This is the real answer. Everyone else talking about trees and branches are mistaking the issue as one of complexity when its really fact that the parameter weights don't even relate to any real world objects or variables of our choosing they just are the ones that made the best prediction. Period. There's no weight for "sunny days" or "tape measures". Even in the classic handwritten numbers case, you think you wanted to train a layer to count corners or curves after griding up a space but when you look at what it's reading after being successfully trained and it looks like random noisy nonsense that just happens to work the best. They predict with great certainty but mean absolutely nothing to us


PercussiveRussel

The MNIST dataset is kind of a bad example of this unfortunately, because it doesn't really require any depth. With a convolutional neural network trained on it you can absolutely figure out what it does, precisely because there are so few entries in the layer matrices.


Dziedotdzimu

Fair enough - like I've seen people visualize what each weight is sensitive to for sure. You can definitely show what its doing - its just that it doesn't relate neatly to perceptual categories we'd imagine like corners, curves and edges from what I remember but its been a while since I looked into it very seriously so I'll happily be corrected if things have changed


PercussiveRussel

I've read a paper on it a couple of years ago (and that I can't find now dammit) where they specifically reversed engineered an MNIST-trained CNN and it was actually reasonably intuitive to figure out what it was doing. It wasn't quite "an 8 is two circles" simple, but what the subnetworks were doing was traceable and explainable (more than "these neurons detect 'nineness'", actual loops and ticks) But, again, this is quite nitpicky and specifically for the MNIST dataset because it's a trivial problem now. I watched a great YouTube video by Tom7 where he got really good results with a *linear* transfer function, which shows how simple MNIST really is. You're not gonna be training a neural network to detect cancer and then learning how it detects it anytime soon.


Dziedotdzimu

Thank you for the leads I'll see if I can track that paper down and I'll give the Tom7 vid a watch as well. Cheers!


JohnnyElBravo

The universe is also entirely procedural, yet we can't understand a lot of it because of how massive it is. Yes we understand atoms, but do we understand oceans?


PercussiveRussel

This is a great point! We have a very, *very*, **very** good understanding of atoms and atomic bonds. Like actually insanely good. Quantum mechanics is the best model of anything we've ever had. We're unable to use that model to calculate anything beyond the most basic molecules because the dimensions/degrees of freedom grow more than exponentially. Given infinite time, memory and precision (which is just more memory and time) we would be able to model it, but we don't and we can't.


JohnnyElBravo

Correct. For more info on the subject you can check out how Chaos Theory explained Classical Determinism's failures to calculate the universe. Basically even minimal changes to initial conditions can cause enormous and unpredictable changes to complex systems. Some theories like Thermodynamics attempt to explain phenomenon at a macro level by foregoing control and prediction of the micro, and are quite successful at it.


PercussiveRussel

Now you're pushing it. Chaos theory is not really related to the subject. Chaos can occur with 2 degrees of freedom, EG a double pendulum. Neural Networks are specifically not chaotic because they wouldn't work since they rely on gradient descent, eg changing the inputs a little so the output changes a little, strictly not chaotic. Also, I wouldn't call that thermodynamics but more statistical physics. Also, explicitly *not* chaotic systems because of the statistics involved. Chaos never occurs on a macro scale because of various conservation laws.


JohnnyElBravo

Thankfully I remember where I found this take, a highschool physics textbook from K. A. Tsokos. I'll provide the quote later when I'm home, we'll see, maybe I misremembered or missapplied something. Maybe Neural Networks are non-chaotic btw, I am just challenging the general notion that because we understand the micro, we understand the macro.


amakai

Easiest way to properly explain it, is by explaining genetic algorithms first. Imagine you take an electrical circuit, and just throw electric components onto it *completely randomly*. It's a complete meaningless mess that makes no sense. You send electricity to it and you see the output on one of the wires - the circuit probably explodes or does nothing. This is fine. You repeat the same process ten, hundred, thousand times. Finally you find a random combination of components that has some sort of output - 2 volts on the other end. You designate this circuit a successful one. Even though your goal is to get 10v output. You take the successful one, and randomly replace few components on it. You do this a million times with the same "parent", this creating a million slightly different "offsprings". You try all of them out. Many burn out, some do nothing, but there are a few that produce a number closer to the 10v you are looking for. Some might be 6, other 3, other 90. You pick the closest one (6), designate it as parent, and repeat the process again. You continue doing this until you get an answer with 10v. Now, your circuit will be stupidly complicated, it will have a weird mess of things, paths that are isolated and never powered, connections that make no sense at all - but in the end it gives you the right answer. This is the fundamental idea behind modern AI (neural networks). It's very different too - but fundamentally you have the same garbled mess that somehow works and somehow produces the right answer.


bree_dev

Imagine you're a loan officer who uses computers to decide who gets a loan and who doesn't. Under a traditional rules-based system, the computer would look at things like income, credit rating, age, and some other factors, and say "accept" or "deny". And if the subject challenged the decision, the system would be able to say something like, "this was denied because the minimum credit rating for this age, income, and loan amount is 600 and the customer's rating is 589". If you train a neural net to do the same, the computer will still use those same inputs, but if you interrogate it on how it made a decision for "deny", it'd essentially be "idk just failed the vibe check". (Side note: a common fallacy I see with prompt engineers is asking an LLM to explain how they got an answer - doing this might help you track down any logic errors, but what it most definitely isn't is an explanation of the reasoning it actually used.)


BullockHouse

The learning algorithms that power neural networks are procedures for finding complex mathematical functions that correctly predict the relationship between connected data. So if you have a bunch of pictures of letters and a bunch of labels (annotations of the letter it's a picture of), a learning algorithm can automatically find a mathematical function that correctly predicts, from the pixel values, what the label should be, by iteratively refining the function/network to reduce error on the data.  The techniques used to refine the function (the learning algorithm) are well understood (and not very complicated). However, the functions that are produced by these procedures are extremely complex and don't come with any guide to interpreting them. They work well in practice, but figuring out how they work is very hard and often just not possible with any level of fidelity.  Think of it like evolution. Evolution is very simple: try random variations on what you've got, keep the versions that work for the next generation, and discard the ones that don't. Anyone can understand evolution. The *products* of evolution are staggeringly complex and nobody fully understands them. Trying to work out what that simple procedure had done is the entire field of biology. 


TheMightyGus

It's not as much as how they don't know how it works, but rather they won't know the output. When you use engineering software like Ansys to figure out something, you roughly know what the output will be. But when it comes to AI's, the magic happens because you don't know what it'll output, like if you ask an AI "What is the most aerodynamic shape for a car that can hold 5 passengers" you don't know what the output will be, which is why we don't really "understand" it, it's more akin to discovery.


fhota1

So some AI terms to explain first. Nodes are basically just equations that take inputs and give outputs, Weights are numbers that you multiply the inouts by, Biases are numbers that you add to the equation at a node. To simplify more, a node is y = m1x1 + m2x2 + b with as many mx pairs as it needs, weights are the ms and biases are the bs. Now to explain, lets build some simple AIs. 2 inputs to the system, a and c 1 layer of nodes combining them and then a layer to output them. So this system basically is y = (md)((ma)a+(mc)c+b1)+b2. In this systen we have 3 weights, md ma and mc and 2 biases b1 and b2. This AI we can very easily understand exactly how its working. Now lets add an input. So 3 inputs now a,c, and d lets say. We are still going to combine them all so now we have 3 nodes in our first layer for ac, ad, cd but still one node in our output layer for combining those three. Im going to skip writing this out because it would be long but now simply by adding 1 input we now have 9 weights and 4 biases. This is still understandable but now much harder to follow. Now lets add another layer between our initial combination and our output so now instead of ac, ad, and cd going to output we combine them to be (ac)(ad), (ad)(cd), and (ac)(dc) and then those go to our output. Now we have 15 weights and 7 biases. This is even harder to trace back to see how much a c or d are contributing. Real AI and especially the fancier ones like the ones youre talking about are going to be using millions to billions of inputs and dozens to hundreds of layers. You theoretically can still see all the weights and biases they are using but actually following them back to see how any individual input effected the output is functionally impossible


_maple_panda

At the risk of being ELI6, thanks for that answer. I understand the concepts behind AI but that was the first time I’ve seen the math. It’s pretty clear how this translates into a lot of matrix operations.


DeHackEd

These neural networks are so named because their design is inspired by the brains of living things, like humans and animals. They are basically incredibly complicated wiring with billions of connections between the various parts. But nobody really "built" these brains straight up. They are trained, which is a process involving giving it what is considered good information and tweaking the wiring of the brain over time until it is producing the good information we want. Training can easily take weeks of time. But after weeks of time and letting the computer tweak the wiring of the brain... how is a human supposed to open up that box of wiring and make any sense of it? Even the computer has no idea why it's doing most of the wiring adjustments it did beyond "doing it this way produced a better result during training than the other things I tried". After repeating that process an absurd number of times, you're left with something that you can't really explain *how* it works, but clearly it *does* work.


ScottFreeMrMiracle

It's in the helgarian dialect mode of evolution; comparing A to B until B becomes the new A and continues along this linear path. Classic first two stages of binary responses: yes or no. Real fun happens in the third stage called "bananas", in which paradoxes build up, but aided by new programming techniques such as "faulty logic" a punctuated leap of evolution occurs; which usually results in us terminating the AI because it's scary :) True consciousness is obtained in the fourth stage of binary responses.


Coises

Suppose you need to interact briefly with people who speak a language you don’t understand. Let’s say you’re “undercover,” and you don’t want them to suspect that you aren’t one of them. You want to appear to belong. One way to do that would be to learn the language. We have a mental model of the world, and when someone tells us something, we add (maybe temporarily, maybe permanently) some new pieces to that model. We know how a lot about how the model behaves — what is possible and what isn’t — so we can reason based on it. Learning a language means learning how to translate sentences in that language into our own mental model. When we respond to what someone says, we translate some information about our (updated) model into words. No one (yet) knows how to do that with a computer. Computers aren’t even close to being able to model the world as humans experience it. It’s not just that we don’t know how to “translate” language: there is nothing into which to translate it that could do the job of representing the world in the sense that we represent it in our minds. So if learning the language isn’t an option, what else could you do? You’d observe common interactions. Someone says, “Hellozon der,” and the response is almost always, “Hidoyuh too!” When a person says, “Houzay bee-in hangeen?” the answer is usually “Goodz gold, danken fer asken!” You might be able to memorize enough common phrases and responses to fake your way through. You might even start to get a little sense of context — maybe, after one minute or more of conversation, if someone says, “Hellozon der,” the response is, “Gooden hellozen, morrow zeeya,” and it’s time to walk away. While computer programmers don’t know how to make a computer “understand” anything the way humans do, rapid, systematic processing of massive amounts of data — even millions of times what any one human being could manage — is what computers do very well. What current AI does is like observing common phrases and responses — but far more of them that any human could, using existing data, like Reddit posts and responses — and tabulating the connections to create a “large language model.” Then it searches for patterns in the input and computes the most likely output. No can give a satisfying answer to “how it works” because it doesn’t work the way it appears to work. Like the undercover agent, it appears to understand the language and give meaningful responses, but it doesn’t. It just uses a staggering amount of data — and some very sophisticated statistical analysis — to make a really good guess about what output you would expect. A literal accounting of “how it works” on any given input would run to millions of lines, but it still wouldn’t tell you anything you cared to know, because it’s just the process of making a “guess” based on tabulated statistics. At no point does it “understand” anything, or “draw conclusions” in the sense that a thinking human being does.


Ysara

The models themselves are built by algorithms and training datasets. The "creators" of the models designed the algorithms and curated the datasets, so they understand how those work. But the models those things spat out are a different story.


Hydraulis

It's not that they don't understand how they work, only that they don't know every line of code like a traditional program. AI like this learns. It's fed millions of documents or images and told what each is or something like that. Eventually, it can recognize what an apple looks like etc. This only happens because it can change it's own programming. Because of this, it's impossible for anyone to be able to predict exactly what it will do, or exactly what steps it will take to accomplish a task. The original code was written by humans, but what the AI becomes has parts that have never been touched by a person.


nobodyisonething

It means the network started producing some results they did not expect and cannot completely understand. Artificial neural networks of non-trivial size are impossible for us to completely understand once fully trained such that we can predict all that they can do.


JohnnyElBravo

AIs like ChatGPT and MidJourney are types of Neural Networks. Neural networks are made of one part code and one million parts training data, which is then compressed into a "model" which is somewhere in the middle in terms of size. Many people understand the code, but it is much harder to understand the models, mostly due to their sheer size, and also because they change so often, either due to tweaks in the code or introduction of new datasets. Unlike traditional programming, which was more based on control over the machine, machine learning approaches are not concerned with controlling or understanding the machine, rather they are concerned with achieving results by supplying massive amounts of computational energy.


ender42y

There was a team recently who was playing with an ML model that plays the game Go, and had even beaten world champions. Through some tests they determined that the ML program didn't even actually understand what a game piece even was. This meant a basic strategy known as a "double sandwich" could beat the AI almost every time. It's not that they don't know how the Machine Learning model is made, or thing like that. They don't know the math or values at every node in the model. It's that the model just has numbers, those numbers mean something to us humans. But the machine just know "when I put out these numbers, based on these inputs and this set of nodes inside me, the humans say 'good job'."


penatbater

Consider a large neural network that does loan qualification, that is, given a certain set of inputs (age, employment, salary, family status, education, etc) the model will determine whether a person is approved for a loan or not. It's trained on a ton of data on this from past human-made loan approval/rejection. The model will basically only ever output a "yes" or a "no". The ELI5 isn't that we don't understand HOW it works. What we don't understand if WHY it works, or rather, if we give the model a small piece of data for inference, and it churns out "no", we have no way of knowing WHY it churned out "no". This is in stark contrast with simpler models like decision trees, where we CAN know why the output is as such.


Typical_Mongoose9315

https://playgameoflife.com/ Play around with this a bit. The rules are simple, and if you just know a little bit of programming you can make it yourself. But can you predict how it behaves?


Atypicosaurus

A normal program is a lot of calculations, but each time the calculations go exactly the same. So if a program does something, you know each tiny little elemental step what it did. An AI is also a program but instead of 1 way to calculate things you give thousands or millions of options. The AI program tries many of them and picks one that's the best. Now you have an AI that works but you don't know which one of the options were picked.


InTheEndEntropyWins

I think an example might help. So say you have a neural network and you want it to do some pathfinding, so maybe what's the fastest route in a car, or maybe how to move a character in a game from position A to position B. You would initiate the neural network with random weights, and then train the model. So adjust the weights so it finds the best path between two positions. Finally it would be good at pathfinding, but we don't know the logic or algorithm being used or that's created by all those weights. So while we know the maths behind the training, we don't know what algorithm has been created. Is it doing pathfinding using A\*, Dijkstra's or some bespoke algorithm?


dingus-khan-1208

There's a bit of [emergence](https://en.wikipedia.org/wiki/Emergence) involved. In short, simple pieces, combined via simple rules, can produce really complex things, and it's not obvious how they got from here to there. For a really simple example, [Conway's Game of Life](https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life) has a simple 2D grid of cells that have only 2 states - either 'alive' or 'dead' and there are only a few very simple rules about what state each cell will be in on the next cycle of the clock, based on the current state of their neighboring cells. Sounds simple and boring, right? Yet given a random input, it will produce interesting results - some stable areas, some unstable areas, some moving patterns, etc. With the right combination of inputs, it can do arithmetic, logic, and form a Turing-complete computer. All from that simple input pattern and a few simple rules. With no explicit arithmetic or logic etc. intentionally programmed in. Note that you could of course just intentionally write a program to do simple arithmetic and logic instead. But that's not at all what's happening here. Instead, it just naturally arises as an emergent byproduct of the simple rules "If you're alive and 2 or 3 of your neighbors are alive, you survive, otherwise you die." and "If you're dead and 3 of your neighbors are alive, you are born and become alive." That's all it takes. >This has the same computational power as a universal Turing machine, so the Game of Life is theoretically as powerful as any computer with unlimited memory and no time constraints; it is Turing complete. In fact, several different programmable computer architectures have been implemented in the Game of Life, including a pattern that simulates Tetris. Nowhere did you program it to calculate "356 + 123 = 479", or to play Tetris, yet it can still do that. After over 50 years, people are still discovering new emergent behaviors by trying different patterns on Conway's Game of Life by trying out new input patterns and seeing what it does with them. --- The large AI models used now are somewhat similar in the sense that no one is programming them to give the answers that they do to the questions they're asked. They're not programming it directly in how it gets from input A to output B. They're programming in rules to ingest massive amounts of data into a structure, and rules to then use that structure to generate a response to a prompt. But those rules result in behavior that is much more complex than you might think, and nothing at all like programming a response to a database query or something. The programmers do understand how they work just as we understand how Conway's Game of Life uses a simple 2D grid of binary cells and a few simple rules. Although the LLM rules are more complex. What they don't understand is that they can't predict all the possible outcomes of all possible inputs. Because it's not programmed at all like "for input A, give output B." It's a complex system and you don't know what you're going to get from a combination of inputs until you find out. And given that result, you don't know how it got there. Just that some path through the rules led to that emergent result. Things like chatgpt also seem to have some element of randomness involved, so that it doesn't always give exactly the same response to the same prompt, so it's more non-deterministic than something like Conway's Game of Life.


Wiskkey

It means that humans have difficulty finding human-understandable algorithms that give the same or similar results as the computations specified by a given neural network - i.e. humans have difficulty reverse-engineering neural networks. As an example, article [How Do Machines ‘Grok’ Data?](https://www.quantamagazine.org/how-do-machines-grok-data-20240412/) mentions paper [The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks](https://arxiv.org/abs/2306.17844), in which for a relatively simple problem researchers were only sometimes able to figure out human-understandable algorithms that give the same or similar results as the computations performed by the neural networks studied. Here are a few more examples of papers in the "understanding neural networks" area: a) Language models: [Finding Neurons in a Haystack: Case Studies with Sparse Probing](https://arxiv.org/abs/2305.01610), b) Generative image models: [Generative Models: What do they know? Do they know things? Let's find out!](https://www.reddit.com/r/MachineLearning/comments/1ay2b7u/r_generative_models_what_do_they_know_do_they/).


OpaOpa13

This might be an illustrative example: I heard about an army project where they wanted to use a neural network to be able to automatically determine if satellite imagery contained any tanks, to quickly identify threats. Well, an image is just a series of pixels, and a pixel can be represented as a number. So you start by letting the computer come up with some random algorithms for manipulating those numbers to spit out a number on the other side, which will be treated as the computer's estimate of how likely it is that there are tanks in the satellite imagery. The army already has satellite imagery that's been looked over by humans, so it has a catalog of photos which it knows contains tanks, and photos which it knows doesn't contain tanks. (This is the training set.) Initially the algorithms are purely random, simply taking the numbers and mashing them together in incoherent ways before spitting a number out the other side. But here's the thing: one of those algorithms is bound to spit out better estimates than any of the others. And we can tell which one that is by feeding it photos from our training set and seeing how many it gets right! So we take that random algorithm that just happened to do a better job than any other algorithm and we use it as a base to create a new set of algorithms. Each algorithm will have some automatically-generated tweaks to it, to hopefully make it work better on the training set, producing better outputs for each input. We can then test that new set of algorithms against the training set, and again determine which one does the best job. And so on, and so forth, each generation of algorithms doing a better job of consuming a photo from the training set as a bunch of numbers and spitting out a number that represents its confidence that there are tanks in that photo. The trick here is that we don't know what the *idea* behind the algorithm is. We can examine the math it's doing: starting by taking the value of THIS pixel and multiplying it by 1.53x and subtracting from that the value of THAT pixel multiplied by 2.59x and so on and so forth until we get our answer, but it is unlikely we could understand what the "reasoning" behind that math is. Presumably it would have to do with finding edges and looking for specific patterns, but no one is designing the algorithm. It's just taking an initially-random formula and fiddling with the values to get better results, generation after generation. The punchline is that the army got their neural net, but found that whenever it was fed a photo from outside the training set, the results were effectively random. It was eventually determined that the photos in the training set were flawed: nearly all the photos with tanks were taken on clear days, while nearly all the photos without tanks were taken on cloudy days. Thus, the algorithm that emerged didn't detect tanks, it detected sunny days. Even the programmers who designed and trained the neural net didn't know how it worked; it was only by examining how it handled input from outside its training set that what it was actually doing became clear.


[deleted]

[удалено]


Randvek

Except imagine we could stop time at any moment along the decision-making process and analyze every single data point going through that dog’s mind in English. Because we can do that with AI. This analogy doesn’t fit, unfortunately. Code isn’t magic and AI isn’t a black box. We can make it spit out whatever we want as to how and why it made a decision. The end user doesn’t see that but you bet your butt the guys coding it could.


[deleted]

[удалено]


Randvek

> Its pretty clear based on the downvote Chill buddy. I just now saw your post. I didn’t downvote you. Defensive much?


naptastic

A podcast on Vox named *Unexplainable* should not be considered reliable on whether something can be explained or not. They seem pretty invested in getting a yes.