How to actually learn

February 18, 2026·18 min read

Learning = (Consequential Decisions / unit time) x Quality of feedback loop

Everyone who has done anything meaningful in life tells young people some version of the same thing: at first, go where you'll learn the most. It is good advice. But it can also feel strangely useless, like telling someone lost in a city to "go where you need to be." Great. Which street do I take?

I have been chewing on this for a while, and I think the reason this advice falls flat is that people rarely define what a high-learning environment actually looks like in concrete terms. So let me try.

A concrete model of learning

Your rate of learning is roughly:

Rate of consequential decisions you make x Quality of feedback you get on those decisions

We care about learning because as you learn, you get better at things, and your life improves. Plenty of people smarter than me have written beautifully about why learning matters. What I want to do here is narrower and, I think, more useful: bridge the gap between the wise people who already reached their destination and forgot the specific turns they took, and the young person standing at the intersection with no clue which steps to actually take.

"Pursue learning" makes perfect sense in retrospect and is nearly worthless in advance. It is not actionable until you can point at a place, a job, a situation, and say: that one, specifically, will teach me more than the alternatives.

Think about how machines learn. Think about AlphaGo. What did it do? Two things, over and over. It made a decision (place a stone), and it got a signal back (did that help me win or not). A number. Clean, immediate, unambiguous. And it did this millions of times.

That is the whole trick. Lots of decisions, tight feedback. Do it fast enough and you discover patterns no one explicitly taught you.

So why can we not just do that? Big problem: we are not machines.

First, humans are slow. Physically and cognitively, we cannot crank through decisions at anything resembling the pace of an algorithm.

Second, and more importantly, the real world is noisy in ways Go is not. In chess, the feedback signal is clean. You won or lost, and there is a rating that tells you how well you played. In life, you make a decision and maybe three years later something happens that might have been caused by that decision, or by seventeen other things you cannot untangle. The signal is buried in noise.

And you cannot strip away the noise without stripping away the nuance that makes real-world judgment valuable. This is one reason general intelligence is hard for AI. The moment you reduce your feedback to a clean number, you throw away most of what matters about reality. You get something superhuman in a narrow corridor and helpless everywhere else.

So we cannot be algorithms. But we can move in that direction with two levers.

Lever 1: consequential decisions, faster

Not just any decisions. Consequential ones. This matters enormously, and I think most people miss it.

Your brain is lazy, productively lazy, but lazy. It will not spend the calories to deeply process a decision unless something is at stake. Think about food. Most people cruise on autopilot until something forces the issue: a health scare, a partner who cares about nutrition, or an internal shift where you suddenly decide this matters.

The key word is consequences. A decision that changes nothing teaches nothing, because your brain literally will not bother closing the loop. Why would it? There is nothing to update on. It is like trying to train a neural net with a loss function that always returns zero.

There is also a time dimension that makes this harder. The closer a consequence lands to the decision that caused it, the more powerfully you learn from it. Touch a hot stove and you learn instantly. Start saving for retirement at 25 and you may not feel the consequence for decades, which is one reason fewer people do it even though it is obviously correct.

If you make a decision and someone you love gets hurt because of it, you will never forget. That lesson burns itself into your brain whether you want it to or not.

If you make a decision whose consequences arrive after you are dead, you learn exactly nothing from it.

So the first piece looks something like:

Learning = Rate of decisions x (Magnitude of consequences / Time until those consequences hit)

You want environments where you make real calls (not observe other people making them, not run hypotheticals in your head) and where results come back fast and hit hard enough that your brain pays attention.

Lever 2: feedback loop

This is where it gets interesting, because this is where we leave the machine analogy behind.

AlphaGo gets a number. We cannot get a clean number in real life. The world has too many interconnected parts, too many externalities, too much noise. The Stoics understood this well: outcomes are often a lousy measure of decision quality because luck and circumstance contaminate them beyond recognition.

If we cannot grade ourselves with a number, what do we use? People.

People as grading functions

Specifically, people who are further along and have strong judgment. This is mentorship, yes, but I think most people miss why it works mechanically. It works because experienced people are sophisticated grading functions. They internalized thousands of pattern matches across messy, high-dimensional situations. They can look at your decision, your reasoning, and sometimes even your instincts, and give you a signal that cuts through noisy outcomes.

The best mentors do not even need to wait for consequences to play out. They can evaluate your process directly. That is an incredible shortcut because it means you do not have to wait for the world to tell you whether you were right. You get feedback now.

If you go to the gym, a good personal trainer can tell you whether your form is right before your body changes. You do not have to wait six months to discover you have been doing the exercise wrong. The trainer is a faster feedback loop than the mirror.

Not all people are equally good at this. Here is my rough stack rank (more intuition than science):

People who have reached the level you are aiming for and have helped others get there too. Gold standard.
People who have helped others reach that level, even if they have not done it themselves. Great coaches are not always great players.
People who reached that level but have never mentored anyone. They may struggle to articulate what they know.
People who have not reached that level. Their feedback is mostly noise.

The better someone's judgment, the cleaner the signal they can give you. Simple and hard.

So if you want to maximize learning speed, you need two things at once: a place where you make real decisions at a high rate, and access to people who can quickly and accurately tell you what you got right and wrong without waiting for reality to run its slow, ambiguous experiment.

Lab A vs. Lab B

Back in November, I faced a decision between two AI labs.

Lab A was the obvious pick: strong brand, stable, great name on paper. I would be a new grad working on a narrow slice of something non-existential for the company. Smart people, kind people, solid environment. But I could already see the shape of it. Maybe three real decisions in six months, and long feedback loops softened by polite, vaguely directional comments.

Lab B looked scarier: less brand recognition, everything still being figured out, the kind of place where the org chart changes on a Tuesday and nobody blinks. But the team was world class, the scope was absurdly broad, and in a single week I made ten consequential calls and got feedback from a manager who could see through my reasoning like glass. Every conversation felt like compressing three months of learning into an hour.

Now do the math. Lab A compounds fewer than two sluggish decisions per quarter through a mediocre feedback loop. Lab B compounds ten decisions per week through a feedback loop that corrects errors in near real time. Run that for seven months and it is not close. Not 2x or 5x. It is a different universe of accumulated judgment.

Yes, Lab B was riskier. It did not have the same brand or stability. But risk is not the same thing as recklessness. Many young people confuse those, pick the safer-sounding option, and call it wisdom. Often, that safe path is where you learn less, compound slower, and wake up years later with a clean resume and underdeveloped decision instincts.

Hierarchy of learning situations

There is a hierarchy of learning situations, and it maps almost perfectly to how much skin you have in the game:

You make a decision and face consequences. Maximum learning. This is the sweet spot.
You watch someone else decide and see their consequences. Dangerous territory, especially when you are young and inclined to defer. It feels like learning, but often you are absorbing conclusions without running the computation yourself.
You make a decision but face no real consequences. Low learning. Simulations, hypotheticals, consulting, most classroom exercises. Your brain knows it is a game and does not fully engage.
You neither decide nor face consequences. You learn nothing.

The uncomfortable truth is that the situations where you learn fastest are also the situations where you can get hurt. That is not a coincidence.

Find the room

So the next time someone tells you to go where you will learn the most, you can actually evaluate it.

Look at the place. Count the decisions you would make that carry real weight. Ask how quickly consequences land. Then look at the people. Are they the kind who can read your thinking and give you a signal clean enough to update on, or the kind who nod and say "looks good" while you drift? Multiply those factors and compound over months. The gap grows fast.

Two people can start in the same place and end up in different universes of judgment, not because one is smarter, but because one was in the right room.

Learning = (Number of consequential decisions / unit time) x (Magnitude of consequences / Time until those consequences hit) x Quality of feedback loop

Stack ranking for decisions (best to worst):

You make the decision and face the consequences.
You watch someone else's decision and see the consequences.
You make a decision but face no real consequences.
You do not make the decision and do not face consequences.

Stack ranking for graders (best to worst):

People who reached the level you want and helped others reach it too.
People who helped others reach that level, even if they have not gotten there themselves.
People who reached that level but never guided anyone.
People who have not reached that level at all.