happily.ai research All research
Field notes

AI Readiness Is a Power Law

Why the people who are best with AI never took a prompt-engineering class.

Normal distribution most cluster near the average vs Power law a few pull far ahead

The best AI users you know did not take a class for it. No one gave them a prompt-engineering certificate. They opened the same chat box as everyone else and got far more out of it. Many other smart, motivated people tried the same tools and came away with average results.

The usual explanation is that this is a training gap, and that skill with AI is spread the way most human traits are: on a bell curve, where a short course lifts everyone a little. If that were true, the fix would be simple. Teach the prompts, move the average.

Skill with AI does not follow a bell curve. A few people get exceptional results, most get ordinary ones, and the gap between them is widening rather than closing. That is the signature of a power law. This piece argues that AI effectiveness is a power law because the skills behind it multiply rather than add, so a weakness in any one skill drags the whole result down. It is a synthesis of public sources, not Happily first-party data. The three figures are illustrative models, labeled as such; every cited number comes from a named public source in the methodology box and the references.

Miss any one of the four skills, and the value you get from AI falls toward zero, however strong the other three.
Why this matters

If you hire, train, or plan around AI, the shape of the distribution decides the strategy. If readiness were additive, you would give everyone a prompt course and move on. If it is multiplicative, that course does nothing for the person whose weak skill is judgment. The work is to find the one skill that is holding someone back and raise it.

Sources and method
What this is
A synthesis of public sources, not Happily first-party data. There is no internal "n=" sample.
Sources
World Inequality Database; Kremer (1993); O'Boyle & Aguinis (2012); Dell'Acqua et al. (2023); Brynjolfsson, Li & Raymond (2023); Kruger & Dunning (1999).
The four skills
Critical thinking, experience and judgment, communication, and domain expertise. The multiplicative core, not the whole space.
Illustrative figures
Figures 1 to 3 are schematic models, not measurements, and are labeled on the chart. The long-tail shape rests on O'Boyle & Aguinis (2012).
Claim discipline
The cited studies are observational or single field experiments. The piece argues a mechanism the evidence is consistent with, not a controlled result.

When a bell curve is the right model

A bell curve appears when an outcome is the sum of many small, independent effects. This is the central limit theorem, and adult height is the clearest case: hundreds of genetic and environmental contributions add up, and the result lands in a narrow, symmetric band around the average. IQ-test scores look normal in part because the tests are built and scored to make them so.

The defining feature of an additive trait is that the top and the middle are close. The tallest adult is not ten times the height of the median person. If skill with AI worked this way, the best user would be a little ahead of the average, and a prompt course really would lift the whole curve at once.

When the curve grows a tail

Other outcomes are not a sum of small parts but a product of factors that all have to line up. When factors multiply, the distribution stops being symmetric and grows a long tail to the right: most cases stay small, and a few run away.

The everyday examples are easy to spot once you look for them, and they split cleanly into two families.

Two families of distribution
Adds up → bell curveMultiplies → power law
Adult heightIncome and wealth
Body weight, blood pressureCity populations (Zipf's law)
IQ-test and exam scoresWord frequency in language (Zipf's law)
Shoe sizeBook, song, and film sales (the long tail)
Daily temperatureFollowers, website traffic, earthquake size

Income is the textbook power law. In the United States, the top 10% of earners took about 47% of pre-tax national income in 2023, up from 34% in 1980 (World Inequality Database). Earnings depend on a long chain of factors that compound, including education, network, location, timing, and capital. The tell is the distance from the middle to the top: the tallest adult is roughly a third taller than the median, while the highest earners make thousands of times the median wage.

Economists model why a chain like this produces a tail. In Michael Kremer's O-ring theory (1993), production is a set of tasks that each have to be done well, so their qualities multiply. One botched task, like the single failed O-ring that destroyed the Challenger, ruins the value of all the others. In that model, output rises steeply with skill, and small gaps in ability open into large gaps in outcome.

Human performance already behaves this way. O'Boyle and Aguinis (2012) studied 633,263 people across 198 samples of researchers, entertainers, politicians, and athletes, and found that individual performance fit a power-law distribution better than a normal one, most clearly in complex work. The bell curve is the special case, not the default.

If a skill multiplies, a few people pull far ahead Illustrative. Additive skills make a bell curve; multiplicative skills make a long tail. share of people If the skill were additive a bell curve, clustered at the average What composite skills produce most people cluster here a few pull far ahead lower effectiveness with AI, per person higher Illustrative shapes. Long-tail basis: O'Boyle & Aguinis (2012), individual performance is Paretian, not normal.
Figure 1 Illustrative. An additive skill forms a bell curve clustered at the average. A multiplicative skill forms a long tail: most people bunch lower and a few pull far ahead. Long-tail basis: O'Boyle & Aguinis (2012).

AI effectiveness multiplies four skills

Working well with AI is not one skill but at least four, and they combine like factors in a product rather than terms in a sum: critical thinking, experience and judgment, communication, and domain expertise.

AI readiness, written as a formula Four composite skills that multiply, not add. AI effectiveness = Critical thinking × Judgment × Communication × Domain expertise Each factor is scored from 0 to 1. Set any one near zero, and the whole product goes to zero. Illustrative. AI effectiveness modeled as the product of four composite skills.
Figure 2 Illustrative. AI readiness modeled as the product of four composite skills, each scored 0 to 1. A near-zero in any one factor pulls the whole product toward zero.

Each factor does a specific job:

  • Communication. State the problem, the constraints, and what a good answer looks like. If you cannot put the task into words, the model has nothing to work with.
  • Domain expertise. Know the field well enough to ask for the right thing and to recognize an answer that looks right but is not.
  • Critical thinking. Reason about cause and effect as you go, so each round of edits moves toward the answer instead of away from it.
  • Experience and judgment. Know when an answer is good enough to ship and when it is only plausible.

The reason this is a power law is that the four factors multiply. Score each from 0 to 1, and the value you get is closer to their product than their average. A near-zero in any one factor pulls the whole product toward zero, no matter how strong the other three are. A clear communicator with deep domain knowledge and no critical thinking will iterate with confidence in the wrong direction, and the value of the tool falls away.

This is not only arithmetic. In the field experiment behind Figure 1, on a task set beyond AI's frontier, consultants who used AI were 19 points less likely to be correct than those who worked without it (Dell'Acqua et al., 2023). The tool did not help them fail less. It helped them fail more, because they could not tell when to stop trusting it.

One weak skill, and most of the value is gone

Because the factors multiply, the damage from a weak skill compounds. Lose one and the result falls hard; lose two and it nearly disappears. A simple average hides this, because averaging treats the skills as if they add. Figure 3 shows the same four profiles two ways: the average of the skills, and their product.

Lose one factor, and most of the value goes Illustrative. Four skills scored 0 to 1: their average vs. their product. Average of the four skills Effectiveness when they multiply 0% 100% All four strong 0.9 · 0.9 · 0.9 · 0.9 90% 66% Missing one 0.9 · 0.9 · 0.9 · 0.2 73% 15% One weak link drops 66% to 15%. Missing two 0.9 · 0.9 · 0.2 · 0.2 55% 3% Missing three 0.9 · 0.2 · 0.2 · 0.2 38% 1% Illustrative. "Missing" sets a factor to 0.2. Effectiveness modeled as the product of four skills, each 0 to 1.
Figure 3 Illustrative. As more factors weaken, the product collapses far faster than the average: 66% to 15% to 3% to 1%. "Missing" sets a factor to 0.2.

The averages decline gently, from 90% to 38%. The products fall off a cliff, from 66% to 1%. A person who is strong on three skills and weak on one still averages a respectable 73%, yet performs at 15%. Judge readiness by the average and that person looks ready. The math says they are not.

The same idea is easier to see in people than in percentages. Each missing factor shows up as a specific failure with AI.

What each missing skill looks like with AI
ProfileWeak skillWhat happens with AI
The complete operatorNoneFrames the task, checks the output, and ships expert-level work fast
The fluent noviceDomain expertiseAccepts answers that sound authoritative but are wrong
The expert who won't checkJudgmentShips fluent work that is subtly wrong
The vague specialistCommunicationNever gets the model to the real question
The confident guesserCritical thinkingIterates in the wrong direction; errors compound
The one-trick contributorThree of fourA single strength cannot carry the rest

Where AI narrows the gap instead

The pattern is not universal, and the clearest counter-evidence is worth stating. Brynjolfsson, Li, and Raymond (2023) studied 5,179 customer-support agents given an AI assistant. Productivity rose 14% on average, but the gain was 34% for novices and close to nothing for the most experienced agents. Here, AI narrowed the skill gap rather than widening it.

The difference is the kind of task. Customer support runs on bounded, scripted problems with a known good answer, and the assistant hands novices the experts' playbook. The power law appears where the work is open-ended, where there is no script and the person has to decide what a good answer even is. That is where the four skills stop being helpful extras and start to multiply.

What two AI field experiments found
StudySettingResult
Dell'Acqua et al. (2023)Consulting tasks inside AI's frontier+12.2% tasks, 25.1% faster, 40%+ quality
Dell'Acqua et al. (2023)One task beyond AI's frontier19 pts less likely correct
Brynjolfsson et al. (2023)Customer support, all agents+14% issues per hour
Brynjolfsson et al. (2023)Customer support, novices+34%
Brynjolfsson et al. (2023)Customer support, most experienced~no change

Why the gap is hard to see

The hardest part is that the missing skill is usually invisible to the person missing it. Judging an AI's answer takes judgment and domain knowledge, so the people most likely to accept a weak answer are the least equipped to notice it is weak. Kruger and Dunning (1999) called this the metacognitive trap: low skill in a field comes with low ability to recognize low skill. The size of their original effect is debated, but the mechanism is intuitive.

AI tightens the trap, because its output is fluent and confident by default. A wrong answer still reads well. If judgment or domain knowledge is the weak factor, the tool does not feel like it is failing. It feels like it is working. That is how capable people plateau without noticing.

What this means

Two things follow. First, you cannot fix a multiplicative system by adding to a factor that is already high. More prompt tricks do nothing for the person whose limiting factor is judgment. The return comes from finding the lowest factor and raising it, which is the opposite of running one prompt course for everyone.

Second, the four skills are general-purpose. Critical thinking, judgment, communication, and domain expertise pay off with or without AI in the workflow. Developing them makes someone better with the tool, better at the job without it, and a faster learner. The investment does not depend on any particular model.

Reading the symptom back to the weak factor
What you seeLikely weak factorWhere to invest
Fluent output that is often subtly wrong, and shipped anywayJudgmentCalibration: review past calls against what actually happened
Long back-and-forth that never convergesCritical thinkingReasoning about cause and effect; stating a hypothesis before iterating
Vague prompts and frequent "that is not what I meant"CommunicationWriting the problem, constraints, and success criteria first
Plausible answers accepted in an unfamiliar areaDomain expertiseDepth in the actual field, not more tool training

Evaluate the conversation, not the result

This points to a better way to assess AI readiness. A finished result hides the process that got it there. A polished deliverable can come from luck or from copying without understanding; a rough one can come from sound reasoning that ran out of time. The skills, or lack thereof, are revealed in the conversation that produce the result.

Watch how someone works a problem through a chat interface, how they frame it, which assumptions they test, where they push back and where they accept, and the weak factor shows itself: the communicator who cannot specify, the expert who never checks, the careful reasoner who lacks the domain knowledge to know what to check.

That is the principle behind MyCulture.ai, the AI-readiness assessment we built alongside Happily. It puts a person in a chat with a real task, simulates what can go right and wrong, and scores the conversation rather than the answer, across communication, critical thinking, judgment, and domain expertise. See who's ready and who's not, and develop the skills that are missing.

Limitations

  • This is a synthesis of public sources plus illustrative models. It is not a Happily measurement, and Figures 1 to 3 are schematic, not data.
  • The cited studies are observational or single field experiments. They are consistent with the multiplicative account; they do not prove that raising these four skills causes better AI outcomes.
  • Four skills is a deliberate simplification. How people delegate and iterate matters too. Four is the memorable core, not the whole space.
  • The public figures span different countries, fields, and years. They illustrate distribution shapes and are not combined into one number.

References

  1. World Inequality Database (WID.world), via Our World in Data. Income share of the richest 10% (before tax). Accessed 2026.
  2. Kremer, M. (1993). The O-Ring Theory of Economic Development. Quarterly Journal of Economics, 108(3), 551-575.
  3. O'Boyle, E., & Aguinis, H. (2012). The Best and the Rest: Revisiting the Norm of Normality of Individual Performance. Personnel Psychology, 65(1), 79-119.
  4. Dell'Acqua, F., et al. (2023). Navigating the Jagged Technological Frontier. Harvard Business School Working Paper 24-013; published in Organization Science (2025).
  5. Brynjolfsson, E., Li, D., & Raymond, L. (2023). Generative AI at Work. NBER Working Paper 31161.
  6. Kruger, J., & Dunning, D. (1999). Unskilled and Unaware of It. Journal of Personality and Social Psychology, 77(6), 1121-1134.
Find the factor that is near zero.

AI readiness is not a prompt course. It is a composite of skills that multiply, and the fastest way to raise it is to see which one is lowest. MyCulture.ai scores how a person works a problem, not just what they produce.

Try it out
Free pilot for qualifying teams