Response time is a red herring: reply quality beats reply speed
"Respond within 24 hours" is the most common feedback rule managers are given. The data does not support it. Across 542 managers, faster replies do not predict better outcomes — and the fastest responders post the worst scores, because the same-day bucket is dominated by checkbox replies.
Managers are coached to reply fast. Feedback tools surface response-time leaderboards, and "close the loop within 24 hours" has become a default best practice. The assumption underneath it is a dose-response curve: the faster the reply, the better the outcome. We tested that assumption directly, and it does not hold.
We analyzed 542 managers across 100 companies over a 365-day window, linking how managers respond to employee feedback on Happily to three team outcomes: text feedback rate, checkin completion rate, and DEBI, a 0–100 team engagement index. Managers were grouped into five response buckets — same-day, 1–3 days, 4–7 days, 7+ days, and never — and we measured reply rate and reply quality alongside timing.
Two things were clear. Responding at all matters: never-responders lead teams scoring 14.7 DEBI points lower than responders (Cohen's d = 0.599, medium). But among managers who respond, the specific timing is irrelevant. The 1–3 day, 4–7 day, and 7+ day buckets all cluster at DEBI 46–51. There is no dose-response curve.
If your feedback program rewards speed, it is optimizing the wrong variable. A 24-hour SLA pushes managers toward exactly the behavior that correlates with the worst outcomes: fast, low-effort, checkbox replies. The actionable rule is "write something real," not "reply before end of day."
How we measured response behavior
When a manager receives feedback from their team through Happily, the system records when they reply and what they write. We used that to build three predictors and assign each manager to a response-time bucket based on their average response across the year.
The first predictor is response time — the bucket itself: same-day (under one day), 1–3 days, 4–7 days, 7+ days, or never. The second is reply rate: the share of feedback items a manager actually replied to. The third is reply quality: an algorithmic score of how substantive the reply content is, binned into low, medium, and high tiers.
Outcomes were measured at the team level. Text rate and checkin rate are participation measures — what share of a manager's team submits written feedback or completes checkins. DEBI is a composite engagement index on a 0–100 scale. We also manually coded 190 replies to read the qualitative texture behind the numbers.
Cohen's d benchmarks throughout: small ≥ 0.2, medium ≥ 0.5, large ≥ 0.8. DEBI is not available for every manager, so each bucket reports an n_DEBI alongside its full count.
Finding 1 — Never-responding is the only clear timing harm
Managers who never reply to any feedback lead measurably less engaged teams than managers who reply at any speed.
| Metric | Responders (n=454) | Never (n=88) | Difference | Cohen's d | Size |
|---|---|---|---|---|---|
| Text rate | 34.7% | 33.7% | +1.0 pp | 0.061 | Negligible |
| Checkin rate | 36.2% | 29.6% | +6.6 pp | 0.403 | Small |
| DEBI | 45.7 | 31.0 | +14.7 pts | 0.599 | Medium |
The DEBI gap is a medium effect, and the checkin-rate gap is small but real: teams whose managers never reply are less likely to complete checkins themselves. The text-rate gap, by contrast, is negligible — whether a manager replies has almost no bearing on whether employees write text feedback. Text submission appears more intrinsically motivated; checkin completion and overall engagement track manager behavior more closely.
This is observational. Managers who never respond may also be less engaged, less skilled, or leading teams that were already struggling. Never-responding may be a symptom rather than a cause. The signal is consistent and worth acting on, but it is not proof.
Finding 2 — Timing shows no dose-response pattern
If speed drove outcomes, the responding buckets would line up on a curve: same-day best, then 1–3 days, then 4–7, then 7+ days worst. They do not. Among managers who respond, the 1–3 day, 4–7 day, and 7+ day buckets all sit at DEBI 46–51. The same-day bucket sits far below them.
| Bucket | n | Text rate | Checkin rate | DEBI | n_DEBI | Reply rate | Reply quality |
|---|---|---|---|---|---|---|---|
| same-day | 91 | 29.95% | 32.47% | 30.43 | 47 | 46.93% | 71.94 |
| 1–3 days | 100 | 37.22% | 39.58% | 45.97 | 68 | 75.33% | 73.28 |
| 4–7 days | 45 | 38.00% | 37.79% | 51.27 | 29 | 68.26% | 68.71 |
| 7+ days | 218 | 34.94% | 35.86% | 48.62 | 186 | 45.95% | 65.39 |
| never | 88 | 33.73% | 29.62% | 31.01 | 65 | 0% | n/a |
The 4–7 day bucket posts the highest raw DEBI (51.27). The 7+ day bucket — the slowest responders — still scores 48.62 and carries the largest, most reliable DEBI sample (n_DEBI=186). The slowest responders do not lag the faster ones. There is no timing window that wins. Any non-same-day response is associated with substantially higher DEBI than same-day or never.
The same-day versus 1–3 day comparison alone is a medium effect: Cohen's d = 0.605, with the 1–3 day group scoring 15.5 DEBI points higher. If you read that gap as "speed hurts," you would conclude managers should slow down. That reading is wrong. The mechanism is not the clock.
Why same-day underperforms: the checkbox reply problem
Same-day responders post the lowest DEBI of any responding group (30.43). Three patterns in the data explain why, and none of them is about speed.
Same-day responders have the lowest reply rate among responders. They reply to only 46.93% of their feedback, against 75.33% for the 1–3 day group and 68.26% for the 4–7 day group. Many same-day managers are selective repliers: they fire off a quick note on the easy items and ignore the rest.
The bucket is bimodal. Manual review of replies found two distinct populations inside the same-day group — genuinely engaged fast responders, and checkbox responders who dash off "Thanks!" or "Noted" without substantive engagement. The checkbox group pulls the average down.
Quality score alone misses it. Same-day reply quality (71.94) is close to the 1–3 day score (73.28). But the quality metric measures the depth of the replies a manager does write. It cannot see the pattern of replying to a few items dismissively while leaving the rest untouched. That pattern is what separates same-day from the rest.
The lever is "write something real," not "wait before replying." A same-day manager who writes thoughtful, substantive replies to all their feedback would likely score with any other bucket. The problem is not speed — it is the checkbox behavior that happens to correlate with speed.
Finding 3 — Reply quality moves participation, within every bucket
Within each time bucket, higher reply quality tracks higher text rates. Managers who write high-quality replies see text rates 8–15 percentage points above those who write low-quality replies in the same bucket.
| Quality | same-day | 1–3 days | 4–7 days | 7+ days |
|---|---|---|---|---|
| Low | 26.06% | 32.03% | 32.81% | 30.91% |
| Medium | 32.19% | 36.58% | 33.92% | 36.51% |
| High | 29.48% | 41.14% | 48.15% | 38.80% |
| Quality | same-day | 1–3 days | 4–7 days | 7+ days |
|---|---|---|---|---|
| Low | 28.50 (15) | 41.85 (17) | 45.45 (8) | 48.26 (75) |
| Medium | 35.84 (19) | 40.90 (26) | 45.72 (10) | 45.14 (69) |
| High | 24.76 (13) | 54.05 (25) | 60.55 (11) | 55.23 (41) |
For DEBI, the quality effect is strong in the 1–3 day and 4–7 day buckets but inverted in same-day, where the high-quality cell posts the lowest DEBI of any cell (24.76). That cell holds only 13 managers with DEBI scores and should be read cautiously — it may reflect a handful of highly engaged managers in already-struggling teams, or simply noise.
Finding 4 — Quality amplifies the effect of timing
The largest effects in the study come from the combination of timing and quality, not from either one alone.
The best-versus-worst combination is a large effect. Comparing 1–3 days + high quality against same-day + low quality, DEBI runs 54.05 versus 28.50 — a 25.6-point gap, Cohen's d = 0.988. On text rate the same comparison runs 41.14% versus 26.06% (d = 0.854, also large).
| Time × Quality | n | Text rate | Checkin rate | DEBI | n_DEBI |
|---|---|---|---|---|---|
| same-day + low | 20 | 26.06% | 30.19% | 28.50 | 15 |
| same-day + medium | 41 | 32.19% | 35.16% | 35.84 | 19 |
| same-day + high | 30 | 29.48% | 30.33% | 24.76 | 13 |
| 1–3 days + low | 23 | 32.03% | 38.47% | 41.85 | 17 |
| 1–3 days + medium | 40 | 36.58% | 39.69% | 40.90 | 26 |
| 1–3 days + high | 37 | 41.14% | 40.13% | 54.05 | 25 |
| 4–7 days + low | 14 | 32.81% | 37.16% | 45.45 | 8 |
| 4–7 days + medium | 17 | 33.92% | 34.43% | 45.72 | 10 |
| 4–7 days + high | 14 | 48.15% | 42.49% | 60.55 | 11 |
| 7+ days + low | 83 | 30.91% | 34.99% | 48.26 | 75 |
| 7+ days + medium | 79 | 36.51% | 34.00% | 45.14 | 69 |
| 7+ days + high | 55 | 38.80% | 39.97% | 55.23 | 41 |
Within the 1–3 day bucket, moving from low to high quality adds 12.2 DEBI points. The highest cell overall is 4–7 days + high quality at DEBI 60.55, but with only 14 managers (n_DEBI=11) it is unreliable. The most trustworthy "best" cell is 1–3 days + high quality: 37 managers, n_DEBI=25, DEBI 54.05. Quality amplifies timing; within same-day, where checkbox behavior dominates, that amplification breaks down.
Finding 5 — Company culture explains most of the effect
The raw correlations between response behavior and outcomes look modest but real. After adjusting for company, the response-behavior correlations collapse toward zero.
| Relationship | Raw r | Adjusted r | Reduction |
|---|---|---|---|
| Reply rate → DEBI | 0.228 | 0.061 | 73% |
| Reply quality → text rate | 0.194 | 0.026 | 87% |
| Response days → DEBI | 0.075 | 0.038 | 49% |
What survives company adjustment is not how an individual manager responds — it is feedback volume. The number of feedback items received still correlates with DEBI at r = 0.317 after adjustment, and items received and replied to track outcomes throughout. The dominant variable is whether the feedback loop is active at all, which is an organizational property, not an individual one.
In plain terms: companies where managers respond quickly and well are also companies where teams are more engaged. But within a given company, the manager who responds faster does not lead a meaningfully more engaged team than the one who responds slower. The company's overall feedback culture is doing most of the work.
The pattern does not hold inside individual managers either
We computed within-manager correlations — does a single manager see better team outcomes during periods when they respond faster or better? The correlations are weak and extremely noisy. The strongest signal, response time to text rate, averages r = 0.251 across 275 managers, but with a standard deviation of 0.516. Roughly a third of managers show the opposite pattern. Reply quality shows essentially no within-manager effect (average r = −0.079). There is no reliable individual-level mechanism by which faster or better responses drive participation.
What the qualitative review added
The manual review of 190 replies confirmed the checkbox story and added detail. Dismissive replies appear in every time bucket, not just same-day — "Thanks for sharing" and "Noted" turn up at every speed. The same-day bucket's defining feature is its bimodality: checkbox responders and highly engaged managers sitting side by side. The 7+ day bucket contains a different artifact — batch catch-up behavior, including AI-generated catch-up replies with response times of 90–220 days, where a manager or system clears a backlog in one session.
What this means for HR
The headline rule most feedback programs ship with — respond within 24 hours — is not supported by this data. Speed is not the lever. Here is what the evidence does and does not back.
| Practice | Verdict |
|---|---|
| Require managers to respond to feedback at all | Supported. Never-responding carries a 14.7-point DEBI gap (d = 0.599). Make response expected, regardless of speed. |
| Coach managers to write substantive, specific replies | Supported. High-quality replies lift text rate 8–15 pp within every bucket; combined with 1–3 day timing they produce the largest effect in the study (d = 0.988). |
| Set a 24-hour response SLA | Not supported. Same-day responders score the lowest DEBI of any responding group. A speed SLA pushes managers toward checkbox replies. |
| Reward or rank managers on response speed | Not supported. There is no dose-response curve. The 4–7 day and 7+ day buckets score as well as faster ones. |
| Treat reply quality as a standalone lever | Use with caution. After company adjustment, quality correlates near zero with outcomes on its own; it works in combination with timing, not independently. |
| Coach individual managers based on their response time | Not supported. Within-manager variation shows no reliable link between changes in response behavior and changes in team outcomes. |
| Invest in company-wide feedback culture | Supported. Company effects dominate: 73% of the reply-rate–DEBI correlation is organizational. Normalizing response across the company beats coaching managers one at a time. |
The simplest version of the rule: replace "reply fast" with "reply for real." A manager who writes a thoughtful response to every piece of feedback three days later is doing better than one who fires off "Noted" the same afternoon and skips half the items. The clock is not the signal. The reply is.
Limitations
This study measures associations between response behavior and team outcomes. It does not establish that responding faster or better causes those outcomes.
- Observational design. All findings are correlational. The most parsimonious explanation for many patterns is shared underlying causes — company culture, manager capability — rather than a causal feedback loop.
- Company culture confound. Finding 5 shows most between-manager variance in the response-outcome relationship is company-level. This substantially limits causal reading of Findings 1–4.
- Small cell sizes. Several quality × time cells have fewer than 15 managers with DEBI scores. The same-day + high quality (n_DEBI=13), 4–7 day + low (n_DEBI=8), and 4–7 day + medium (n_DEBI=10) estimates are unreliable.
- Bucket assignment uses averages. Managers are placed in their average response bucket, which obscures within-manager variation. A manager who replies same-day on some items and after 7+ days on others is classified on the mean.
- Reply quality is algorithmic. Quality scoring is automated, not human-judged, and may not capture what employees actually perceive as helpful.
- DEBI survivorship. Not all managers have DEBI scores; managers without DEBI data are excluded, which may introduce selection bias.
- No causal identification strategy. The study lacks instrumental variables, natural experiments, or longitudinal designs that would support causal claims.
Happily Research (2026). Response Time Is a Red Herring: Reply Quality Beats Reply Speed. happily.ai/research/response-time-feedback/
References
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates. Source of the d effect-size benchmarks (small ≥ 0.2, medium ≥ 0.5, large ≥ 0.8) used throughout.
- Happily People Science (March 2026). Manager Response Time and Team Engagement. Internal analysis, 542 managers across 100 companies, 365-day window.
Happily reads how managers engage with feedback — depth, coverage, and follow-through — so you can coach the behavior that actually moves engagement instead of the one that just looks fast.
Get in touch