happily.ai research All research
Research Dataset

The Happily Workplace Dataset

The Happily Workplace Dataset is a longitudinal, continuously collected record of employee well-being and workplace behavior, spanning 10M+ interactions across 350+ organizations since 2017. It pairs daily mood check-ins, validated well-being (WHO-5), eNPS, peer recognition and feedback networks, manager behavior, and AI-rated power skills, with free-text feedback in Thai and English. It is available to license for academic, people-science, and network-science research. Licensing inquiries: tareef@happily.ai.

10M+
Employee interactions
350+
Workplaces
2017
Collecting since

Most workplace datasets available to researchers are annual engagement surveys or one-time panels: a single snapshot of a moving picture. Happily's data is collected differently. Through an employee-experience platform that companies use every working day, the same people respond to short prompts on most working days, year after year. That produces a continuous, per-person record of mood, well-being, recognition, and feedback, rather than an end-of-year summary.

Because the same individuals respond repeatedly inside real reporting structures, the data is naturally longitudinal and relational. It supports questions that a survey cannot reach: how a habit holds or drifts across a full year, how one team diverges from another over many months, what a trajectory looks like in the 90 days before someone resigns, and how a behavior spreads from one person to the next through an org chart.

The studies on this site are built from this data. The same data is available, de-identified and scoped to a research question, for institutions that want to analyze it directly.

379,866 daily check-ins came from a single 12-month, 72-company cohort, one recent slice of a dataset that has been collected every working day since 2017.
Why this dataset is rare

Five things rarely appear together in workplace data, and they appear here at once:

  • Daily granularity. Mood and behavior are recorded most working days, not once a year, so short-window dynamics are visible.
  • Real workplaces, not a lab or a paid panel. The records are generated in the flow of work by employees at companies using the platform.
  • Network structure. Recognition and feedback are directed events, who recognizes or asks whom, layered on true reporting lines.
  • Paired text, numbers, and outcomes. Free-text feedback sits next to numeric ratings and later outcomes such as exit, so themes can be read against what happened.
  • Multilingual and clinically grounded. A largely Thai and English population, measured with validated instruments like the WHO-5, a group underrepresented in published workplace research.
The dataset at a glance
Scale
10M+ employee interactions across 350+ workplaces.
Time span
Continuous collection since 2017, nine years and counting. The published studies analyze recent windows; the full history is available for licensed extracts.
Structure
Per-person time series plus directed recognition and feedback networks on real reporting lines.
Population
Employees at companies using the Happily platform, primarily Thailand and the wider Southeast Asia region.
Languages
Free text is largely Thai, with English and mixed responses.
Instruments
WHO-5 Well-being Index, eNPS, daily check-ins, weekly stress, peer recognition and feedback, AI-rated power skills.
Unit of analysis
De-identified, aggregated employee responses. Every extract states its own sample.
Access
By license, scoped to a research question. De-identified extracts or aggregated cohort tables under a data-sharing agreement.
Formats
Tabular extracts (for example CSV or Parquet) with field-level documentation.
Citation
Happily Research (2026). The Happily Workplace Dataset. happily.ai/research/dataset/

What's in the data

The data is organized around a handful of recurring signals. Each is captured at the grain shown below, and most can be delivered as a per-person time series or aggregated to team, cohort, or company level.

SignalEach record representsWhat it carriesCadence
Daily check-inOne employee on one dayMood on a five-point scale, optional free textDaily
Weekly stressOne employee, one pulseStress on a four-point scale, optional free-text sourceWeekly
WHO-5 well-beingOne employee, one assessmentFive sub-scores combined into a 0–100 indexQuarterly
eNPSOne employee, one survey0–10 recommendation score, optional barrier textPeriodic
Peer recognitionOne recognition eventGiver → receiver, points, optional values tagPer event
Peer feedbackOne feedback requestRequester → chosen giver, written feedbackPer event
Manager replyOne feedback itemEmployee text, manager reply, AI quality flagsPer event
Power skillsOne written feedback itemSix-skill volume score plus AI quality rating (0.5–5)Per event
Performance reviewOne reviewGoal and culture ratingsPeriodic
Org hierarchyOne employeeReporting line, multiple levels deepMaintained

Extracts are delivered as de-identified records: stable hashed identifiers in place of names, no company identities, with test and internal accounts removed. The schema and field-level documentation are shared with each engagement so the data can be loaded and analyzed directly.

Measurement instruments

Where possible the data uses established, externally validated measures rather than metrics invented in-house, which keeps results legible against work done elsewhere.

  • WHO-5 Well-being Index. A five-item scale from the World Health Organization, validated and used in clinical and academic research worldwide, scored 0–100 with established thresholds.
  • eNPS. The standard single-item measure of whether an employee would recommend their workplace, on a 0–10 scale, with an optional open-ended reason.
  • Daily check-in. A short daily prompt on how an employee feels, answered on a five-point scale.
  • Weekly stress. A four-point self-report of stress, with an optional free-text source.
  • Peer recognition and feedback. Structured records of who recognizes, and gives feedback to, whom, forming directed networks across teams and companies.
  • Power skills. Six human skills (critical thinking, self-awareness, optimism, leadership, initiative, empathy) extracted and rated from written feedback. Five of the six align with the top human skills in the World Economic Forum's Future of Jobs 2025 report.

Data packs

The data is most useful when scoped to a question. We package it into named, de-identified data packs, each combining the signals a given research area needs. The sample sizes below are illustrative, drawn from published studies that analyzed only the most recent years of the record. The full archive reaches back to 2017, so a licensed extract can span far more history, and a larger population, than these figures suggest. Each extract is scoped to your design.

Nine years of depth

The figures in this table reflect a recent analysis window of a few years. The complete dataset runs nine years deep, back to 2017, and is available for licensing. Longer time spans mean more within-person history: multi-year trajectories, full tenure arcs from onboarding to exit, and how habits hold across years.

Data packWhat's insideIllustrative scale
Well-being LongitudinalWHO-5 assessments, daily mood, and weekly stress as per-person time series2,912 employees, 74 companies, 65,626 WHO-5 responses
Recognition & Trust NetworksDirected recognition and peer-feedback graphs layered on org hierarchy3,446 employees across 31 companies
Manager Behavior & Team OutcomesManager reply rate, quality, and timing, with team engagement and cascade structure633 managers, 60 companies
Attrition Early-WarningDaily mood trajectories and text complaints paired with exit outcomes7,717 employees (4,532 left, 3,185 stayed), 39+ organizations
Multilingual Text + BehaviorThai and English free-text feedback paired with numeric ratings, outcomes, and thematic codes34,803 eNPS responses; 1,681 coded barrier answers
Power Skills DevelopmentAI-rated six-skill longitudinal panel with repeated measurements2,630 individuals over 180+ days, 80 companies
Custom cohorts

If your question does not map to a pack, we scope a bespoke de-identified extract: a particular time window, role or tenure band, set of signals, or network slice. Tell us the design and we will tell you what the data can and cannot support.

Governance, ethics, and de-identification

Research access runs on de-identified, aggregated data. Individual employees are never identified, identifiers are hashed before any data leaves our systems, and test and internal accounts are excluded. Extracts respect minimum aggregation thresholds so that no result can single out a specific person, and company identities are not shared.

Each engagement is governed by a written data-sharing and license agreement that sets out permitted use, retention, and publication terms. We are glad to support an institution's ethics or IRB review and to align the data-sharing terms with it. Where a study needs a higher bar, analysis can be arranged against aggregated cohort tables rather than record-level extracts.

In short

The findings travel; the individuals do not. We share the structure and the signals needed to do real research, never an identifiable person or company.

How licensing works

Access is inquiry-based and scoped to your project rather than sold off a fixed price list.

  1. Share your research question. Tell us the design, the signals you need, and the population you have in mind.
  2. We scope the extract. We confirm what the data can support, propose a pack or custom cohort, and flag any limitations up front.
  3. Agreement. We put a data-sharing and license agreement in place, aligned with your institution's ethics review.
  4. Delivery. You receive de-identified extracts (for example CSV or Parquet), or access to aggregated cohort tables, with field-level documentation.

To start a conversation, write to tareef@happily.ai with a sketch of what you want to study.

What researchers have already found

The data yields findings that hold up to scrutiny. These open studies, each with its own methodology box and limitations, show the kinds of questions it can answer:

  • WHO-5 Dimensions, where rest is consistently the lowest-scoring well-being dimension across quarters.
  • Trust Networks, where 72% of the most-trusted people across 31 companies hold no management title.
  • Attrition signals, where complaint themes in daily text are associated with sharply different exit rates.
  • Power Skills over time, a 2,630-person panel where human-skill ratings rise with sustained practice.
  • The Leadership Cascade, where reply behavior tracks a manager's own boss but not their skip-level.

You can browse all studies to see the breadth of what has been published from this data.

How to cite this dataset

The dataset and the studies built on it are meant to be referenced. When you cite the dataset itself, please attribute it to Happily Research and link this page.

Citation

Happily Research (2026). The Happily Workplace Dataset. happily.ai/research/dataset/

To cite a specific finding, cite the individual study and link its page, for example Happily Research (2026). The Stress Sweet Spot. happily.ai/research/stress-sweet-spot/.

Frequently asked questions

What is the Happily Workplace Dataset?

The Happily Workplace Dataset is a longitudinal, continuously collected record of employee well-being and workplace behavior. It spans 10M+ interactions across 350+ organizations since 2017 and combines daily mood check-ins, WHO-5 well-being, eNPS, peer recognition and feedback networks, manager behavior, and AI-rated power skills.

What variables and measures does the dataset include?

It includes daily mood (5-point), weekly stress (4-point), the WHO-5 Well-Being Index (0–100), eNPS (0–10) with free-text reasons, directed peer recognition and feedback networks, manager reply rate and quality, AI-rated power skills, performance ratings, and multi-level organizational hierarchy.

How large is the dataset and what time period does it cover?

Collection has run continuously since 2017, nine years and counting, across 350+ workplaces and 10M+ employee interactions. Individual published studies analyze recent multi-year windows; licensed extracts can draw on the full historical depth.

Where are participants located and what languages are represented?

Participants are employees at companies using the Happily platform, primarily in Thailand and the wider Southeast Asia region. Free-text responses are largely Thai, with English and mixed-language text, a workforce underrepresented in published workplace research.

How can researchers access or license the dataset?

Access is by license and scoped to a research question. Researchers email tareef@happily.ai with their design; Happily scopes a de-identified extract or aggregated cohort tables, puts a data-sharing agreement in place, and delivers the data with field-level documentation.

Is the data anonymized and how is privacy protected?

Yes. Research access uses de-identified, aggregated data. Identifiers are hashed before data leaves Happily's systems, no individual or company is identifiable, test and internal accounts are excluded, and minimum aggregation thresholds prevent any result from singling out a person.

What license and terms apply, and can it support IRB review?

Each engagement runs under a written data-sharing and license agreement covering permitted use, retention, and publication. Happily supports an institution's ethics or IRB review and can provide aggregated cohort tables rather than record-level data where a higher bar is required.

How should I cite the Happily Workplace Dataset?

Cite it as: Happily Research (2026). The Happily Workplace Dataset. happily.ai/research/dataset/. For a specific study built on the data, cite that study and link its page.

References

  1. Topp, C. W., Østergaard, S. D., Søndergaard, S., & Bech, P. (2015). The WHO-5 Well-Being Index: A Systematic Review of the Literature. Psychotherapy and Psychosomatics, 84(3), 167–176.
  2. Reichheld, F. F. (2003). The One Number You Need to Grow. Harvard Business Review.
  3. World Economic Forum (2025). The Future of Jobs Report 2025.
License the data

If your lab or institution wants to work with this dataset, tell us what you want to study and we will scope an extract.

Email tareef@happily.ai