The Happily Workplace Dataset
The Happily Workplace Dataset is a longitudinal, continuously collected record of employee well-being and workplace behavior, spanning 10M+ interactions across 350+ organizations since 2017. It pairs daily mood check-ins, validated well-being (WHO-5), eNPS, peer recognition and feedback networks, manager behavior, and AI-rated power skills, with free-text feedback in Thai and English. It is available to license for academic, people-science, and network-science research. Licensing inquiries: tareef@happily.ai.
Most workplace datasets available to researchers are annual engagement surveys or one-time panels: a single snapshot of a moving picture. Happily's data is collected differently. Through an employee-experience platform that companies use every working day, the same people respond to short prompts on most working days, year after year. That produces a continuous, per-person record of mood, well-being, recognition, and feedback, rather than an end-of-year summary.
Because the same individuals respond repeatedly inside real reporting structures, the data is naturally longitudinal and relational. It supports questions that a survey cannot reach: how a habit holds or drifts across a full year, how one team diverges from another over many months, what a trajectory looks like in the 90 days before someone resigns, and how a behavior spreads from one person to the next through an org chart.
The studies on this site are built from this data. The same data is available, de-identified and scoped to a research question, for institutions that want to analyze it directly.
Five things rarely appear together in workplace data, and they appear here at once:
- Daily granularity. Mood and behavior are recorded most working days, not once a year, so short-window dynamics are visible.
- Real workplaces, not a lab or a paid panel. The records are generated in the flow of work by employees at companies using the platform.
- Network structure. Recognition and feedback are directed events, who recognizes or asks whom, layered on true reporting lines.
- Paired text, numbers, and outcomes. Free-text feedback sits next to numeric ratings and later outcomes such as exit, so themes can be read against what happened.
- Multilingual and clinically grounded. A largely Thai and English population, measured with validated instruments like the WHO-5, a group underrepresented in published workplace research.
What's in the data
The data is organized around a handful of recurring signals. Each is captured at the grain shown below, and most can be delivered as a per-person time series or aggregated to team, cohort, or company level.
| Signal | Each record represents | What it carries | Cadence |
|---|---|---|---|
| Daily check-in | One employee on one day | Mood on a five-point scale, optional free text | Daily |
| Weekly stress | One employee, one pulse | Stress on a four-point scale, optional free-text source | Weekly |
| WHO-5 well-being | One employee, one assessment | Five sub-scores combined into a 0–100 index | Quarterly |
| eNPS | One employee, one survey | 0–10 recommendation score, optional barrier text | Periodic |
| Peer recognition | One recognition event | Giver → receiver, points, optional values tag | Per event |
| Peer feedback | One feedback request | Requester → chosen giver, written feedback | Per event |
| Manager reply | One feedback item | Employee text, manager reply, AI quality flags | Per event |
| Power skills | One written feedback item | Six-skill volume score plus AI quality rating (0.5–5) | Per event |
| Performance review | One review | Goal and culture ratings | Periodic |
| Org hierarchy | One employee | Reporting line, multiple levels deep | Maintained |
Extracts are delivered as de-identified records: stable hashed identifiers in place of names, no company identities, with test and internal accounts removed. The schema and field-level documentation are shared with each engagement so the data can be loaded and analyzed directly.
Measurement instruments
Where possible the data uses established, externally validated measures rather than metrics invented in-house, which keeps results legible against work done elsewhere.
- WHO-5 Well-being Index. A five-item scale from the World Health Organization, validated and used in clinical and academic research worldwide, scored 0–100 with established thresholds.
- eNPS. The standard single-item measure of whether an employee would recommend their workplace, on a 0–10 scale, with an optional open-ended reason.
- Daily check-in. A short daily prompt on how an employee feels, answered on a five-point scale.
- Weekly stress. A four-point self-report of stress, with an optional free-text source.
- Peer recognition and feedback. Structured records of who recognizes, and gives feedback to, whom, forming directed networks across teams and companies.
- Power skills. Six human skills (critical thinking, self-awareness, optimism, leadership, initiative, empathy) extracted and rated from written feedback. Five of the six align with the top human skills in the World Economic Forum's Future of Jobs 2025 report.
Data packs
The data is most useful when scoped to a question. We package it into named, de-identified data packs, each combining the signals a given research area needs. The sample sizes below are illustrative, drawn from published studies that analyzed only the most recent years of the record. The full archive reaches back to 2017, so a licensed extract can span far more history, and a larger population, than these figures suggest. Each extract is scoped to your design.
The figures in this table reflect a recent analysis window of a few years. The complete dataset runs nine years deep, back to 2017, and is available for licensing. Longer time spans mean more within-person history: multi-year trajectories, full tenure arcs from onboarding to exit, and how habits hold across years.
| Data pack | What's inside | Illustrative scale |
|---|---|---|
| Well-being Longitudinal | WHO-5 assessments, daily mood, and weekly stress as per-person time series | 2,912 employees, 74 companies, 65,626 WHO-5 responses |
| Recognition & Trust Networks | Directed recognition and peer-feedback graphs layered on org hierarchy | 3,446 employees across 31 companies |
| Manager Behavior & Team Outcomes | Manager reply rate, quality, and timing, with team engagement and cascade structure | 633 managers, 60 companies |
| Attrition Early-Warning | Daily mood trajectories and text complaints paired with exit outcomes | 7,717 employees (4,532 left, 3,185 stayed), 39+ organizations |
| Multilingual Text + Behavior | Thai and English free-text feedback paired with numeric ratings, outcomes, and thematic codes | 34,803 eNPS responses; 1,681 coded barrier answers |
| Power Skills Development | AI-rated six-skill longitudinal panel with repeated measurements | 2,630 individuals over 180+ days, 80 companies |
If your question does not map to a pack, we scope a bespoke de-identified extract: a particular time window, role or tenure band, set of signals, or network slice. Tell us the design and we will tell you what the data can and cannot support.
Governance, ethics, and de-identification
Research access runs on de-identified, aggregated data. Individual employees are never identified, identifiers are hashed before any data leaves our systems, and test and internal accounts are excluded. Extracts respect minimum aggregation thresholds so that no result can single out a specific person, and company identities are not shared.
Each engagement is governed by a written data-sharing and license agreement that sets out permitted use, retention, and publication terms. We are glad to support an institution's ethics or IRB review and to align the data-sharing terms with it. Where a study needs a higher bar, analysis can be arranged against aggregated cohort tables rather than record-level extracts.
The findings travel; the individuals do not. We share the structure and the signals needed to do real research, never an identifiable person or company.
How licensing works
Access is inquiry-based and scoped to your project rather than sold off a fixed price list.
- Share your research question. Tell us the design, the signals you need, and the population you have in mind.
- We scope the extract. We confirm what the data can support, propose a pack or custom cohort, and flag any limitations up front.
- Agreement. We put a data-sharing and license agreement in place, aligned with your institution's ethics review.
- Delivery. You receive de-identified extracts (for example CSV or Parquet), or access to aggregated cohort tables, with field-level documentation.
To start a conversation, write to tareef@happily.ai with a sketch of what you want to study.
What researchers have already found
The data yields findings that hold up to scrutiny. These open studies, each with its own methodology box and limitations, show the kinds of questions it can answer:
- WHO-5 Dimensions, where rest is consistently the lowest-scoring well-being dimension across quarters.
- Trust Networks, where 72% of the most-trusted people across 31 companies hold no management title.
- Attrition signals, where complaint themes in daily text are associated with sharply different exit rates.
- Power Skills over time, a 2,630-person panel where human-skill ratings rise with sustained practice.
- The Leadership Cascade, where reply behavior tracks a manager's own boss but not their skip-level.
You can browse all studies to see the breadth of what has been published from this data.
How to cite this dataset
The dataset and the studies built on it are meant to be referenced. When you cite the dataset itself, please attribute it to Happily Research and link this page.
Happily Research (2026). The Happily Workplace Dataset. happily.ai/research/dataset/
To cite a specific finding, cite the individual study and link its page, for example Happily Research (2026). The Stress Sweet Spot. happily.ai/research/stress-sweet-spot/.
Frequently asked questions
What is the Happily Workplace Dataset?
The Happily Workplace Dataset is a longitudinal, continuously collected record of employee well-being and workplace behavior. It spans 10M+ interactions across 350+ organizations since 2017 and combines daily mood check-ins, WHO-5 well-being, eNPS, peer recognition and feedback networks, manager behavior, and AI-rated power skills.
What variables and measures does the dataset include?
It includes daily mood (5-point), weekly stress (4-point), the WHO-5 Well-Being Index (0–100), eNPS (0–10) with free-text reasons, directed peer recognition and feedback networks, manager reply rate and quality, AI-rated power skills, performance ratings, and multi-level organizational hierarchy.
How large is the dataset and what time period does it cover?
Collection has run continuously since 2017, nine years and counting, across 350+ workplaces and 10M+ employee interactions. Individual published studies analyze recent multi-year windows; licensed extracts can draw on the full historical depth.
Where are participants located and what languages are represented?
Participants are employees at companies using the Happily platform, primarily in Thailand and the wider Southeast Asia region. Free-text responses are largely Thai, with English and mixed-language text, a workforce underrepresented in published workplace research.
How can researchers access or license the dataset?
Access is by license and scoped to a research question. Researchers email tareef@happily.ai with their design; Happily scopes a de-identified extract or aggregated cohort tables, puts a data-sharing agreement in place, and delivers the data with field-level documentation.
Is the data anonymized and how is privacy protected?
Yes. Research access uses de-identified, aggregated data. Identifiers are hashed before data leaves Happily's systems, no individual or company is identifiable, test and internal accounts are excluded, and minimum aggregation thresholds prevent any result from singling out a person.
What license and terms apply, and can it support IRB review?
Each engagement runs under a written data-sharing and license agreement covering permitted use, retention, and publication. Happily supports an institution's ethics or IRB review and can provide aggregated cohort tables rather than record-level data where a higher bar is required.
How should I cite the Happily Workplace Dataset?
Cite it as: Happily Research (2026). The Happily Workplace Dataset. happily.ai/research/dataset/. For a specific study built on the data, cite that study and link its page.
References
- Topp, C. W., Østergaard, S. D., Søndergaard, S., & Bech, P. (2015). The WHO-5 Well-Being Index: A Systematic Review of the Literature. Psychotherapy and Psychosomatics, 84(3), 167–176.
- Reichheld, F. F. (2003). The One Number You Need to Grow. Harvard Business Review.
- World Economic Forum (2025). The Future of Jobs Report 2025.
If your lab or institution wants to work with this dataset, tell us what you want to study and we will scope an extract.
Email tareef@happily.ai