There are too many candidates, so let’s first ask the AI to list them.
There is no single “vulnerable-customer support” dataset. You get best results by combining (1) customer-support dialogs, (2) empathy/mental-health counseling dialogs, (3) document-grounded support dialogs, and (4) safety/PII tooling and preference data. Sources below.
1) Customer-support and task-oriented dialog corpora
-
Customer Support on Twitter (brand↔user threads). Good for realistic tone and handoffs. Kaggle. (Kaggle)
-
MSDialog (multi-turn tech support with utterance-level intent labels). (arXiv, GitHub)
-
Schema-Guided Dialogue (SGD) (18K multi-domain goal-oriented dialogs, incl. banking). (Hugging Face)
-
Taskmaster-1/2 (Google, realistic multi-turn scripts; several customer-service domains). (arXiv, ResearchGate)
-
TicketTalk (transactional support in movie-ticketing). (ResearchGate)
-
AIRDialogue (flight-booking, long multi-turn). (Hugging Face)
-
MultiWOZ 2.4 (large multi-domain TOD; use as a proxy for slots, escalation, DST). (ACL archive)
2) Empathy / counseling datasets for “vulnerability-aware” behavior
-
EmpatheticDialogues and ESConv (emotional-support conversations; many papers build on ESConv). (PMC, ACL archive)
-
PESConv (persona-augmented ESConv; evaluates empathy, relevance, emotion intensity). (ACL archive)
-
CounselChat / mental-health counseling Q&A (therapist answers; several scrapes). Check licensing. (GitHub, Kaggle)
-
MedDialog-EN and MediTOD (doctor-patient; for health-risk triage patterns). Note the scraping terms on sources; review before use. (Hugging Face, par.nsf.gov, ACL archive)
-
Motivational Interviewing corpora: AnnoMI, MI-TAGS, EMMI (empathy-rich, annotated therapist behaviors). (GitHub, Kaggle, ACL archive, arXiv)
3) Document-grounded “support from policy/KB” dialogs
-
Doc2Dial and MultiDoc2Dial (agent responses grounded in public-service docs; matches enterprise KB+policy flow). (doc2dial.github.io, ACL archive, GitHub)
-
DialDoc shared tasks for open-book, multi-doc response generation baselines and metrics. (ACL archive)
4) Safety, guardrails, and preference data
-
HH-RLHF preference + red-teaming data for helpfulness/harmlessness; useful for refusal and de-escalation behavior. (GitHub, Hugging Face, arXiv)
-
Plain-language and readability resources for low literacy users: textstat and government plain-language guides. (GitHub, GOV.UK, Service manual)
-
PII/PHI de-identification for training data: Microsoft Presidio (+NVIDIA NeMo Guardrails integration) or Phileas. (GitHub, Microsoft GitHub, NVIDIA Docs)
-
Regulatory framing for “vulnerable customers” (clear definitions and actions): UK FCA FG21/1. Use it to design escalation and “tell-us-once” patterns. (FCA)
5) Text simplification for accessibility
- Datasets and ready models to simplify bot outputs: GEM/wiki_auto + ASSET/TURK; mrm8488/t5-small-finetuned-text-simplification; agentlans/flan-t5-small-simplifier. (Hugging Face)
6) Training recipes that work in practice
-
SFT first, then preference tuning: Hugging Face TRL SFTTrainer for chat formats; then DPOTrainer with preference pairs. (Hugging Face)
-
Multi-turn formatting: use tokenizer chat_template and pack full histories; see HF forum threads for gotchas. (Hugging Face Forums, GitHub)
-
Efficient finetuning: Unsloth guides and HF blog for Llama-3.x SFT on multi-turn data. (Unsloth Docs, Hugging Face, atharvamaskar.hashnode.dev)
7) Evaluation you’ll actually need
-
Task success on support corpora (dst/slot accuracy, resolution rate) + grounding faithfulness on Doc2Dial/MultiDoc2Dial. (ACL archive)
-
Human eval for empathy and helpfulness: ACUTE-Eval; note known evaluator-group variance. (arXiv, ar5iv, ACL archive)
-
Operational empathy metrics exist in CCaaS tooling; use as inspiration, not ground truth. (Genesys Cloud Resource Center)
8) System blueprint (minimum viable path)
-
Collect & clean: start with Customer Support on Twitter + MSDialog; redact PII; label “risk cues” (bereavement, disability, financial hardship). (Kaggle, arXiv, GitHub)
-
Grounding: load internal policies/KB; pretrain the bot on Doc2Dial/MultiDoc2Dial patterns; then RAG over your docs. (doc2dial.github.io)
-
Empathy: SFT on ESConv/PESConv; add text-simplification post-processor for low literacy. (ACL archive, Hugging Face)
-
Safety: add HH-RLHF-style preferences and refusal exemplars; implement crisis keywords→human handoff. (GitHub)
-
Evaluate: run ACUTE-Eval for empathy/helpfulness; run grounding checks on Doc2Dial tasks; pilot with FCA-style vulnerable-customer journeys. (arXiv, ACL archive, FCA)
Caveats
- Some health counseling datasets have restrictive terms or scraping concerns. Always verify license/ToS before training. (Hugging Face)