Looking for vulnerable customer support Datasets for building a Multi-turn Chatbot

John6666 · August 13, 2025, 3:48pm

There are too many candidates, so let’s first ask the AI to list them.

There is no single “vulnerable-customer support” dataset. You get best results by combining (1) customer-support dialogs, (2) empathy/mental-health counseling dialogs, (3) document-grounded support dialogs, and (4) safety/PII tooling and preference data. Sources below.

1) Customer-support and task-oriented dialog corpora

Customer Support on Twitter (brand↔user threads). Good for realistic tone and handoffs. Kaggle. (Kaggle)
MSDialog (multi-turn tech support with utterance-level intent labels). (arXiv, GitHub)
Schema-Guided Dialogue (SGD) (18K multi-domain goal-oriented dialogs, incl. banking). (Hugging Face)
Taskmaster-1/2 (Google, realistic multi-turn scripts; several customer-service domains). (arXiv, ResearchGate)
TicketTalk (transactional support in movie-ticketing). (ResearchGate)
AIRDialogue (flight-booking, long multi-turn). (Hugging Face)
MultiWOZ 2.4 (large multi-domain TOD; use as a proxy for slots, escalation, DST). (ACL archive)

2) Empathy / counseling datasets for “vulnerability-aware” behavior

EmpatheticDialogues and ESConv (emotional-support conversations; many papers build on ESConv). (PMC, ACL archive)
PESConv (persona-augmented ESConv; evaluates empathy, relevance, emotion intensity). (ACL archive)
CounselChat / mental-health counseling Q&A (therapist answers; several scrapes). Check licensing. (GitHub, Kaggle)
MedDialog-EN and MediTOD (doctor-patient; for health-risk triage patterns). Note the scraping terms on sources; review before use. (Hugging Face, par.nsf.gov, ACL archive)
Motivational Interviewing corpora: AnnoMI, MI-TAGS, EMMI (empathy-rich, annotated therapist behaviors). (GitHub, Kaggle, ACL archive, arXiv)

3) Document-grounded “support from policy/KB” dialogs

Doc2Dial and MultiDoc2Dial (agent responses grounded in public-service docs; matches enterprise KB+policy flow). (doc2dial.github.io, ACL archive, GitHub)
DialDoc shared tasks for open-book, multi-doc response generation baselines and metrics. (ACL archive)

4) Safety, guardrails, and preference data

HH-RLHF preference + red-teaming data for helpfulness/harmlessness; useful for refusal and de-escalation behavior. (GitHub, Hugging Face, arXiv)
Plain-language and readability resources for low literacy users: textstat and government plain-language guides. (GitHub, GOV.UK, Service manual)
PII/PHI de-identification for training data: Microsoft Presidio (+NVIDIA NeMo Guardrails integration) or Phileas. (GitHub, Microsoft GitHub, NVIDIA Docs)
Regulatory framing for “vulnerable customers” (clear definitions and actions): UK FCA FG21/1. Use it to design escalation and “tell-us-once” patterns. (FCA)

5) Text simplification for accessibility

Datasets and ready models to simplify bot outputs: GEM/wiki_auto + ASSET/TURK; mrm8488/t5-small-finetuned-text-simplification; agentlans/flan-t5-small-simplifier. (Hugging Face)

6) Training recipes that work in practice

SFT first, then preference tuning: Hugging Face TRL SFTTrainer for chat formats; then DPOTrainer with preference pairs. (Hugging Face)
Multi-turn formatting: use tokenizer chat_template and pack full histories; see HF forum threads for gotchas. (Hugging Face Forums, GitHub)
Efficient finetuning: Unsloth guides and HF blog for Llama-3.x SFT on multi-turn data. (Unsloth Docs, Hugging Face, atharvamaskar.hashnode.dev)

7) Evaluation you’ll actually need

Task success on support corpora (dst/slot accuracy, resolution rate) + grounding faithfulness on Doc2Dial/MultiDoc2Dial. (ACL archive)
Human eval for empathy and helpfulness: ACUTE-Eval; note known evaluator-group variance. (arXiv, ar5iv, ACL archive)
Operational empathy metrics exist in CCaaS tooling; use as inspiration, not ground truth. (Genesys Cloud Resource Center)

8) System blueprint (minimum viable path)

Collect & clean: start with Customer Support on Twitter + MSDialog; redact PII; label “risk cues” (bereavement, disability, financial hardship). (Kaggle, arXiv, GitHub)
Grounding: load internal policies/KB; pretrain the bot on Doc2Dial/MultiDoc2Dial patterns; then RAG over your docs. (doc2dial.github.io)
Empathy: SFT on ESConv/PESConv; add text-simplification post-processor for low literacy. (ACL archive, Hugging Face)
Safety: add HH-RLHF-style preferences and refusal exemplars; implement crisis keywords→human handoff. (GitHub)
Evaluate: run ACUTE-Eval for empathy/helpfulness; run grounding checks on Doc2Dial tasks; pilot with FCA-style vulnerable-customer journeys. (arXiv, ACL archive, FCA)

Caveats

Some health counseling datasets have restrictive terms or scraping concerns. Always verify license/ToS before training. (Hugging Face)

Topic		Replies	Views
Looking for Mental Health Support Datasets for building a Multi-turn Chatbot 🤗Datasets	7	3321	March 4, 2026
Mentail health counseling 🤗Datasets	2	304	September 18, 2024
Data problem for live support for my e-commerce site Beginners	0	29	October 28, 2024
VIBE-2k For General chat Dataset 🤗Datasets	2	45	November 18, 2025
LLaMa2 fine-tuning: Multi-turn conversation dataset template Models	2	5801	March 6, 2024