FLUX.2 Klein 9B Schematic LoRA
This project was inspired by Vision Banana, which treats tasks such as depth, normal, and segmentation as image editing.
I wanted to test whether a similar idea could work with FLUX.2 [klein] 9B base by using small task-specific LoRA training runs.
This repository contains six task-specific LoRAs:
- relative depth
- surface normal
- body pose
- full pose
- binary segmentation
- amodal segmentation
The outputs are RGB schematic images. The quality is not production-ready, and these LoRAs are not intended to replace dedicated CV models.
For more details about the experiment and dataset construction, see the blog post:
Files
| Task | LoRA |
|---|---|
| Relative depth | loras/flux2-klein-schematic-relative-depth-lora.safetensors |
| Surface normal | loras/flux2-klein-schematic-surface-normal-lora.safetensors |
| Body pose | loras/flux2-klein-schematic-body-pose-lora.safetensors |
| Full pose | loras/flux2-klein-schematic-full-pose-lora.safetensors |
| Binary segmentation | loras/flux2-klein-schematic-binary-segmentation-lora.safetensors |
| Amodal segmentation | loras/flux2-klein-schematic-amodal-segmentation-lora.safetensors |
Examples
Relative Depth
Surface Normal
Body Pose
Full Pose
Binary Segmentation
Amodal Segmentation
Usage
Use the LoRA with FLUX.2 [klein] 9B base in an image-editing workflow.
These LoRAs were trained on the base model. They may not behave correctly with the distilled Klein models unless you also use an appropriate base-to-turbo / base-to-distilled compatibility LoRA.
Prompt Templates
Use simple command-style prompts.
Relative Depth
Generate a relative depth map of the input image.
Surface Normal
Generate a surface normal map of the input image.
Body Pose
Generate a body pose map of all visible people in the input image.
Full Pose
Generate a full pose map of all visible people in the input image.
Binary Segmentation
Generate a binary segmentation mask of [target] in the input image.
Amodal Segmentation
Generate an amodal segmentation mask of [target] in the input image.
ComfyUI Workflow
Notes
- This is not a drop-in replacement for dedicated preprocessors such as DWPose, Depth Anything, Lotus-2, or SAM.
- Pose is the least stable task. Small errors in color or skeleton topology are visually obvious.
- Segmentation can fail when the target description is ambiguous or when multiple similar objects are present.
- Amodal segmentation is especially experimental because the model must infer occluded parts.
- The dataset is small, so the behavior is limited and may vary across images.
Training Setup
- Base model:
black-forest-labs/FLUX.2-klein-base-9B - Training tool:
ai-toolkit - LoRA rank: linear
32/ conv16 - Optimizer:
adamw8bit - Learning rate:
5e-5 - Batch size:
4 - Dataset size:
1920image pairs across all tasks
Dataset
The training dataset is available here:
License
Please follow the license and usage terms of the base model:
black-forest-labs/FLUX.2-klein-base-9B.
This repository uses flux-non-commercial-license-v2.1.
Model tree for nomadoor/flux-2-klein-9B-schematic-lora
Base model
black-forest-labs/FLUX.2-klein-base-9B





