Adding standalone training script and update

This commit is contained in:
2025-12-13 09:28:24 +02:00
parent 908e9a5b82
commit aec0fbf83c
8 changed files with 1434 additions and 290 deletions

View File

@@ -10,7 +10,7 @@ This document describes the implementation of 16-bit grayscale TIFF support for
✅ Converts to float32 [0-1] (NO uint8 conversion)
✅ Replicates grayscale → RGB (3 channels)
**Inference**: Passes numpy arrays directly to YOLO (no file I/O)
**Training**: Creates float32 3-channel TIFF dataset cache
**Training**: On-the-fly float32 conversion (NO disk caching)
✅ Uses Ultralytics YOLOv8/v11 models
✅ Works with segmentation models
✅ No data loss, no double normalization, no silent clipping
@@ -65,18 +65,18 @@ For 16-bit TIFF files during inference:
### For Training (train)
During training, YOLO's internal dataloader loads images from disk, so we create a cached 3-channel dataset:
Training now uses a custom dataset loader with on-the-fly conversion (NO disk caching):
1. **Detect**: Check if dataset contains 16-bit TIFF files
2. **Create Cache**: Build float32 3-channel TIFF dataset in `data/datasets/_float32_cache/`
3. **Convert Each Image**:
- Load 16-bit TIFF using `tifffile`
- Normalize to float32 [0-1]
- Replicate to 3 channels
- Save as float32 TIFF (preserves precision)
4. **Copy Labels**: Copy label files unchanged
5. **Generate data.yaml**: Points to cached 3-channel dataset
6. **Train**: YOLO trains on float32 3-channel TIFFs
1. **Custom Dataset**: Uses `Float32Dataset` class that extends Ultralytics' `YOLODataset`
2. **Load On-The-Fly**: Each image is loaded and converted during training:
- Detect 16-bit TIFF files automatically
- Load with `tifffile` (preserves uint16)
- Convert to float32 [0-1] in memory
- Replicate to 3 channels (RGB)
3. **No Disk Cache**: Conversion happens in memory, no files written
4. **Train**: YOLO trains on float32 [0-1] RGB arrays directly
See [`src/utils/train_ultralytics_float.py`](../src/utils/train_ultralytics_float.py) for implementation.
### No Data Loss!
@@ -214,9 +214,9 @@ For a 2048×2048 single-channel image:
| Float32 3-channel | 48 MB | ~48 MB | Training cache |
| uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |
The float32 approach uses ~4× more memory and disk space than uint8 but preserves **all information**.
The float32 approach uses ~3× more memory than uint8 during training but preserves **all information**.
**Cache Directory**: Training creates cached datasets in `data/datasets/_float32_cache/<dataset>_<hash>/`
**No Disk Cache**: The new on-the-fly approach eliminates the need for cached datasets on disk.
### Why Direct Numpy Array?
@@ -233,50 +233,31 @@ Ultralytics YOLO supports various input types:
- PIL Images: `PIL.Image`
- Torch tensors: `torch.Tensor`
## For Training with Custom Dataset
## Training with Float32 Dataset Loader
If you need to train YOLO on 16-bit TIFF images, you should create a custom dataset loader similar to the example provided by the user:
The system now includes a custom dataset loader for 16-bit TIFF training:
```python
import torch
import numpy as np
import tifffile as tiff
from pathlib import Path
from src.utils.train_ultralytics_float import train_with_float32_loader
class FloatYoloSegDataset(torch.utils.data.Dataset):
def __init__(self, img_dir, label_dir, img_size=640):
self.img_paths = sorted(Path(img_dir).glob('*'))
self.label_dir = Path(label_dir)
self.img_size = img_size
def __len__(self):
return len(self.img_paths)
def __getitem__(self, idx):
img_path = self.img_paths[idx]
# Load 16-bit TIFF
img = tiff.imread(img_path)
# Convert to float32 [0-1]
img = img.astype(np.float32)
if img.max() > 1.5: # Assume 16-bit if max > 1.5
img /= 65535.0
# Grayscale → RGB
if img.ndim == 2:
img = np.repeat(img[..., None], 3, axis=2)
# HWC → CHW for PyTorch
img = torch.from_numpy(img).permute(2, 0, 1).contiguous()
# Load labels...
# (implementation depends on your label format)
return img, labels
# Train with on-the-fly float32 conversion
results = train_with_float32_loader(
model_path="yolov8s-seg.pt",
data_yaml="data/my_dataset/data.yaml",
epochs=100,
batch=16,
imgsz=640,
)
```
Then use this dataset with Ultralytics training API or custom training loop.
The `Float32Dataset` class automatically:
- Detects 16-bit TIFF files
- Loads with `tifffile` (not PIL/cv2)
- Converts to float32 [0-1] on-the-fly
- Replicates to 3 channels
- Integrates seamlessly with Ultralytics training pipeline
This is used automatically by the training tab in the GUI.
## Installation