Adding standalone training script and update

2025-12-13 09:28:24 +02:00
parent 908e9a5b82
commit aec0fbf83c
8 changed files with 1434 additions and 290 deletions
--- a/docs/16BIT_TIFF_SUPPORT.md
+++ b/docs/16BIT_TIFF_SUPPORT.md
@@ -10,7 +10,7 @@ This document describes the implementation of 16-bit grayscale TIFF support for
 ✅ Converts to float32 [0-1] (NO uint8 conversion)
 ✅ Replicates grayscale → RGB (3 channels)
 ✅ **Inference**: Passes numpy arrays directly to YOLO (no file I/O)
-✅ **Training**: Creates float32 3-channel TIFF dataset cache
+✅ **Training**: On-the-fly float32 conversion (NO disk caching)
 ✅ Uses Ultralytics YOLOv8/v11 models
 ✅ Works with segmentation models
 ✅ No data loss, no double normalization, no silent clipping
@@ -65,18 +65,18 @@ For 16-bit TIFF files during inference:

 ### For Training (train)

-During training, YOLO's internal dataloader loads images from disk, so we create a cached 3-channel dataset:
+Training now uses a custom dataset loader with on-the-fly conversion (NO disk caching):

-1. **Detect**: Check if dataset contains 16-bit TIFF files
-2. **Create Cache**: Build float32 3-channel TIFF dataset in `data/datasets/_float32_cache/`
-3. **Convert Each Image**:
-   - Load 16-bit TIFF using `tifffile`
-   - Normalize to float32 [0-1]
-   - Replicate to 3 channels
-   - Save as float32 TIFF (preserves precision)
-4. **Copy Labels**: Copy label files unchanged
-5. **Generate data.yaml**: Points to cached 3-channel dataset
-6. **Train**: YOLO trains on float32 3-channel TIFFs
+1. **Custom Dataset**: Uses `Float32Dataset` class that extends Ultralytics' `YOLODataset`
+2. **Load On-The-Fly**: Each image is loaded and converted during training:
+   - Detect 16-bit TIFF files automatically
+   - Load with `tifffile` (preserves uint16)
+   - Convert to float32 [0-1] in memory
+   - Replicate to 3 channels (RGB)
+3. **No Disk Cache**: Conversion happens in memory, no files written
+4. **Train**: YOLO trains on float32 [0-1] RGB arrays directly
+
+See [`src/utils/train_ultralytics_float.py`](../src/utils/train_ultralytics_float.py) for implementation.

 ### No Data Loss!

@@ -214,9 +214,9 @@ For a 2048×2048 single-channel image:
 | Float32 3-channel | 48 MB | ~48 MB | Training cache |
 | uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |

-The float32 approach uses ~4× more memory and disk space than uint8 but preserves **all information**.
+The float32 approach uses ~3× more memory than uint8 during training but preserves **all information**.

-**Cache Directory**: Training creates cached datasets in `data/datasets/_float32_cache/<dataset>_<hash>/`
+**No Disk Cache**: The new on-the-fly approach eliminates the need for cached datasets on disk.

 ### Why Direct Numpy Array?

@@ -233,50 +233,31 @@ Ultralytics YOLO supports various input types:
 - PIL Images: `PIL.Image`
 - Torch tensors: `torch.Tensor`

-## For Training with Custom Dataset
+## Training with Float32 Dataset Loader

-If you need to train YOLO on 16-bit TIFF images, you should create a custom dataset loader similar to the example provided by the user:
+The system now includes a custom dataset loader for 16-bit TIFF training:

 ```python
-import torch
-import numpy as np
-import tifffile as tiff
-from pathlib import Path
+from src.utils.train_ultralytics_float import train_with_float32_loader

-class FloatYoloSegDataset(torch.utils.data.Dataset):
-    def __init__(self, img_dir, label_dir, img_size=640):
-        self.img_paths = sorted(Path(img_dir).glob('*'))
-        self.label_dir = Path(label_dir)
-        self.img_size = img_size
-
-    def __len__(self):
-        return len(self.img_paths)
-
-    def __getitem__(self, idx):
-        img_path = self.img_paths[idx]
-        
-        # Load 16-bit TIFF
-        img = tiff.imread(img_path)
-        
-        # Convert to float32 [0-1]
-        img = img.astype(np.float32)
-        if img.max() > 1.5:  # Assume 16-bit if max > 1.5
-            img /= 65535.0
-        
-        # Grayscale → RGB
-        if img.ndim == 2:
-            img = np.repeat(img[..., None], 3, axis=2)
-        
-        # HWC → CHW for PyTorch
-        img = torch.from_numpy(img).permute(2, 0, 1).contiguous()
-        
-        # Load labels...
-        # (implementation depends on your label format)
-        
-        return img, labels
+# Train with on-the-fly float32 conversion
+results = train_with_float32_loader(
+    model_path="yolov8s-seg.pt",
+    data_yaml="data/my_dataset/data.yaml",
+    epochs=100,
+    batch=16,
+    imgsz=640,
+)
 ```

-Then use this dataset with Ultralytics training API or custom training loop.
+The `Float32Dataset` class automatically:
+- Detects 16-bit TIFF files
+- Loads with `tifffile` (not PIL/cv2)
+- Converts to float32 [0-1] on-the-fly
+- Replicates to 3 channels
+- Integrates seamlessly with Ultralytics training pipeline
+
+This is used automatically by the training tab in the GUI.

 ## Installation