Bug fix

2025-12-13 01:18:16 +02:00
parent edcd448a61
commit 908e9a5b82
4 changed files with 2102 additions and 21 deletions
--- a/docs/16BIT_TIFF_SUPPORT.md
+++ b/docs/16BIT_TIFF_SUPPORT.md
@@ -2,17 +2,18 @@

 ## Overview

-This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and passes them directly to YOLO **without uint8 conversion** to preserve the full dynamic range and avoid data loss.
+This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and handles them appropriately for both **inference** and **training** **without uint8 conversion** to preserve the full dynamic range and avoid data loss.

 ## Key Features

-✅ Reads 16-bit or float32 images using tifffile  
-✅ Converts to float32 [0-1] (NO uint8 conversion)  
-✅ Replicates grayscale → RGB (3 channels)  
-✅ Passes numpy arrays directly to YOLO (no file I/O)  
-✅ Uses Ultralytics YOLOv8/v11 models  
-✅ Works with segmentation models  
-✅ No data loss, no double normalization, no silent clipping  
+✅ Reads 16-bit or float32 images using tifffile
+✅ Converts to float32 [0-1] (NO uint8 conversion)
+✅ Replicates grayscale → RGB (3 channels)
+✅ **Inference**: Passes numpy arrays directly to YOLO (no file I/O)
+✅ **Training**: Creates float32 3-channel TIFF dataset cache
+✅ Uses Ultralytics YOLOv8/v11 models
+✅ Works with segmentation models
+✅ No data loss, no double normalization, no silent clipping

 ## Changes Made

@@ -46,7 +47,9 @@ Enhanced [`YOLOWrapper._prepare_source()`](../src/model/yolo_wrapper.py:231) to:

 ## Processing Pipeline

-For 16-bit TIFF files:
+### For Inference (predict)
+
+For 16-bit TIFF files during inference:

 1. **Load**: File loaded using `tifffile` → preserves 16-bit uint16 data
 2. **Normalize**: Convert to float32 and scale to [0, 1]
@@ -60,12 +63,28 @@ For 16-bit TIFF files:
 4. **Pass to YOLO**: Return float32 array directly (no uint8, no file I/O)
 5. **Inference**: YOLO processes the float32 [0-1] RGB array

+### For Training (train)
+
+During training, YOLO's internal dataloader loads images from disk, so we create a cached 3-channel dataset:
+
+1. **Detect**: Check if dataset contains 16-bit TIFF files
+2. **Create Cache**: Build float32 3-channel TIFF dataset in `data/datasets/_float32_cache/`
+3. **Convert Each Image**:
+   - Load 16-bit TIFF using `tifffile`
+   - Normalize to float32 [0-1]
+   - Replicate to 3 channels
+   - Save as float32 TIFF (preserves precision)
+4. **Copy Labels**: Copy label files unchanged
+5. **Generate data.yaml**: Points to cached 3-channel dataset
+6. **Train**: YOLO trains on float32 3-channel TIFFs
+
 ### No Data Loss!

-Unlike the previous approach that converted to uint8 (256 levels), the new implementation:
+Unlike approaches that convert to uint8 (256 levels), this implementation:
 - Preserves full 16-bit dynamic range (65536 levels)
 - Maintains precision with float32 representation
- Passes data directly without intermediate file conversions
+- For inference: passes data directly without file conversions
+- For training: uses float32 TIFFs (not uint8 PNGs)

 ## Usage

@@ -188,14 +207,16 @@ This test shows the old behavior (uint8 conversion) - kept for comparison.

 For a 2048×2048 single-channel image:

-| Format | Memory | Notes |
-|--------|--------|-------|
-| Original 16-bit | 8 MB | uint16 grayscale |
-| Float32 grayscale | 16 MB | Intermediate |
-| Float32 RGB | 48 MB | Final (3 channels) |
-| uint8 RGB (old) | 12 MB | OLD approach with data loss |
+| Format | Memory | Disk Space | Notes |
+|--------|--------|------------|-------|
+| Original 16-bit | 8 MB | ~8 MB | uint16 grayscale TIFF |
+| Float32 grayscale | 16 MB | - | Intermediate |
+| Float32 3-channel | 48 MB | ~48 MB | Training cache |
+| uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |

-The float32 approach uses ~4× more memory than uint8 but preserves **all information**.
+The float32 approach uses ~4× more memory and disk space than uint8 but preserves **all information**.
+
+**Cache Directory**: Training creates cached datasets in `data/datasets/_float32_cache/<dataset>_<hash>/`

 ### Why Direct Numpy Array?