# 16-bit TIFF Support for YOLO Object Detection ## Overview This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and handles them appropriately for both **inference** and **training** **without uint8 conversion** to preserve the full dynamic range and avoid data loss. ## Key Features ✅ Reads 16-bit or float32 images using tifffile ✅ Converts to float32 [0-1] (NO uint8 conversion) ✅ Replicates grayscale → RGB (3 channels) ✅ **Inference**: Passes numpy arrays directly to YOLO (no file I/O) ✅ **Training**: On-the-fly float32 conversion (NO disk caching) ✅ Uses Ultralytics YOLOv8/v11 models ✅ Works with segmentation models ✅ No data loss, no double normalization, no silent clipping ## Changes Made ### 1. Dependencies ([`requirements.txt`](../requirements.txt:14)) - Added `tifffile>=2023.0.0` for reliable 16-bit TIFF loading ### 2. Image Loading ([`src/utils/image.py`](../src/utils/image.py)) #### Enhanced TIFF Loading - Modified [`Image._load()`](../src/utils/image.py:87) to use `tifffile` for `.tif` and `.tiff` files - Preserves original 16-bit data type during loading - Properly handles both grayscale and multi-channel TIFF files #### New Normalization Method Added [`Image.to_normalized_float32()`](../src/utils/image.py:280) method that: - Converts image data to `float32` - Properly scales values to [0, 1] range: - **16-bit images**: divides by 65535 (full dynamic range) - 8-bit images: divides by 255 - Float images: clips to [0, 1] - Handles various data types automatically ### 3. YOLO Preprocessing ([`src/model/yolo_wrapper.py`](../src/model/yolo_wrapper.py)) Enhanced [`YOLOWrapper._prepare_source()`](../src/model/yolo_wrapper.py:231) to: 1. Detect 16-bit TIFF files automatically 2. Load and normalize to float32 [0-1] using the new method 3. Replicate grayscale to RGB (3 channels) 4. **Return numpy array directly** (NO file saving, NO uint8 conversion) 5. Pass float32 array directly to YOLO for inference ## Processing Pipeline ### For Inference (predict) For 16-bit TIFF files during inference: 1. **Load**: File loaded using `tifffile` → preserves 16-bit uint16 data 2. **Normalize**: Convert to float32 and scale to [0, 1] ```python float_data = uint16_data.astype(np.float32) / 65535.0 ``` 3. **RGB Conversion**: Replicate grayscale to 3 channels ```python rgb_float = np.stack([float_data] * 3, axis=-1) ``` 4. **Pass to YOLO**: Return float32 array directly (no uint8, no file I/O) 5. **Inference**: YOLO processes the float32 [0-1] RGB array ### For Training (train) Training now uses a custom dataset loader with on-the-fly conversion (NO disk caching): 1. **Custom Dataset**: Uses `Float32Dataset` class that extends Ultralytics' `YOLODataset` 2. **Load On-The-Fly**: Each image is loaded and converted during training: - Detect 16-bit TIFF files automatically - Load with `tifffile` (preserves uint16) - Convert to float32 [0-1] in memory - Replicate to 3 channels (RGB) 3. **No Disk Cache**: Conversion happens in memory, no files written 4. **Train**: YOLO trains on float32 [0-1] RGB arrays directly See [`src/utils/train_ultralytics_float.py`](../src/utils/train_ultralytics_float.py) for implementation. ### No Data Loss! Unlike approaches that convert to uint8 (256 levels), this implementation: - Preserves full 16-bit dynamic range (65536 levels) - Maintains precision with float32 representation - For inference: passes data directly without file conversions - For training: uses float32 TIFFs (not uint8 PNGs) ## Usage ### Basic Image Loading ```python from src.utils.image import Image # Load a 16-bit TIFF file img = Image("path/to/16bit_image.tif") # Get normalized float32 data [0-1] normalized = img.to_normalized_float32() # Shape: (H, W), dtype: float32 # Original data is preserved original = img.data # Still uint16 ``` ### YOLO Inference The preprocessing is automatic - just use YOLO as normal: ```python from src.model.yolo_wrapper import YOLOWrapper # Initialize model yolo = YOLOWrapper("yolov8s-seg.pt") yolo.load_model() # Perform inference on 16-bit TIFF # The image will be automatically normalized and passed as float32 [0-1] detections = yolo.predict("path/to/16bit_image.tif", conf=0.25) ``` ### With InferenceEngine ```python from src.model.inference import InferenceEngine from src.database.db_manager import DatabaseManager # Setup db = DatabaseManager("database.db") engine = InferenceEngine("model.pt", db, model_id=1) # Detect objects in 16-bit TIFF result = engine.detect_single( image_path="path/to/16bit_image.tif", relative_path="images/16bit_image.tif", conf=0.25 ) ``` ## Testing Three test scripts are provided: ### 1. Image Loading Test ```bash ./venv/bin/python tests/test_16bit_tiff_loading.py ``` Tests: - Loading 16-bit TIFF files with tifffile - Normalization to float32 [0-1] - Data type and value range verification ### 2. Float32 Passthrough Test (Most Important!) ```bash ./venv/bin/python tests/test_yolo_16bit_float32.py ``` Tests: - YOLO preprocessing returns numpy array (not file path) - Data is float32 [0-1] (not uint8) - No quantization to 256 levels (proves no uint8 conversion) - Sample output: ``` ✓ SUCCESS: Prepared source is a numpy array (float32 passthrough) Shape: (200, 200, 3) Dtype: float32 Min value: 0.000000 Max value: 1.000000 Unique values: 399 ✓ SUCCESS: Data has 399 unique values (> 256) This confirms NO uint8 quantization occurred! ``` ### 3. Legacy Test (Shows Old Behavior) ```bash ./venv/bin/python tests/test_yolo_16bit_preprocessing.py ``` This test shows the old behavior (uint8 conversion) - kept for comparison. ## Benefits 1. **No Data Loss**: Preserves full 16-bit dynamic range (65536 levels vs 256) 2. **High Precision**: Float32 maintains fine-grained intensity differences 3. **Automatic Processing**: No manual preprocessing needed 4. **YOLO Compatible**: Ultralytics YOLO accepts float32 [0-1] arrays 5. **Performance**: No intermediate file I/O for 16-bit TIFFs 6. **Backwards Compatible**: Regular images (8-bit PNG, JPEG, etc.) still work as before ## Technical Notes ### Float32 vs uint8 **With uint8 conversion (OLD - BAD):** - 16-bit (65536 levels) → uint8 (256 levels) = **99.6% data loss!** - Fine intensity differences are lost - Quantization artifacts **With float32 [0-1] (NEW - GOOD):** - 16-bit (65536 levels) → float32 (continuous) = **No data loss** - Full dynamic range preserved - Smooth gradients maintained ### Memory Considerations For a 2048×2048 single-channel image: | Format | Memory | Disk Space | Notes | |--------|--------|------------|-------| | Original 16-bit | 8 MB | ~8 MB | uint16 grayscale TIFF | | Float32 grayscale | 16 MB | - | Intermediate | | Float32 3-channel | 48 MB | ~48 MB | Training cache | | uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss | The float32 approach uses ~3× more memory than uint8 during training but preserves **all information**. **No Disk Cache**: The new on-the-fly approach eliminates the need for cached datasets on disk. ### Why Direct Numpy Array? Passing numpy arrays directly to YOLO (instead of saving to file): 1. **Faster**: No disk I/O overhead 2. **No Quantization**: Avoids PNG/JPEG quantization 3. **Memory Efficient**: Single copy in memory 4. **Cleaner**: No temp file management Ultralytics YOLO supports various input types: - File paths (str): `"image.jpg"` - Numpy arrays: `np.ndarray` ← **we use this** - PIL Images: `PIL.Image` - Torch tensors: `torch.Tensor` ## Training with Float32 Dataset Loader The system now includes a custom dataset loader for 16-bit TIFF training: ```python from src.utils.train_ultralytics_float import train_with_float32_loader # Train with on-the-fly float32 conversion results = train_with_float32_loader( model_path="yolov8s-seg.pt", data_yaml="data/my_dataset/data.yaml", epochs=100, batch=16, imgsz=640, ) ``` The `Float32Dataset` class automatically: - Detects 16-bit TIFF files - Loads with `tifffile` (not PIL/cv2) - Converts to float32 [0-1] on-the-fly - Replicates to 3 channels - Integrates seamlessly with Ultralytics training pipeline This is used automatically by the training tab in the GUI. ## Installation Install the updated dependencies: ```bash ./venv/bin/pip install -r requirements.txt ``` Or install tifffile directly: ```bash ./venv/bin/pip install tifffile>=2023.0.0 ``` ## Example Test Output ``` === Testing Float32 Passthrough (NO uint8) === Created test 16-bit TIFF: /tmp/tmpdt5hm0ab.tif Shape: (200, 200) Dtype: uint16 Min value: 0 Max value: 65535 Preprocessing result: Prepared source type: ✓ SUCCESS: Prepared source is a numpy array (float32 passthrough) Shape: (200, 200, 3) Dtype: float32 Min value: 0.000000 Max value: 1.000000 Mean value: 0.499992 Unique values: 399 ✓ SUCCESS: Data has 399 unique values (> 256) This confirms NO uint8 quantization occurred! ✓ All float32 passthrough tests passed!