2025-12-13 00:32:32 +02:00
# 16-bit TIFF Support for YOLO Object Detection
## Overview
2025-12-13 01:18:16 +02:00
This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and handles them appropriately for both **inference ** and **training ** **without uint8 conversion ** to preserve the full dynamic range and avoid data loss.
2025-12-13 00:32:32 +02:00
## Key Features
2025-12-13 01:18:16 +02:00
✅ Reads 16-bit or float32 images using tifffile
✅ Converts to float32 [0-1] (NO uint8 conversion)
✅ Replicates grayscale → RGB (3 channels)
✅ **Inference ** : Passes numpy arrays directly to YOLO (no file I/O)
2025-12-13 09:28:24 +02:00
✅ **Training ** : On-the-fly float32 conversion (NO disk caching)
2025-12-13 01:18:16 +02:00
✅ Uses Ultralytics YOLOv8/v11 models
✅ Works with segmentation models
✅ No data loss, no double normalization, no silent clipping
2025-12-13 00:32:32 +02:00
## Changes Made
### 1. Dependencies ([`requirements.txt`](../requirements.txt:14))
- Added `tifffile>=2023.0.0` for reliable 16-bit TIFF loading
### 2. Image Loading ([`src/utils/image.py`](../src/utils/image.py))
#### Enhanced TIFF Loading
- Modified [`Image._load()` ](../src/utils/image.py:87 ) to use `tifffile` for `.tif` and `.tiff` files
- Preserves original 16-bit data type during loading
- Properly handles both grayscale and multi-channel TIFF files
#### New Normalization Method
Added [`Image.to_normalized_float32()` ](../src/utils/image.py:280 ) method that:
- Converts image data to `float32`
- Properly scales values to [0, 1] range:
- **16-bit images**: divides by 65535 (full dynamic range)
- 8-bit images: divides by 255
- Float images: clips to [0, 1]
- Handles various data types automatically
### 3. YOLO Preprocessing ([`src/model/yolo_wrapper.py`](../src/model/yolo_wrapper.py))
Enhanced [`YOLOWrapper._prepare_source()` ](../src/model/yolo_wrapper.py:231 ) to:
1. Detect 16-bit TIFF files automatically
2. Load and normalize to float32 [0-1] using the new method
3. Replicate grayscale to RGB (3 channels)
4. **Return numpy array directly ** (NO file saving, NO uint8 conversion)
5. Pass float32 array directly to YOLO for inference
## Processing Pipeline
2025-12-13 01:18:16 +02:00
### For Inference (predict)
For 16-bit TIFF files during inference:
2025-12-13 00:32:32 +02:00
1. **Load ** : File loaded using `tifffile` → preserves 16-bit uint16 data
2. **Normalize ** : Convert to float32 and scale to [0, 1]
```python
float_data = uint16_data.astype(np.float32) / 65535.0
```
3. **RGB Conversion ** : Replicate grayscale to 3 channels
```python
rgb_float = np.stack([float_data] * 3, axis=-1)
```
4. **Pass to YOLO ** : Return float32 array directly (no uint8, no file I/O)
5. **Inference ** : YOLO processes the float32 [0-1] RGB array
2025-12-13 01:18:16 +02:00
### For Training (train)
2025-12-13 09:28:24 +02:00
Training now uses a custom dataset loader with on-the-fly conversion (NO disk caching):
2025-12-13 01:18:16 +02:00
2025-12-13 09:28:24 +02:00
1. **Custom Dataset ** : Uses `Float32Dataset` class that extends Ultralytics' `YOLODataset`
2. **Load On-The-Fly ** : Each image is loaded and converted during training:
- Detect 16-bit TIFF files automatically
- Load with `tifffile` (preserves uint16)
- Convert to float32 [0-1] in memory
- Replicate to 3 channels (RGB)
3. **No Disk Cache ** : Conversion happens in memory, no files written
4. **Train ** : YOLO trains on float32 [0-1] RGB arrays directly
See [`src/utils/train_ultralytics_float.py` ](../src/utils/train_ultralytics_float.py ) for implementation.
2025-12-13 01:18:16 +02:00
2025-12-13 00:32:32 +02:00
### No Data Loss!
2025-12-13 01:18:16 +02:00
Unlike approaches that convert to uint8 (256 levels), this implementation:
2025-12-13 00:32:32 +02:00
- Preserves full 16-bit dynamic range (65536 levels)
- Maintains precision with float32 representation
2025-12-13 01:18:16 +02:00
- For inference: passes data directly without file conversions
- For training: uses float32 TIFFs (not uint8 PNGs)
2025-12-13 00:32:32 +02:00
## Usage
### Basic Image Loading
```python
from src.utils.image import Image
# Load a 16-bit TIFF file
img = Image("path/to/16bit_image.tif")
# Get normalized float32 data [0-1]
normalized = img.to_normalized_float32() # Shape: (H, W), dtype: float32
# Original data is preserved
original = img.data # Still uint16
```
### YOLO Inference
The preprocessing is automatic - just use YOLO as normal:
```python
from src.model.yolo_wrapper import YOLOWrapper
# Initialize model
yolo = YOLOWrapper("yolov8s-seg.pt")
yolo.load_model()
# Perform inference on 16-bit TIFF
# The image will be automatically normalized and passed as float32 [0-1]
detections = yolo.predict("path/to/16bit_image.tif", conf=0.25)
```
### With InferenceEngine
```python
from src.model.inference import InferenceEngine
from src.database.db_manager import DatabaseManager
# Setup
db = DatabaseManager("database.db")
engine = InferenceEngine("model.pt", db, model_id=1)
# Detect objects in 16-bit TIFF
result = engine.detect_single(
image_path="path/to/16bit_image.tif",
relative_path="images/16bit_image.tif",
conf=0.25
)
```
## Testing
Three test scripts are provided:
### 1. Image Loading Test
```bash
./venv/bin/python tests/test_16bit_tiff_loading.py
```
Tests:
- Loading 16-bit TIFF files with tifffile
- Normalization to float32 [0-1]
- Data type and value range verification
### 2. Float32 Passthrough Test (Most Important!)
```bash
./venv/bin/python tests/test_yolo_16bit_float32.py
```
Tests:
- YOLO preprocessing returns numpy array (not file path)
- Data is float32 [0-1] (not uint8)
- No quantization to 256 levels (proves no uint8 conversion)
- Sample output:
```
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
Shape: (200, 200, 3)
Dtype: float32
Min value: 0.000000
Max value: 1.000000
Unique values: 399
✓ SUCCESS: Data has 399 unique values (> 256)
This confirms NO uint8 quantization occurred!
```
### 3. Legacy Test (Shows Old Behavior)
```bash
./venv/bin/python tests/test_yolo_16bit_preprocessing.py
```
This test shows the old behavior (uint8 conversion) - kept for comparison.
## Benefits
1. **No Data Loss ** : Preserves full 16-bit dynamic range (65536 levels vs 256)
2. **High Precision ** : Float32 maintains fine-grained intensity differences
3. **Automatic Processing ** : No manual preprocessing needed
4. **YOLO Compatible ** : Ultralytics YOLO accepts float32 [0-1] arrays
5. **Performance ** : No intermediate file I/O for 16-bit TIFFs
6. **Backwards Compatible ** : Regular images (8-bit PNG, JPEG, etc.) still work as before
## Technical Notes
### Float32 vs uint8
**With uint8 conversion (OLD - BAD):**
- 16-bit (65536 levels) → uint8 (256 levels) = **99.6% data loss! **
- Fine intensity differences are lost
- Quantization artifacts
**With float32 [0-1] (NEW - GOOD):**
- 16-bit (65536 levels) → float32 (continuous) = **No data loss **
- Full dynamic range preserved
- Smooth gradients maintained
### Memory Considerations
For a 2048× 2048 single-channel image:
2025-12-13 01:18:16 +02:00
| Format | Memory | Disk Space | Notes |
|--------|--------|------------|-------|
| Original 16-bit | 8 MB | ~8 MB | uint16 grayscale TIFF |
| Float32 grayscale | 16 MB | - | Intermediate |
| Float32 3-channel | 48 MB | ~48 MB | Training cache |
| uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |
2025-12-13 09:28:24 +02:00
The float32 approach uses ~3× more memory than uint8 during training but preserves **all information ** .
2025-12-13 00:32:32 +02:00
2025-12-13 09:28:24 +02:00
**No Disk Cache**: The new on-the-fly approach eliminates the need for cached datasets on disk.
2025-12-13 00:32:32 +02:00
### Why Direct Numpy Array?
Passing numpy arrays directly to YOLO (instead of saving to file):
1. **Faster ** : No disk I/O overhead
2. **No Quantization ** : Avoids PNG/JPEG quantization
3. **Memory Efficient ** : Single copy in memory
4. **Cleaner ** : No temp file management
Ultralytics YOLO supports various input types:
- File paths (str): `"image.jpg"`
- Numpy arrays: `np.ndarray` ← **we use this **
- PIL Images: `PIL.Image`
- Torch tensors: `torch.Tensor`
2025-12-13 09:28:24 +02:00
## Training with Float32 Dataset Loader
2025-12-13 00:32:32 +02:00
2025-12-13 09:28:24 +02:00
The system now includes a custom dataset loader for 16-bit TIFF training:
2025-12-13 00:32:32 +02:00
```python
2025-12-13 09:28:24 +02:00
from src.utils.train_ultralytics_float import train_with_float32_loader
# Train with on-the-fly float32 conversion
results = train_with_float32_loader(
model_path="yolov8s-seg.pt",
data_yaml="data/my_dataset/data.yaml",
epochs=100,
batch=16,
imgsz=640,
)
2025-12-13 00:32:32 +02:00
```
2025-12-13 09:28:24 +02:00
The `Float32Dataset` class automatically:
- Detects 16-bit TIFF files
- Loads with `tifffile` (not PIL/cv2)
- Converts to float32 [0-1] on-the-fly
- Replicates to 3 channels
- Integrates seamlessly with Ultralytics training pipeline
This is used automatically by the training tab in the GUI.
2025-12-13 00:32:32 +02:00
## Installation
Install the updated dependencies:
```bash
./venv/bin/pip install -r requirements.txt
```
Or install tifffile directly:
```bash
./venv/bin/pip install tifffile>=2023.0.0
```
## Example Test Output
```
=== Testing Float32 Passthrough (NO uint8) ===
Created test 16-bit TIFF: /tmp/tmpdt5hm0ab.tif
Shape: (200, 200)
Dtype: uint16
Min value: 0
Max value: 65535
Preprocessing result:
Prepared source type: <class 'numpy.ndarray'>
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
Shape: (200, 200, 3)
Dtype: float32
Min value: 0.000000
Max value: 1.000000
Mean value: 0.499992
Unique values: 399
✓ SUCCESS: Data has 399 unique values (> 256)
This confirms NO uint8 quantization occurred!
✓ All float32 passthrough tests passed!