9.5 KiB
16-bit TIFF Support for YOLO Object Detection
Overview
This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and handles them appropriately for both inference and training without uint8 conversion to preserve the full dynamic range and avoid data loss.
Key Features
✅ Reads 16-bit or float32 images using tifffile ✅ Converts to float32 [0-1] (NO uint8 conversion) ✅ Replicates grayscale → RGB (3 channels) ✅ Inference: Passes numpy arrays directly to YOLO (no file I/O) ✅ Training: Creates float32 3-channel TIFF dataset cache ✅ Uses Ultralytics YOLOv8/v11 models ✅ Works with segmentation models ✅ No data loss, no double normalization, no silent clipping
Changes Made
1. Dependencies (requirements.txt)
- Added
tifffile>=2023.0.0for reliable 16-bit TIFF loading
2. Image Loading (src/utils/image.py)
Enhanced TIFF Loading
- Modified
Image._load()to usetifffilefor.tifand.tifffiles - Preserves original 16-bit data type during loading
- Properly handles both grayscale and multi-channel TIFF files
New Normalization Method
Added Image.to_normalized_float32() method that:
- Converts image data to
float32 - Properly scales values to [0, 1] range:
- 16-bit images: divides by 65535 (full dynamic range)
- 8-bit images: divides by 255
- Float images: clips to [0, 1]
- Handles various data types automatically
3. YOLO Preprocessing (src/model/yolo_wrapper.py)
Enhanced YOLOWrapper._prepare_source() to:
- Detect 16-bit TIFF files automatically
- Load and normalize to float32 [0-1] using the new method
- Replicate grayscale to RGB (3 channels)
- Return numpy array directly (NO file saving, NO uint8 conversion)
- Pass float32 array directly to YOLO for inference
Processing Pipeline
For Inference (predict)
For 16-bit TIFF files during inference:
- Load: File loaded using
tifffile→ preserves 16-bit uint16 data - Normalize: Convert to float32 and scale to [0, 1]
float_data = uint16_data.astype(np.float32) / 65535.0 - RGB Conversion: Replicate grayscale to 3 channels
rgb_float = np.stack([float_data] * 3, axis=-1) - Pass to YOLO: Return float32 array directly (no uint8, no file I/O)
- Inference: YOLO processes the float32 [0-1] RGB array
For Training (train)
During training, YOLO's internal dataloader loads images from disk, so we create a cached 3-channel dataset:
- Detect: Check if dataset contains 16-bit TIFF files
- Create Cache: Build float32 3-channel TIFF dataset in
data/datasets/_float32_cache/ - Convert Each Image:
- Load 16-bit TIFF using
tifffile - Normalize to float32 [0-1]
- Replicate to 3 channels
- Save as float32 TIFF (preserves precision)
- Load 16-bit TIFF using
- Copy Labels: Copy label files unchanged
- Generate data.yaml: Points to cached 3-channel dataset
- Train: YOLO trains on float32 3-channel TIFFs
No Data Loss!
Unlike approaches that convert to uint8 (256 levels), this implementation:
- Preserves full 16-bit dynamic range (65536 levels)
- Maintains precision with float32 representation
- For inference: passes data directly without file conversions
- For training: uses float32 TIFFs (not uint8 PNGs)
Usage
Basic Image Loading
from src.utils.image import Image
# Load a 16-bit TIFF file
img = Image("path/to/16bit_image.tif")
# Get normalized float32 data [0-1]
normalized = img.to_normalized_float32() # Shape: (H, W), dtype: float32
# Original data is preserved
original = img.data # Still uint16
YOLO Inference
The preprocessing is automatic - just use YOLO as normal:
from src.model.yolo_wrapper import YOLOWrapper
# Initialize model
yolo = YOLOWrapper("yolov8s-seg.pt")
yolo.load_model()
# Perform inference on 16-bit TIFF
# The image will be automatically normalized and passed as float32 [0-1]
detections = yolo.predict("path/to/16bit_image.tif", conf=0.25)
With InferenceEngine
from src.model.inference import InferenceEngine
from src.database.db_manager import DatabaseManager
# Setup
db = DatabaseManager("database.db")
engine = InferenceEngine("model.pt", db, model_id=1)
# Detect objects in 16-bit TIFF
result = engine.detect_single(
image_path="path/to/16bit_image.tif",
relative_path="images/16bit_image.tif",
conf=0.25
)
Testing
Three test scripts are provided:
1. Image Loading Test
./venv/bin/python tests/test_16bit_tiff_loading.py
Tests:
- Loading 16-bit TIFF files with tifffile
- Normalization to float32 [0-1]
- Data type and value range verification
2. Float32 Passthrough Test (Most Important!)
./venv/bin/python tests/test_yolo_16bit_float32.py
Tests:
- YOLO preprocessing returns numpy array (not file path)
- Data is float32 [0-1] (not uint8)
- No quantization to 256 levels (proves no uint8 conversion)
- Sample output:
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough) Shape: (200, 200, 3) Dtype: float32 Min value: 0.000000 Max value: 1.000000 Unique values: 399 ✓ SUCCESS: Data has 399 unique values (> 256) This confirms NO uint8 quantization occurred!
3. Legacy Test (Shows Old Behavior)
./venv/bin/python tests/test_yolo_16bit_preprocessing.py
This test shows the old behavior (uint8 conversion) - kept for comparison.
Benefits
- No Data Loss: Preserves full 16-bit dynamic range (65536 levels vs 256)
- High Precision: Float32 maintains fine-grained intensity differences
- Automatic Processing: No manual preprocessing needed
- YOLO Compatible: Ultralytics YOLO accepts float32 [0-1] arrays
- Performance: No intermediate file I/O for 16-bit TIFFs
- Backwards Compatible: Regular images (8-bit PNG, JPEG, etc.) still work as before
Technical Notes
Float32 vs uint8
With uint8 conversion (OLD - BAD):
- 16-bit (65536 levels) → uint8 (256 levels) = 99.6% data loss!
- Fine intensity differences are lost
- Quantization artifacts
With float32 [0-1] (NEW - GOOD):
- 16-bit (65536 levels) → float32 (continuous) = No data loss
- Full dynamic range preserved
- Smooth gradients maintained
Memory Considerations
For a 2048×2048 single-channel image:
| Format | Memory | Disk Space | Notes |
|---|---|---|---|
| Original 16-bit | 8 MB | ~8 MB | uint16 grayscale TIFF |
| Float32 grayscale | 16 MB | - | Intermediate |
| Float32 3-channel | 48 MB | ~48 MB | Training cache |
| uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |
The float32 approach uses ~4× more memory and disk space than uint8 but preserves all information.
Cache Directory: Training creates cached datasets in data/datasets/_float32_cache/<dataset>_<hash>/
Why Direct Numpy Array?
Passing numpy arrays directly to YOLO (instead of saving to file):
- Faster: No disk I/O overhead
- No Quantization: Avoids PNG/JPEG quantization
- Memory Efficient: Single copy in memory
- Cleaner: No temp file management
Ultralytics YOLO supports various input types:
- File paths (str):
"image.jpg" - Numpy arrays:
np.ndarray← we use this - PIL Images:
PIL.Image - Torch tensors:
torch.Tensor
For Training with Custom Dataset
If you need to train YOLO on 16-bit TIFF images, you should create a custom dataset loader similar to the example provided by the user:
import torch
import numpy as np
import tifffile as tiff
from pathlib import Path
class FloatYoloSegDataset(torch.utils.data.Dataset):
def __init__(self, img_dir, label_dir, img_size=640):
self.img_paths = sorted(Path(img_dir).glob('*'))
self.label_dir = Path(label_dir)
self.img_size = img_size
def __len__(self):
return len(self.img_paths)
def __getitem__(self, idx):
img_path = self.img_paths[idx]
# Load 16-bit TIFF
img = tiff.imread(img_path)
# Convert to float32 [0-1]
img = img.astype(np.float32)
if img.max() > 1.5: # Assume 16-bit if max > 1.5
img /= 65535.0
# Grayscale → RGB
if img.ndim == 2:
img = np.repeat(img[..., None], 3, axis=2)
# HWC → CHW for PyTorch
img = torch.from_numpy(img).permute(2, 0, 1).contiguous()
# Load labels...
# (implementation depends on your label format)
return img, labels
Then use this dataset with Ultralytics training API or custom training loop.
Installation
Install the updated dependencies:
./venv/bin/pip install -r requirements.txt
Or install tifffile directly:
./venv/bin/pip install tifffile>=2023.0.0
Example Test Output
=== Testing Float32 Passthrough (NO uint8) ===
Created test 16-bit TIFF: /tmp/tmpdt5hm0ab.tif
Shape: (200, 200)
Dtype: uint16
Min value: 0
Max value: 65535
Preprocessing result:
Prepared source type: <class 'numpy.ndarray'>
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
Shape: (200, 200, 3)
Dtype: float32
Min value: 0.000000
Max value: 1.000000
Mean value: 0.499992
Unique values: 399
✓ SUCCESS: Data has 399 unique values (> 256)
This confirms NO uint8 quantization occurred!
✓ All float32 passthrough tests passed!