298 lines
8.6 KiB
Markdown
298 lines
8.6 KiB
Markdown
|
|
# 16-bit TIFF Support for YOLO Object Detection
|
|||
|
|
|
|||
|
|
## Overview
|
|||
|
|
|
|||
|
|
This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and passes them directly to YOLO **without uint8 conversion** to preserve the full dynamic range and avoid data loss.
|
|||
|
|
|
|||
|
|
## Key Features
|
|||
|
|
|
|||
|
|
✅ Reads 16-bit or float32 images using tifffile
|
|||
|
|
✅ Converts to float32 [0-1] (NO uint8 conversion)
|
|||
|
|
✅ Replicates grayscale → RGB (3 channels)
|
|||
|
|
✅ Passes numpy arrays directly to YOLO (no file I/O)
|
|||
|
|
✅ Uses Ultralytics YOLOv8/v11 models
|
|||
|
|
✅ Works with segmentation models
|
|||
|
|
✅ No data loss, no double normalization, no silent clipping
|
|||
|
|
|
|||
|
|
## Changes Made
|
|||
|
|
|
|||
|
|
### 1. Dependencies ([`requirements.txt`](../requirements.txt:14))
|
|||
|
|
- Added `tifffile>=2023.0.0` for reliable 16-bit TIFF loading
|
|||
|
|
|
|||
|
|
### 2. Image Loading ([`src/utils/image.py`](../src/utils/image.py))
|
|||
|
|
|
|||
|
|
#### Enhanced TIFF Loading
|
|||
|
|
- Modified [`Image._load()`](../src/utils/image.py:87) to use `tifffile` for `.tif` and `.tiff` files
|
|||
|
|
- Preserves original 16-bit data type during loading
|
|||
|
|
- Properly handles both grayscale and multi-channel TIFF files
|
|||
|
|
|
|||
|
|
#### New Normalization Method
|
|||
|
|
Added [`Image.to_normalized_float32()`](../src/utils/image.py:280) method that:
|
|||
|
|
- Converts image data to `float32`
|
|||
|
|
- Properly scales values to [0, 1] range:
|
|||
|
|
- **16-bit images**: divides by 65535 (full dynamic range)
|
|||
|
|
- 8-bit images: divides by 255
|
|||
|
|
- Float images: clips to [0, 1]
|
|||
|
|
- Handles various data types automatically
|
|||
|
|
|
|||
|
|
### 3. YOLO Preprocessing ([`src/model/yolo_wrapper.py`](../src/model/yolo_wrapper.py))
|
|||
|
|
|
|||
|
|
Enhanced [`YOLOWrapper._prepare_source()`](../src/model/yolo_wrapper.py:231) to:
|
|||
|
|
1. Detect 16-bit TIFF files automatically
|
|||
|
|
2. Load and normalize to float32 [0-1] using the new method
|
|||
|
|
3. Replicate grayscale to RGB (3 channels)
|
|||
|
|
4. **Return numpy array directly** (NO file saving, NO uint8 conversion)
|
|||
|
|
5. Pass float32 array directly to YOLO for inference
|
|||
|
|
|
|||
|
|
## Processing Pipeline
|
|||
|
|
|
|||
|
|
For 16-bit TIFF files:
|
|||
|
|
|
|||
|
|
1. **Load**: File loaded using `tifffile` → preserves 16-bit uint16 data
|
|||
|
|
2. **Normalize**: Convert to float32 and scale to [0, 1]
|
|||
|
|
```python
|
|||
|
|
float_data = uint16_data.astype(np.float32) / 65535.0
|
|||
|
|
```
|
|||
|
|
3. **RGB Conversion**: Replicate grayscale to 3 channels
|
|||
|
|
```python
|
|||
|
|
rgb_float = np.stack([float_data] * 3, axis=-1)
|
|||
|
|
```
|
|||
|
|
4. **Pass to YOLO**: Return float32 array directly (no uint8, no file I/O)
|
|||
|
|
5. **Inference**: YOLO processes the float32 [0-1] RGB array
|
|||
|
|
|
|||
|
|
### No Data Loss!
|
|||
|
|
|
|||
|
|
Unlike the previous approach that converted to uint8 (256 levels), the new implementation:
|
|||
|
|
- Preserves full 16-bit dynamic range (65536 levels)
|
|||
|
|
- Maintains precision with float32 representation
|
|||
|
|
- Passes data directly without intermediate file conversions
|
|||
|
|
|
|||
|
|
## Usage
|
|||
|
|
|
|||
|
|
### Basic Image Loading
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from src.utils.image import Image
|
|||
|
|
|
|||
|
|
# Load a 16-bit TIFF file
|
|||
|
|
img = Image("path/to/16bit_image.tif")
|
|||
|
|
|
|||
|
|
# Get normalized float32 data [0-1]
|
|||
|
|
normalized = img.to_normalized_float32() # Shape: (H, W), dtype: float32
|
|||
|
|
|
|||
|
|
# Original data is preserved
|
|||
|
|
original = img.data # Still uint16
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### YOLO Inference
|
|||
|
|
|
|||
|
|
The preprocessing is automatic - just use YOLO as normal:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from src.model.yolo_wrapper import YOLOWrapper
|
|||
|
|
|
|||
|
|
# Initialize model
|
|||
|
|
yolo = YOLOWrapper("yolov8s-seg.pt")
|
|||
|
|
yolo.load_model()
|
|||
|
|
|
|||
|
|
# Perform inference on 16-bit TIFF
|
|||
|
|
# The image will be automatically normalized and passed as float32 [0-1]
|
|||
|
|
detections = yolo.predict("path/to/16bit_image.tif", conf=0.25)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### With InferenceEngine
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
from src.model.inference import InferenceEngine
|
|||
|
|
from src.database.db_manager import DatabaseManager
|
|||
|
|
|
|||
|
|
# Setup
|
|||
|
|
db = DatabaseManager("database.db")
|
|||
|
|
engine = InferenceEngine("model.pt", db, model_id=1)
|
|||
|
|
|
|||
|
|
# Detect objects in 16-bit TIFF
|
|||
|
|
result = engine.detect_single(
|
|||
|
|
image_path="path/to/16bit_image.tif",
|
|||
|
|
relative_path="images/16bit_image.tif",
|
|||
|
|
conf=0.25
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Testing
|
|||
|
|
|
|||
|
|
Three test scripts are provided:
|
|||
|
|
|
|||
|
|
### 1. Image Loading Test
|
|||
|
|
```bash
|
|||
|
|
./venv/bin/python tests/test_16bit_tiff_loading.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Tests:
|
|||
|
|
- Loading 16-bit TIFF files with tifffile
|
|||
|
|
- Normalization to float32 [0-1]
|
|||
|
|
- Data type and value range verification
|
|||
|
|
|
|||
|
|
### 2. Float32 Passthrough Test (Most Important!)
|
|||
|
|
```bash
|
|||
|
|
./venv/bin/python tests/test_yolo_16bit_float32.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Tests:
|
|||
|
|
- YOLO preprocessing returns numpy array (not file path)
|
|||
|
|
- Data is float32 [0-1] (not uint8)
|
|||
|
|
- No quantization to 256 levels (proves no uint8 conversion)
|
|||
|
|
- Sample output:
|
|||
|
|
```
|
|||
|
|
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
|
|||
|
|
Shape: (200, 200, 3)
|
|||
|
|
Dtype: float32
|
|||
|
|
Min value: 0.000000
|
|||
|
|
Max value: 1.000000
|
|||
|
|
Unique values: 399
|
|||
|
|
|
|||
|
|
✓ SUCCESS: Data has 399 unique values (> 256)
|
|||
|
|
This confirms NO uint8 quantization occurred!
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. Legacy Test (Shows Old Behavior)
|
|||
|
|
```bash
|
|||
|
|
./venv/bin/python tests/test_yolo_16bit_preprocessing.py
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
This test shows the old behavior (uint8 conversion) - kept for comparison.
|
|||
|
|
|
|||
|
|
## Benefits
|
|||
|
|
|
|||
|
|
1. **No Data Loss**: Preserves full 16-bit dynamic range (65536 levels vs 256)
|
|||
|
|
2. **High Precision**: Float32 maintains fine-grained intensity differences
|
|||
|
|
3. **Automatic Processing**: No manual preprocessing needed
|
|||
|
|
4. **YOLO Compatible**: Ultralytics YOLO accepts float32 [0-1] arrays
|
|||
|
|
5. **Performance**: No intermediate file I/O for 16-bit TIFFs
|
|||
|
|
6. **Backwards Compatible**: Regular images (8-bit PNG, JPEG, etc.) still work as before
|
|||
|
|
|
|||
|
|
## Technical Notes
|
|||
|
|
|
|||
|
|
### Float32 vs uint8
|
|||
|
|
|
|||
|
|
**With uint8 conversion (OLD - BAD):**
|
|||
|
|
- 16-bit (65536 levels) → uint8 (256 levels) = **99.6% data loss!**
|
|||
|
|
- Fine intensity differences are lost
|
|||
|
|
- Quantization artifacts
|
|||
|
|
|
|||
|
|
**With float32 [0-1] (NEW - GOOD):**
|
|||
|
|
- 16-bit (65536 levels) → float32 (continuous) = **No data loss**
|
|||
|
|
- Full dynamic range preserved
|
|||
|
|
- Smooth gradients maintained
|
|||
|
|
|
|||
|
|
### Memory Considerations
|
|||
|
|
|
|||
|
|
For a 2048×2048 single-channel image:
|
|||
|
|
|
|||
|
|
| Format | Memory | Notes |
|
|||
|
|
|--------|--------|-------|
|
|||
|
|
| Original 16-bit | 8 MB | uint16 grayscale |
|
|||
|
|
| Float32 grayscale | 16 MB | Intermediate |
|
|||
|
|
| Float32 RGB | 48 MB | Final (3 channels) |
|
|||
|
|
| uint8 RGB (old) | 12 MB | OLD approach with data loss |
|
|||
|
|
|
|||
|
|
The float32 approach uses ~4× more memory than uint8 but preserves **all information**.
|
|||
|
|
|
|||
|
|
### Why Direct Numpy Array?
|
|||
|
|
|
|||
|
|
Passing numpy arrays directly to YOLO (instead of saving to file):
|
|||
|
|
|
|||
|
|
1. **Faster**: No disk I/O overhead
|
|||
|
|
2. **No Quantization**: Avoids PNG/JPEG quantization
|
|||
|
|
3. **Memory Efficient**: Single copy in memory
|
|||
|
|
4. **Cleaner**: No temp file management
|
|||
|
|
|
|||
|
|
Ultralytics YOLO supports various input types:
|
|||
|
|
- File paths (str): `"image.jpg"`
|
|||
|
|
- Numpy arrays: `np.ndarray` ← **we use this**
|
|||
|
|
- PIL Images: `PIL.Image`
|
|||
|
|
- Torch tensors: `torch.Tensor`
|
|||
|
|
|
|||
|
|
## For Training with Custom Dataset
|
|||
|
|
|
|||
|
|
If you need to train YOLO on 16-bit TIFF images, you should create a custom dataset loader similar to the example provided by the user:
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
import torch
|
|||
|
|
import numpy as np
|
|||
|
|
import tifffile as tiff
|
|||
|
|
from pathlib import Path
|
|||
|
|
|
|||
|
|
class FloatYoloSegDataset(torch.utils.data.Dataset):
|
|||
|
|
def __init__(self, img_dir, label_dir, img_size=640):
|
|||
|
|
self.img_paths = sorted(Path(img_dir).glob('*'))
|
|||
|
|
self.label_dir = Path(label_dir)
|
|||
|
|
self.img_size = img_size
|
|||
|
|
|
|||
|
|
def __len__(self):
|
|||
|
|
return len(self.img_paths)
|
|||
|
|
|
|||
|
|
def __getitem__(self, idx):
|
|||
|
|
img_path = self.img_paths[idx]
|
|||
|
|
|
|||
|
|
# Load 16-bit TIFF
|
|||
|
|
img = tiff.imread(img_path)
|
|||
|
|
|
|||
|
|
# Convert to float32 [0-1]
|
|||
|
|
img = img.astype(np.float32)
|
|||
|
|
if img.max() > 1.5: # Assume 16-bit if max > 1.5
|
|||
|
|
img /= 65535.0
|
|||
|
|
|
|||
|
|
# Grayscale → RGB
|
|||
|
|
if img.ndim == 2:
|
|||
|
|
img = np.repeat(img[..., None], 3, axis=2)
|
|||
|
|
|
|||
|
|
# HWC → CHW for PyTorch
|
|||
|
|
img = torch.from_numpy(img).permute(2, 0, 1).contiguous()
|
|||
|
|
|
|||
|
|
# Load labels...
|
|||
|
|
# (implementation depends on your label format)
|
|||
|
|
|
|||
|
|
return img, labels
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Then use this dataset with Ultralytics training API or custom training loop.
|
|||
|
|
|
|||
|
|
## Installation
|
|||
|
|
|
|||
|
|
Install the updated dependencies:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
./venv/bin/pip install -r requirements.txt
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Or install tifffile directly:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
./venv/bin/pip install tifffile>=2023.0.0
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Example Test Output
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
=== Testing Float32 Passthrough (NO uint8) ===
|
|||
|
|
Created test 16-bit TIFF: /tmp/tmpdt5hm0ab.tif
|
|||
|
|
Shape: (200, 200)
|
|||
|
|
Dtype: uint16
|
|||
|
|
Min value: 0
|
|||
|
|
Max value: 65535
|
|||
|
|
|
|||
|
|
Preprocessing result:
|
|||
|
|
Prepared source type: <class 'numpy.ndarray'>
|
|||
|
|
|
|||
|
|
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
|
|||
|
|
Shape: (200, 200, 3)
|
|||
|
|
Dtype: float32
|
|||
|
|
Min value: 0.000000
|
|||
|
|
Max value: 1.000000
|
|||
|
|
Mean value: 0.499992
|
|||
|
|
Unique values: 399
|
|||
|
|
|
|||
|
|
✓ SUCCESS: Data has 399 unique values (> 256)
|
|||
|
|
This confirms NO uint8 quantization occurred!
|
|||
|
|
|
|||
|
|
✓ All float32 passthrough tests passed!
|