Files

2025-12-13 01:18:16 +02:00

9.5 KiB

Raw Blame History

16-bit TIFF Support for YOLO Object Detection

Overview

This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and handles them appropriately for both inference and training without uint8 conversion to preserve the full dynamic range and avoid data loss.

Key Features

✅ Reads 16-bit or float32 images using tifffile ✅ Converts to float32 [0-1] (NO uint8 conversion) ✅ Replicates grayscale → RGB (3 channels) ✅ Inference: Passes numpy arrays directly to YOLO (no file I/O) ✅ Training: Creates float32 3-channel TIFF dataset cache ✅ Uses Ultralytics YOLOv8/v11 models ✅ Works with segmentation models ✅ No data loss, no double normalization, no silent clipping

Changes Made

1. Dependencies (`requirements.txt`)

Added tifffile>=2023.0.0 for reliable 16-bit TIFF loading

2. Image Loading (`src/utils/image.py`)

Enhanced TIFF Loading

Modified Image._load() to use tifffile for .tif and .tiff files
Preserves original 16-bit data type during loading
Properly handles both grayscale and multi-channel TIFF files

New Normalization Method

Added Image.to_normalized_float32() method that:

Converts image data to float32
Properly scales values to [0, 1] range:
- 16-bit images: divides by 65535 (full dynamic range)
- 8-bit images: divides by 255
- Float images: clips to [0, 1]
Handles various data types automatically

3. YOLO Preprocessing (`src/model/yolo_wrapper.py`)

Enhanced YOLOWrapper._prepare_source() to:

Detect 16-bit TIFF files automatically
Load and normalize to float32 [0-1] using the new method
Replicate grayscale to RGB (3 channels)
Return numpy array directly (NO file saving, NO uint8 conversion)
Pass float32 array directly to YOLO for inference

Processing Pipeline

For Inference (predict)

For 16-bit TIFF files during inference:

Load: File loaded using tifffile → preserves 16-bit uint16 data

Normalize: Convert to float32 and scale to [0, 1]

float_data = uint16_data.astype(np.float32) / 65535.0

RGB Conversion: Replicate grayscale to 3 channels

rgb_float = np.stack([float_data] * 3, axis=-1)

Pass to YOLO: Return float32 array directly (no uint8, no file I/O)
Inference: YOLO processes the float32 [0-1] RGB array

For Training (train)

During training, YOLO's internal dataloader loads images from disk, so we create a cached 3-channel dataset:

Detect: Check if dataset contains 16-bit TIFF files
Create Cache: Build float32 3-channel TIFF dataset in data/datasets/_float32_cache/
Convert Each Image:
- Load 16-bit TIFF using tifffile
- Normalize to float32 [0-1]
- Replicate to 3 channels
- Save as float32 TIFF (preserves precision)
Copy Labels: Copy label files unchanged
Generate data.yaml: Points to cached 3-channel dataset
Train: YOLO trains on float32 3-channel TIFFs

No Data Loss!

Unlike approaches that convert to uint8 (256 levels), this implementation:

Preserves full 16-bit dynamic range (65536 levels)
Maintains precision with float32 representation
For inference: passes data directly without file conversions
For training: uses float32 TIFFs (not uint8 PNGs)

Usage

Basic Image Loading

from src.utils.image import Image

# Load a 16-bit TIFF file
img = Image("path/to/16bit_image.tif")

# Get normalized float32 data [0-1]
normalized = img.to_normalized_float32()  # Shape: (H, W), dtype: float32

# Original data is preserved
original = img.data  # Still uint16

YOLO Inference

The preprocessing is automatic - just use YOLO as normal:

from src.model.yolo_wrapper import YOLOWrapper

# Initialize model
yolo = YOLOWrapper("yolov8s-seg.pt")
yolo.load_model()

# Perform inference on 16-bit TIFF
# The image will be automatically normalized and passed as float32 [0-1]
detections = yolo.predict("path/to/16bit_image.tif", conf=0.25)

With InferenceEngine

from src.model.inference import InferenceEngine
from src.database.db_manager import DatabaseManager

# Setup
db = DatabaseManager("database.db")
engine = InferenceEngine("model.pt", db, model_id=1)

# Detect objects in 16-bit TIFF
result = engine.detect_single(
    image_path="path/to/16bit_image.tif",
    relative_path="images/16bit_image.tif",
    conf=0.25
)

Testing

Three test scripts are provided:

1. Image Loading Test

./venv/bin/python tests/test_16bit_tiff_loading.py

Tests:

Loading 16-bit TIFF files with tifffile
Normalization to float32 [0-1]
Data type and value range verification

2. Float32 Passthrough Test (Most Important!)

./venv/bin/python tests/test_yolo_16bit_float32.py

Tests:

YOLO preprocessing returns numpy array (not file path)
Data is float32 [0-1] (not uint8)
No quantization to 256 levels (proves no uint8 conversion)

Sample output:

✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
  Shape: (200, 200, 3)
  Dtype: float32
  Min value: 0.000000
  Max value: 1.000000
  Unique values: 399

✓ SUCCESS: Data has 399 unique values (> 256)
  This confirms NO uint8 quantization occurred!

3. Legacy Test (Shows Old Behavior)

./venv/bin/python tests/test_yolo_16bit_preprocessing.py

This test shows the old behavior (uint8 conversion) - kept for comparison.

Benefits

No Data Loss: Preserves full 16-bit dynamic range (65536 levels vs 256)
High Precision: Float32 maintains fine-grained intensity differences
Automatic Processing: No manual preprocessing needed
YOLO Compatible: Ultralytics YOLO accepts float32 [0-1] arrays
Performance: No intermediate file I/O for 16-bit TIFFs
Backwards Compatible: Regular images (8-bit PNG, JPEG, etc.) still work as before

Technical Notes

Float32 vs uint8

With uint8 conversion (OLD - BAD):

16-bit (65536 levels) → uint8 (256 levels) = 99.6% data loss!
Fine intensity differences are lost
Quantization artifacts

With float32 [0-1] (NEW - GOOD):

16-bit (65536 levels) → float32 (continuous) = No data loss
Full dynamic range preserved
Smooth gradients maintained

Memory Considerations

For a 2048×2048 single-channel image:

Format	Memory	Disk Space	Notes
Original 16-bit	8 MB	~8 MB	uint16 grayscale TIFF
Float32 grayscale	16 MB	-	Intermediate
Float32 3-channel	48 MB	~48 MB	Training cache
uint8 RGB (old)	12 MB	~12 MB	OLD approach with data loss

The float32 approach uses ~4× more memory and disk space than uint8 but preserves all information.

Cache Directory: Training creates cached datasets in data/datasets/_float32_cache/<dataset>_<hash>/

Why Direct Numpy Array?

Passing numpy arrays directly to YOLO (instead of saving to file):

Faster: No disk I/O overhead
No Quantization: Avoids PNG/JPEG quantization
Memory Efficient: Single copy in memory
Cleaner: No temp file management

Ultralytics YOLO supports various input types:

File paths (str): "image.jpg"
Numpy arrays: np.ndarray ← we use this
PIL Images: PIL.Image
Torch tensors: torch.Tensor

For Training with Custom Dataset

If you need to train YOLO on 16-bit TIFF images, you should create a custom dataset loader similar to the example provided by the user:

import torch
import numpy as np
import tifffile as tiff
from pathlib import Path

class FloatYoloSegDataset(torch.utils.data.Dataset):
    def __init__(self, img_dir, label_dir, img_size=640):
        self.img_paths = sorted(Path(img_dir).glob('*'))
        self.label_dir = Path(label_dir)
        self.img_size = img_size

    def __len__(self):
        return len(self.img_paths)

    def __getitem__(self, idx):
        img_path = self.img_paths[idx]
        
        # Load 16-bit TIFF
        img = tiff.imread(img_path)
        
        # Convert to float32 [0-1]
        img = img.astype(np.float32)
        if img.max() > 1.5:  # Assume 16-bit if max > 1.5
            img /= 65535.0
        
        # Grayscale → RGB
        if img.ndim == 2:
            img = np.repeat(img[..., None], 3, axis=2)
        
        # HWC → CHW for PyTorch
        img = torch.from_numpy(img).permute(2, 0, 1).contiguous()
        
        # Load labels...
        # (implementation depends on your label format)
        
        return img, labels

Then use this dataset with Ultralytics training API or custom training loop.

Installation

Install the updated dependencies:

./venv/bin/pip install -r requirements.txt

Or install tifffile directly:

./venv/bin/pip install tifffile>=2023.0.0

Example Test Output

=== Testing Float32 Passthrough (NO uint8) ===
Created test 16-bit TIFF: /tmp/tmpdt5hm0ab.tif
  Shape: (200, 200)
  Dtype: uint16
  Min value: 0
  Max value: 65535

Preprocessing result:
  Prepared source type: <class 'numpy.ndarray'>

✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
  Shape: (200, 200, 3)
  Dtype: float32
  Min value: 0.000000
  Max value: 1.000000
  Mean value: 0.499992
  Unique values: 399

✓ SUCCESS: Data has 399 unique values (> 256)
  This confirms NO uint8 quantization occurred!

✓ All float32 passthrough tests passed!

9.5 KiB Raw Blame History Unescape Escape