Adding standalone training script and update
This commit is contained in:
@@ -10,7 +10,7 @@ This document describes the implementation of 16-bit grayscale TIFF support for
|
||||
✅ Converts to float32 [0-1] (NO uint8 conversion)
|
||||
✅ Replicates grayscale → RGB (3 channels)
|
||||
✅ **Inference**: Passes numpy arrays directly to YOLO (no file I/O)
|
||||
✅ **Training**: Creates float32 3-channel TIFF dataset cache
|
||||
✅ **Training**: On-the-fly float32 conversion (NO disk caching)
|
||||
✅ Uses Ultralytics YOLOv8/v11 models
|
||||
✅ Works with segmentation models
|
||||
✅ No data loss, no double normalization, no silent clipping
|
||||
@@ -65,18 +65,18 @@ For 16-bit TIFF files during inference:
|
||||
|
||||
### For Training (train)
|
||||
|
||||
During training, YOLO's internal dataloader loads images from disk, so we create a cached 3-channel dataset:
|
||||
Training now uses a custom dataset loader with on-the-fly conversion (NO disk caching):
|
||||
|
||||
1. **Detect**: Check if dataset contains 16-bit TIFF files
|
||||
2. **Create Cache**: Build float32 3-channel TIFF dataset in `data/datasets/_float32_cache/`
|
||||
3. **Convert Each Image**:
|
||||
- Load 16-bit TIFF using `tifffile`
|
||||
- Normalize to float32 [0-1]
|
||||
- Replicate to 3 channels
|
||||
- Save as float32 TIFF (preserves precision)
|
||||
4. **Copy Labels**: Copy label files unchanged
|
||||
5. **Generate data.yaml**: Points to cached 3-channel dataset
|
||||
6. **Train**: YOLO trains on float32 3-channel TIFFs
|
||||
1. **Custom Dataset**: Uses `Float32Dataset` class that extends Ultralytics' `YOLODataset`
|
||||
2. **Load On-The-Fly**: Each image is loaded and converted during training:
|
||||
- Detect 16-bit TIFF files automatically
|
||||
- Load with `tifffile` (preserves uint16)
|
||||
- Convert to float32 [0-1] in memory
|
||||
- Replicate to 3 channels (RGB)
|
||||
3. **No Disk Cache**: Conversion happens in memory, no files written
|
||||
4. **Train**: YOLO trains on float32 [0-1] RGB arrays directly
|
||||
|
||||
See [`src/utils/train_ultralytics_float.py`](../src/utils/train_ultralytics_float.py) for implementation.
|
||||
|
||||
### No Data Loss!
|
||||
|
||||
@@ -214,9 +214,9 @@ For a 2048×2048 single-channel image:
|
||||
| Float32 3-channel | 48 MB | ~48 MB | Training cache |
|
||||
| uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |
|
||||
|
||||
The float32 approach uses ~4× more memory and disk space than uint8 but preserves **all information**.
|
||||
The float32 approach uses ~3× more memory than uint8 during training but preserves **all information**.
|
||||
|
||||
**Cache Directory**: Training creates cached datasets in `data/datasets/_float32_cache/<dataset>_<hash>/`
|
||||
**No Disk Cache**: The new on-the-fly approach eliminates the need for cached datasets on disk.
|
||||
|
||||
### Why Direct Numpy Array?
|
||||
|
||||
@@ -233,50 +233,31 @@ Ultralytics YOLO supports various input types:
|
||||
- PIL Images: `PIL.Image`
|
||||
- Torch tensors: `torch.Tensor`
|
||||
|
||||
## For Training with Custom Dataset
|
||||
## Training with Float32 Dataset Loader
|
||||
|
||||
If you need to train YOLO on 16-bit TIFF images, you should create a custom dataset loader similar to the example provided by the user:
|
||||
The system now includes a custom dataset loader for 16-bit TIFF training:
|
||||
|
||||
```python
|
||||
import torch
|
||||
import numpy as np
|
||||
import tifffile as tiff
|
||||
from pathlib import Path
|
||||
from src.utils.train_ultralytics_float import train_with_float32_loader
|
||||
|
||||
class FloatYoloSegDataset(torch.utils.data.Dataset):
|
||||
def __init__(self, img_dir, label_dir, img_size=640):
|
||||
self.img_paths = sorted(Path(img_dir).glob('*'))
|
||||
self.label_dir = Path(label_dir)
|
||||
self.img_size = img_size
|
||||
|
||||
def __len__(self):
|
||||
return len(self.img_paths)
|
||||
|
||||
def __getitem__(self, idx):
|
||||
img_path = self.img_paths[idx]
|
||||
|
||||
# Load 16-bit TIFF
|
||||
img = tiff.imread(img_path)
|
||||
|
||||
# Convert to float32 [0-1]
|
||||
img = img.astype(np.float32)
|
||||
if img.max() > 1.5: # Assume 16-bit if max > 1.5
|
||||
img /= 65535.0
|
||||
|
||||
# Grayscale → RGB
|
||||
if img.ndim == 2:
|
||||
img = np.repeat(img[..., None], 3, axis=2)
|
||||
|
||||
# HWC → CHW for PyTorch
|
||||
img = torch.from_numpy(img).permute(2, 0, 1).contiguous()
|
||||
|
||||
# Load labels...
|
||||
# (implementation depends on your label format)
|
||||
|
||||
return img, labels
|
||||
# Train with on-the-fly float32 conversion
|
||||
results = train_with_float32_loader(
|
||||
model_path="yolov8s-seg.pt",
|
||||
data_yaml="data/my_dataset/data.yaml",
|
||||
epochs=100,
|
||||
batch=16,
|
||||
imgsz=640,
|
||||
)
|
||||
```
|
||||
|
||||
Then use this dataset with Ultralytics training API or custom training loop.
|
||||
The `Float32Dataset` class automatically:
|
||||
- Detects 16-bit TIFF files
|
||||
- Loads with `tifffile` (not PIL/cv2)
|
||||
- Converts to float32 [0-1] on-the-fly
|
||||
- Replicates to 3 channels
|
||||
- Integrates seamlessly with Ultralytics training pipeline
|
||||
|
||||
This is used automatically by the training tab in the GUI.
|
||||
|
||||
## Installation
|
||||
|
||||
|
||||
Reference in New Issue
Block a user