179 lines
4.5 KiB
Markdown
179 lines
4.5 KiB
Markdown
|
|
# Standalone Float32 Training Script for 16-bit TIFFs
|
||
|
|
|
||
|
|
## Overview
|
||
|
|
|
||
|
|
This standalone script (`train_float32_standalone.py`) trains YOLO models on 16-bit grayscale TIFF datasets with **no data loss**.
|
||
|
|
|
||
|
|
- Loads 16-bit TIFFs with `tifffile` (not PIL/cv2)
|
||
|
|
- Converts to float32 [0-1] on-the-fly (preserves full 16-bit precision)
|
||
|
|
- Replicates grayscale → 3-channel RGB in memory
|
||
|
|
- **No disk caching required**
|
||
|
|
- Uses custom PyTorch Dataset + training loop
|
||
|
|
|
||
|
|
## Quick Start
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Activate virtual environment
|
||
|
|
source venv/bin/activate
|
||
|
|
|
||
|
|
# Train on your 16-bit TIFF dataset
|
||
|
|
python scripts/train_float32_standalone.py \
|
||
|
|
--data data/my_dataset/data.yaml \
|
||
|
|
--weights yolov8s-seg.pt \
|
||
|
|
--epochs 100 \
|
||
|
|
--batch 16 \
|
||
|
|
--imgsz 640 \
|
||
|
|
--lr 0.0001 \
|
||
|
|
--save-dir runs/my_training \
|
||
|
|
--device cuda
|
||
|
|
```
|
||
|
|
|
||
|
|
## Arguments
|
||
|
|
|
||
|
|
| Argument | Required | Default | Description |
|
||
|
|
|----------|----------|---------|-------------|
|
||
|
|
| `--data` | Yes | - | Path to YOLO data.yaml file |
|
||
|
|
| `--weights` | No | yolov8s-seg.pt | Pretrained model weights |
|
||
|
|
| `--epochs` | No | 100 | Number of training epochs |
|
||
|
|
| `--batch` | No | 16 | Batch size |
|
||
|
|
| `--imgsz` | No | 640 | Input image size |
|
||
|
|
| `--lr` | No | 0.0001 | Learning rate |
|
||
|
|
| `--save-dir` | No | runs/train | Directory to save checkpoints |
|
||
|
|
| `--device` | No | cuda/cpu | Training device (auto-detected) |
|
||
|
|
|
||
|
|
## Dataset Format
|
||
|
|
|
||
|
|
Your data.yaml should follow standard YOLO format:
|
||
|
|
|
||
|
|
```yaml
|
||
|
|
path: /path/to/dataset
|
||
|
|
train: train/images
|
||
|
|
val: val/images
|
||
|
|
test: test/images # optional
|
||
|
|
|
||
|
|
names:
|
||
|
|
0: class1
|
||
|
|
1: class2
|
||
|
|
|
||
|
|
nc: 2
|
||
|
|
```
|
||
|
|
|
||
|
|
Directory structure:
|
||
|
|
```
|
||
|
|
dataset/
|
||
|
|
├── train/
|
||
|
|
│ ├── images/
|
||
|
|
│ │ ├── img1.tif (16-bit grayscale TIFF)
|
||
|
|
│ │ └── img2.tif
|
||
|
|
│ └── labels/
|
||
|
|
│ ├── img1.txt (YOLO format)
|
||
|
|
│ └── img2.txt
|
||
|
|
├── val/
|
||
|
|
│ ├── images/
|
||
|
|
│ └── labels/
|
||
|
|
└── data.yaml
|
||
|
|
```
|
||
|
|
|
||
|
|
## Output
|
||
|
|
|
||
|
|
The script saves:
|
||
|
|
- `epoch{N}.pt`: Checkpoint after each epoch
|
||
|
|
- `best.pt`: Best model weights (lowest loss)
|
||
|
|
- Training logs to console
|
||
|
|
|
||
|
|
## Features
|
||
|
|
|
||
|
|
✅ **16-bit precision preserved**: Float32 [0-1] maintains full dynamic range
|
||
|
|
✅ **No disk caching**: Conversion happens in memory
|
||
|
|
✅ **No PIL/cv2**: Direct tifffile loading
|
||
|
|
✅ **Variable-length labels**: Handles segmentation polygons
|
||
|
|
✅ **Checkpoint saving**: Resume training if interrupted
|
||
|
|
✅ **Best model tracking**: Automatically saves best weights
|
||
|
|
|
||
|
|
## Example
|
||
|
|
|
||
|
|
Train a segmentation model on microscopy data:
|
||
|
|
|
||
|
|
```bash
|
||
|
|
python scripts/train_float32_standalone.py \
|
||
|
|
--data data/microscopy/data.yaml \
|
||
|
|
--weights yolov11s-seg.pt \
|
||
|
|
--epochs 150 \
|
||
|
|
--batch 8 \
|
||
|
|
--imgsz 1024 \
|
||
|
|
--lr 0.0003 \
|
||
|
|
--save-dir data/models/microscopy_v1
|
||
|
|
```
|
||
|
|
|
||
|
|
## Troubleshooting
|
||
|
|
|
||
|
|
### Out of Memory (OOM)
|
||
|
|
Reduce batch size:
|
||
|
|
```bash
|
||
|
|
--batch 4
|
||
|
|
```
|
||
|
|
|
||
|
|
### Slow Loading
|
||
|
|
Reduce num_workers (edit script line 208):
|
||
|
|
```python
|
||
|
|
num_workers=2 # instead of 4
|
||
|
|
```
|
||
|
|
|
||
|
|
### Different Image Sizes
|
||
|
|
The script expects all images to have the same dimensions. For variable sizes:
|
||
|
|
1. Implement letterbox/resize in dataset's `_read_image()`
|
||
|
|
2. Or preprocess images to same size
|
||
|
|
|
||
|
|
### Loss Computation Errors
|
||
|
|
If you see "Cannot determine loss", the script may need adjustment for your Ultralytics version. Check:
|
||
|
|
```python
|
||
|
|
# In train() function, the preds format may vary
|
||
|
|
# Current script assumes: preds is tuple with loss OR dict with 'loss' key
|
||
|
|
```
|
||
|
|
|
||
|
|
## vs GUI Training
|
||
|
|
|
||
|
|
| Feature | Standalone Script | GUI Training Tab |
|
||
|
|
|---------|------------------|------------------|
|
||
|
|
| Float32 conversion | ✓ Yes | ✓ Yes (automatic) |
|
||
|
|
| Disk caching | ✗ None | ✗ None |
|
||
|
|
| Progress UI | ✗ Console only | ✓ Visual progress bar |
|
||
|
|
| Dataset selection | Manual CLI args | ✓ GUI browsing |
|
||
|
|
| Multi-stage training | Manual runs | ✓ Built-in |
|
||
|
|
| Use case | Advanced users | General users |
|
||
|
|
|
||
|
|
## Technical Details
|
||
|
|
|
||
|
|
### Data Loading Pipeline
|
||
|
|
|
||
|
|
```
|
||
|
|
16-bit TIFF file
|
||
|
|
↓ (tifffile.imread)
|
||
|
|
uint16 [0-65535]
|
||
|
|
↓ (/ 65535.0)
|
||
|
|
float32 [0-1]
|
||
|
|
↓ (replicate channels)
|
||
|
|
float32 RGB (H,W,3) [0-1]
|
||
|
|
↓ (permute to C,H,W)
|
||
|
|
torch.Tensor (3,H,W) float32
|
||
|
|
↓ (DataLoader stack)
|
||
|
|
Batch (B,3,H,W) float32
|
||
|
|
↓
|
||
|
|
YOLO Model
|
||
|
|
```
|
||
|
|
|
||
|
|
### Precision Comparison
|
||
|
|
|
||
|
|
| Method | Unique Values | Data Loss |
|
||
|
|
|--------|---------------|-----------|
|
||
|
|
| **float32 [0-1]** | ~65,536 | None ✓ |
|
||
|
|
| uint16 RGB | 65,536 | None ✓ |
|
||
|
|
| uint8 | 256 | 99.6% ✗ |
|
||
|
|
|
||
|
|
Example: Pixel value 32,768 (middle intensity)
|
||
|
|
- Float32: 32768 / 65535.0 = 0.50000763 (exact)
|
||
|
|
- uint8: 32768 → 128 → many values collapse!
|
||
|
|
|
||
|
|
## License
|
||
|
|
|
||
|
|
Same as main project.
|