# Standalone Float32 Training Script for 16-bit TIFFs ## Overview This standalone script (`train_float32_standalone.py`) trains YOLO models on 16-bit grayscale TIFF datasets with **no data loss**. - Loads 16-bit TIFFs with `tifffile` (not PIL/cv2) - Converts to float32 [0-1] on-the-fly (preserves full 16-bit precision) - Replicates grayscale → 3-channel RGB in memory - **No disk caching required** - Uses custom PyTorch Dataset + training loop ## Quick Start ```bash # Activate virtual environment source venv/bin/activate # Train on your 16-bit TIFF dataset python scripts/train_float32_standalone.py \ --data data/my_dataset/data.yaml \ --weights yolov8s-seg.pt \ --epochs 100 \ --batch 16 \ --imgsz 640 \ --lr 0.0001 \ --save-dir runs/my_training \ --device cuda ``` ## Arguments | Argument | Required | Default | Description | |----------|----------|---------|-------------| | `--data` | Yes | - | Path to YOLO data.yaml file | | `--weights` | No | yolov8s-seg.pt | Pretrained model weights | | `--epochs` | No | 100 | Number of training epochs | | `--batch` | No | 16 | Batch size | | `--imgsz` | No | 640 | Input image size | | `--lr` | No | 0.0001 | Learning rate | | `--save-dir` | No | runs/train | Directory to save checkpoints | | `--device` | No | cuda/cpu | Training device (auto-detected) | ## Dataset Format Your data.yaml should follow standard YOLO format: ```yaml path: /path/to/dataset train: train/images val: val/images test: test/images # optional names: 0: class1 1: class2 nc: 2 ``` Directory structure: ``` dataset/ ├── train/ │ ├── images/ │ │ ├── img1.tif (16-bit grayscale TIFF) │ │ └── img2.tif │ └── labels/ │ ├── img1.txt (YOLO format) │ └── img2.txt ├── val/ │ ├── images/ │ └── labels/ └── data.yaml ``` ## Output The script saves: - `epoch{N}.pt`: Checkpoint after each epoch - `best.pt`: Best model weights (lowest loss) - Training logs to console ## Features ✅ **16-bit precision preserved**: Float32 [0-1] maintains full dynamic range ✅ **No disk caching**: Conversion happens in memory ✅ **No PIL/cv2**: Direct tifffile loading ✅ **Variable-length labels**: Handles segmentation polygons ✅ **Checkpoint saving**: Resume training if interrupted ✅ **Best model tracking**: Automatically saves best weights ## Example Train a segmentation model on microscopy data: ```bash python scripts/train_float32_standalone.py \ --data data/microscopy/data.yaml \ --weights yolov11s-seg.pt \ --epochs 150 \ --batch 8 \ --imgsz 1024 \ --lr 0.0003 \ --save-dir data/models/microscopy_v1 ``` ## Troubleshooting ### Out of Memory (OOM) Reduce batch size: ```bash --batch 4 ``` ### Slow Loading Reduce num_workers (edit script line 208): ```python num_workers=2 # instead of 4 ``` ### Different Image Sizes The script expects all images to have the same dimensions. For variable sizes: 1. Implement letterbox/resize in dataset's `_read_image()` 2. Or preprocess images to same size ### Loss Computation Errors If you see "Cannot determine loss", the script may need adjustment for your Ultralytics version. Check: ```python # In train() function, the preds format may vary # Current script assumes: preds is tuple with loss OR dict with 'loss' key ``` ## vs GUI Training | Feature | Standalone Script | GUI Training Tab | |---------|------------------|------------------| | Float32 conversion | ✓ Yes | ✓ Yes (automatic) | | Disk caching | ✗ None | ✗ None | | Progress UI | ✗ Console only | ✓ Visual progress bar | | Dataset selection | Manual CLI args | ✓ GUI browsing | | Multi-stage training | Manual runs | ✓ Built-in | | Use case | Advanced users | General users | ## Technical Details ### Data Loading Pipeline ``` 16-bit TIFF file ↓ (tifffile.imread) uint16 [0-65535] ↓ (/ 65535.0) float32 [0-1] ↓ (replicate channels) float32 RGB (H,W,3) [0-1] ↓ (permute to C,H,W) torch.Tensor (3,H,W) float32 ↓ (DataLoader stack) Batch (B,3,H,W) float32 ↓ YOLO Model ``` ### Precision Comparison | Method | Unique Values | Data Loss | |--------|---------------|-----------| | **float32 [0-1]** | ~65,536 | None ✓ | | uint16 RGB | 65,536 | None ✓ | | uint8 | 256 | 99.6% ✗ | Example: Pixel value 32,768 (middle intensity) - Float32: 32768 / 65535.0 = 0.50000763 (exact) - uint8: 32768 → 128 → many values collapse! ## License Same as main project.