41 Commits

Author SHA1 Message Date
87095ec3f0 Merge branch 'float32integ' of code.sysbio.ioc.ee:martin/object-segmentation into float32integ 2025-12-16 13:25:42 +02:00
2dbfa54256 Update 2025-12-16 13:25:20 +02:00
c7e1271193 Adding file 2025-12-13 09:42:00 +02:00
aec0fbf83c Adding standalone training script and update 2025-12-13 09:28:24 +02:00
908e9a5b82 Bug fix 2025-12-13 01:18:16 +02:00
edcd448a61 Update, cleanup 2025-12-13 01:06:40 +02:00
2411223a14 Adding test scripts 2025-12-13 00:32:32 +02:00
b3b1e3acff Implementing float 32 data managent 2025-12-13 00:31:23 +02:00
9c4c39fb39 Adding image converter 2025-12-12 23:52:34 +02:00
20a87c9040 Updating config 2025-12-12 21:51:12 +02:00
9f7d2be1ac Updating the base model preset 2025-12-11 23:27:02 +02:00
dbde07c0e8 Making training tab scrollable 2025-12-11 23:12:39 +02:00
b3c5a51dbb Using QPolygonF instead of drawLine 2025-12-11 17:14:07 +02:00
9a221acb63 Making image manipulations thru one class 2025-12-11 16:59:56 +02:00
32a6a122bd Fixing circular import 2025-12-11 16:06:39 +02:00
9ba44043ef Defining image extensions only in one place 2025-12-11 15:50:14 +02:00
8eb1cc8c86 Fixing grayscale conversion 2025-12-11 15:15:38 +02:00
e4ce882a18 Grayscale RGB conversion modified 2025-12-11 15:06:59 +02:00
6b6d6fad03 2Stage training fix 2025-12-11 12:50:34 +02:00
c0684a9c14 Implementing 2 stage training 2025-12-11 12:04:08 +02:00
221c80aa8c Small image showing fix 2025-12-11 11:20:20 +02:00
833b222fad Adding result shower 2025-12-10 16:55:28 +02:00
5370d31dce Merge pull request 'Update training' (#2) from training into main
Reviewed-on: #2
2025-12-10 15:47:00 +02:00
5d196c3a4a Update training 2025-12-10 15:46:26 +02:00
f719c7ec40 Merge pull request 'segmentation' (#1) from segmentation into main
Reviewed-on: #1
2025-12-10 12:08:54 +02:00
e6a5e74fa1 Adding feature to remove annotations 2025-12-10 00:19:59 +02:00
35e2398e95 Fixing bounding box drawing 2025-12-09 23:56:29 +02:00
c3d44ac945 Renaming Pen tool to polyline tool 2025-12-09 23:38:23 +02:00
dad5c2bf74 Updating 2025-12-09 22:44:23 +02:00
73cb698488 Saving state before replacing annotation tool 2025-12-09 22:00:56 +02:00
12f2bf94d5 Updating polyline saving and drawing 2025-12-09 15:42:42 +02:00
710b684456 Updating annotations 2025-12-08 23:59:44 +02:00
fc22479621 Adding pen tool for annotation 2025-12-08 23:15:54 +02:00
f84dea0bff Adding splitter and saving layout state when closing the app 2025-12-08 22:40:07 +02:00
bb26d43dd7 Adding image_display widget 2025-12-08 17:33:32 +02:00
4b5d2a7c45 Adding image loading 2025-12-08 16:28:58 +02:00
42fb2b782d Bug fix in installing and lauching the program 2025-12-05 16:18:37 +02:00
310e0b2285 Making it installabel package and switching to segmentation mode 2025-12-05 15:51:16 +02:00
9011276584 Small update 2025-12-05 15:30:19 +02:00
6bd2b100ca Adding python files 2025-12-05 09:50:50 +02:00
c6143cd11a Adding documentation and main.py 2025-12-05 09:44:00 +02:00
59 changed files with 14753 additions and 58 deletions

View File

@@ -2,11 +2,11 @@
## Project Overview
A desktop application for detecting organelles and membrane branching structures in microscopy images using YOLOv8s, with comprehensive training, validation, and visualization capabilities.
A desktop application for detecting and segmenting organelles and membrane branching structures in microscopy images using YOLOv8s-seg, with comprehensive training, validation, and visualization capabilities including pixel-accurate segmentation masks.
## Technology Stack
- **ML Framework**: Ultralytics YOLOv8 (YOLOv8s.pt model)
- **ML Framework**: Ultralytics YOLOv8 (YOLOv8s-seg.pt segmentation model)
- **GUI Framework**: PySide6 (Qt6 for Python)
- **Visualization**: pyqtgraph
- **Database**: SQLite3
@@ -110,6 +110,7 @@ erDiagram
float x_max
float y_max
float confidence
text segmentation_mask
datetime detected_at
json metadata
}
@@ -122,6 +123,7 @@ erDiagram
float y_min
float x_max
float y_max
text segmentation_mask
string annotator
datetime created_at
boolean verified
@@ -139,7 +141,7 @@ Stores information about trained models and their versions.
| model_name | TEXT | NOT NULL | User-friendly model name |
| model_version | TEXT | NOT NULL | Version string (e.g., "v1.0") |
| model_path | TEXT | NOT NULL | Path to model weights file |
| base_model | TEXT | NOT NULL | Base model used (e.g., "yolov8s.pt") |
| base_model | TEXT | NOT NULL | Base model used (e.g., "yolov8s-seg.pt") |
| created_at | TIMESTAMP | DEFAULT CURRENT_TIMESTAMP | Model creation timestamp |
| training_params | JSON | | Training hyperparameters |
| metrics | JSON | | Validation metrics (mAP, precision, recall) |
@@ -159,7 +161,7 @@ Stores metadata about microscopy images.
| checksum | TEXT | | MD5 hash for integrity verification |
#### **detections** table
Stores object detection results.
Stores object detection results with optional segmentation masks.
| Column | Type | Constraints | Description |
|--------|------|-------------|-------------|
@@ -172,11 +174,12 @@ Stores object detection results.
| x_max | REAL | NOT NULL | Bounding box right coordinate (normalized 0-1) |
| y_max | REAL | NOT NULL | Bounding box bottom coordinate (normalized 0-1) |
| confidence | REAL | NOT NULL | Detection confidence score (0-1) |
| segmentation_mask | TEXT | | JSON array of polygon coordinates [[x1,y1], [x2,y2], ...] (normalized 0-1) |
| detected_at | TIMESTAMP | DEFAULT CURRENT_TIMESTAMP | When detection was performed |
| metadata | JSON | | Additional metadata (processing time, etc.) |
#### **annotations** table
Stores manual annotations for training data (future feature).
Stores manual annotations for training data with optional segmentation masks (future feature).
| Column | Type | Constraints | Description |
|--------|------|-------------|-------------|
@@ -187,6 +190,7 @@ Stores manual annotations for training data (future feature).
| y_min | REAL | NOT NULL | Bounding box top coordinate (normalized) |
| x_max | REAL | NOT NULL | Bounding box right coordinate (normalized) |
| y_max | REAL | NOT NULL | Bounding box bottom coordinate (normalized) |
| segmentation_mask | TEXT | | JSON array of polygon coordinates [[x1,y1], [x2,y2], ...] (normalized 0-1) |
| annotator | TEXT | | Name of person who created annotation |
| created_at | TIMESTAMP | DEFAULT CURRENT_TIMESTAMP | Annotation timestamp |
| verified | BOOLEAN | DEFAULT 0 | Whether annotation is verified |
@@ -245,8 +249,9 @@ graph TB
### Key Components
#### 1. **YOLO Wrapper** ([`src/model/yolo_wrapper.py`](src/model/yolo_wrapper.py))
Encapsulates YOLOv8 operations:
- Load pre-trained YOLOv8s model
Encapsulates YOLOv8-seg operations:
- Load pre-trained YOLOv8s-seg segmentation model
- Extract pixel-accurate segmentation masks
- Fine-tune on custom microscopy dataset
- Export trained models
- Provide training progress callbacks
@@ -255,10 +260,10 @@ Encapsulates YOLOv8 operations:
**Key Methods:**
```python
class YOLOWrapper:
def __init__(self, model_path: str = "yolov8s.pt")
def __init__(self, model_path: str = "yolov8s-seg.pt")
def train(self, data_yaml: str, epochs: int, callbacks: dict)
def validate(self, data_yaml: str) -> dict
def predict(self, image_path: str, conf: float) -> list
def predict(self, image_path: str, conf: float) -> list # Returns detections with segmentation masks
def export_model(self, format: str, output_path: str)
```
@@ -435,7 +440,7 @@ image_repository:
allowed_extensions: [".jpg", ".jpeg", ".png", ".tif", ".tiff"]
models:
default_base_model: "yolov8s.pt"
default_base_model: "yolov8s-seg.pt"
models_directory: "data/models"
training:

178
BUILD.md Normal file
View File

@@ -0,0 +1,178 @@
# Building and Publishing Guide
This guide explains how to build and publish the microscopy-object-detection package.
## Prerequisites
```bash
pip install build twine
```
## Building the Package
### 1. Clean Previous Builds
```bash
rm -rf build/ dist/ *.egg-info
```
### 2. Build Distribution Archives
```bash
python -m build
```
This will create both wheel (`.whl`) and source distribution (`.tar.gz`) in the `dist/` directory.
### 3. Verify the Build
```bash
ls dist/
# Should show:
# microscopy_object_detection-1.0.0-py3-none-any.whl
# microscopy_object_detection-1.0.0.tar.gz
```
## Testing the Package Locally
### Install in Development Mode
```bash
pip install -e .
```
### Install from Built Package
```bash
pip install dist/microscopy_object_detection-1.0.0-py3-none-any.whl
```
### Test the Installation
```bash
# Test CLI
microscopy-detect --version
# Test GUI launcher
microscopy-detect-gui
```
## Publishing to PyPI
### 1. Configure PyPI Credentials
Create or update `~/.pypirc`:
```ini
[pypi]
username = __token__
password = pypi-YOUR-API-TOKEN-HERE
```
### 2. Upload to Test PyPI (Recommended First)
```bash
python -m twine upload --repository testpypi dist/*
```
Then test installation:
```bash
pip install --index-url https://test.pypi.org/simple/ microscopy-object-detection
```
### 3. Upload to PyPI
```bash
python -m twine upload dist/*
```
## Version Management
Update version in multiple files:
- `setup.py`: Update `version` parameter
- `pyproject.toml`: Update `version` field
- `src/__init__.py`: Update `__version__` variable
## Git Tags
After publishing, tag the release:
```bash
git tag -a v1.0.0 -m "Release version 1.0.0"
git push origin v1.0.0
```
## Package Structure
The built package includes:
- All Python source files in `src/`
- Configuration files in `config/`
- Database schema file (`src/database/schema.sql`)
- Documentation files (README.md, LICENSE, etc.)
- Entry points for CLI and GUI
## Troubleshooting
### Import Errors
If you get import errors, ensure:
- All `__init__.py` files are present
- Package structure follows the setup configuration
- Dependencies are listed in `requirements.txt`
### Missing Files
If files are missing in the built package:
- Check `MANIFEST.in` includes the required patterns
- Check `pyproject.toml` package-data configuration
- Rebuild with `python -m build --no-isolation` for debugging
### Version Conflicts
If version conflicts occur:
- Ensure version is consistent across all files
- Clear build artifacts and rebuild
- Check for cached installations: `pip list | grep microscopy`
## CI/CD Integration
### GitHub Actions Example
```yaml
name: Build and Publish
on:
release:
types: [created]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.8'
- name: Install dependencies
run: |
pip install build twine
- name: Build package
run: python -m build
- name: Publish to PyPI
env:
TWINE_USERNAME: __token__
TWINE_PASSWORD: ${{ secrets.PYPI_TOKEN }}
run: twine upload dist/*
```
## Best Practices
1. **Version Bumping**: Use semantic versioning (MAJOR.MINOR.PATCH)
2. **Testing**: Always test on Test PyPI before publishing to PyPI
3. **Documentation**: Update README.md and CHANGELOG.md for each release
4. **Git Tags**: Tag releases in git for easy reference
5. **Dependencies**: Keep requirements.txt updated and specify version ranges
## Resources
- [Python Packaging Guide](https://packaging.python.org/)
- [setuptools Documentation](https://setuptools.pypa.io/)
- [PyPI Publishing Guide](https://packaging.python.org/tutorials/packaging-projects/)

236
INSTALL_TEST.md Normal file
View File

@@ -0,0 +1,236 @@
# Installation Testing Guide
This guide helps you verify that the package installation works correctly.
## Clean Installation Test
### 1. Remove Any Previous Installations
```bash
# Deactivate any active virtual environment
deactivate
# Remove old virtual environment (if exists)
rm -rf venv
# Create fresh virtual environment
python3 -m venv venv
source venv/bin/activate # On Linux/Mac
# or
venv\Scripts\activate # On Windows
```
### 2. Install the Package
#### Option A: Editable/Development Install
```bash
pip install -e .
```
This allows you to modify source code and see changes immediately.
#### Option B: Regular Install
```bash
pip install .
```
This installs the package as if it were from PyPI.
### 3. Verify Installation
```bash
# Check package is installed
pip list | grep microscopy
# Check version
microscopy-detect --version
# Expected output: microscopy-object-detection 1.0.0
# Test Python import
python -c "import src; print(src.__version__)"
# Expected output: 1.0.0
```
### 4. Test Entry Points
```bash
# Test CLI
microscopy-detect --help
# Test GUI launcher (will open window)
microscopy-detect-gui
```
### 5. Verify Package Contents
```python
# Run this in Python shell
import src
import src.database
import src.model
import src.gui
# Check schema file is included
from pathlib import Path
import src.database
db_path = Path(src.database.__file__).parent
schema_file = db_path / 'schema.sql'
print(f"Schema file exists: {schema_file.exists()}")
# Expected: Schema file exists: True
```
## Troubleshooting
### Issue: ModuleNotFoundError
**Error:**
```
ModuleNotFoundError: No module named 'src'
```
**Solution:**
```bash
# Reinstall with verbose output
pip install -e . -v
# Or try regular install
pip install . --force-reinstall
```
### Issue: Entry Points Not Working
**Error:**
```
microscopy-detect: command not found
```
**Solution:**
```bash
# Check if scripts are in PATH
which microscopy-detect
# If not found, check pip install location
pip show microscopy-object-detection
# You might need to add to PATH or use full path
~/.local/bin/microscopy-detect # Linux
```
### Issue: Import Errors for PySide6
**Error:**
```
ImportError: cannot import name 'QApplication' from 'PySide6.QtWidgets'
```
**Solution:**
```bash
# Install Qt dependencies (Linux only)
sudo apt-get install libxcb-xinerama0
# Reinstall PySide6
pip uninstall PySide6
pip install PySide6
```
### Issue: Config Files Not Found
**Error:**
```
FileNotFoundError: config/app_config.yaml
```
**Solution:**
The config file should be created automatically. If not:
```bash
# Create config directory in your home
mkdir -p ~/.microscopy-detect
cp config/app_config.yaml ~/.microscopy-detect/
# Or run from source directory first time
cd /home/martin/code/object_detection
python main.py
```
## Manual Testing Checklist
- [ ] Package installs without errors
- [ ] Version command works (`microscopy-detect --version`)
- [ ] Help command works (`microscopy-detect --help`)
- [ ] GUI launches (`microscopy-detect-gui`)
- [ ] Can import all modules in Python
- [ ] Database schema file is accessible
- [ ] Configuration loads correctly
## Build and Install from Wheel
```bash
# Build the package
python -m build
# Install from wheel
pip install dist/microscopy_object_detection-1.0.0-py3-none-any.whl
# Test
microscopy-detect --version
```
## Uninstall
```bash
pip uninstall microscopy-object-detection
```
## Development Workflow
### After Code Changes
If installed with `-e` (editable mode):
- Python code changes are immediately available
- No need to reinstall
If installed with regular `pip install .`:
- Reinstall after changes: `pip install . --force-reinstall`
### After Adding New Files
```bash
# Reinstall to include new files
pip install -e . --force-reinstall
```
## Expected Installation Output
```
Processing /home/martin/code/object_detection
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: microscopy-object-detection
Building wheel for microscopy-object-detection (pyproject.toml) ... done
Successfully built microscopy-object-detection
Installing collected packages: microscopy-object-detection
Successfully installed microscopy-object-detection-1.0.0
```
## Success Criteria
Installation is successful when:
1. ✅ No error messages during installation
2.`pip list` shows the package
3.`microscopy-detect --version` returns correct version
4. ✅ GUI launches without errors
5. ✅ All Python modules can be imported
6. ✅ Database operations work
7. ✅ Detection functionality works
## Next Steps After Successful Install
1. Configure image repository path
2. Run first detection
3. Train a custom model
4. Export results
For usage instructions, see [QUICKSTART.md](QUICKSTART.md)

21
LICENSE Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2024 Your Name
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

37
MANIFEST.in Normal file
View File

@@ -0,0 +1,37 @@
# Include documentation files
include README.md
include LICENSE
include ARCHITECTURE.md
include IMPLEMENTATION_GUIDE.md
include QUICKSTART.md
include PLAN_SUMMARY.md
# Include requirements
include requirements.txt
# Include configuration files
recursive-include config *.yaml
recursive-include config *.yml
# Include database schema
recursive-include src/database *.sql
# Include tests
recursive-include tests *.py
# Exclude compiled Python files
global-exclude *.pyc
global-exclude *.pyo
global-exclude __pycache__
global-exclude *.so
global-exclude .DS_Store
# Exclude git and IDE files
global-exclude .git*
global-exclude .vscode
global-exclude .idea
# Exclude build artifacts
prune build
prune dist
prune *.egg-info

323
PLAN_SUMMARY.md Normal file
View File

@@ -0,0 +1,323 @@
# Project Plan Summary - Microscopy Object Detection Application
## Executive Summary
This document provides a high-level overview of the planned microscopy object detection application. The application will enable users to train, validate, and deploy YOLOv8-based object detection models for microscopy images, with comprehensive GUI, database storage, and visualization capabilities.
## Project Goals
1. **Detect Objects**: Identify organelles and membrane branching structures in microscopy images
2. **Train Models**: Fine-tune YOLOv8s on custom datasets with YOLO format annotations
3. **Manage Results**: Store and query detection results in SQLite database
4. **Visualize Data**: Interactive plots and image viewers for analysis
5. **Export Results**: Flexible export options (CSV, JSON, Excel)
6. **Future Annotation**: Provide manual annotation capabilities for dataset preparation
## Key Technologies
| Component | Technology | Purpose |
|-----------|------------|---------|
| ML Framework | Ultralytics YOLOv8 | Object detection model |
| GUI | PySide6 | Desktop application interface |
| Visualization | pyqtgraph | Real-time plots and charts |
| Database | SQLite | Detection results storage |
| Image Processing | OpenCV, Pillow | Image manipulation |
## Application Architecture
### Three-Layer Architecture
```
┌─────────────────────────────────────┐
│ GUI Layer (PySide6) │
│ - Main Window with Tabs │
│ - Dialogs & Custom Widgets │
└──────────────┬──────────────────────┘
┌──────────────▼──────────────────────┐
│ Business Logic Layer │
│ - YOLO Wrapper │
│ - Inference Engine │
│ - Database Manager │
│ - Config Manager │
└──────────────┬──────────────────────┘
┌──────────────▼──────────────────────┐
│ Data Layer │
│ - SQLite Database │
│ - File System (images, models) │
│ - YOLOv8 Model Files │
└─────────────────────────────────────┘
```
## Core Features
### 1. Detection Tab ✨
- **Single Image Detection**: Select and detect objects in one image
- **Batch Processing**: Process entire folders of images
- **Real-time Preview**: View detections with bounding boxes
- **Confidence Control**: Adjustable threshold slider
- **Database Storage**: Automatic saving of results with metadata
### 2. Training Tab 🎓
- **Dataset Selection**: Browse for YOLO format datasets
- **Parameter Configuration**: Epochs, batch size, image size, learning rate
- **Progress Monitoring**: Real-time training metrics and loss curves
- **Model Versioning**: Automatic model naming and version tracking
- **Validation Metrics**: Track mAP, precision, recall during training
### 3. Validation Tab 📊
- **Model Evaluation**: Validate models on test datasets
- **Metrics Visualization**:
- Confusion matrix
- Precision-Recall curves
- Class-wise performance
- **Model Comparison**: Compare multiple model versions
### 4. Results Tab 📈
- **Detection Browser**: Searchable table of all detections
- **Advanced Filtering**: By date, model, class, confidence
- **Statistics Dashboard**:
- Detection count by class
- Confidence distribution
- Timeline visualization
- **Export Options**: CSV, JSON, Excel formats
### 5. Annotation Tab 🖊️ (Future Feature)
- **Image Browser**: Navigate through image repository
- **Drawing Tools**: Create bounding boxes
- **Class Assignment**: Label objects
- **YOLO Export**: Export annotations for training
## Database Design
### Tables Overview
| Table | Purpose | Key Fields |
|-------|---------|------------|
| **models** | Store trained models | name, version, path, metrics |
| **images** | Image metadata | path, dimensions, checksum |
| **detections** | Detection results | bbox, class, confidence |
| **annotations** | Manual labels | bbox, class, annotator |
### Key Relationships
- Each detection links to one image and one model
- Each image can have multiple detections from multiple models
- Annotations are separate from automated detections
## Implementation Phases
### Phase 1: Core Foundation (Weeks 1-2)
- [x] Architecture design and documentation
- [ ] Project structure setup
- [ ] Database schema implementation
- [ ] YOLO wrapper basic functionality
- [ ] Database manager CRUD operations
### Phase 2: GUI Development (Weeks 3-4)
- [ ] Main window and tab structure
- [ ] Detection tab implementation
- [ ] Training tab implementation
- [ ] Basic visualization widgets
- [ ] Configuration management
### Phase 3: Advanced Features (Weeks 5-6)
- [ ] Validation tab with metrics
- [ ] Results tab with filtering
- [ ] Export functionality
- [ ] Error handling and logging
- [ ] Performance optimization
### Phase 4: Polish & Testing (Week 7)
- [ ] Unit tests for all components
- [ ] Integration testing
- [ ] User documentation
- [ ] Bug fixes and refinements
- [ ] Deployment preparation
### Phase 5: Future Enhancements (Post-Launch)
- [ ] Annotation tool
- [ ] Real-time camera detection
- [ ] Multi-model ensemble
- [ ] Cloud integration
## File Structure
```
object_detection/
├── main.py # Entry point
├── requirements.txt # Dependencies
├── README.md # User documentation
├── ARCHITECTURE.md # Technical architecture
├── IMPLEMENTATION_GUIDE.md # Development guide
├── PLAN_SUMMARY.md # This file
├── config/
│ └── app_config.yaml # Application settings
├── src/
│ ├── database/ # Database layer
│ │ ├── db_manager.py # Main database operations
│ │ ├── models.py # Data classes
│ │ └── schema.sql # SQL schema
│ │
│ ├── model/ # ML layer
│ │ ├── yolo_wrapper.py # YOLO interface
│ │ └── inference.py # Detection engine
│ │
│ ├── gui/ # Presentation layer
│ │ ├── main_window.py # Main app window
│ │ ├── tabs/ # Feature tabs
│ │ ├── dialogs/ # Popup dialogs
│ │ └── widgets/ # Custom widgets
│ │
│ └── utils/ # Utilities
│ ├── config_manager.py # Config handling
│ ├── logger.py # Logging setup
│ └── file_utils.py # File operations
├── data/ # Runtime data
│ ├── models/ # Trained models
│ ├── datasets/ # Training data
│ └── results/ # Detection outputs
├── tests/ # Test suite
│ ├── test_database.py
│ ├── test_model.py
│ └── test_gui.py
├── logs/ # Application logs
└── docs/ # Additional docs
```
## User Workflows
### Workflow 1: First-Time Setup
1. Launch application
2. Configure image repository path in Settings
3. Download/verify YOLOv8s.pt base model
4. Ready to detect!
### Workflow 2: Quick Detection
1. Open Detection tab
2. Select pre-trained model
3. Choose image or folder
4. Adjust confidence threshold
5. Click "Detect"
6. View results and save to database
### Workflow 3: Train Custom Model
1. Prepare dataset in YOLO format
2. Open Training tab
3. Select data.yaml file
4. Configure hyperparameters
5. Start training
6. Monitor progress
7. Model saved automatically with metrics
### Workflow 4: Analyze Results
1. Open Results tab
2. Apply filters (date range, class, model)
3. View statistics dashboard
4. Export filtered results to CSV/JSON
## Data Formats
### YOLO Annotation Format
```
<class_id> <x_center> <y_center> <width> <height>
```
- All coordinates normalized (0-1)
- One line per object
- Separate .txt file for each image
### Detection Output Format
```json
{
"image_id": 123,
"image_path": "relative/path/to/image.jpg",
"model_id": 5,
"model_name": "organelle_detector_v1",
"detections": [
{
"class_name": "organelle",
"confidence": 0.95,
"bbox": [0.1, 0.2, 0.3, 0.4],
"detected_at": "2024-12-04T15:30:00Z"
}
]
}
```
## Performance Requirements
| Metric | Target | Notes |
|--------|--------|-------|
| Detection Speed | < 1s per image | On GPU for 640x640 images |
| Training Start | < 5s | Model loading time |
| Database Query | < 100ms | For 10k detections |
| GUI Responsiveness | < 100ms | UI interactions |
| Memory Usage | < 4GB | During inference |
| Batch Processing | 100+ images | Without memory issues |
## Risk Mitigation
| Risk | Impact | Mitigation Strategy |
|------|--------|-------------------|
| CUDA/GPU issues | High | Fallback to CPU mode |
| Database corruption | Medium | Regular backups, transactions |
| Large datasets | Medium | Lazy loading, pagination |
| Memory overflow | High | Batch size limits, monitoring |
| Model compatibility | Low | Version checking, validation |
## Success Criteria
- ✅ Successfully detect objects in microscopy images
- ✅ Train custom models with >80% mAP50
- ✅ Store and retrieve detection results efficiently
- ✅ Intuitive GUI requiring minimal training
- ✅ Process 100+ images in batch mode
- ✅ Export results in multiple formats
- ✅ Comprehensive documentation
## Next Steps
1. **Review this plan** with stakeholders
2. **Gather feedback** on features and priorities
3. **Confirm requirements** are fully captured
4. **Switch to Code mode** to begin implementation
5. **Follow the implementation guide** for development
6. **Iterate based on testing** and user feedback
## Questions for Stakeholders
Before proceeding to implementation, please consider:
1. Are there any additional object classes beyond organelles and membrane branches?
2. What is the typical size of your microscopy images?
3. Do you need multi-user support or is single-user sufficient?
4. Are there specific export formats you need beyond CSV/JSON/Excel?
5. What is your expected dataset size (number of images)?
6. Do you need support for 3D/volumetric microscopy images?
7. Any specific visualization requirements?
8. Integration with other tools or workflows?
## Resources Required
### Development Time
- Phase 1-3: ~6 weeks development
- Phase 4: ~1 week testing & polish
- Total: ~7 weeks for initial release
### Hardware Requirements
- Development machine with CUDA GPU (recommended)
- 16GB RAM minimum
- 50GB storage for development
### Software Dependencies
- All listed in [`requirements.txt`](requirements.txt)
- Total size: ~2GB when installed
---
**Ready to proceed with implementation? Let's build this! 🚀**

View File

@@ -38,7 +38,7 @@ This will install:
- OpenCV and Pillow (image processing)
- And other dependencies
**Note:** The first run will automatically download the YOLOv8s.pt model (~22MB).
**Note:** The first run will automatically download the YOLOv8s-seg.pt segmentation model (~23MB).
### 4. Verify Installation
@@ -84,11 +84,11 @@ In the Settings dialog:
### Single Image Detection
1. Go to the **Detection** tab
2. Select a model from the dropdown (default: Base Model yolov8s.pt)
2. Select a model from the dropdown (default: Base Model yolov8s-seg.pt)
3. Adjust confidence threshold with the slider
4. Click "Detect Single Image"
5. Select an image file
6. View results in the results panel
6. View results with segmentation masks overlaid on the image
### Batch Detection
@@ -108,9 +108,18 @@ Detection results include:
- **Class names**: Types of objects detected (e.g., organelle, membrane_branch)
- **Confidence scores**: Detection confidence (0-1)
- **Bounding boxes**: Object locations (stored in database)
- **Segmentation masks**: Pixel-accurate polygon coordinates for each detected object
All results are stored in the SQLite database at [`data/detections.db`](data/detections.db).
### Segmentation Visualization
The application automatically displays segmentation masks when available:
- Semi-transparent colored overlay (30% opacity) showing the exact shape of detected objects
- Polygon contours outlining each segmentation
- Color-coded by object class
- Toggle-able in future versions
## Database
The application uses SQLite to store:
@@ -176,7 +185,7 @@ sudo apt-get install libxcb-xinerama0
### Detection Not Working
**No models available**
- The base YOLOv8s model will be downloaded automatically on first use
- The base YOLOv8s-seg segmentation model will be downloaded automatically on first use
- Make sure you have internet connection for the first run
**Images not found**

View File

@@ -1,6 +1,6 @@
# Microscopy Object Detection Application
A desktop application for detecting organelles and membrane branching structures in microscopy images using YOLOv8, featuring comprehensive training, validation, and visualization capabilities.
A desktop application for detecting and segmenting organelles and membrane branching structures in microscopy images using YOLOv8-seg, featuring comprehensive training, validation, and visualization capabilities with pixel-accurate segmentation masks.
![Python](https://img.shields.io/badge/python-3.8+-blue.svg)
![PySide6](https://img.shields.io/badge/PySide6-6.5+-green.svg)
@@ -8,8 +8,8 @@ A desktop application for detecting organelles and membrane branching structures
## Features
- **🎯 Object Detection**: Real-time and batch detection of microscopy objects
- **🎓 Model Training**: Fine-tune YOLOv8s on custom microscopy datasets
- **🎯 Object Detection & Segmentation**: Real-time and batch detection with pixel-accurate segmentation masks
- **🎓 Model Training**: Fine-tune YOLOv8s-seg on custom microscopy datasets
- **📊 Validation & Metrics**: Comprehensive model validation with visualization
- **💾 Database Storage**: SQLite database for detection results and metadata
- **📈 Visualization**: Interactive plots and charts using pyqtgraph
@@ -34,14 +34,24 @@ A desktop application for detecting organelles and membrane branching structures
## Installation
### 1. Clone the Repository
### Option 1: Install from PyPI (Recommended)
```bash
pip install microscopy-object-detection
```
This will install the package and all its dependencies.
### Option 2: Install from Source
#### 1. Clone the Repository
```bash
git clone <repository-url>
cd object_detection
```
### 2. Create Virtual Environment
#### 2. Create Virtual Environment
```bash
# Linux/Mac
@@ -53,25 +63,44 @@ python -m venv venv
venv\Scripts\activate
```
### 3. Install Dependencies
#### 3. Install in Development Mode
```bash
pip install -r requirements.txt
# Install in editable mode with dev dependencies
pip install -e ".[dev]"
# Or install just the package
pip install .
```
### 4. Download Base Model
The application will automatically download the YOLOv8s.pt model on first use, or you can download it manually:
The application will automatically download the YOLOv8s-seg.pt segmentation model on first use, or you can download it manually:
```bash
# The model will be downloaded automatically by ultralytics
# Or download manually from: https://github.com/ultralytics/assets/releases
```
**Note:** YOLOv8s-seg is a segmentation model that provides pixel-accurate masks for detected objects, enabling more precise analysis than standard bounding box detection.
## Quick Start
### 1. Launch the Application
After installation, you can launch the application in two ways:
**Using the GUI launcher:**
```bash
microscopy-detect-gui
```
**Or using Python directly:**
```bash
python -m microscopy_object_detection
```
**If installed from source:**
```bash
python main.py
```
@@ -85,11 +114,12 @@ python main.py
### 3. Perform Detection
1. Navigate to the **Detection** tab
2. Select a model (default: yolov8s.pt)
2. Select a model (default: yolov8s-seg.pt)
3. Choose an image or folder
4. Set confidence threshold
5. Click **Detect**
6. View results and save to database
6. View results with segmentation masks overlaid
7. Save results to database
### 4. Train Custom Model
@@ -212,8 +242,8 @@ The application uses SQLite with the following main tables:
- **models**: Stores trained model information and metrics
- **images**: Stores image metadata and paths
- **detections**: Stores detection results with bounding boxes
- **annotations**: Stores manual annotations (future feature)
- **detections**: Stores detection results with bounding boxes and segmentation masks (polygon coordinates)
- **annotations**: Stores manual annotations with optional segmentation masks (future feature)
See [`ARCHITECTURE.md`](ARCHITECTURE.md) for detailed schema information.
@@ -230,7 +260,7 @@ image_repository:
allowed_extensions: [".jpg", ".jpeg", ".png", ".tif", ".tiff"]
models:
default_base_model: "yolov8s.pt"
default_base_model: "yolov8s-seg.pt"
models_directory: "data/models"
training:
@@ -258,7 +288,7 @@ visualization:
from src.model.yolo_wrapper import YOLOWrapper
# Initialize wrapper
yolo = YOLOWrapper("yolov8s.pt")
yolo = YOLOWrapper("yolov8s-seg.pt")
# Train model
results = yolo.train(
@@ -393,10 +423,10 @@ make html
**Issue**: Model not found error
**Solution**: Ensure YOLOv8s.pt is downloaded. Run:
**Solution**: Ensure YOLOv8s-seg.pt is downloaded. Run:
```python
from ultralytics import YOLO
model = YOLO('yolov8s.pt') # Will auto-download
model = YOLO('yolov8s-seg.pt') # Will auto-download
```

View File

@@ -1,48 +1,57 @@
database:
path: "data/detections.db"
path: data/detections.db
image_repository:
base_path: "" # Set by user through GUI
base_path: ''
allowed_extensions:
- ".jpg"
- ".jpeg"
- ".png"
- ".tif"
- ".tiff"
- ".bmp"
- .jpg
- .jpeg
- .png
- .tif
- .tiff
- .bmp
models:
default_base_model: "yolov8s.pt"
models_directory: "data/models"
default_base_model: yolov8s-seg.pt
models_directory: data/models
base_model_choices:
- yolov8s-seg.pt
- yolo11s-seg.pt
training:
default_epochs: 100
default_batch_size: 16
default_imgsz: 640
default_imgsz: 1024
default_patience: 50
default_lr0: 0.01
two_stage:
enabled: false
stage1:
epochs: 20
lr0: 0.0005
patience: 10
freeze: 10
stage2:
epochs: 150
lr0: 0.0003
patience: 30
last_dataset_yaml: /home/martin/code/object_detection/data/datasets/data.yaml
last_dataset_dir: /home/martin/code/object_detection/data/datasets
detection:
default_confidence: 0.25
default_iou: 0.45
max_batch_size: 100
visualization:
bbox_colors:
organelle: "#FF6B6B"
membrane_branch: "#4ECDC4"
default: "#00FF00"
organelle: '#FF6B6B'
membrane_branch: '#4ECDC4'
default: '#00FF00'
bbox_thickness: 2
font_size: 12
export:
formats:
- csv
- json
- excel
default_format: "csv"
default_format: csv
logging:
level: "INFO"
file: "logs/app.log"
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
level: INFO
file: logs/app.log
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'

300
docs/16BIT_TIFF_SUPPORT.md Normal file
View File

@@ -0,0 +1,300 @@
# 16-bit TIFF Support for YOLO Object Detection
## Overview
This document describes the implementation of 16-bit grayscale TIFF support for YOLO object detection. The system properly loads 16-bit TIFF images, normalizes them to float32 [0-1], and handles them appropriately for both **inference** and **training** **without uint8 conversion** to preserve the full dynamic range and avoid data loss.
## Key Features
✅ Reads 16-bit or float32 images using tifffile
✅ Converts to float32 [0-1] (NO uint8 conversion)
✅ Replicates grayscale → RGB (3 channels)
**Inference**: Passes numpy arrays directly to YOLO (no file I/O)
**Training**: On-the-fly float32 conversion (NO disk caching)
✅ Uses Ultralytics YOLOv8/v11 models
✅ Works with segmentation models
✅ No data loss, no double normalization, no silent clipping
## Changes Made
### 1. Dependencies ([`requirements.txt`](../requirements.txt:14))
- Added `tifffile>=2023.0.0` for reliable 16-bit TIFF loading
### 2. Image Loading ([`src/utils/image.py`](../src/utils/image.py))
#### Enhanced TIFF Loading
- Modified [`Image._load()`](../src/utils/image.py:87) to use `tifffile` for `.tif` and `.tiff` files
- Preserves original 16-bit data type during loading
- Properly handles both grayscale and multi-channel TIFF files
#### New Normalization Method
Added [`Image.to_normalized_float32()`](../src/utils/image.py:280) method that:
- Converts image data to `float32`
- Properly scales values to [0, 1] range:
- **16-bit images**: divides by 65535 (full dynamic range)
- 8-bit images: divides by 255
- Float images: clips to [0, 1]
- Handles various data types automatically
### 3. YOLO Preprocessing ([`src/model/yolo_wrapper.py`](../src/model/yolo_wrapper.py))
Enhanced [`YOLOWrapper._prepare_source()`](../src/model/yolo_wrapper.py:231) to:
1. Detect 16-bit TIFF files automatically
2. Load and normalize to float32 [0-1] using the new method
3. Replicate grayscale to RGB (3 channels)
4. **Return numpy array directly** (NO file saving, NO uint8 conversion)
5. Pass float32 array directly to YOLO for inference
## Processing Pipeline
### For Inference (predict)
For 16-bit TIFF files during inference:
1. **Load**: File loaded using `tifffile` → preserves 16-bit uint16 data
2. **Normalize**: Convert to float32 and scale to [0, 1]
```python
float_data = uint16_data.astype(np.float32) / 65535.0
```
3. **RGB Conversion**: Replicate grayscale to 3 channels
```python
rgb_float = np.stack([float_data] * 3, axis=-1)
```
4. **Pass to YOLO**: Return float32 array directly (no uint8, no file I/O)
5. **Inference**: YOLO processes the float32 [0-1] RGB array
### For Training (train)
Training now uses a custom dataset loader with on-the-fly conversion (NO disk caching):
1. **Custom Dataset**: Uses `Float32Dataset` class that extends Ultralytics' `YOLODataset`
2. **Load On-The-Fly**: Each image is loaded and converted during training:
- Detect 16-bit TIFF files automatically
- Load with `tifffile` (preserves uint16)
- Convert to float32 [0-1] in memory
- Replicate to 3 channels (RGB)
3. **No Disk Cache**: Conversion happens in memory, no files written
4. **Train**: YOLO trains on float32 [0-1] RGB arrays directly
See [`src/utils/train_ultralytics_float.py`](../src/utils/train_ultralytics_float.py) for implementation.
### No Data Loss!
Unlike approaches that convert to uint8 (256 levels), this implementation:
- Preserves full 16-bit dynamic range (65536 levels)
- Maintains precision with float32 representation
- For inference: passes data directly without file conversions
- For training: uses float32 TIFFs (not uint8 PNGs)
## Usage
### Basic Image Loading
```python
from src.utils.image import Image
# Load a 16-bit TIFF file
img = Image("path/to/16bit_image.tif")
# Get normalized float32 data [0-1]
normalized = img.to_normalized_float32() # Shape: (H, W), dtype: float32
# Original data is preserved
original = img.data # Still uint16
```
### YOLO Inference
The preprocessing is automatic - just use YOLO as normal:
```python
from src.model.yolo_wrapper import YOLOWrapper
# Initialize model
yolo = YOLOWrapper("yolov8s-seg.pt")
yolo.load_model()
# Perform inference on 16-bit TIFF
# The image will be automatically normalized and passed as float32 [0-1]
detections = yolo.predict("path/to/16bit_image.tif", conf=0.25)
```
### With InferenceEngine
```python
from src.model.inference import InferenceEngine
from src.database.db_manager import DatabaseManager
# Setup
db = DatabaseManager("database.db")
engine = InferenceEngine("model.pt", db, model_id=1)
# Detect objects in 16-bit TIFF
result = engine.detect_single(
image_path="path/to/16bit_image.tif",
relative_path="images/16bit_image.tif",
conf=0.25
)
```
## Testing
Three test scripts are provided:
### 1. Image Loading Test
```bash
./venv/bin/python tests/test_16bit_tiff_loading.py
```
Tests:
- Loading 16-bit TIFF files with tifffile
- Normalization to float32 [0-1]
- Data type and value range verification
### 2. Float32 Passthrough Test (Most Important!)
```bash
./venv/bin/python tests/test_yolo_16bit_float32.py
```
Tests:
- YOLO preprocessing returns numpy array (not file path)
- Data is float32 [0-1] (not uint8)
- No quantization to 256 levels (proves no uint8 conversion)
- Sample output:
```
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
Shape: (200, 200, 3)
Dtype: float32
Min value: 0.000000
Max value: 1.000000
Unique values: 399
✓ SUCCESS: Data has 399 unique values (> 256)
This confirms NO uint8 quantization occurred!
```
### 3. Legacy Test (Shows Old Behavior)
```bash
./venv/bin/python tests/test_yolo_16bit_preprocessing.py
```
This test shows the old behavior (uint8 conversion) - kept for comparison.
## Benefits
1. **No Data Loss**: Preserves full 16-bit dynamic range (65536 levels vs 256)
2. **High Precision**: Float32 maintains fine-grained intensity differences
3. **Automatic Processing**: No manual preprocessing needed
4. **YOLO Compatible**: Ultralytics YOLO accepts float32 [0-1] arrays
5. **Performance**: No intermediate file I/O for 16-bit TIFFs
6. **Backwards Compatible**: Regular images (8-bit PNG, JPEG, etc.) still work as before
## Technical Notes
### Float32 vs uint8
**With uint8 conversion (OLD - BAD):**
- 16-bit (65536 levels) → uint8 (256 levels) = **99.6% data loss!**
- Fine intensity differences are lost
- Quantization artifacts
**With float32 [0-1] (NEW - GOOD):**
- 16-bit (65536 levels) → float32 (continuous) = **No data loss**
- Full dynamic range preserved
- Smooth gradients maintained
### Memory Considerations
For a 2048×2048 single-channel image:
| Format | Memory | Disk Space | Notes |
|--------|--------|------------|-------|
| Original 16-bit | 8 MB | ~8 MB | uint16 grayscale TIFF |
| Float32 grayscale | 16 MB | - | Intermediate |
| Float32 3-channel | 48 MB | ~48 MB | Training cache |
| uint8 RGB (old) | 12 MB | ~12 MB | OLD approach with data loss |
The float32 approach uses ~3× more memory than uint8 during training but preserves **all information**.
**No Disk Cache**: The new on-the-fly approach eliminates the need for cached datasets on disk.
### Why Direct Numpy Array?
Passing numpy arrays directly to YOLO (instead of saving to file):
1. **Faster**: No disk I/O overhead
2. **No Quantization**: Avoids PNG/JPEG quantization
3. **Memory Efficient**: Single copy in memory
4. **Cleaner**: No temp file management
Ultralytics YOLO supports various input types:
- File paths (str): `"image.jpg"`
- Numpy arrays: `np.ndarray` ← **we use this**
- PIL Images: `PIL.Image`
- Torch tensors: `torch.Tensor`
## Training with Float32 Dataset Loader
The system now includes a custom dataset loader for 16-bit TIFF training:
```python
from src.utils.train_ultralytics_float import train_with_float32_loader
# Train with on-the-fly float32 conversion
results = train_with_float32_loader(
model_path="yolov8s-seg.pt",
data_yaml="data/my_dataset/data.yaml",
epochs=100,
batch=16,
imgsz=640,
)
```
The `Float32Dataset` class automatically:
- Detects 16-bit TIFF files
- Loads with `tifffile` (not PIL/cv2)
- Converts to float32 [0-1] on-the-fly
- Replicates to 3 channels
- Integrates seamlessly with Ultralytics training pipeline
This is used automatically by the training tab in the GUI.
## Installation
Install the updated dependencies:
```bash
./venv/bin/pip install -r requirements.txt
```
Or install tifffile directly:
```bash
./venv/bin/pip install tifffile>=2023.0.0
```
## Example Test Output
```
=== Testing Float32 Passthrough (NO uint8) ===
Created test 16-bit TIFF: /tmp/tmpdt5hm0ab.tif
Shape: (200, 200)
Dtype: uint16
Min value: 0
Max value: 65535
Preprocessing result:
Prepared source type: <class 'numpy.ndarray'>
✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)
Shape: (200, 200, 3)
Dtype: float32
Min value: 0.000000
Max value: 1.000000
Mean value: 0.499992
Unique values: 399
✓ SUCCESS: Data has 399 unique values (> 256)
This confirms NO uint8 quantization occurred!
✓ All float32 passthrough tests passed!

220
docs/IMAGE_CLASS_USAGE.md Normal file
View File

@@ -0,0 +1,220 @@
# Image Class Usage Guide
The `Image` class provides a convenient way to load and work with images in the microscopy object detection application.
## Supported Formats
The Image class supports the following image formats:
- `.jpg`, `.jpeg` - JPEG images
- `.png` - PNG images
- `.tif`, `.tiff` - TIFF images (commonly used in microscopy)
- `.bmp` - Bitmap images
## Basic Usage
### Loading an Image
```python
from src.utils import Image, ImageLoadError
# Load an image from a file path
try:
img = Image("path/to/image.jpg")
print(f"Loaded image: {img.width}x{img.height} pixels")
except ImageLoadError as e:
print(f"Failed to load image: {e}")
```
### Accessing Image Properties
```python
img = Image("microscopy_image.tif")
# Basic properties
print(f"Width: {img.width} pixels")
print(f"Height: {img.height} pixels")
print(f"Channels: {img.channels}")
print(f"Format: {img.format}")
print(f"Shape: {img.shape}") # (height, width, channels)
# File information
print(f"File size: {img.size_mb:.2f} MB")
print(f"File size: {img.size_bytes} bytes")
# Image type checks
print(f"Is color: {img.is_color()}")
print(f"Is grayscale: {img.is_grayscale()}")
# String representation
print(img) # Shows summary of image properties
```
### Working with Image Data
```python
import numpy as np
img = Image("sample.png")
# Get image data as numpy array (OpenCV format, BGR)
bgr_data = img.data
print(f"Data shape: {bgr_data.shape}")
print(f"Data type: {bgr_data.dtype}")
# Get image as RGB (for display or processing)
rgb_data = img.get_rgb()
# Get grayscale version
gray_data = img.get_grayscale()
# Create a copy (for modifications)
img_copy = img.copy()
img_copy[0, 0] = [255, 255, 255] # Modify copy, original unchanged
# Resize image (returns new array, doesn't modify original)
resized = img.resize(640, 640)
```
### Using PIL Image
```python
img = Image("photo.jpg")
# Access as PIL Image (RGB format)
pil_img = img.pil_image
# Use PIL methods
pil_img.show() # Display image
pil_img.save("output.png") # Save with PIL
```
## Integration with YOLO
```python
from src.utils import Image
from ultralytics import YOLO
# Load model and image
model = YOLO("yolov8n.pt")
img = Image("microscopy/cell_01.tif")
# Run inference (YOLO accepts file paths or numpy arrays)
results = model(img.data)
# Or use the file path directly
results = model(str(img.path))
```
## Error Handling
```python
from src.utils import Image, ImageLoadError
def process_image(image_path):
try:
img = Image(image_path)
# Process the image...
return img
except ImageLoadError as e:
print(f"Cannot load image: {e}")
return None
```
## Advanced Usage
### Batch Processing
```python
from pathlib import Path
from src.utils import Image, ImageLoadError
def process_image_directory(directory):
"""Process all images in a directory."""
image_paths = Path(directory).glob("*.tif")
for path in image_paths:
try:
img = Image(path)
print(f"Processing {img.path.name}: {img.width}x{img.height}")
# Process the image...
except ImageLoadError as e:
print(f"Skipping {path}: {e}")
```
### Using with OpenCV Operations
```python
import cv2
from src.utils import Image
img = Image("input.jpg")
# Apply OpenCV operations on the data
blurred = cv2.GaussianBlur(img.data, (5, 5), 0)
edges = cv2.Canny(img.data, 100, 200)
# Note: These operations don't modify the original img.data
```
### Memory Efficient Processing
```python
from src.utils import Image
# The Image class loads data into memory
img = Image("large_image.tif")
print(f"Image size in memory: {img.data.nbytes / (1024**2):.2f} MB")
# When processing many images, consider loading one at a time
# and releasing memory by deleting the object
del img
```
## Best Practices
1. **Always use try-except** when loading images to handle errors gracefully
2. **Check image properties** before processing to ensure compatibility
3. **Use copy()** when you need to modify image data without affecting the original
4. **Path objects work too** - The class accepts both strings and Path objects
5. **Consider memory usage** when working with large images or batches
## Example: Complete Workflow
```python
from src.utils import Image, ImageLoadError
from src.utils.file_utils import get_image_files
def analyze_microscopy_images(directory):
"""Analyze all microscopy images in a directory."""
# Get all image files
image_files = get_image_files(directory, recursive=True)
results = []
for image_path in image_files:
try:
# Load image
img = Image(image_path)
# Analyze
result = {
'filename': img.path.name,
'width': img.width,
'height': img.height,
'channels': img.channels,
'format': img.format,
'size_mb': img.size_mb,
'is_color': img.is_color()
}
results.append(result)
print(f"✓ Analyzed {img.path.name}")
except ImageLoadError as e:
print(f"✗ Failed to load {image_path}: {e}")
return results
# Run analysis
results = analyze_microscopy_images("data/datasets/cells")
print(f"\nProcessed {len(results)} images")

269
docs/TRAINING_16BIT_TIFF.md Normal file
View File

@@ -0,0 +1,269 @@
# Training YOLO with 16-bit TIFF Datasets
## Quick Start
If your dataset contains 16-bit grayscale TIFF files, the training tab will automatically:
1. Detect 16-bit TIFF images in your dataset
2. Convert them to float32 [0-1] RGB **on-the-fly** during training
3. Train without any disk caching (memory-efficient)
**No manual intervention or disk space needed!**
## Why Float32 On-The-Fly Conversion?
### The Problem
YOLO's training expects:
- 3-channel images (RGB)
- Images loaded from disk by the dataloader
16-bit grayscale TIFFs are:
- 1-channel (grayscale)
- Need to be converted to RGB format
### The Solution
**NEW APPROACH (Current)**: On-the-fly float32 conversion
- Load 16-bit TIFF with `tifffile` (not PIL/cv2)
- Convert uint16 [0-65535] → float32 [0-1] in memory
- Replicate grayscale to 3 channels
- Pass directly to YOLO training pipeline
- **No disk caching required!**
**OLD APPROACH (Deprecated)**: Disk caching
- Created 16-bit RGB PNG cache files on disk
- Required ~2x dataset size in disk space
- Slower first training run
## How It Works
### Custom Dataset Loader
The system uses a custom `Float32Dataset` class that extends Ultralytics' `YOLODataset`:
```python
from src.utils.train_ultralytics_float import Float32Dataset
# This dataset loader:
# 1. Intercepts image loading
# 2. Detects 16-bit TIFFs
# 3. Converts to float32 [0-1] RGB on-the-fly
# 4. Passes to training pipeline
```
### Conversion Process
For each 16-bit grayscale TIFF during training:
```
1. Load with tifffile → uint16 [0, 65535]
2. Convert to float32 → img.astype(float32) / 65535.0
3. Replicate to RGB → np.stack([img] * 3, axis=-1)
4. Result: float32 [0, 1] RGB array, shape (H, W, 3)
```
### Memory vs Disk
| Aspect | On-the-fly (NEW) | Disk Cache (OLD) |
|--------|------------------|------------------|
| Disk Space | Dataset size only | ~2× dataset size |
| First Training | Fast | Slow (creates cache) |
| Subsequent Training | Fast | Fast |
| Data Loss | None | None |
| Setup Required | None | Cache creation |
## Data Preservation
### Float32 Precision
16-bit TIFF: 65,536 levels (0-65535)
Float32: ~7 decimal digits precision
**Conversion accuracy:**
```python
Original: 32768 (uint16, middle intensity)
Float32: 32768 / 65535 = 0.50000763 (exact)
```
Full 16-bit precision is preserved in float32 representation.
### Comparison to uint8
| Approach | Precision Loss | Recommended |
|----------|----------------|-------------|
| **float32 [0-1]** | None | ✓ YES |
| uint16 RGB | None | ✓ YES (but disk-heavy) |
| uint8 | 99.6% data loss | ✗ NO |
**Why NO uint8:**
```
Original values: 32768, 32769, 32770 (distinct)
Converted to uint8: 128, 128, 128 (collapsed!)
```
Multiple 16-bit values collapse to the same uint8 value.
## Training Tab Behavior
When you click "Start Training" with a 16-bit TIFF dataset:
```
[01:23:45] Exported 150 annotations across 50 image(s).
[01:23:45] Using Float32 on-the-fly loader for 16-bit TIFF support (no disk caching)
[01:23:45] Starting training run 'my_model_v1' using yolov8s-seg.pt
[01:23:46] Using Float32Dataset loader for 16-bit TIFF support
```
Every training run uses the same approach - fast and efficient!
## Inference vs Training
| Operation | Input | Processing | Output to YOLO |
|-----------|-------|------------|----------------|
| **Inference** | 16-bit TIFF file | Load → float32 [0-1] → 3ch | numpy array (float32) |
| **Training** | 16-bit TIFF dataset | Load on-the-fly → float32 [0-1] → 3ch | numpy array (float32) |
Both preserve full 16-bit precision using float32 representation.
## Technical Details
### Custom Dataset Class
Located in `src/utils/train_ultralytics_float.py`:
```python
class Float32Dataset(YOLODataset):
"""
Extends Ultralytics YOLODataset to handle 16-bit TIFFs.
Key methods:
- load_image(): Intercepts image loading
- Detects .tif/.tiff with dtype == uint16
- Converts: uint16 → float32 [0-1] → RGB (3-channel)
"""
```
### Integration with YOLO
The `YOLOWrapper.train()` method automatically uses the custom loader:
```python
# In src/model/yolo_wrapper.py
def train(self, data_yaml, use_float32_loader=True, **kwargs):
if use_float32_loader:
# Use custom Float32Dataset
return train_with_float32_loader(...)
else:
# Standard YOLO training
return self.model.train(...)
```
### No PIL or cv2 for 16-bit
16-bit TIFF loading uses `tifffile` directly:
- PIL: Can load 16-bit but converts during processing
- cv2: Limited 16-bit TIFF support
- tifffile: Native 16-bit support, numpy output
## Advantages Over Disk Caching
### 1. No Disk Space Required
```
Dataset: 1000 images × 12 MB = 12 GB
Old cache: Additional 24 GB (16-bit RGB PNGs)
New approach: 0 GB additional (on-the-fly)
```
### 2. Faster Setup
```
Old: First training requires cache creation (minutes)
New: Start training immediately (seconds)
```
### 3. Always In Sync
```
Old: Cache could become stale if images change
New: Always loads current version from disk
```
### 4. Simpler Workflow
```
Old: Manage cache directory, cleanup, etc.
New: Just point to dataset and train
```
## Troubleshooting
### Error: "expected input to have 3 channels, but got 1"
This shouldn't happen with the new Float32Dataset, but if it does:
1. Check that `use_float32_loader=True` in training call
2. Verify `Float32Dataset` is being used (check logs)
3. Ensure `tifffile` is installed: `pip install tifffile`
### Memory Usage
On-the-fly conversion uses memory during training:
- Image loaded: ~24 MB (2048×2048 uint16)
- Converted float32 RGB: ~48 MB (temporary)
- Released after augmentation pipeline
**Mitigation:**
- Reduce batch size if OOM errors occur
- Images are processed one at a time during loading
- Only active batch kept in memory
### Slow Training
If training seems slow:
- Check disk I/O (slow disk can bottleneck loading)
- Verify images aren't being re-converted each epoch (should cache after first load)
- Monitor CPU usage during loading
## Migration from Old Approach
If you have existing cached datasets:
```bash
# Old cache location (safe to delete)
rm -rf data/datasets/_float32_cache/
# The new approach doesn't use this directory
```
Your original dataset structure remains unchanged:
```
data/my_dataset/
├── train/
│ ├── images/ (original 16-bit TIFFs)
│ └── labels/
├── val/
│ ├── images/
│ └── labels/
└── data.yaml
```
Just point to the same `data.yaml` and train!
## Performance Comparison
| Metric | Old (Disk Cache) | New (On-the-fly) |
|--------|------------------|------------------|
| First training setup | 5-10 min | 0 sec |
| Disk space overhead | 100% | 0% |
| Training speed | Fast | Fast |
| Subsequent runs | Fast | Fast |
| Data accuracy | 16-bit preserved | 16-bit preserved |
## Summary
**On-the-fly conversion**: Load and convert during training
**No disk caching**: Zero additional disk space
**Full precision**: Float32 preserves 16-bit dynamic range
**No PIL/cv2**: Direct tifffile loading
**Automatic**: Works transparently with training tab
**Fast**: Efficient memory-based conversion
The new approach is simpler, faster to set up, and requires no disk space overhead!

151
examples/image_demo.py Normal file
View File

@@ -0,0 +1,151 @@
"""
Example script demonstrating the Image class functionality.
"""
import sys
from pathlib import Path
# Add parent directory to path for imports
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.utils import Image, ImageLoadError
def demonstrate_image_loading():
"""Demonstrate basic image loading functionality."""
print("=" * 60)
print("Image Class Demonstration")
print("=" * 60)
# Example 1: Try to load an image (replace with your own path)
example_paths = [
"data/datasets/example.jpg",
"data/datasets/sample.png",
"tests/test_image.jpg",
]
loaded_img = None
for image_path in example_paths:
if Path(image_path).exists():
try:
print(f"\n1. Loading image: {image_path}")
img = Image(image_path)
loaded_img = img
print(f" ✓ Successfully loaded!")
print(f" {img}")
break
except ImageLoadError as e:
print(f" ✗ Failed: {e}")
else:
print(f"\n1. Image not found: {image_path}")
if loaded_img is None:
print("\nNo example images found. Creating a test image...")
create_test_image()
return
# Example 2: Access image properties
print(f"\n2. Image Properties:")
print(f" Width: {loaded_img.width} pixels")
print(f" Height: {loaded_img.height} pixels")
print(f" Channels: {loaded_img.channels}")
print(f" Format: {loaded_img.format.upper()}")
print(f" Shape: {loaded_img.shape}")
print(f" File size: {loaded_img.size_mb:.2f} MB")
print(f" Is color: {loaded_img.is_color()}")
print(f" Is grayscale: {loaded_img.is_grayscale()}")
# Example 3: Get different formats
print(f"\n3. Accessing Image Data:")
print(f" BGR data shape: {loaded_img.data.shape}")
print(f" RGB data shape: {loaded_img.get_rgb().shape}")
print(f" Grayscale shape: {loaded_img.get_grayscale().shape}")
print(f" PIL image mode: {loaded_img.pil_image.mode}")
# Example 4: Resizing
print(f"\n4. Resizing Image:")
resized = loaded_img.resize(320, 320)
print(f" Original size: {loaded_img.width}x{loaded_img.height}")
print(f" Resized to: {resized.shape[1]}x{resized.shape[0]}")
# Example 5: Working with copies
print(f"\n5. Creating Copies:")
copy = loaded_img.copy()
print(f" Created copy with shape: {copy.shape}")
print(f" Original data unchanged: {(loaded_img.data == copy).all()}")
print("\n" + "=" * 60)
print("Demonstration Complete!")
print("=" * 60)
def create_test_image():
"""Create a test image for demonstration purposes."""
import cv2
import numpy as np
print("\nCreating a test image...")
# Create a colorful test image
width, height = 400, 300
test_img = np.zeros((height, width, 3), dtype=np.uint8)
# Add some colors
test_img[:100, :] = [255, 0, 0] # Blue section
test_img[100:200, :] = [0, 255, 0] # Green section
test_img[200:, :] = [0, 0, 255] # Red section
# Save the test image
test_path = Path("test_demo_image.png")
cv2.imwrite(str(test_path), test_img)
print(f"Test image created: {test_path}")
# Now load and demonstrate with it
try:
img = Image(test_path)
print(f"\nLoaded test image: {img}")
print(f"Dimensions: {img.width}x{img.height}")
print(f"Channels: {img.channels}")
print(f"Format: {img.format}")
# Clean up
test_path.unlink()
print(f"\nTest image cleaned up.")
except ImageLoadError as e:
print(f"Error loading test image: {e}")
def demonstrate_error_handling():
"""Demonstrate error handling."""
print("\n" + "=" * 60)
print("Error Handling Demonstration")
print("=" * 60)
# Try to load non-existent file
print("\n1. Loading non-existent file:")
try:
img = Image("nonexistent.jpg")
except ImageLoadError as e:
print(f" ✓ Caught error: {e}")
# Try unsupported format
print("\n2. Loading unsupported format:")
try:
# Create a text file
test_file = Path("test.txt")
test_file.write_text("not an image")
img = Image(test_file)
except ImageLoadError as e:
print(f" ✓ Caught error: {e}")
test_file.unlink() # Clean up
print("\n" + "=" * 60)
if __name__ == "__main__":
print("\n")
demonstrate_image_loading()
print("\n")
demonstrate_error_handling()
print("\n")

55
main.py Normal file
View File

@@ -0,0 +1,55 @@
"""
Microscopy Object Detection Application
Main entry point for the application.
"""
import sys
from pathlib import Path
# Add src directory to path for development mode
sys.path.insert(0, str(Path(__file__).parent))
from PySide6.QtWidgets import QApplication
from PySide6.QtCore import Qt
from src import __version__
from src.gui.main_window import MainWindow
from src.utils.logger import setup_logging
from src.utils.config_manager import ConfigManager
def main():
"""Application entry point."""
# Setup logging
config_manager = ConfigManager()
log_config = config_manager.get_section("logging")
setup_logging(
log_file=log_config.get("file", "logs/app.log"),
level=log_config.get("level", "INFO"),
log_format=log_config.get("format"),
)
# Enable High DPI scaling
QApplication.setHighDpiScaleFactorRoundingPolicy(
Qt.HighDpiScaleFactorRoundingPolicy.PassThrough
)
# Create Qt application
app = QApplication(sys.argv)
app.setApplicationName("Microscopy Object Detection")
app.setOrganizationName("MicroscopyLab")
app.setApplicationVersion(__version__)
# Set application style
app.setStyle("Fusion")
# Create and show main window
window = MainWindow()
window.show()
# Run application
sys.exit(app.exec())
if __name__ == "__main__":
main()

102
pyproject.toml Normal file
View File

@@ -0,0 +1,102 @@
[build-system]
requires = ["setuptools>=45", "wheel"]
build-backend = "setuptools.build_meta"
[project]
name = "microscopy-object-detection"
version = "1.0.0"
description = "Desktop application for detecting and segmenting organelles in microscopy images using YOLOv8-seg"
readme = "README.md"
requires-python = ">=3.8"
license = { text = "MIT" }
authors = [{ name = "Your Name", email = "your.email@example.com" }]
keywords = [
"microscopy",
"yolov8",
"object-detection",
"segmentation",
"computer-vision",
"deep-learning",
]
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Science/Research",
"Topic :: Scientific/Engineering :: Image Recognition",
"Topic :: Scientific/Engineering :: Bio-Informatics",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Operating System :: OS Independent",
]
dependencies = [
"ultralytics>=8.0.0",
"PySide6>=6.5.0",
"pyqtgraph>=0.13.0",
"numpy>=1.24.0",
"opencv-python>=4.8.0",
"Pillow>=10.0.0",
"PyYAML>=6.0",
"pandas>=2.0.0",
"openpyxl>=3.1.0",
]
[project.optional-dependencies]
dev = [
"pytest>=7.0.0",
"pytest-cov>=4.0.0",
"black>=23.0.0",
"pylint>=2.17.0",
"mypy>=1.0.0",
]
[project.urls]
Homepage = "https://github.com/yourusername/object_detection"
Documentation = "https://github.com/yourusername/object_detection/blob/main/README.md"
Repository = "https://github.com/yourusername/object_detection"
"Bug Tracker" = "https://github.com/yourusername/object_detection/issues"
[project.scripts]
microscopy-detect = "src.cli:main"
[project.gui-scripts]
microscopy-detect-gui = "src.gui_launcher:main"
[tool.setuptools]
packages = [
"src",
"src.database",
"src.model",
"src.gui",
"src.gui.tabs",
"src.gui.dialogs",
"src.gui.widgets",
"src.utils",
]
include-package-data = true
[tool.setuptools.package-data]
"src.database" = ["*.sql"]
[tool.black]
line-length = 88
target-version = ['py38', 'py39', 'py310', 'py311']
include = '\.pyi?$'
[tool.pylint.messages_control]
max-line-length = 88
[tool.mypy]
python_version = "3.8"
warn_return_any = true
warn_unused_configs = true
disallow_untyped_defs = false
[tool.pytest.ini_options]
testpaths = ["tests"]
python_files = ["test_*.py"]
python_functions = ["test_*"]
addopts = "-v --cov=src --cov-report=term-missing"

View File

@@ -11,6 +11,7 @@ pyqtgraph>=0.13.0
opencv-python>=4.8.0
Pillow>=10.0.0
numpy>=1.24.0
tifffile>=2023.0.0
# Database
sqlalchemy>=2.0.0

View File

@@ -0,0 +1,179 @@
# Standalone Float32 Training Script for 16-bit TIFFs
## Overview
This standalone script (`train_float32_standalone.py`) trains YOLO models on 16-bit grayscale TIFF datasets with **no data loss**.
- Loads 16-bit TIFFs with `tifffile` (not PIL/cv2)
- Converts to float32 [0-1] on-the-fly (preserves full 16-bit precision)
- Replicates grayscale → 3-channel RGB in memory
- **No disk caching required**
- Uses custom PyTorch Dataset + training loop
## Quick Start
```bash
# Activate virtual environment
source venv/bin/activate
# Train on your 16-bit TIFF dataset
python scripts/train_float32_standalone.py \
--data data/my_dataset/data.yaml \
--weights yolov8s-seg.pt \
--epochs 100 \
--batch 16 \
--imgsz 640 \
--lr 0.0001 \
--save-dir runs/my_training \
--device cuda
```
## Arguments
| Argument | Required | Default | Description |
|----------|----------|---------|-------------|
| `--data` | Yes | - | Path to YOLO data.yaml file |
| `--weights` | No | yolov8s-seg.pt | Pretrained model weights |
| `--epochs` | No | 100 | Number of training epochs |
| `--batch` | No | 16 | Batch size |
| `--imgsz` | No | 640 | Input image size |
| `--lr` | No | 0.0001 | Learning rate |
| `--save-dir` | No | runs/train | Directory to save checkpoints |
| `--device` | No | cuda/cpu | Training device (auto-detected) |
## Dataset Format
Your data.yaml should follow standard YOLO format:
```yaml
path: /path/to/dataset
train: train/images
val: val/images
test: test/images # optional
names:
0: class1
1: class2
nc: 2
```
Directory structure:
```
dataset/
├── train/
│ ├── images/
│ │ ├── img1.tif (16-bit grayscale TIFF)
│ │ └── img2.tif
│ └── labels/
│ ├── img1.txt (YOLO format)
│ └── img2.txt
├── val/
│ ├── images/
│ └── labels/
└── data.yaml
```
## Output
The script saves:
- `epoch{N}.pt`: Checkpoint after each epoch
- `best.pt`: Best model weights (lowest loss)
- Training logs to console
## Features
**16-bit precision preserved**: Float32 [0-1] maintains full dynamic range
**No disk caching**: Conversion happens in memory
**No PIL/cv2**: Direct tifffile loading
**Variable-length labels**: Handles segmentation polygons
**Checkpoint saving**: Resume training if interrupted
**Best model tracking**: Automatically saves best weights
## Example
Train a segmentation model on microscopy data:
```bash
python scripts/train_float32_standalone.py \
--data data/microscopy/data.yaml \
--weights yolov11s-seg.pt \
--epochs 150 \
--batch 8 \
--imgsz 1024 \
--lr 0.0003 \
--save-dir data/models/microscopy_v1
```
## Troubleshooting
### Out of Memory (OOM)
Reduce batch size:
```bash
--batch 4
```
### Slow Loading
Reduce num_workers (edit script line 208):
```python
num_workers=2 # instead of 4
```
### Different Image Sizes
The script expects all images to have the same dimensions. For variable sizes:
1. Implement letterbox/resize in dataset's `_read_image()`
2. Or preprocess images to same size
### Loss Computation Errors
If you see "Cannot determine loss", the script may need adjustment for your Ultralytics version. Check:
```python
# In train() function, the preds format may vary
# Current script assumes: preds is tuple with loss OR dict with 'loss' key
```
## vs GUI Training
| Feature | Standalone Script | GUI Training Tab |
|---------|------------------|------------------|
| Float32 conversion | ✓ Yes | ✓ Yes (automatic) |
| Disk caching | ✗ None | ✗ None |
| Progress UI | ✗ Console only | ✓ Visual progress bar |
| Dataset selection | Manual CLI args | ✓ GUI browsing |
| Multi-stage training | Manual runs | ✓ Built-in |
| Use case | Advanced users | General users |
## Technical Details
### Data Loading Pipeline
```
16-bit TIFF file
↓ (tifffile.imread)
uint16 [0-65535]
↓ (/ 65535.0)
float32 [0-1]
↓ (replicate channels)
float32 RGB (H,W,3) [0-1]
↓ (permute to C,H,W)
torch.Tensor (3,H,W) float32
↓ (DataLoader stack)
Batch (B,3,H,W) float32
YOLO Model
```
### Precision Comparison
| Method | Unique Values | Data Loss |
|--------|---------------|-----------|
| **float32 [0-1]** | ~65,536 | None ✓ |
| uint16 RGB | 65,536 | None ✓ |
| uint8 | 256 | 99.6% ✗ |
Example: Pixel value 32,768 (middle intensity)
- Float32: 32768 / 65535.0 = 0.50000763 (exact)
- uint8: 32768 → 128 → many values collapse!
## License
Same as main project.

View File

@@ -0,0 +1,351 @@
#!/usr/bin/env python3
"""
Standalone training script for YOLO with 16-bit TIFF float32 support.
This script trains YOLO models on 16-bit grayscale TIFF datasets without data loss.
Converts images to float32 [0-1] on-the-fly using tifffile (no PIL/cv2).
Usage:
python scripts/train_float32_standalone.py \\
--data path/to/data.yaml \\
--weights yolov8s-seg.pt \\
--epochs 100 \\
--batch 16 \\
--imgsz 640
Based on the custom dataset approach to avoid Ultralytics' channel conversion issues.
"""
import argparse
import os
import sys
import time
from pathlib import Path
import cv2
import numpy as np
import torch
import torch.nn as nn
import tifffile
import yaml
from torch.utils.data import Dataset, DataLoader
from ultralytics import YOLO
# Add project root to path
project_root = Path(__file__).parent.parent
sys.path.insert(0, str(project_root))
from src.utils.logger import get_logger
logger = get_logger(__name__)
# ===================== Dataset =====================
class Float32YOLODataset(Dataset):
"""PyTorch dataset for 16-bit TIFF images with float32 conversion."""
def __init__(self, images_dir, labels_dir, img_size=640):
self.images_dir = Path(images_dir)
self.labels_dir = Path(labels_dir)
self.img_size = img_size
# Find images
extensions = {".tif", ".tiff", ".png", ".jpg", ".jpeg", ".bmp"}
self.paths = sorted(
[
p
for p in self.images_dir.rglob("*")
if p.is_file() and p.suffix.lower() in extensions
]
)
if not self.paths:
raise ValueError(f"No images found in {images_dir}")
logger.info(f"Dataset: {len(self.paths)} images from {images_dir}")
def __len__(self):
return len(self.paths)
def _read_image(self, path: Path) -> np.ndarray:
"""Load image as float32 [0-1] RGB."""
# Load with tifffile
img = tifffile.imread(str(path))
# Convert to float32
img = img.astype(np.float32)
# Normalize 16-bit→[0,1]
if img.max() > 1.5:
img = img / 65535.0
img = np.clip(img, 0.0, 1.0)
# Grayscale→RGB
if img.ndim == 2:
img = np.repeat(img[..., None], 3, axis=2)
elif img.ndim == 3 and img.shape[2] == 1:
img = np.repeat(img, 3, axis=2)
# Resize to model input size
img = cv2.resize(img, (self.img_size, self.img_size))
return img # float32 (img_size, img_size, 3) [0,1] BGR
def _parse_label(self, path: Path) -> list:
"""Parse YOLO label with variable-length rows."""
if not path.exists():
return []
labels = []
with open(path, "r") as f:
for line in f:
vals = line.strip().split()
if len(vals) >= 5:
labels.append([float(v) for v in vals])
return labels
def __getitem__(self, idx):
img_path = self.paths[idx]
label_path = self.labels_dir / f"{img_path.stem}.txt"
# Load & convert to tensor (C,H,W)
img = self._read_image(img_path)
img_t = torch.from_numpy(img).permute(2, 0, 1).contiguous()
# Load labels
labels = self._parse_label(label_path)
return img_t, labels, str(img_path.name)
# ===================== Collate =====================
def collate_fn(batch):
"""Stack images, keep labels as list."""
imgs = torch.stack([b[0] for b in batch], dim=0)
labels = [b[1] for b in batch]
names = [b[2] for b in batch]
return imgs, labels, names
# ===================== Training =====================
def get_pytorch_model(ul_model):
"""Extract PyTorch model and loss from Ultralytics wrapper."""
pt_model = None
loss_fn = None
# Try common patterns
if hasattr(ul_model, "model"):
pt_model = ul_model.model
# Find loss
if pt_model and hasattr(pt_model, "loss"):
loss_fn = pt_model.loss
elif pt_model and hasattr(pt_model, "compute_loss"):
loss_fn = pt_model.compute_loss
if pt_model is None:
raise RuntimeError("Could not extract PyTorch model")
return pt_model, loss_fn
def train(args):
"""Main training function."""
device = args.device
logger.info(f"Device: {device}")
# Parse data.yaml
with open(args.data, "r") as f:
data_config = yaml.safe_load(f)
dataset_root = Path(data_config.get("path", Path(args.data).parent))
train_img = dataset_root / data_config.get("train", "train/images")
val_img = dataset_root / data_config.get("val", "val/images")
train_lbl = train_img.parent / "labels"
val_lbl = val_img.parent / "labels"
# Load model
logger.info(f"Loading {args.weights}")
ul_model = YOLO(args.weights)
pt_model, loss_fn = get_pytorch_model(ul_model)
# Configure model args
from types import SimpleNamespace
if not hasattr(pt_model, "args"):
pt_model.args = SimpleNamespace()
if isinstance(pt_model.args, dict):
pt_model.args = SimpleNamespace(**pt_model.args)
# Set segmentation loss args
pt_model.args.overlap_mask = getattr(pt_model.args, "overlap_mask", True)
pt_model.args.mask_ratio = getattr(pt_model.args, "mask_ratio", 4)
pt_model.args.task = "segment"
pt_model.to(device)
pt_model.train()
for param in pt_model.parameters():
param.requires_grad = True
# Create datasets
train_ds = Float32YOLODataset(str(train_img), str(train_lbl), args.imgsz)
val_ds = Float32YOLODataset(str(val_img), str(val_lbl), args.imgsz)
train_loader = DataLoader(
train_ds,
batch_size=args.batch,
shuffle=True,
num_workers=4,
pin_memory=(device == "cuda"),
collate_fn=collate_fn,
)
val_loader = DataLoader(
val_ds,
batch_size=args.batch,
shuffle=False,
num_workers=2,
pin_memory=(device == "cuda"),
collate_fn=collate_fn,
)
# Optimizer
optimizer = torch.optim.AdamW(pt_model.parameters(), lr=args.lr)
# Training loop
os.makedirs(args.save_dir, exist_ok=True)
best_loss = float("inf")
for epoch in range(args.epochs):
t0 = time.time()
running_loss = 0.0
num_batches = 0
for imgs, labels_list, names in train_loader:
imgs = imgs.to(device)
optimizer.zero_grad()
num_batches += 1
# Forward (simple approach - just use preds)
preds = pt_model(imgs)
# Try to compute loss
# Simplest fallback: if preds is tuple/list, assume last element is loss
if isinstance(preds, (tuple, list)):
# Often YOLO forward returns (preds, loss) in training mode
if (
len(preds) >= 2
and isinstance(preds[-1], dict)
and "loss" in preds[-1]
):
loss = preds[-1]["loss"]
elif len(preds) >= 2 and isinstance(preds[-1], torch.Tensor):
loss = preds[-1]
else:
# Manually compute using loss_fn if available
if loss_fn:
# This may fail - see logs
try:
loss_out = loss_fn(preds, labels_list)
if isinstance(loss_out, dict):
loss = loss_out["loss"]
elif isinstance(loss_out, (tuple, list)):
loss = loss_out[0]
else:
loss = loss_out
except Exception as e:
logger.error(f"Loss computation failed: {e}")
logger.error(
"Consider using Ultralytics .train() or check model/loss compatibility"
)
raise
else:
raise RuntimeError("Cannot determine loss from model output")
elif isinstance(preds, dict) and "loss" in preds:
loss = preds["loss"]
else:
raise RuntimeError(f"Unexpected preds format: {type(preds)}")
# Backward
loss = loss.mean()
loss.backward()
optimizer.step()
running_loss += loss.item()
if (num_batches % 10) == 0:
logger.info(
f"Epoch {epoch+1} Batch {num_batches} Loss: {loss.item():.4f}"
)
epoch_loss = running_loss / max(1, num_batches)
epoch_time = time.time() - t0
logger.info(
f"Epoch {epoch+1}/{args.epochs} - Loss: {epoch_loss:.4f}, Time: {epoch_time:.1f}s"
)
# Save checkpoint
ckpt = Path(args.save_dir) / f"epoch{epoch+1}.pt"
torch.save(
{
"epoch": epoch + 1,
"model_state_dict": pt_model.state_dict(),
"optimizer_state_dict": optimizer.state_dict(),
"loss": epoch_loss,
},
ckpt,
)
# Save best
if epoch_loss < best_loss:
best_loss = epoch_loss
best_ckpt = Path(args.save_dir) / "best.pt"
torch.save(pt_model.state_dict(), best_ckpt)
logger.info(f"New best: {best_ckpt}")
logger.info("Training complete")
# ===================== Main =====================
def parse_args():
parser = argparse.ArgumentParser(
description="Train YOLO on 16-bit TIFF with float32"
)
parser.add_argument("--data", type=str, required=True, help="Path to data.yaml")
parser.add_argument(
"--weights", type=str, default="yolov8s-seg.pt", help="Pretrained weights"
)
parser.add_argument("--epochs", type=int, default=100, help="Number of epochs")
parser.add_argument("--batch", type=int, default=16, help="Batch size")
parser.add_argument("--imgsz", type=int, default=640, help="Image size")
parser.add_argument("--lr", type=float, default=1e-4, help="Learning rate")
parser.add_argument(
"--save-dir", type=str, default="runs/train", help="Save directory"
)
parser.add_argument(
"--device", type=str, default="cuda" if torch.cuda.is_available() else "cpu"
)
return parser.parse_args()
if __name__ == "__main__":
args = parse_args()
logger.info("=" * 70)
logger.info("Float32 16-bit TIFF Training - Standalone Script")
logger.info("=" * 70)
logger.info(f"Data: {args.data}")
logger.info(f"Weights: {args.weights}")
logger.info(f"Epochs: {args.epochs}, Batch: {args.batch}, ImgSz: {args.imgsz}")
logger.info(f"LR: {args.lr}, Device: {args.device}")
logger.info("=" * 70)
train(args)

56
setup.py Normal file
View File

@@ -0,0 +1,56 @@
"""Setup script for Microscopy Object Detection Application."""
from setuptools import setup, find_packages
from pathlib import Path
# Read the contents of README file
this_directory = Path(__file__).parent
long_description = (this_directory / "README.md").read_text(encoding="utf-8")
# Read requirements
requirements = (this_directory / "requirements.txt").read_text().splitlines()
requirements = [
req.strip() for req in requirements if req.strip() and not req.startswith("#")
]
setup(
name="microscopy-object-detection",
version="1.0.0",
author="Your Name",
author_email="your.email@example.com",
description="Desktop application for detecting and segmenting organelles in microscopy images using YOLOv8-seg",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/yourusername/object_detection",
packages=find_packages(exclude=["tests", "tests.*", "docs"]),
include_package_data=True,
install_requires=requirements,
python_requires=">=3.8",
classifiers=[
"Development Status :: 4 - Beta",
"Intended Audience :: Science/Research",
"Topic :: Scientific/Engineering :: Image Recognition",
"Topic :: Scientific/Engineering :: Bio-Informatics",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Operating System :: OS Independent",
],
entry_points={
"console_scripts": [
"microscopy-detect=src.cli:main",
],
"gui_scripts": [
"microscopy-detect-gui=src.gui_launcher:main",
],
},
keywords="microscopy yolov8 object-detection segmentation computer-vision deep-learning",
project_urls={
"Bug Reports": "https://github.com/yourusername/object_detection/issues",
"Source": "https://github.com/yourusername/object_detection",
"Documentation": "https://github.com/yourusername/object_detection/blob/main/README.md",
},
)

19
src/__init__.py Normal file
View File

@@ -0,0 +1,19 @@
"""
Microscopy Object Detection Application
A desktop application for detecting and segmenting organelles and membrane
branching structures in microscopy images using YOLOv8-seg.
"""
__version__ = "1.0.0"
__author__ = "Your Name"
__email__ = "your.email@example.com"
__license__ = "MIT"
# Package metadata
__all__ = [
"__version__",
"__author__",
"__email__",
"__license__",
]

61
src/cli.py Normal file
View File

@@ -0,0 +1,61 @@
"""
Command-line interface for microscopy object detection application.
"""
import sys
import argparse
from pathlib import Path
from src import __version__
def main():
"""Main CLI entry point."""
parser = argparse.ArgumentParser(
description="Microscopy Object Detection Application - CLI Interface",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
Examples:
# Launch GUI
microscopy-detect-gui
# Show version
microscopy-detect --version
# Get help
microscopy-detect --help
""",
)
parser.add_argument(
"--version",
action="version",
version=f"microscopy-object-detection {__version__}",
)
parser.add_argument(
"--gui",
action="store_true",
help="Launch the GUI application (same as microscopy-detect-gui)",
)
args = parser.parse_args()
if args.gui:
# Launch GUI
try:
from src.gui_launcher import main as gui_main
gui_main()
except Exception as e:
print(f"Error launching GUI: {e}", file=sys.stderr)
sys.exit(1)
else:
# Show help if no arguments provided
parser.print_help()
print("\nTo launch the GUI, use: microscopy-detect-gui")
return 0
if __name__ == "__main__":
sys.exit(main())

0
src/database/__init__.py Normal file
View File

1079
src/database/db_manager.py Normal file

File diff suppressed because it is too large Load Diff

69
src/database/models.py Normal file
View File

@@ -0,0 +1,69 @@
"""
Data models for the microscopy object detection application.
These dataclasses represent the database entities.
"""
from dataclasses import dataclass
from datetime import datetime
from typing import Optional, Dict, Tuple, List
@dataclass
class Model:
"""Represents a trained model."""
id: Optional[int]
model_name: str
model_version: str
model_path: str
base_model: str
created_at: datetime
training_params: Optional[Dict]
metrics: Optional[Dict]
@dataclass
class Image:
"""Represents an image in the database."""
id: Optional[int]
relative_path: str
filename: str
width: int
height: int
captured_at: Optional[datetime]
added_at: datetime
checksum: Optional[str]
@dataclass
class Detection:
"""Represents a detection result."""
id: Optional[int]
image_id: int
model_id: int
class_name: str
bbox: Tuple[float, float, float, float] # (x_min, y_min, x_max, y_max)
confidence: float
segmentation_mask: Optional[
List[List[float]]
] # List of polygon coordinates [[x1,y1], [x2,y2], ...]
detected_at: datetime
metadata: Optional[Dict]
@dataclass
class Annotation:
"""Represents a manual annotation."""
id: Optional[int]
image_id: int
class_name: str
bbox: Tuple[float, float, float, float] # (x_min, y_min, x_max, y_max)
segmentation_mask: Optional[
List[List[float]]
] # List of polygon coordinates [[x1,y1], [x2,y2], ...]
annotator: str
created_at: datetime
verified: bool

91
src/database/schema.sql Normal file
View File

@@ -0,0 +1,91 @@
-- Microscopy Object Detection Application - Database Schema
-- SQLite Database Schema for storing models, images, detections, and annotations
-- Models table: stores trained model information
CREATE TABLE IF NOT EXISTS models (
id INTEGER PRIMARY KEY AUTOINCREMENT,
model_name TEXT NOT NULL,
model_version TEXT NOT NULL,
model_path TEXT NOT NULL,
base_model TEXT NOT NULL DEFAULT 'yolov8s.pt',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
training_params TEXT, -- JSON string of training parameters
metrics TEXT, -- JSON string of validation metrics
UNIQUE(model_name, model_version)
);
-- Images table: stores image metadata
CREATE TABLE IF NOT EXISTS images (
id INTEGER PRIMARY KEY AUTOINCREMENT,
relative_path TEXT NOT NULL UNIQUE,
filename TEXT NOT NULL,
width INTEGER,
height INTEGER,
captured_at TIMESTAMP,
added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
checksum TEXT
);
-- Detections table: stores detection results
CREATE TABLE IF NOT EXISTS detections (
id INTEGER PRIMARY KEY AUTOINCREMENT,
image_id INTEGER NOT NULL,
model_id INTEGER NOT NULL,
class_name TEXT NOT NULL,
x_min REAL NOT NULL CHECK(x_min >= 0 AND x_min <= 1),
y_min REAL NOT NULL CHECK(y_min >= 0 AND y_min <= 1),
x_max REAL NOT NULL CHECK(x_max >= 0 AND x_max <= 1),
y_max REAL NOT NULL CHECK(y_max >= 0 AND y_max <= 1),
confidence REAL NOT NULL CHECK(confidence >= 0 AND confidence <= 1),
segmentation_mask TEXT, -- JSON string of polygon coordinates [[x1,y1], [x2,y2], ...]
detected_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
metadata TEXT, -- JSON string for additional metadata
FOREIGN KEY (image_id) REFERENCES images (id) ON DELETE CASCADE,
FOREIGN KEY (model_id) REFERENCES models (id) ON DELETE CASCADE
);
-- Object classes table: stores annotation class definitions with colors
CREATE TABLE IF NOT EXISTS object_classes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
class_name TEXT NOT NULL UNIQUE,
color TEXT NOT NULL, -- Hex color code (e.g., '#FF0000')
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
description TEXT
);
-- Insert default object classes
INSERT OR IGNORE INTO object_classes (class_name, color, description) VALUES
('cell', '#FF0000', 'Cell object'),
('nucleus', '#00FF00', 'Cell nucleus'),
('mitochondria', '#0000FF', 'Mitochondria'),
('vesicle', '#FFFF00', 'Vesicle');
-- Annotations table: stores manual annotations
CREATE TABLE IF NOT EXISTS annotations (
id INTEGER PRIMARY KEY AUTOINCREMENT,
image_id INTEGER NOT NULL,
class_id INTEGER NOT NULL,
x_min REAL NOT NULL CHECK(x_min >= 0 AND x_min <= 1),
y_min REAL NOT NULL CHECK(y_min >= 0 AND y_min <= 1),
x_max REAL NOT NULL CHECK(x_max >= 0 AND x_max <= 1),
y_max REAL NOT NULL CHECK(y_max >= 0 AND y_max <= 1),
segmentation_mask TEXT, -- JSON string of polygon coordinates [[x1,y1], [x2,y2], ...]
annotator TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
verified BOOLEAN DEFAULT 0,
FOREIGN KEY (image_id) REFERENCES images (id) ON DELETE CASCADE,
FOREIGN KEY (class_id) REFERENCES object_classes (id) ON DELETE CASCADE
);
-- Create indexes for performance optimization
CREATE INDEX IF NOT EXISTS idx_detections_image_id ON detections(image_id);
CREATE INDEX IF NOT EXISTS idx_detections_model_id ON detections(model_id);
CREATE INDEX IF NOT EXISTS idx_detections_class_name ON detections(class_name);
CREATE INDEX IF NOT EXISTS idx_detections_detected_at ON detections(detected_at);
CREATE INDEX IF NOT EXISTS idx_detections_confidence ON detections(confidence);
CREATE INDEX IF NOT EXISTS idx_images_relative_path ON images(relative_path);
CREATE INDEX IF NOT EXISTS idx_images_added_at ON images(added_at);
CREATE INDEX IF NOT EXISTS idx_annotations_image_id ON annotations(image_id);
CREATE INDEX IF NOT EXISTS idx_annotations_class_id ON annotations(class_id);
CREATE INDEX IF NOT EXISTS idx_models_created_at ON models(created_at);
CREATE INDEX IF NOT EXISTS idx_object_classes_class_name ON object_classes(class_name);

0
src/gui/__init__.py Normal file
View File

View File

View File

@@ -0,0 +1,291 @@
"""
Configuration dialog for the microscopy object detection application.
"""
from PySide6.QtWidgets import (
QDialog,
QVBoxLayout,
QHBoxLayout,
QFormLayout,
QPushButton,
QLineEdit,
QSpinBox,
QDoubleSpinBox,
QFileDialog,
QTabWidget,
QWidget,
QLabel,
QGroupBox,
)
from PySide6.QtCore import Qt
from src.utils.config_manager import ConfigManager
from src.utils.logger import get_logger
logger = get_logger(__name__)
class ConfigDialog(QDialog):
"""Configuration dialog window."""
def __init__(self, config_manager: ConfigManager, parent=None):
super().__init__(parent)
self.config_manager = config_manager
self.setWindowTitle("Settings")
self.setMinimumWidth(500)
self.setMinimumHeight(400)
self._setup_ui()
self._load_settings()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
# Create tab widget for different setting categories
self.tab_widget = QTabWidget()
# General settings tab
general_tab = self._create_general_tab()
self.tab_widget.addTab(general_tab, "General")
# Training settings tab
training_tab = self._create_training_tab()
self.tab_widget.addTab(training_tab, "Training")
# Detection settings tab
detection_tab = self._create_detection_tab()
self.tab_widget.addTab(detection_tab, "Detection")
layout.addWidget(self.tab_widget)
# Buttons
button_layout = QHBoxLayout()
button_layout.addStretch()
self.save_button = QPushButton("Save")
self.save_button.clicked.connect(self.accept)
button_layout.addWidget(self.save_button)
self.cancel_button = QPushButton("Cancel")
self.cancel_button.clicked.connect(self.reject)
button_layout.addWidget(self.cancel_button)
layout.addLayout(button_layout)
self.setLayout(layout)
def _create_general_tab(self) -> QWidget:
"""Create general settings tab."""
widget = QWidget()
layout = QVBoxLayout()
# Image repository group
repo_group = QGroupBox("Image Repository")
repo_layout = QFormLayout()
# Repository path
path_layout = QHBoxLayout()
self.repo_path_edit = QLineEdit()
self.repo_path_edit.setPlaceholderText("Path to image repository")
path_layout.addWidget(self.repo_path_edit)
browse_button = QPushButton("Browse...")
browse_button.clicked.connect(self._browse_repository)
path_layout.addWidget(browse_button)
repo_layout.addRow("Base Path:", path_layout)
repo_group.setLayout(repo_layout)
layout.addWidget(repo_group)
# Database group
db_group = QGroupBox("Database")
db_layout = QFormLayout()
self.db_path_edit = QLineEdit()
self.db_path_edit.setPlaceholderText("Path to database file")
db_layout.addRow("Database Path:", self.db_path_edit)
db_group.setLayout(db_layout)
layout.addWidget(db_group)
# Models group
models_group = QGroupBox("Models")
models_layout = QFormLayout()
self.models_dir_edit = QLineEdit()
self.models_dir_edit.setPlaceholderText("Directory for saved models")
models_layout.addRow("Models Directory:", self.models_dir_edit)
self.base_model_edit = QLineEdit()
self.base_model_edit.setPlaceholderText("yolov8s-seg.pt")
models_layout.addRow("Default Base Model:", self.base_model_edit)
models_group.setLayout(models_layout)
layout.addWidget(models_group)
layout.addStretch()
widget.setLayout(layout)
return widget
def _create_training_tab(self) -> QWidget:
"""Create training settings tab."""
widget = QWidget()
layout = QVBoxLayout()
form_layout = QFormLayout()
# Epochs
self.epochs_spin = QSpinBox()
self.epochs_spin.setRange(1, 1000)
self.epochs_spin.setValue(100)
form_layout.addRow("Default Epochs:", self.epochs_spin)
# Batch size
self.batch_size_spin = QSpinBox()
self.batch_size_spin.setRange(1, 128)
self.batch_size_spin.setValue(16)
form_layout.addRow("Default Batch Size:", self.batch_size_spin)
# Image size
self.imgsz_spin = QSpinBox()
self.imgsz_spin.setRange(320, 1280)
self.imgsz_spin.setSingleStep(32)
self.imgsz_spin.setValue(640)
form_layout.addRow("Default Image Size:", self.imgsz_spin)
# Patience
self.patience_spin = QSpinBox()
self.patience_spin.setRange(1, 200)
self.patience_spin.setValue(50)
form_layout.addRow("Default Patience:", self.patience_spin)
# Learning rate
self.lr_spin = QDoubleSpinBox()
self.lr_spin.setRange(0.0001, 0.1)
self.lr_spin.setSingleStep(0.001)
self.lr_spin.setDecimals(4)
self.lr_spin.setValue(0.01)
form_layout.addRow("Default Learning Rate:", self.lr_spin)
layout.addLayout(form_layout)
layout.addStretch()
widget.setLayout(layout)
return widget
def _create_detection_tab(self) -> QWidget:
"""Create detection settings tab."""
widget = QWidget()
layout = QVBoxLayout()
form_layout = QFormLayout()
# Confidence threshold
self.conf_spin = QDoubleSpinBox()
self.conf_spin.setRange(0.0, 1.0)
self.conf_spin.setSingleStep(0.05)
self.conf_spin.setDecimals(2)
self.conf_spin.setValue(0.25)
form_layout.addRow("Default Confidence:", self.conf_spin)
# IoU threshold
self.iou_spin = QDoubleSpinBox()
self.iou_spin.setRange(0.0, 1.0)
self.iou_spin.setSingleStep(0.05)
self.iou_spin.setDecimals(2)
self.iou_spin.setValue(0.45)
form_layout.addRow("Default IoU:", self.iou_spin)
# Max batch size
self.max_batch_spin = QSpinBox()
self.max_batch_spin.setRange(1, 1000)
self.max_batch_spin.setValue(100)
form_layout.addRow("Max Batch Size:", self.max_batch_spin)
layout.addLayout(form_layout)
layout.addStretch()
widget.setLayout(layout)
return widget
def _browse_repository(self):
"""Browse for image repository directory."""
directory = QFileDialog.getExistingDirectory(
self, "Select Image Repository", self.repo_path_edit.text()
)
if directory:
self.repo_path_edit.setText(directory)
def _load_settings(self):
"""Load current settings into dialog."""
# General settings
self.repo_path_edit.setText(
self.config_manager.get("image_repository.base_path", "")
)
self.db_path_edit.setText(
self.config_manager.get("database.path", "data/detections.db")
)
self.models_dir_edit.setText(
self.config_manager.get("models.models_directory", "data/models")
)
self.base_model_edit.setText(
self.config_manager.get("models.default_base_model", "yolov8s-seg.pt")
)
# Training settings
self.epochs_spin.setValue(
self.config_manager.get("training.default_epochs", 100)
)
self.batch_size_spin.setValue(
self.config_manager.get("training.default_batch_size", 16)
)
self.imgsz_spin.setValue(self.config_manager.get("training.default_imgsz", 640))
self.patience_spin.setValue(
self.config_manager.get("training.default_patience", 50)
)
self.lr_spin.setValue(self.config_manager.get("training.default_lr0", 0.01))
# Detection settings
self.conf_spin.setValue(
self.config_manager.get("detection.default_confidence", 0.25)
)
self.iou_spin.setValue(self.config_manager.get("detection.default_iou", 0.45))
self.max_batch_spin.setValue(
self.config_manager.get("detection.max_batch_size", 100)
)
def accept(self):
"""Save settings and close dialog."""
logger.info("Saving configuration")
# Save general settings
self.config_manager.set(
"image_repository.base_path", self.repo_path_edit.text()
)
self.config_manager.set("database.path", self.db_path_edit.text())
self.config_manager.set("models.models_directory", self.models_dir_edit.text())
self.config_manager.set(
"models.default_base_model", self.base_model_edit.text()
)
# Save training settings
self.config_manager.set("training.default_epochs", self.epochs_spin.value())
self.config_manager.set(
"training.default_batch_size", self.batch_size_spin.value()
)
self.config_manager.set("training.default_imgsz", self.imgsz_spin.value())
self.config_manager.set("training.default_patience", self.patience_spin.value())
self.config_manager.set("training.default_lr0", self.lr_spin.value())
# Save detection settings
self.config_manager.set("detection.default_confidence", self.conf_spin.value())
self.config_manager.set("detection.default_iou", self.iou_spin.value())
self.config_manager.set("detection.max_batch_size", self.max_batch_spin.value())
# Save to file
self.config_manager.save_config()
super().accept()

309
src/gui/main_window.py Normal file
View File

@@ -0,0 +1,309 @@
"""
Main window for the microscopy object detection application.
"""
from PySide6.QtWidgets import (
QMainWindow,
QTabWidget,
QMenuBar,
QMenu,
QStatusBar,
QMessageBox,
QWidget,
QVBoxLayout,
QLabel,
)
from PySide6.QtCore import Qt, QTimer, QSettings
from PySide6.QtGui import QAction, QKeySequence
from src.database.db_manager import DatabaseManager
from src.utils.config_manager import ConfigManager
from src.utils.logger import get_logger
from src.gui.dialogs.config_dialog import ConfigDialog
from src.gui.tabs.detection_tab import DetectionTab
from src.gui.tabs.training_tab import TrainingTab
from src.gui.tabs.validation_tab import ValidationTab
from src.gui.tabs.results_tab import ResultsTab
from src.gui.tabs.annotation_tab import AnnotationTab
logger = get_logger(__name__)
class MainWindow(QMainWindow):
"""Main application window."""
def __init__(self):
super().__init__()
# Initialize managers
self.config_manager = ConfigManager()
db_path = self.config_manager.get_database_path()
self.db_manager = DatabaseManager(db_path)
logger.info("Main window initializing")
# Setup UI
self.setWindowTitle("Microscopy Object Detection")
self.setMinimumSize(1200, 800)
self._create_menu_bar()
self._create_tab_widget()
self._create_status_bar()
# Restore window geometry or center window on screen
self._restore_window_state()
logger.info("Main window initialized")
def _create_menu_bar(self):
"""Create application menu bar."""
menubar = self.menuBar()
# File menu
file_menu = menubar.addMenu("&File")
settings_action = QAction("&Settings", self)
settings_action.setShortcut(QKeySequence("Ctrl+,"))
settings_action.triggered.connect(self._show_settings)
file_menu.addAction(settings_action)
file_menu.addSeparator()
exit_action = QAction("E&xit", self)
exit_action.setShortcut(QKeySequence("Ctrl+Q"))
exit_action.triggered.connect(self.close)
file_menu.addAction(exit_action)
# View menu
view_menu = menubar.addMenu("&View")
refresh_action = QAction("&Refresh", self)
refresh_action.setShortcut(QKeySequence("F5"))
refresh_action.triggered.connect(self._refresh_current_tab)
view_menu.addAction(refresh_action)
# Tools menu
tools_menu = menubar.addMenu("&Tools")
db_stats_action = QAction("Database &Statistics", self)
db_stats_action.triggered.connect(self._show_database_stats)
tools_menu.addAction(db_stats_action)
# Help menu
help_menu = menubar.addMenu("&Help")
about_action = QAction("&About", self)
about_action.triggered.connect(self._show_about)
help_menu.addAction(about_action)
docs_action = QAction("&Documentation", self)
docs_action.triggered.connect(self._show_documentation)
help_menu.addAction(docs_action)
def _create_tab_widget(self):
"""Create main tab widget with all tabs."""
self.tab_widget = QTabWidget()
self.tab_widget.setTabPosition(QTabWidget.North)
# Create tabs
try:
self.detection_tab = DetectionTab(self.db_manager, self.config_manager)
self.training_tab = TrainingTab(self.db_manager, self.config_manager)
self.validation_tab = ValidationTab(self.db_manager, self.config_manager)
self.results_tab = ResultsTab(self.db_manager, self.config_manager)
self.annotation_tab = AnnotationTab(self.db_manager, self.config_manager)
# Add tabs to widget
self.tab_widget.addTab(self.detection_tab, "Detection")
self.tab_widget.addTab(self.training_tab, "Training")
self.tab_widget.addTab(self.validation_tab, "Validation")
self.tab_widget.addTab(self.results_tab, "Results")
self.tab_widget.addTab(self.annotation_tab, "Annotation (Future)")
# Connect tab change signal
self.tab_widget.currentChanged.connect(self._on_tab_changed)
except Exception as e:
logger.error(f"Error creating tabs: {e}")
# Create placeholder
placeholder = QWidget()
layout = QVBoxLayout()
layout.addWidget(QLabel(f"Error creating tabs: {e}"))
placeholder.setLayout(layout)
self.tab_widget.addTab(placeholder, "Error")
self.setCentralWidget(self.tab_widget)
def _create_status_bar(self):
"""Create status bar."""
self.status_bar = QStatusBar()
self.setStatusBar(self.status_bar)
# Add permanent widgets to status bar
self.status_label = QLabel("Ready")
self.status_bar.addWidget(self.status_label)
# Initial status message
self._update_status("Ready")
def _center_window(self):
"""Center window on screen."""
screen = self.screen().geometry()
size = self.geometry()
self.move(
(screen.width() - size.width()) // 2, (screen.height() - size.height()) // 2
)
def _restore_window_state(self):
"""Restore window geometry from settings or center window."""
settings = QSettings("microscopy_app", "object_detection")
geometry = settings.value("main_window/geometry")
if geometry:
self.restoreGeometry(geometry)
logger.debug("Restored window geometry from settings")
else:
self._center_window()
logger.debug("Centered window on screen")
def _save_window_state(self):
"""Save window geometry to settings."""
settings = QSettings("microscopy_app", "object_detection")
settings.setValue("main_window/geometry", self.saveGeometry())
logger.debug("Saved window geometry to settings")
def _show_settings(self):
"""Show settings dialog."""
logger.info("Opening settings dialog")
dialog = ConfigDialog(self.config_manager, self)
if dialog.exec():
self._apply_settings()
self._update_status("Settings saved")
def _apply_settings(self):
"""Apply changed settings."""
logger.info("Applying settings changes")
# Reload configuration in all tabs if needed
try:
if hasattr(self, "detection_tab"):
self.detection_tab.refresh()
if hasattr(self, "training_tab"):
self.training_tab.refresh()
if hasattr(self, "results_tab"):
self.results_tab.refresh()
except Exception as e:
logger.error(f"Error applying settings: {e}")
def _refresh_current_tab(self):
"""Refresh the current tab."""
current_widget = self.tab_widget.currentWidget()
if hasattr(current_widget, "refresh"):
current_widget.refresh()
self._update_status("Tab refreshed")
def _on_tab_changed(self, index: int):
"""Handle tab change event."""
tab_name = self.tab_widget.tabText(index)
logger.debug(f"Switched to tab: {tab_name}")
self._update_status(f"Viewing: {tab_name}")
def _show_database_stats(self):
"""Show database statistics dialog."""
try:
stats = self.db_manager.get_detection_statistics()
message = f"""
<h3>Database Statistics</h3>
<p><b>Total Detections:</b> {stats.get('total_detections', 0)}</p>
<p><b>Average Confidence:</b> {stats.get('average_confidence', 0):.2%}</p>
<p><b>Classes:</b></p>
<ul>
"""
for class_name, count in stats.get("class_counts", {}).items():
message += f"<li>{class_name}: {count}</li>"
message += "</ul>"
QMessageBox.information(self, "Database Statistics", message)
except Exception as e:
logger.error(f"Error getting database stats: {e}")
QMessageBox.warning(
self, "Error", f"Failed to get database statistics:\n{str(e)}"
)
def _show_about(self):
"""Show about dialog."""
about_text = """
<h2>Microscopy Object Detection Application</h2>
<p><b>Version:</b> 1.0.0</p>
<p>A desktop application for detecting organelles and membrane branching
structures in microscopy images using YOLOv8.</p>
<p><b>Features:</b></p>
<ul>
<li>Object detection with YOLOv8</li>
<li>Model training and validation</li>
<li>Detection results storage</li>
<li>Interactive visualization</li>
<li>Export capabilities</li>
</ul>
<p><b>Technologies:</b></p>
<ul>
<li>Ultralytics YOLOv8</li>
<li>PySide6</li>
<li>pyqtgraph</li>
<li>SQLite</li>
</ul>
"""
QMessageBox.about(self, "About", about_text)
def _show_documentation(self):
"""Show documentation."""
QMessageBox.information(
self,
"Documentation",
"Please refer to README.md and ARCHITECTURE.md files in the project directory.",
)
def _update_status(self, message: str, timeout: int = 5000):
"""
Update status bar message.
Args:
message: Status message to display
timeout: Time in milliseconds to show message (0 for permanent)
"""
self.status_label.setText(message)
if timeout > 0:
QTimer.singleShot(timeout, lambda: self.status_label.setText("Ready"))
def closeEvent(self, event):
"""Handle window close event."""
reply = QMessageBox.question(
self,
"Confirm Exit",
"Are you sure you want to exit?",
QMessageBox.Yes | QMessageBox.No,
QMessageBox.No,
)
if reply == QMessageBox.Yes:
# Save window state before closing
self._save_window_state()
# Persist tab state and stop background work before exit
if hasattr(self, "training_tab"):
self.training_tab.shutdown()
if hasattr(self, "annotation_tab"):
self.annotation_tab.save_state()
logger.info("Application closing")
event.accept()
else:
event.ignore()

0
src/gui/tabs/__init__.py Normal file
View File

View File

@@ -0,0 +1,566 @@
"""
Annotation tab for the microscopy object detection application.
Manual annotation with pen tool and object class management.
"""
from PySide6.QtWidgets import (
QWidget,
QVBoxLayout,
QHBoxLayout,
QLabel,
QGroupBox,
QPushButton,
QFileDialog,
QMessageBox,
QSplitter,
)
from PySide6.QtCore import Qt, QSettings
from pathlib import Path
from src.database.db_manager import DatabaseManager
from src.utils.config_manager import ConfigManager
from src.utils.image import Image, ImageLoadError
from src.utils.logger import get_logger
from src.gui.widgets import AnnotationCanvasWidget, AnnotationToolsWidget
logger = get_logger(__name__)
class AnnotationTab(QWidget):
"""Annotation tab for manual image annotation."""
def __init__(
self, db_manager: DatabaseManager, config_manager: ConfigManager, parent=None
):
super().__init__(parent)
self.db_manager = db_manager
self.config_manager = config_manager
self.current_image = None
self.current_image_path = None
self.current_image_id = None
self.current_annotations = []
# IDs of annotations currently selected on the canvas (multi-select)
self.selected_annotation_ids = []
self._setup_ui()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
# Main horizontal splitter to divide left (image) and right (controls)
self.main_splitter = QSplitter(Qt.Horizontal)
self.main_splitter.setHandleWidth(10)
# { Left splitter for image display and zoom info
self.left_splitter = QSplitter(Qt.Vertical)
self.left_splitter.setHandleWidth(10)
# Annotation canvas section
canvas_group = QGroupBox("Annotation Canvas")
canvas_layout = QVBoxLayout()
# Use the AnnotationCanvasWidget
self.annotation_canvas = AnnotationCanvasWidget()
self.annotation_canvas.zoom_changed.connect(self._on_zoom_changed)
self.annotation_canvas.annotation_drawn.connect(self._on_annotation_drawn)
# Selection of existing polylines (when tool is not in drawing mode)
self.annotation_canvas.annotation_selected.connect(self._on_annotation_selected)
canvas_layout.addWidget(self.annotation_canvas)
canvas_group.setLayout(canvas_layout)
self.left_splitter.addWidget(canvas_group)
# Controls info
controls_info = QLabel(
"Zoom: Mouse wheel or +/- keys | Drawing: Enable pen and drag mouse"
)
controls_info.setStyleSheet("QLabel { color: #888; font-style: italic; }")
self.left_splitter.addWidget(controls_info)
# }
# { Right splitter for annotation tools and controls
self.right_splitter = QSplitter(Qt.Vertical)
self.right_splitter.setHandleWidth(10)
# Annotation tools section
self.annotation_tools = AnnotationToolsWidget(self.db_manager)
self.annotation_tools.polyline_enabled_changed.connect(
self.annotation_canvas.set_polyline_enabled
)
self.annotation_tools.polyline_pen_color_changed.connect(
self.annotation_canvas.set_polyline_pen_color
)
self.annotation_tools.polyline_pen_width_changed.connect(
self.annotation_canvas.set_polyline_pen_width
)
# Show / hide bounding boxes
self.annotation_tools.show_bboxes_changed.connect(
self.annotation_canvas.set_show_bboxes
)
# RDP simplification controls
self.annotation_tools.simplify_on_finish_changed.connect(
self._on_simplify_on_finish_changed
)
self.annotation_tools.simplify_epsilon_changed.connect(
self._on_simplify_epsilon_changed
)
# Class selection and class-color changes
self.annotation_tools.class_selected.connect(self._on_class_selected)
self.annotation_tools.class_color_changed.connect(self._on_class_color_changed)
self.annotation_tools.clear_annotations_requested.connect(
self._on_clear_annotations
)
# Delete selected annotation on canvas
self.annotation_tools.delete_selected_annotation_requested.connect(
self._on_delete_selected_annotation
)
self.right_splitter.addWidget(self.annotation_tools)
# Image loading section
load_group = QGroupBox("Image Loading")
load_layout = QVBoxLayout()
# Load image button
button_layout = QHBoxLayout()
self.load_image_btn = QPushButton("Load Image")
self.load_image_btn.clicked.connect(self._load_image)
button_layout.addWidget(self.load_image_btn)
button_layout.addStretch()
load_layout.addLayout(button_layout)
# Image info label
self.image_info_label = QLabel("No image loaded")
load_layout.addWidget(self.image_info_label)
load_group.setLayout(load_layout)
self.right_splitter.addWidget(load_group)
# }
# Add both splitters to the main horizontal splitter
self.main_splitter.addWidget(self.left_splitter)
self.main_splitter.addWidget(self.right_splitter)
# Set initial sizes: 75% for left (image), 25% for right (controls)
self.main_splitter.setSizes([750, 250])
layout.addWidget(self.main_splitter)
self.setLayout(layout)
# Restore splitter positions from settings
self._restore_state()
def _load_image(self):
"""Load and display an image file."""
# Get last opened directory from QSettings
settings = QSettings("microscopy_app", "object_detection")
last_dir = settings.value("annotation_tab/last_directory", None)
# Fallback to image repository path or home directory
if last_dir and Path(last_dir).exists():
start_dir = last_dir
else:
repo_path = self.config_manager.get_image_repository_path()
start_dir = repo_path if repo_path else str(Path.home())
# Open file dialog
file_path, _ = QFileDialog.getOpenFileName(
self,
"Select Image",
start_dir,
"Images (*" + " *".join(Image.SUPPORTED_EXTENSIONS) + ")",
)
if not file_path:
return
try:
# Load image using Image class
self.current_image = Image(file_path)
self.current_image_path = file_path
# Store the directory for next time
settings.setValue(
"annotation_tab/last_directory", str(Path(file_path).parent)
)
# Get or create image in database
relative_path = str(Path(file_path).name) # Simplified for now
self.current_image_id = self.db_manager.get_or_create_image(
relative_path,
Path(file_path).name,
self.current_image.width,
self.current_image.height,
)
# Display image using the AnnotationCanvasWidget
self.annotation_canvas.load_image(self.current_image)
# Load and display any existing annotations for this image
self._load_annotations_for_current_image()
# Update info label
self._update_image_info()
logger.info(f"Loaded image: {file_path} (DB ID: {self.current_image_id})")
except ImageLoadError as e:
logger.error(f"Failed to load image: {e}")
QMessageBox.critical(
self, "Error Loading Image", f"Failed to load image:\n{str(e)}"
)
except Exception as e:
logger.error(f"Unexpected error loading image: {e}")
QMessageBox.critical(self, "Error", f"Unexpected error:\n{str(e)}")
def _update_image_info(self):
"""Update the image info label with current image details."""
if self.current_image is None:
self.image_info_label.setText("No image loaded")
return
zoom_percentage = self.annotation_canvas.get_zoom_percentage()
info_text = (
f"File: {Path(self.current_image_path).name}\n"
f"Size: {self.current_image.width}x{self.current_image.height} pixels\n"
f"Channels: {self.current_image.channels}\n"
f"Data type: {self.current_image.dtype}\n"
f"Format: {self.current_image.format.upper()}\n"
f"File size: {self.current_image.size_mb:.2f} MB\n"
f"Zoom: {zoom_percentage}%"
)
self.image_info_label.setText(info_text)
def _on_zoom_changed(self, zoom_scale: float):
"""Handle zoom level changes from the annotation canvas."""
self._update_image_info()
def _on_annotation_drawn(self, points: list):
"""
Handle when an annotation stroke is drawn.
Saves the new annotation directly to the database and refreshes the
on-canvas display of annotations for the current image.
"""
# Ensure we have an image loaded and in the DB
if not self.current_image or not self.current_image_id:
logger.warning("Annotation drawn but no image loaded")
QMessageBox.warning(
self,
"No Image",
"Please load an image before drawing annotations.",
)
return
current_class = self.annotation_tools.get_current_class()
if not current_class:
logger.warning("Annotation drawn but no object class selected")
QMessageBox.warning(
self,
"No Class Selected",
"Please select an object class before drawing annotations.",
)
return
if not points:
logger.warning("Annotation drawn with no points, ignoring")
return
# points are [(x_norm, y_norm), ...]
xs = [p[0] for p in points]
ys = [p[1] for p in points]
x_min, x_max = min(xs), max(xs)
y_min, y_max = min(ys), max(ys)
# Store segmentation mask in [y_norm, x_norm] format to match DB
db_polyline = [[float(y), float(x)] for (x, y) in points]
try:
annotation_id = self.db_manager.add_annotation(
image_id=self.current_image_id,
class_id=current_class["id"],
bbox=(x_min, y_min, x_max, y_max),
annotator="manual",
segmentation_mask=db_polyline,
verified=False,
)
logger.info(
f"Saved annotation (ID: {annotation_id}) for class "
f"'{current_class['class_name']}' "
f"Bounding box: ({x_min:.3f}, {y_min:.3f}) to ({x_max:.3f}, {y_max:.3f})\n"
f"with {len(points)} polyline points"
)
# Reload annotations from DB and redraw (respecting current class filter)
self._load_annotations_for_current_image()
except Exception as e:
logger.error(f"Failed to save annotation: {e}")
QMessageBox.critical(self, "Error", f"Failed to save annotation:\n{str(e)}")
def _on_annotation_selected(self, annotation_ids):
"""
Handle selection of existing annotations on the canvas.
Args:
annotation_ids: List of selected annotation IDs, or None/empty if cleared.
"""
if not annotation_ids:
self.selected_annotation_ids = []
self.annotation_tools.set_has_selected_annotation(False)
logger.debug("Annotation selection cleared on canvas")
return
# Normalize to a unique, sorted list of integer IDs
ids = sorted({int(aid) for aid in annotation_ids if isinstance(aid, int)})
self.selected_annotation_ids = ids
self.annotation_tools.set_has_selected_annotation(bool(ids))
logger.debug(f"Annotations selected on canvas: IDs={ids}")
def _on_simplify_on_finish_changed(self, enabled: bool):
"""Update canvas simplify-on-finish flag from tools widget."""
self.annotation_canvas.simplify_on_finish = enabled
logger.debug(f"Annotation simplification on finish set to {enabled}")
def _on_simplify_epsilon_changed(self, epsilon: float):
"""Update canvas RDP epsilon from tools widget."""
self.annotation_canvas.simplify_epsilon = float(epsilon)
logger.debug(f"Annotation simplification epsilon set to {epsilon}")
def _on_class_color_changed(self):
"""
Handle changes to the selected object's class color.
When the user updates a class color in the tools widget, reload the
annotations for the current image so that all polylines are redrawn
using the updated per-class colors.
"""
if not self.current_image_id:
return
logger.debug(
f"Class color changed; reloading annotations for image ID {self.current_image_id}"
)
self._load_annotations_for_current_image()
def _on_class_selected(self, class_data):
"""
Handle when an object class is selected or cleared.
When a specific class is selected, only annotations of that class are drawn.
When the selection is cleared ("-- Select Class --"), all annotations are shown.
"""
if class_data:
logger.debug(f"Object class selected: {class_data['class_name']}")
else:
logger.debug(
'No class selected ("-- Select Class --"), showing all annotations'
)
# Changing the class filter invalidates any previous selection
self.selected_annotation_ids = []
self.annotation_tools.set_has_selected_annotation(False)
# Whenever the selection changes, update which annotations are visible
self._redraw_annotations_for_current_filter()
def _on_clear_annotations(self):
"""Handle clearing all annotations."""
self.annotation_canvas.clear_annotations()
# Clear in-memory state and selection, but keep DB entries unchanged
self.current_annotations = []
self.selected_annotation_ids = []
self.annotation_tools.set_has_selected_annotation(False)
logger.info("Cleared all annotations")
def _on_delete_selected_annotation(self):
"""Handle deleting the currently selected annotation(s) (if any)."""
if not self.selected_annotation_ids:
QMessageBox.information(
self,
"No Selection",
"No annotation is currently selected.",
)
return
count = len(self.selected_annotation_ids)
if count == 1:
question = "Are you sure you want to delete the selected annotation?"
title = "Delete Annotation"
else:
question = (
f"Are you sure you want to delete the {count} selected annotations?"
)
title = "Delete Annotations"
reply = QMessageBox.question(
self,
title,
question,
QMessageBox.Yes | QMessageBox.No,
QMessageBox.No,
)
if reply != QMessageBox.Yes:
return
failed_ids = []
try:
for ann_id in self.selected_annotation_ids:
try:
deleted = self.db_manager.delete_annotation(ann_id)
if not deleted:
failed_ids.append(ann_id)
except Exception as e:
logger.error(f"Failed to delete annotation ID {ann_id}: {e}")
failed_ids.append(ann_id)
if failed_ids:
QMessageBox.warning(
self,
"Partial Failure",
"Some annotations could not be deleted:\n"
+ ", ".join(str(a) for a in failed_ids),
)
else:
logger.info(
f"Deleted {count} annotation(s): "
+ ", ".join(str(a) for a in self.selected_annotation_ids)
)
# Clear selection and reload annotations for the current image from DB
self.selected_annotation_ids = []
self.annotation_tools.set_has_selected_annotation(False)
self._load_annotations_for_current_image()
except Exception as e:
logger.error(f"Failed to delete annotations: {e}")
QMessageBox.critical(
self,
"Error",
f"Failed to delete annotations:\n{str(e)}",
)
def _load_annotations_for_current_image(self):
"""
Load all annotations for the current image from the database and
redraw them on the canvas, honoring the currently selected class
filter (if any).
"""
if not self.current_image_id:
self.current_annotations = []
self.annotation_canvas.clear_annotations()
self.selected_annotation_ids = []
self.annotation_tools.set_has_selected_annotation(False)
return
try:
self.current_annotations = self.db_manager.get_annotations_for_image(
self.current_image_id
)
# New annotations loaded; reset any selection
self.selected_annotation_ids = []
self.annotation_tools.set_has_selected_annotation(False)
self._redraw_annotations_for_current_filter()
except Exception as e:
logger.error(
f"Failed to load annotations for image {self.current_image_id}: {e}"
)
QMessageBox.critical(
self,
"Error",
f"Failed to load annotations for this image:\n{str(e)}",
)
def _redraw_annotations_for_current_filter(self):
"""
Redraw annotations for the current image, optionally filtered by the
currently selected object class.
"""
# Clear current on-canvas annotations but keep the image
self.annotation_canvas.clear_annotations()
if not self.current_annotations:
return
current_class = self.annotation_tools.get_current_class()
selected_class_id = current_class["id"] if current_class else None
drawn_count = 0
for ann in self.current_annotations:
# Filter by class if one is selected
if (
selected_class_id is not None
and ann.get("class_id") != selected_class_id
):
continue
if ann.get("segmentation_mask"):
polyline = ann["segmentation_mask"]
color = ann.get("class_color", "#FF0000")
self.annotation_canvas.draw_saved_polyline(
polyline,
color,
width=3,
annotation_id=ann["id"],
)
self.annotation_canvas.draw_saved_bbox(
[ann["x_min"], ann["y_min"], ann["x_max"], ann["y_max"]],
color,
width=3,
)
drawn_count += 1
logger.info(
f"Displayed {drawn_count} annotation(s) for current image with "
f"{'no class filter' if selected_class_id is None else f'class_id={selected_class_id}'}"
)
def _restore_state(self):
"""Restore splitter positions from settings."""
settings = QSettings("microscopy_app", "object_detection")
# Restore main splitter state
main_state = settings.value("annotation_tab/main_splitter_state")
if main_state:
self.main_splitter.restoreState(main_state)
logger.debug("Restored main splitter state")
# Restore left splitter state
left_state = settings.value("annotation_tab/left_splitter_state")
if left_state:
self.left_splitter.restoreState(left_state)
logger.debug("Restored left splitter state")
# Restore right splitter state
right_state = settings.value("annotation_tab/right_splitter_state")
if right_state:
self.right_splitter.restoreState(right_state)
logger.debug("Restored right splitter state")
def save_state(self):
"""Save splitter positions to settings."""
settings = QSettings("microscopy_app", "object_detection")
# Save main splitter state
settings.setValue(
"annotation_tab/main_splitter_state", self.main_splitter.saveState()
)
# Save left splitter state
settings.setValue(
"annotation_tab/left_splitter_state", self.left_splitter.saveState()
)
# Save right splitter state
settings.setValue(
"annotation_tab/right_splitter_state", self.right_splitter.saveState()
)
logger.debug("Saved annotation tab splitter states")
def refresh(self):
"""Refresh the tab."""
pass

View File

@@ -0,0 +1,466 @@
"""
Detection tab for the microscopy object detection application.
Handles single image and batch detection.
"""
from PySide6.QtWidgets import (
QWidget,
QVBoxLayout,
QHBoxLayout,
QPushButton,
QLabel,
QComboBox,
QSlider,
QFileDialog,
QMessageBox,
QProgressBar,
QTextEdit,
QGroupBox,
QFormLayout,
)
from PySide6.QtCore import Qt, QThread, Signal
from pathlib import Path
from typing import Optional
from src.database.db_manager import DatabaseManager
from src.utils.config_manager import ConfigManager
from src.utils.logger import get_logger
from src.utils.file_utils import get_image_files
from src.model.inference import InferenceEngine
from src.utils.image import Image
logger = get_logger(__name__)
class DetectionWorker(QThread):
"""Worker thread for running detection."""
progress = Signal(int, int, str) # current, total, message
finished = Signal(list) # results
error = Signal(str) # error message
def __init__(self, engine, image_paths, repo_root, conf):
super().__init__()
self.engine = engine
self.image_paths = image_paths
self.repo_root = repo_root
self.conf = conf
def run(self):
"""Run detection in background thread."""
try:
results = self.engine.detect_batch(
self.image_paths, self.repo_root, self.conf, self.progress.emit
)
self.finished.emit(results)
except Exception as e:
logger.error(f"Detection error: {e}")
self.error.emit(str(e))
class DetectionTab(QWidget):
"""Detection tab for single image and batch detection."""
def __init__(
self, db_manager: DatabaseManager, config_manager: ConfigManager, parent=None
):
super().__init__(parent)
self.db_manager = db_manager
self.config_manager = config_manager
self.inference_engine = None
self.current_model_id = None
self._setup_ui()
self._connect_signals()
self._load_models()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
# Model selection group
model_group = QGroupBox("Model Selection")
model_layout = QFormLayout()
self.model_combo = QComboBox()
self.model_combo.addItem("No models available", None)
model_layout.addRow("Model:", self.model_combo)
model_group.setLayout(model_layout)
layout.addWidget(model_group)
# Detection settings group
settings_group = QGroupBox("Detection Settings")
settings_layout = QFormLayout()
# Confidence threshold
conf_layout = QHBoxLayout()
self.conf_slider = QSlider(Qt.Horizontal)
self.conf_slider.setRange(0, 100)
self.conf_slider.setValue(25)
self.conf_slider.setTickPosition(QSlider.TicksBelow)
self.conf_slider.setTickInterval(10)
conf_layout.addWidget(self.conf_slider)
self.conf_label = QLabel("0.25")
conf_layout.addWidget(self.conf_label)
settings_layout.addRow("Confidence:", conf_layout)
settings_group.setLayout(settings_layout)
layout.addWidget(settings_group)
# Action buttons
button_layout = QHBoxLayout()
self.single_image_btn = QPushButton("Detect Single Image")
self.single_image_btn.clicked.connect(self._detect_single_image)
button_layout.addWidget(self.single_image_btn)
self.batch_btn = QPushButton("Detect Batch (Folder)")
self.batch_btn.clicked.connect(self._detect_batch)
button_layout.addWidget(self.batch_btn)
layout.addLayout(button_layout)
# Progress bar
self.progress_bar = QProgressBar()
self.progress_bar.setVisible(False)
layout.addWidget(self.progress_bar)
# Results display
results_group = QGroupBox("Detection Results")
results_layout = QVBoxLayout()
self.results_text = QTextEdit()
self.results_text.setReadOnly(True)
self.results_text.setMaximumHeight(200)
results_layout.addWidget(self.results_text)
results_group.setLayout(results_layout)
layout.addWidget(results_group)
layout.addStretch()
self.setLayout(layout)
def _connect_signals(self):
"""Connect signals and slots."""
self.conf_slider.valueChanged.connect(self._update_confidence_label)
self.model_combo.currentIndexChanged.connect(self._on_model_changed)
def _load_models(self):
"""Load available models from database and local storage."""
try:
self.model_combo.clear()
models = self.db_manager.get_models()
has_models = False
known_paths = set()
# Add base model option first (always available)
base_model = self.config_manager.get(
"models.default_base_model", "yolov8s-seg.pt"
)
if base_model:
base_data = {
"id": 0,
"path": base_model,
"model_name": Path(base_model).stem or "Base Model",
"model_version": "pretrained",
"base_model": base_model,
"source": "base",
}
self.model_combo.addItem(f"Base Model ({base_model})", base_data)
known_paths.add(self._normalize_model_path(base_model))
has_models = True
# Add trained models from database
for model in models:
display_name = f"{model['model_name']} v{model['model_version']}"
model_data = {**model, "path": model.get("model_path")}
normalized = self._normalize_model_path(model_data.get("path"))
if normalized:
known_paths.add(normalized)
self.model_combo.addItem(display_name, model_data)
has_models = True
# Discover local model files not yet in the database
local_models = self._discover_local_models()
for model_path in local_models:
normalized = self._normalize_model_path(model_path)
if normalized in known_paths:
continue
display_name = f"Local Model ({Path(model_path).stem})"
model_data = {
"id": None,
"path": str(model_path),
"model_name": Path(model_path).stem,
"model_version": "local",
"base_model": Path(model_path).stem,
"source": "local",
}
self.model_combo.addItem(display_name, model_data)
known_paths.add(normalized)
has_models = True
if not has_models:
self.model_combo.addItem("No models available", None)
self._set_buttons_enabled(False)
else:
self._set_buttons_enabled(True)
except Exception as e:
logger.error(f"Error loading models: {e}")
QMessageBox.warning(self, "Error", f"Failed to load models:\n{str(e)}")
def _on_model_changed(self, index: int):
"""Handle model selection change."""
model_data = self.model_combo.itemData(index)
if model_data and model_data["id"] != 0:
self.current_model_id = model_data["id"]
else:
self.current_model_id = None
def _update_confidence_label(self, value: int):
"""Update confidence label."""
conf = value / 100.0
self.conf_label.setText(f"{conf:.2f}")
def _detect_single_image(self):
"""Detect objects in a single image."""
# Get image file
repo_path = self.config_manager.get_image_repository_path()
start_dir = repo_path if repo_path else ""
file_path, _ = QFileDialog.getOpenFileName(
self,
"Select Image",
start_dir,
"Images (*" + " *".join(Image.SUPPORTED_EXTENSIONS) + ")",
)
if not file_path:
return
# Run detection
self._run_detection([file_path])
def _detect_batch(self):
"""Detect objects in batch (folder)."""
# Get folder
repo_path = self.config_manager.get_image_repository_path()
start_dir = repo_path if repo_path else ""
folder_path = QFileDialog.getExistingDirectory(self, "Select Folder", start_dir)
if not folder_path:
return
# Get all image files
allowed_ext = self.config_manager.get_allowed_extensions()
image_files = get_image_files(folder_path, allowed_ext, recursive=False)
if not image_files:
QMessageBox.information(
self, "No Images", "No image files found in selected folder."
)
return
# Confirm batch processing
reply = QMessageBox.question(
self,
"Confirm Batch Detection",
f"Process {len(image_files)} images?",
QMessageBox.Yes | QMessageBox.No,
)
if reply == QMessageBox.Yes:
self._run_detection(image_files)
def _run_detection(self, image_paths: list):
"""Run detection on image list."""
try:
# Get selected model
model_data = self.model_combo.currentData()
if not model_data:
QMessageBox.warning(self, "No Model", "Please select a model first.")
return
model_path = model_data.get("path")
if not model_path:
QMessageBox.warning(
self, "Invalid Model", "Selected model is missing a file path."
)
return
if not Path(model_path).exists():
QMessageBox.critical(
self,
"Model Not Found",
f"The selected model file could not be found:\n{model_path}",
)
return
model_id = model_data.get("id")
# Ensure we have a database entry for the selected model
if model_id in (None, 0):
model_id = self._ensure_model_record(model_data)
if not model_id:
QMessageBox.critical(
self,
"Model Registration Failed",
"Unable to register the selected model in the database.",
)
return
normalized_model_path = self._normalize_model_path(model_path) or model_path
# Create inference engine
self.inference_engine = InferenceEngine(
normalized_model_path, self.db_manager, model_id
)
# Get confidence threshold
conf = self.conf_slider.value() / 100.0
# Get repository root
repo_root = self.config_manager.get_image_repository_path()
if not repo_root:
repo_root = str(Path(image_paths[0]).parent)
# Show progress bar
self.progress_bar.setVisible(True)
self.progress_bar.setMaximum(len(image_paths))
self._set_buttons_enabled(False)
# Create and start worker thread
self.worker = DetectionWorker(
self.inference_engine, image_paths, repo_root, conf
)
self.worker.progress.connect(self._on_progress)
self.worker.finished.connect(self._on_detection_finished)
self.worker.error.connect(self._on_detection_error)
self.worker.start()
except Exception as e:
logger.error(f"Error starting detection: {e}")
QMessageBox.critical(self, "Error", f"Failed to start detection:\n{str(e)}")
self._set_buttons_enabled(True)
def _on_progress(self, current: int, total: int, message: str):
"""Handle progress update."""
self.progress_bar.setValue(current)
self.results_text.append(f"[{current}/{total}] {message}")
def _on_detection_finished(self, results: list):
"""Handle detection completion."""
self.progress_bar.setVisible(False)
self._set_buttons_enabled(True)
# Calculate statistics
total_detections = sum(r["count"] for r in results)
successful = sum(1 for r in results if r.get("success", False))
summary = f"\n=== Detection Complete ===\n"
summary += f"Processed: {len(results)} images\n"
summary += f"Successful: {successful}\n"
summary += f"Total detections: {total_detections}\n"
self.results_text.append(summary)
QMessageBox.information(
self,
"Detection Complete",
f"Processed {len(results)} images\n{total_detections} objects detected",
)
def _on_detection_error(self, error_msg: str):
"""Handle detection error."""
self.progress_bar.setVisible(False)
self._set_buttons_enabled(True)
self.results_text.append(f"\nERROR: {error_msg}")
QMessageBox.critical(self, "Detection Error", error_msg)
def _set_buttons_enabled(self, enabled: bool):
"""Enable/disable action buttons."""
self.single_image_btn.setEnabled(enabled)
self.batch_btn.setEnabled(enabled)
self.model_combo.setEnabled(enabled)
def _discover_local_models(self) -> list:
"""Scan the models directory for standalone .pt files."""
models_dir = self.config_manager.get_models_directory()
if not models_dir:
return []
models_path = Path(models_dir)
if not models_path.exists():
return []
try:
return sorted(
[p for p in models_path.rglob("*.pt") if p.is_file()],
key=lambda p: str(p).lower(),
)
except Exception as e:
logger.warning(f"Error discovering local models: {e}")
return []
def _normalize_model_path(self, path_value) -> str:
"""Return a normalized absolute path string for comparison."""
if not path_value:
return ""
try:
return str(Path(path_value).resolve())
except Exception:
return str(path_value)
def _ensure_model_record(self, model_data: dict) -> Optional[int]:
"""Ensure a database record exists for the selected model."""
model_path = model_data.get("path")
if not model_path:
return None
normalized_target = self._normalize_model_path(model_path)
try:
existing_models = self.db_manager.get_models()
for model in existing_models:
existing_path = model.get("model_path")
if not existing_path:
continue
normalized_existing = self._normalize_model_path(existing_path)
if (
normalized_existing == normalized_target
or existing_path == model_path
):
return model["id"]
model_name = (
model_data.get("model_name") or Path(model_path).stem or "Custom Model"
)
model_version = (
model_data.get("model_version") or model_data.get("source") or "local"
)
base_model = model_data.get(
"base_model",
self.config_manager.get("models.default_base_model", "yolov8s-seg.pt"),
)
return self.db_manager.add_model(
model_name=model_name,
model_version=model_version,
model_path=normalized_target,
base_model=base_model,
)
except Exception as e:
logger.error(f"Failed to ensure model record for {model_path}: {e}")
return None
def refresh(self):
"""Refresh the tab."""
self._load_models()
self.results_text.clear()

439
src/gui/tabs/results_tab.py Normal file
View File

@@ -0,0 +1,439 @@
"""
Results tab for browsing stored detections and visualizing overlays.
"""
from pathlib import Path
from typing import Dict, List, Optional
from PySide6.QtWidgets import (
QWidget,
QVBoxLayout,
QHBoxLayout,
QLabel,
QGroupBox,
QPushButton,
QSplitter,
QTableWidget,
QTableWidgetItem,
QHeaderView,
QAbstractItemView,
QMessageBox,
QCheckBox,
)
from PySide6.QtCore import Qt
from src.database.db_manager import DatabaseManager
from src.utils.config_manager import ConfigManager
from src.utils.logger import get_logger
from src.utils.image import Image, ImageLoadError
from src.gui.widgets import AnnotationCanvasWidget
logger = get_logger(__name__)
class ResultsTab(QWidget):
"""Results tab showing detection history and preview overlays."""
def __init__(
self, db_manager: DatabaseManager, config_manager: ConfigManager, parent=None
):
super().__init__(parent)
self.db_manager = db_manager
self.config_manager = config_manager
self.detection_summary: List[Dict] = []
self.current_selection: Optional[Dict] = None
self.current_image: Optional[Image] = None
self.current_detections: List[Dict] = []
self._image_path_cache: Dict[str, str] = {}
self._setup_ui()
self.refresh()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
# Splitter for list + preview
splitter = QSplitter(Qt.Horizontal)
# Left pane: detection list
left_container = QWidget()
left_layout = QVBoxLayout()
left_layout.setContentsMargins(0, 0, 0, 0)
controls_layout = QHBoxLayout()
self.refresh_btn = QPushButton("Refresh")
self.refresh_btn.clicked.connect(self.refresh)
controls_layout.addWidget(self.refresh_btn)
controls_layout.addStretch()
left_layout.addLayout(controls_layout)
self.results_table = QTableWidget(0, 5)
self.results_table.setHorizontalHeaderLabels(
["Image", "Model", "Detections", "Classes", "Last Updated"]
)
self.results_table.horizontalHeader().setSectionResizeMode(
0, QHeaderView.Stretch
)
self.results_table.horizontalHeader().setSectionResizeMode(
1, QHeaderView.Stretch
)
self.results_table.horizontalHeader().setSectionResizeMode(
2, QHeaderView.ResizeToContents
)
self.results_table.horizontalHeader().setSectionResizeMode(
3, QHeaderView.Stretch
)
self.results_table.horizontalHeader().setSectionResizeMode(
4, QHeaderView.ResizeToContents
)
self.results_table.setSelectionBehavior(QAbstractItemView.SelectRows)
self.results_table.setSelectionMode(QAbstractItemView.SingleSelection)
self.results_table.setEditTriggers(QAbstractItemView.NoEditTriggers)
self.results_table.itemSelectionChanged.connect(self._on_result_selected)
left_layout.addWidget(self.results_table)
left_container.setLayout(left_layout)
# Right pane: preview canvas and controls
right_container = QWidget()
right_layout = QVBoxLayout()
right_layout.setContentsMargins(0, 0, 0, 0)
preview_group = QGroupBox("Detection Preview")
preview_layout = QVBoxLayout()
self.preview_canvas = AnnotationCanvasWidget()
self.preview_canvas.set_polyline_enabled(False)
self.preview_canvas.set_show_bboxes(True)
preview_layout.addWidget(self.preview_canvas)
toggles_layout = QHBoxLayout()
self.show_masks_checkbox = QCheckBox("Show Masks")
self.show_masks_checkbox.setChecked(False)
self.show_masks_checkbox.stateChanged.connect(self._apply_detection_overlays)
self.show_bboxes_checkbox = QCheckBox("Show Bounding Boxes")
self.show_bboxes_checkbox.setChecked(True)
self.show_bboxes_checkbox.stateChanged.connect(self._toggle_bboxes)
self.show_confidence_checkbox = QCheckBox("Show Confidence")
self.show_confidence_checkbox.setChecked(False)
self.show_confidence_checkbox.stateChanged.connect(
self._apply_detection_overlays
)
toggles_layout.addWidget(self.show_masks_checkbox)
toggles_layout.addWidget(self.show_bboxes_checkbox)
toggles_layout.addWidget(self.show_confidence_checkbox)
toggles_layout.addStretch()
preview_layout.addLayout(toggles_layout)
self.summary_label = QLabel("Select a detection result to preview.")
self.summary_label.setWordWrap(True)
preview_layout.addWidget(self.summary_label)
preview_group.setLayout(preview_layout)
right_layout.addWidget(preview_group)
right_container.setLayout(right_layout)
splitter.addWidget(left_container)
splitter.addWidget(right_container)
splitter.setStretchFactor(0, 1)
splitter.setStretchFactor(1, 2)
layout.addWidget(splitter)
self.setLayout(layout)
def refresh(self):
"""Refresh the detection list and preview."""
self._load_detection_summary()
self._populate_results_table()
self.current_selection = None
self.current_image = None
self.current_detections = []
self.preview_canvas.clear()
self.summary_label.setText("Select a detection result to preview.")
def _load_detection_summary(self):
"""Load latest detection summaries grouped by image + model."""
try:
detections = self.db_manager.get_detections(limit=500)
summary_map: Dict[tuple, Dict] = {}
for det in detections:
key = (det["image_id"], det["model_id"])
metadata = det.get("metadata") or {}
entry = summary_map.setdefault(
key,
{
"image_id": det["image_id"],
"model_id": det["model_id"],
"image_path": det.get("image_path"),
"image_filename": det.get("image_filename")
or det.get("image_path"),
"model_name": det.get("model_name", ""),
"model_version": det.get("model_version", ""),
"last_detected": det.get("detected_at"),
"count": 0,
"classes": set(),
"source_path": metadata.get("source_path"),
"repository_root": metadata.get("repository_root"),
},
)
entry["count"] += 1
if det.get("detected_at") and (
not entry.get("last_detected")
or str(det.get("detected_at")) > str(entry.get("last_detected"))
):
entry["last_detected"] = det.get("detected_at")
if det.get("class_name"):
entry["classes"].add(det["class_name"])
if metadata.get("source_path") and not entry.get("source_path"):
entry["source_path"] = metadata.get("source_path")
if metadata.get("repository_root") and not entry.get("repository_root"):
entry["repository_root"] = metadata.get("repository_root")
self.detection_summary = sorted(
summary_map.values(),
key=lambda x: str(x.get("last_detected") or ""),
reverse=True,
)
except Exception as e:
logger.error(f"Failed to load detection summary: {e}")
QMessageBox.critical(
self,
"Error",
f"Failed to load detection results:\n{str(e)}",
)
self.detection_summary = []
def _populate_results_table(self):
"""Populate the table widget with detection summaries."""
self.results_table.setRowCount(len(self.detection_summary))
for row, entry in enumerate(self.detection_summary):
model_label = f"{entry['model_name']} {entry['model_version']}".strip()
class_list = (
", ".join(sorted(entry["classes"])) if entry["classes"] else "-"
)
items = [
QTableWidgetItem(entry.get("image_filename", "")),
QTableWidgetItem(model_label),
QTableWidgetItem(str(entry.get("count", 0))),
QTableWidgetItem(class_list),
QTableWidgetItem(str(entry.get("last_detected") or "")),
]
for col, item in enumerate(items):
item.setData(Qt.UserRole, row)
self.results_table.setItem(row, col, item)
self.results_table.clearSelection()
def _on_result_selected(self):
"""Handle selection changes in the detection table."""
selected_items = self.results_table.selectedItems()
if not selected_items:
return
row = selected_items[0].data(Qt.UserRole)
if row is None or row >= len(self.detection_summary):
return
entry = self.detection_summary[row]
if (
self.current_selection
and self.current_selection.get("image_id") == entry["image_id"]
and self.current_selection.get("model_id") == entry["model_id"]
):
return
self.current_selection = entry
image_path = self._resolve_image_path(entry)
if not image_path:
QMessageBox.warning(
self,
"Image Not Found",
"Unable to locate the image file for this detection.",
)
return
try:
self.current_image = Image(image_path)
self.preview_canvas.load_image(self.current_image)
except ImageLoadError as e:
logger.error(f"Failed to load image '{image_path}': {e}")
QMessageBox.critical(
self,
"Image Error",
f"Failed to load image for preview:\n{str(e)}",
)
return
self._load_detections_for_selection(entry)
self._apply_detection_overlays()
self._update_summary_label(entry)
def _load_detections_for_selection(self, entry: Dict):
"""Load detection records for the selected image/model pair."""
self.current_detections = []
if not entry:
return
try:
filters = {"image_id": entry["image_id"], "model_id": entry["model_id"]}
self.current_detections = self.db_manager.get_detections(filters)
except Exception as e:
logger.error(f"Failed to load detections for preview: {e}")
QMessageBox.critical(
self,
"Error",
f"Failed to load detections for this image:\n{str(e)}",
)
self.current_detections = []
def _apply_detection_overlays(self):
"""Draw detections onto the preview canvas based on current toggles."""
self.preview_canvas.clear_annotations()
self.preview_canvas.set_show_bboxes(self.show_bboxes_checkbox.isChecked())
if not self.current_detections or not self.current_image:
return
for det in self.current_detections:
color = self._get_class_color(det.get("class_name"))
if self.show_masks_checkbox.isChecked() and det.get("segmentation_mask"):
mask_points = self._convert_mask(det["segmentation_mask"])
if mask_points:
self.preview_canvas.draw_saved_polyline(mask_points, color)
bbox = [
det.get("x_min"),
det.get("y_min"),
det.get("x_max"),
det.get("y_max"),
]
if all(v is not None for v in bbox):
label = None
if self.show_confidence_checkbox.isChecked():
confidence = det.get("confidence")
if confidence is not None:
label = f"{confidence:.2f}"
self.preview_canvas.draw_saved_bbox(bbox, color, label=label)
def _convert_mask(self, mask_points: List[List[float]]) -> List[List[float]]:
"""Convert stored [x, y] masks to [y, x] format for the canvas."""
converted = []
for point in mask_points:
if len(point) >= 2:
x, y = point[0], point[1]
converted.append([y, x])
return converted
def _toggle_bboxes(self):
"""Update bounding box visibility on the canvas."""
self.preview_canvas.set_show_bboxes(self.show_bboxes_checkbox.isChecked())
# Re-render to respect show/hide when toggled
self._apply_detection_overlays()
def _update_summary_label(self, entry: Dict):
"""Display textual summary for the selected detection run."""
classes = ", ".join(sorted(entry.get("classes", []))) or "-"
summary_text = (
f"Image: {entry.get('image_filename', 'unknown')}\n"
f"Model: {entry.get('model_name', '')} {entry.get('model_version', '')}\n"
f"Detections: {entry.get('count', 0)}\n"
f"Classes: {classes}\n"
f"Last Updated: {entry.get('last_detected', 'n/a')}"
)
self.summary_label.setText(summary_text)
def _resolve_image_path(self, entry: Dict) -> Optional[str]:
"""Resolve an image path using metadata, cache, and repository hints."""
relative_path = entry.get("image_path") if entry else None
cache_key = relative_path or entry.get("source_path")
if cache_key and cache_key in self._image_path_cache:
cached = Path(self._image_path_cache[cache_key])
if cached.exists():
return self._image_path_cache[cache_key]
del self._image_path_cache[cache_key]
candidates = []
source_path = entry.get("source_path") if entry else None
if source_path:
candidates.append(Path(source_path))
repo_roots = []
if entry.get("repository_root"):
repo_roots.append(entry["repository_root"])
config_repo = self.config_manager.get_image_repository_path()
if config_repo:
repo_roots.append(config_repo)
for root in repo_roots:
if relative_path:
candidates.append(Path(root) / relative_path)
if relative_path:
candidates.append(Path(relative_path))
for candidate in candidates:
try:
if candidate and candidate.exists():
resolved = str(candidate.resolve())
if cache_key:
self._image_path_cache[cache_key] = resolved
return resolved
except Exception:
continue
# Fallback: search by filename in known roots
filename = Path(relative_path).name if relative_path else None
if filename:
search_roots = [Path(root) for root in repo_roots if root]
if not search_roots:
search_roots = [Path("data")]
match = self._search_in_roots(filename, search_roots)
if match:
resolved = str(match.resolve())
if cache_key:
self._image_path_cache[cache_key] = resolved
return resolved
return None
def _search_in_roots(self, filename: str, roots: List[Path]) -> Optional[Path]:
"""Search for a file name within a list of root directories."""
for root in roots:
try:
if not root.exists():
continue
for candidate in root.rglob(filename):
return candidate
except Exception as e:
logger.debug(f"Error searching for {filename} in {root}: {e}")
return None
def _get_class_color(self, class_name: Optional[str]) -> str:
"""Return consistent color hex for a class name."""
if not class_name:
return "#FF6B6B"
color_map = self.config_manager.get_bbox_colors()
if class_name in color_map:
return color_map[class_name]
# Deterministic fallback color based on hash
palette = [
"#FF6B6B",
"#4ECDC4",
"#FFD166",
"#1D3557",
"#F4A261",
"#E76F51",
]
return palette[hash(class_name) % len(palette)]

1597
src/gui/tabs/training_tab.py Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,46 @@
"""
Validation tab for the microscopy object detection application.
"""
from PySide6.QtWidgets import QWidget, QVBoxLayout, QLabel, QGroupBox
from src.database.db_manager import DatabaseManager
from src.utils.config_manager import ConfigManager
class ValidationTab(QWidget):
"""Validation tab placeholder."""
def __init__(
self, db_manager: DatabaseManager, config_manager: ConfigManager, parent=None
):
super().__init__(parent)
self.db_manager = db_manager
self.config_manager = config_manager
self._setup_ui()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
group = QGroupBox("Validation")
group_layout = QVBoxLayout()
label = QLabel(
"Validation functionality will be implemented here.\n\n"
"Features:\n"
"- Model validation\n"
"- Metrics visualization\n"
"- Confusion matrix\n"
"- Precision-Recall curves"
)
group_layout.addWidget(label)
group.setLayout(group_layout)
layout.addWidget(group)
layout.addStretch()
self.setLayout(layout)
def refresh(self):
"""Refresh the tab."""
pass

View File

@@ -0,0 +1,7 @@
"""GUI widgets for the microscopy object detection application."""
from src.gui.widgets.image_display_widget import ImageDisplayWidget
from src.gui.widgets.annotation_canvas_widget import AnnotationCanvasWidget
from src.gui.widgets.annotation_tools_widget import AnnotationToolsWidget
__all__ = ["ImageDisplayWidget", "AnnotationCanvasWidget", "AnnotationToolsWidget"]

View File

@@ -0,0 +1,931 @@
"""
Annotation canvas widget for drawing annotations on images.
Currently supports polyline drawing tool with color selection for manual annotation.
"""
import numpy as np
import math
from PySide6.QtWidgets import QWidget, QVBoxLayout, QLabel, QScrollArea
from PySide6.QtGui import (
QPixmap,
QImage,
QPainter,
QPen,
QColor,
QKeyEvent,
QMouseEvent,
QPaintEvent,
QPolygonF,
)
from PySide6.QtCore import Qt, QEvent, Signal, QPoint, QPointF, QRect
from typing import Any, Dict, List, Optional, Tuple
from src.utils.image import Image, ImageLoadError
from src.utils.logger import get_logger
logger = get_logger(__name__)
def perpendicular_distance(
point: Tuple[float, float],
start: Tuple[float, float],
end: Tuple[float, float],
) -> float:
"""Perpendicular distance from `point` to the line defined by `start`->`end`."""
(x, y), (x1, y1), (x2, y2) = point, start, end
dx = x2 - x1
dy = y2 - y1
if dx == 0.0 and dy == 0.0:
return math.hypot(x - x1, y - y1)
num = abs(dy * x - dx * y + x2 * y1 - y2 * x1)
den = math.hypot(dx, dy)
return num / den
def rdp(points: List[Tuple[float, float]], epsilon: float) -> List[Tuple[float, float]]:
"""
Recursive Ramer-Douglas-Peucker (RDP) polyline simplification.
Args:
points: List of (x, y) points.
epsilon: Maximum allowed perpendicular distance in pixels.
Returns:
Simplified list of (x, y) points including first and last.
"""
if len(points) <= 2:
return list(points)
start = points[0]
end = points[-1]
max_dist = -1.0
index = -1
for i in range(1, len(points) - 1):
d = perpendicular_distance(points[i], start, end)
if d > max_dist:
max_dist = d
index = i
if max_dist > epsilon:
# Recursive split
left = rdp(points[: index + 1], epsilon)
right = rdp(points[index:], epsilon)
# Concatenate but avoid duplicate at split point
return left[:-1] + right
# Keep only start and end
return [start, end]
def simplify_polyline(
points: List[Tuple[float, float]], epsilon: float
) -> List[Tuple[float, float]]:
"""
Simplify a polyline with RDP while preserving closure semantics.
If the polyline is closed (first == last), the duplicate last point is removed
before simplification and then re-added after simplification.
"""
if not points:
return []
pts = [(float(x), float(y)) for x, y in points]
closed = False
if len(pts) >= 2 and pts[0] == pts[-1]:
closed = True
pts = pts[:-1] # remove duplicate last for simplification
if len(pts) <= 2:
simplified = list(pts)
else:
simplified = rdp(pts, epsilon)
if closed and simplified:
if simplified[0] != simplified[-1]:
simplified.append(simplified[0])
return simplified
class AnnotationCanvasWidget(QWidget):
"""
Widget for displaying images and drawing annotations with zoom and drawing tools.
Features:
- Display images with zoom functionality
- Polyline tool for drawing annotations
- Configurable pen color and width
- Mouse-based drawing interface
- Zoom in/out with mouse wheel and keyboard
Signals:
zoom_changed: Emitted when zoom level changes (float zoom_scale)
annotation_drawn: Emitted when a new stroke is completed (list of points)
"""
zoom_changed = Signal(float)
annotation_drawn = Signal(list) # List of (x, y) points in normalized coordinates
# Emitted when the user selects an existing polyline on the canvas.
# Carries the associated annotation_id (int) or None if selection is cleared
annotation_selected = Signal(object)
def __init__(self, parent=None):
"""Initialize the annotation canvas widget."""
super().__init__(parent)
self.current_image = None
self.original_pixmap = None
self.annotation_pixmap = None # Overlay for annotations
self.zoom_scale = 1.0
self.zoom_min = 0.1
self.zoom_max = 10.0
self.zoom_step = 0.1
self.zoom_wheel_step = 0.15
# Drawing / interaction state
self.is_drawing = False
self.polyline_enabled = False
self.polyline_pen_color = QColor(255, 0, 0, 128) # Default red with 50% alpha
self.polyline_pen_width = 3
self.show_bboxes: bool = True # Control visibility of bounding boxes
# Current stroke and stored polylines (in image coordinates, pixel units)
self.current_stroke: List[Tuple[float, float]] = []
self.polylines: List[List[Tuple[float, float]]] = []
self.stroke_meta: List[Dict[str, Any]] = [] # per-polyline style (color, width)
# Optional DB annotation_id for each stored polyline (None for temporary / unsaved)
self.polyline_annotation_ids: List[Optional[int]] = []
# Indices in self.polylines of the currently selected polylines (multi-select)
self.selected_polyline_indices: List[int] = []
# Stored bounding boxes in normalized coordinates (x_min, y_min, x_max, y_max)
self.bboxes: List[List[float]] = []
self.bbox_meta: List[Dict[str, Any]] = [] # per-bbox style (color, width)
# Legacy collection of strokes in normalized coordinates (kept for API compatibility)
self.all_strokes: List[dict] = []
# RDP simplification parameters (in pixels)
self.simplify_on_finish: bool = True
self.simplify_epsilon: float = 2.0
self.sample_threshold: float = 2.0 # minimum movement to sample a new point
self._setup_ui()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
layout.setContentsMargins(0, 0, 0, 0)
# Scroll area for canvas
self.scroll_area = QScrollArea()
self.scroll_area.setWidgetResizable(True)
self.scroll_area.setMinimumHeight(400)
self.canvas_label = QLabel("No image loaded")
self.canvas_label.setAlignment(Qt.AlignCenter)
self.canvas_label.setStyleSheet(
"QLabel { background-color: #2b2b2b; color: #888; }"
)
self.canvas_label.setScaledContents(False)
self.canvas_label.setMouseTracking(True)
self.scroll_area.setWidget(self.canvas_label)
self.scroll_area.viewport().installEventFilter(self)
layout.addWidget(self.scroll_area)
self.setLayout(layout)
self.setFocusPolicy(Qt.StrongFocus)
def load_image(self, image: Image):
"""
Load and display an image.
Args:
image: Image object to display
"""
self.current_image = image
self.zoom_scale = 1.0
self.clear_annotations()
self._display_image()
logger.debug(
f"Loaded image into annotation canvas: {image.width}x{image.height}"
)
def clear(self):
"""Clear the displayed image and all annotations."""
self.current_image = None
self.original_pixmap = None
self.annotation_pixmap = None
self.zoom_scale = 1.0
self.clear_annotations()
self.canvas_label.setText("No image loaded")
self.canvas_label.setPixmap(QPixmap())
def clear_annotations(self):
"""Clear all drawn annotations."""
self.all_strokes = []
self.current_stroke = []
self.polylines = []
self.stroke_meta = []
self.polyline_annotation_ids = []
self.selected_polyline_indices = []
self.bboxes = []
self.bbox_meta = []
self.is_drawing = False
if self.annotation_pixmap:
self.annotation_pixmap.fill(Qt.transparent)
self._update_display()
def _display_image(self):
"""Display the current image in the canvas."""
if self.current_image is None:
return
try:
# Get image data in a format compatible with Qt
if self.current_image.channels in (3, 4):
image_data = self.current_image.get_rgb()
height, width = image_data.shape[:2]
else:
image_data = self.current_image.get_grayscale()
height, width = image_data.shape
image_data = np.ascontiguousarray(image_data)
bytes_per_line = image_data.strides[0]
qimage = QImage(
image_data.data,
width,
height,
bytes_per_line,
self.current_image.qtimage_format,
).copy() # Copy so Qt owns the buffer even after numpy array goes out of scope
self.original_pixmap = QPixmap.fromImage(qimage)
# Create transparent overlay for annotations
self.annotation_pixmap = QPixmap(self.original_pixmap.size())
self.annotation_pixmap.fill(Qt.transparent)
self._apply_zoom()
except Exception as e:
logger.error(f"Error displaying image: {e}")
raise ImageLoadError(f"Failed to display image: {str(e)}")
def _apply_zoom(self):
"""Apply current zoom level to the displayed image."""
if self.original_pixmap is None:
return
scaled_width = int(self.original_pixmap.width() * self.zoom_scale)
scaled_height = int(self.original_pixmap.height() * self.zoom_scale)
# Scale both image and annotations
scaled_image = self.original_pixmap.scaled(
scaled_width,
scaled_height,
Qt.KeepAspectRatio,
(
Qt.SmoothTransformation
if self.zoom_scale >= 1.0
else Qt.FastTransformation
),
)
scaled_annotations = self.annotation_pixmap.scaled(
scaled_width,
scaled_height,
Qt.KeepAspectRatio,
(
Qt.SmoothTransformation
if self.zoom_scale >= 1.0
else Qt.FastTransformation
),
)
# Composite image and annotations
combined = QPixmap(scaled_image.size())
painter = QPainter(combined)
painter.drawPixmap(0, 0, scaled_image)
painter.drawPixmap(0, 0, scaled_annotations)
painter.end()
self.canvas_label.setPixmap(combined)
self.canvas_label.setScaledContents(False)
self.canvas_label.adjustSize()
self.zoom_changed.emit(self.zoom_scale)
def _update_display(self):
"""Update display after drawing."""
self._apply_zoom()
def set_polyline_enabled(self, enabled: bool):
"""Enable or disable polyline tool."""
self.polyline_enabled = enabled
if enabled:
self.canvas_label.setCursor(Qt.CrossCursor)
else:
self.canvas_label.setCursor(Qt.ArrowCursor)
def set_polyline_pen_color(self, color: QColor):
"""Set polyline pen color."""
self.polyline_pen_color = color
def set_polyline_pen_width(self, width: int):
"""Set polyline pen width."""
self.polyline_pen_width = max(1, width)
def get_zoom_percentage(self) -> int:
"""Get current zoom level as percentage."""
return int(self.zoom_scale * 100)
def zoom_in(self):
"""Zoom in on the image."""
if self.original_pixmap is None:
return
new_scale = self.zoom_scale + self.zoom_step
if new_scale <= self.zoom_max:
self.zoom_scale = new_scale
self._apply_zoom()
def zoom_out(self):
"""Zoom out from the image."""
if self.original_pixmap is None:
return
new_scale = self.zoom_scale - self.zoom_step
if new_scale >= self.zoom_min:
self.zoom_scale = new_scale
self._apply_zoom()
def reset_zoom(self):
"""Reset zoom to 100%."""
if self.original_pixmap is None:
return
self.zoom_scale = 1.0
self._apply_zoom()
def _canvas_to_image_coords(self, pos: QPoint) -> Optional[Tuple[int, int]]:
"""Convert canvas coordinates to image coordinates, accounting for zoom and centering."""
if self.original_pixmap is None or self.canvas_label.pixmap() is None:
return None
# Get the displayed pixmap size (after zoom)
displayed_pixmap = self.canvas_label.pixmap()
displayed_width = displayed_pixmap.width()
displayed_height = displayed_pixmap.height()
# Calculate offset due to label centering (label might be larger than pixmap)
label_width = self.canvas_label.width()
label_height = self.canvas_label.height()
offset_x = max(0, (label_width - displayed_width) // 2)
offset_y = max(0, (label_height - displayed_height) // 2)
# Adjust position for offset and convert to image coordinates
x = (pos.x() - offset_x) / self.zoom_scale
y = (pos.y() - offset_y) / self.zoom_scale
# Check bounds
if (
0 <= x < self.original_pixmap.width()
and 0 <= y < self.original_pixmap.height()
):
return (int(x), int(y))
return None
def _find_polyline_at(
self, img_x: float, img_y: float, threshold_px: float = 5.0
) -> Optional[int]:
"""
Find index of polyline whose geometry is within threshold_px of (img_x, img_y).
Returns the index in self.polylines, or None if none is close enough.
"""
best_index: Optional[int] = None
best_dist: float = float("inf")
for idx, polyline in enumerate(self.polylines):
if len(polyline) < 2:
continue
# Quick bounding-box check to skip obviously distant polylines
xs = [p[0] for p in polyline]
ys = [p[1] for p in polyline]
if img_x < min(xs) - threshold_px or img_x > max(xs) + threshold_px:
continue
if img_y < min(ys) - threshold_px or img_y > max(ys) + threshold_px:
continue
# Precise distance to all segments
for (x1, y1), (x2, y2) in zip(polyline[:-1], polyline[1:]):
d = perpendicular_distance(
(img_x, img_y), (float(x1), float(y1)), (float(x2), float(y2))
)
if d < best_dist:
best_dist = d
best_index = idx
if best_index is not None and best_dist <= threshold_px:
return best_index
return None
def _image_to_normalized_coords(self, x: int, y: int) -> Tuple[float, float]:
"""Convert image coordinates to normalized coordinates (0-1)."""
if self.original_pixmap is None:
return (0.0, 0.0)
norm_x = x / self.original_pixmap.width()
norm_y = y / self.original_pixmap.height()
return (norm_x, norm_y)
def _add_polyline(
self,
img_points: List[Tuple[float, float]],
color: QColor,
width: int,
annotation_id: Optional[int] = None,
):
"""Store a polyline in image coordinates and redraw annotations."""
if not img_points or len(img_points) < 2:
return
# Ensure all points are tuples of floats
normalized_points = [(float(x), float(y)) for x, y in img_points]
self.polylines.append(normalized_points)
self.stroke_meta.append({"color": QColor(color), "width": int(width)})
self.polyline_annotation_ids.append(annotation_id)
self._redraw_annotations()
def _redraw_annotations(self):
"""Redraw all stored polylines and (optionally) bounding boxes onto the annotation pixmap."""
if self.annotation_pixmap is None:
return
# Clear existing overlay
self.annotation_pixmap.fill(Qt.transparent)
painter = QPainter(self.annotation_pixmap)
# Draw polylines
for idx, (polyline, meta) in enumerate(zip(self.polylines, self.stroke_meta)):
pen_color: QColor = meta.get("color", self.polyline_pen_color)
width: int = meta.get("width", self.polyline_pen_width)
if idx in self.selected_polyline_indices:
# Highlight selected polylines in a distinct color / width
highlight_color = QColor(255, 255, 0, 200) # yellow, semi-opaque
pen = QPen(
highlight_color,
width + 1,
Qt.SolidLine,
Qt.RoundCap,
Qt.RoundJoin,
)
else:
pen = QPen(
pen_color,
width,
Qt.SolidLine,
Qt.RoundCap,
Qt.RoundJoin,
)
painter.setPen(pen)
# Use QPolygonF for efficient polygon rendering (single call vs N-1 calls)
# drawPolygon() automatically closes the shape, ensuring proper visual closure
polygon = QPolygonF([QPointF(x, y) for x, y in polyline])
painter.drawPolygon(polygon)
# Draw bounding boxes (dashed) if enabled
if self.show_bboxes and self.original_pixmap is not None and self.bboxes:
img_width = float(self.original_pixmap.width())
img_height = float(self.original_pixmap.height())
for bbox, meta in zip(self.bboxes, self.bbox_meta):
if len(bbox) != 4:
continue
x_min_norm, y_min_norm, x_max_norm, y_max_norm = bbox
x_min = int(x_min_norm * img_width)
y_min = int(y_min_norm * img_height)
x_max = int(x_max_norm * img_width)
y_max = int(y_max_norm * img_height)
rect_width = x_max - x_min
rect_height = y_max - y_min
pen_color: QColor = meta.get("color", QColor(255, 0, 0, 128))
width: int = meta.get("width", self.polyline_pen_width)
pen = QPen(
pen_color,
width,
Qt.DashLine,
Qt.SquareCap,
Qt.MiterJoin,
)
painter.setPen(pen)
painter.drawRect(x_min, y_min, rect_width, rect_height)
label_text = meta.get("label")
if label_text:
painter.save()
font = painter.font()
font.setPointSizeF(max(10.0, width + 4))
painter.setFont(font)
metrics = painter.fontMetrics()
text_width = metrics.horizontalAdvance(label_text)
text_height = metrics.height()
padding = 4
bg_width = text_width + padding * 2
bg_height = text_height + padding * 2
canvas_width = self.original_pixmap.width()
canvas_height = self.original_pixmap.height()
bg_x = max(0, min(x_min, canvas_width - bg_width))
bg_y = y_min - bg_height
if bg_y < 0:
bg_y = min(y_min, canvas_height - bg_height)
bg_y = max(0, bg_y)
background_rect = QRect(bg_x, bg_y, bg_width, bg_height)
background_color = QColor(pen_color)
background_color.setAlpha(220)
painter.fillRect(background_rect, background_color)
text_color = QColor(0, 0, 0)
if background_color.lightness() < 128:
text_color = QColor(255, 255, 255)
painter.setPen(text_color)
painter.drawText(
background_rect.adjusted(padding, padding, -padding, -padding),
Qt.AlignLeft | Qt.AlignVCenter,
label_text,
)
painter.restore()
painter.end()
self._update_display()
def mousePressEvent(self, event: QMouseEvent):
"""Handle mouse press events for drawing and selecting polylines."""
if self.annotation_pixmap is None:
super().mousePressEvent(event)
return
# Map click to image coordinates
label_pos = self.canvas_label.mapFromGlobal(event.globalPos())
img_coords = self._canvas_to_image_coords(label_pos)
# Left button + drawing tool enabled -> start a new stroke
if event.button() == Qt.LeftButton and self.polyline_enabled:
if img_coords:
self.is_drawing = True
self.current_stroke = [(float(img_coords[0]), float(img_coords[1]))]
return
# Left button + drawing tool disabled -> attempt selection of existing polyline
if event.button() == Qt.LeftButton and not self.polyline_enabled:
if img_coords:
idx = self._find_polyline_at(float(img_coords[0]), float(img_coords[1]))
if idx is not None:
if event.modifiers() & Qt.ShiftModifier:
# Multi-select mode: add to current selection (if not already selected)
if idx not in self.selected_polyline_indices:
self.selected_polyline_indices.append(idx)
else:
# Single-select mode: replace current selection
self.selected_polyline_indices = [idx]
# Build list of selected annotation IDs (ignore None entries)
selected_ids: List[int] = []
for sel_idx in self.selected_polyline_indices:
if 0 <= sel_idx < len(self.polyline_annotation_ids):
ann_id = self.polyline_annotation_ids[sel_idx]
if isinstance(ann_id, int):
selected_ids.append(ann_id)
if selected_ids:
self.annotation_selected.emit(selected_ids)
else:
# No valid DB-backed annotations in selection
self.annotation_selected.emit(None)
else:
# Clicked on empty space -> clear selection
self.selected_polyline_indices = []
self.annotation_selected.emit(None)
self._redraw_annotations()
return
# Fallback for other buttons / cases
super().mousePressEvent(event)
def mouseMoveEvent(self, event: QMouseEvent):
"""Handle mouse move events for drawing."""
if (
not self.is_drawing
or not self.polyline_enabled
or self.annotation_pixmap is None
):
super().mouseMoveEvent(event)
return
# Get accurate position using global coordinates
label_pos = self.canvas_label.mapFromGlobal(event.globalPos())
img_coords = self._canvas_to_image_coords(label_pos)
if img_coords and len(self.current_stroke) > 0:
last_point = self.current_stroke[-1]
dx = img_coords[0] - last_point[0]
dy = img_coords[1] - last_point[1]
# Only sample a new point if we moved enough pixels
if math.hypot(dx, dy) < self.sample_threshold:
return
# Draw line from last point to current point for interactive feedback
painter = QPainter(self.annotation_pixmap)
pen = QPen(
self.polyline_pen_color,
self.polyline_pen_width,
Qt.SolidLine,
Qt.RoundCap,
Qt.RoundJoin,
)
painter.setPen(pen)
painter.drawLine(
int(last_point[0]),
int(last_point[1]),
int(img_coords[0]),
int(img_coords[1]),
)
painter.end()
self.current_stroke.append((float(img_coords[0]), float(img_coords[1])))
self._update_display()
def mouseReleaseEvent(self, event: QMouseEvent):
"""Handle mouse release events to complete a stroke."""
if not self.is_drawing or event.button() != Qt.LeftButton:
super().mouseReleaseEvent(event)
return
self.is_drawing = False
if len(self.current_stroke) > 1 and self.original_pixmap is not None:
# Ensure the stroke is closed by connecting end -> start
raw_points = list(self.current_stroke)
if raw_points[0] != raw_points[-1]:
raw_points.append(raw_points[0])
# Optional RDP simplification (in image pixel space)
if self.simplify_on_finish:
simplified = simplify_polyline(raw_points, self.simplify_epsilon)
else:
simplified = raw_points
if len(simplified) >= 2:
# Store polyline and redraw all annotations
self._add_polyline(
simplified, self.polyline_pen_color, self.polyline_pen_width
)
# Convert to normalized coordinates for metadata + signal
normalized_stroke = [
self._image_to_normalized_coords(int(x), int(y))
for (x, y) in simplified
]
self.all_strokes.append(
{
"points": normalized_stroke,
"color": self.polyline_pen_color.name(),
"alpha": self.polyline_pen_color.alpha(),
"width": self.polyline_pen_width,
}
)
# Emit signal with normalized coordinates
self.annotation_drawn.emit(normalized_stroke)
logger.debug(
f"Completed stroke with {len(simplified)} points "
f"(normalized len={len(normalized_stroke)})"
)
self.current_stroke = []
def get_all_strokes(self) -> List[dict]:
"""Get all drawn strokes with metadata."""
return self.all_strokes
def get_annotation_parameters(self) -> Optional[List[Dict[str, Any]]]:
"""
Get all annotation parameters including bounding box and polyline.
Returns:
List of dictionaries, each containing:
- 'bbox': [x_min, y_min, x_max, y_max] in normalized image coordinates
- 'polyline': List of [y_norm, x_norm] points describing the polygon
"""
if self.original_pixmap is None or not self.polylines:
return None
img_width = float(self.original_pixmap.width())
img_height = float(self.original_pixmap.height())
params: List[Dict[str, Any]] = []
for idx, polyline in enumerate(self.polylines):
if len(polyline) < 2:
continue
xs = [p[0] for p in polyline]
ys = [p[1] for p in polyline]
x_min_norm = min(xs) / img_width
x_max_norm = max(xs) / img_width
y_min_norm = min(ys) / img_height
y_max_norm = max(ys) / img_height
# Store polyline as [y_norm, x_norm] to match DB convention and
# the expectations of draw_saved_polyline().
normalized_polyline = [
[y / img_height, x / img_width] for (x, y) in polyline
]
logger.debug(
f"Polyline {idx}: {len(polyline)} points, "
f"bbox=({x_min_norm:.3f}, {y_min_norm:.3f})-({x_max_norm:.3f}, {y_max_norm:.3f})"
)
params.append(
{
"bbox": [x_min_norm, y_min_norm, x_max_norm, y_max_norm],
"polyline": normalized_polyline,
}
)
return params or None
def draw_saved_polyline(
self,
polyline: List[List[float]],
color: str,
width: int = 3,
annotation_id: Optional[int] = None,
):
"""
Draw a polyline from database coordinates onto the annotation canvas.
Args:
polyline: List of [x, y] coordinate pairs in normalized coordinates (0-1)
color: Color hex string (e.g., '#FF0000')
width: Line width in pixels
"""
if not self.annotation_pixmap or not self.original_pixmap:
logger.warning("Cannot draw polyline: no image loaded")
return
if len(polyline) < 2:
logger.warning("Polyline has less than 2 points, cannot draw")
return
# Convert normalized coordinates to image coordinates
# Polyline is stored as [[y_norm, x_norm], ...] (row_norm, col_norm format)
img_width = self.original_pixmap.width()
img_height = self.original_pixmap.height()
logger.debug(f"Loading polyline with {len(polyline)} points")
logger.debug(f" Image size: {img_width}x{img_height}")
logger.debug(f" First 3 normalized points from DB: {polyline[:3]}")
img_coords: List[Tuple[float, float]] = []
for y_norm, x_norm in polyline:
x = float(x_norm * img_width)
y = float(y_norm * img_height)
img_coords.append((x, y))
logger.debug(f" First 3 pixel coords: {img_coords[:3]}")
# Store and redraw using common pipeline
pen_color = QColor(color)
pen_color.setAlpha(128) # Add semi-transparency
self._add_polyline(img_coords, pen_color, width, annotation_id=annotation_id)
# Store in all_strokes for consistency (uses normalized coordinates)
self.all_strokes.append(
{"points": polyline, "color": color, "alpha": 128, "width": width}
)
logger.debug(
f"Drew saved polyline with {len(polyline)} points in color {color}"
)
def draw_saved_bbox(
self,
bbox: List[float],
color: str,
width: int = 3,
label: Optional[str] = None,
):
"""
Draw a bounding box from database coordinates onto the annotation canvas.
Args:
bbox: Bounding box as [x_min_norm, y_min_norm, x_max_norm, y_max_norm]
in normalized coordinates (0-1)
color: Color hex string (e.g., '#FF0000')
width: Line width in pixels
label: Optional text label to render near the bounding box
"""
if not self.annotation_pixmap or not self.original_pixmap:
logger.warning("Cannot draw bounding box: no image loaded")
return
if len(bbox) != 4:
logger.warning(
f"Invalid bounding box format: expected 4 values, got {len(bbox)}"
)
return
# Convert normalized coordinates to image coordinates (for logging/debug)
img_width = self.original_pixmap.width()
img_height = self.original_pixmap.height()
x_min_norm, y_min_norm, x_max_norm, y_max_norm = bbox
x_min = int(x_min_norm * img_width)
y_min = int(y_min_norm * img_height)
x_max = int(x_max_norm * img_width)
y_max = int(y_max_norm * img_height)
logger.debug(f"Drawing bounding box: {bbox}")
logger.debug(f" Image size: {img_width}x{img_height}")
logger.debug(f" Pixel coords: ({x_min}, {y_min}) to ({x_max}, {y_max})")
# Store bounding box (normalized) and its style; actual drawing happens
# in _redraw_annotations() together with all polylines.
pen_color = QColor(color)
pen_color.setAlpha(128) # Add semi-transparency
self.bboxes.append(
[float(x_min_norm), float(y_min_norm), float(x_max_norm), float(y_max_norm)]
)
self.bbox_meta.append({"color": pen_color, "width": int(width), "label": label})
# Store in all_strokes for consistency
self.all_strokes.append(
{"bbox": bbox, "color": color, "alpha": 128, "width": width, "label": label}
)
# Redraw overlay (polylines + all bounding boxes)
self._redraw_annotations()
logger.debug(f"Drew saved bounding box in color {color}")
def set_show_bboxes(self, show: bool):
"""
Enable or disable drawing of bounding boxes.
Args:
show: If True, draw bounding boxes; if False, hide them.
"""
self.show_bboxes = bool(show)
logger.debug(f"Set show_bboxes to {self.show_bboxes}")
self._redraw_annotations()
def keyPressEvent(self, event: QKeyEvent):
"""Handle keyboard events for zooming."""
if event.key() in (Qt.Key_Plus, Qt.Key_Equal):
self.zoom_in()
event.accept()
elif event.key() == Qt.Key_Minus:
self.zoom_out()
event.accept()
elif event.key() == Qt.Key_0 and event.modifiers() == Qt.ControlModifier:
self.reset_zoom()
event.accept()
else:
super().keyPressEvent(event)
def eventFilter(self, obj, event: QEvent) -> bool:
"""Event filter to capture wheel events for zooming."""
if event.type() == QEvent.Wheel:
wheel_event = event
if self.original_pixmap is not None:
delta = wheel_event.angleDelta().y()
if delta > 0:
new_scale = self.zoom_scale + self.zoom_wheel_step
if new_scale <= self.zoom_max:
self.zoom_scale = new_scale
self._apply_zoom()
else:
new_scale = self.zoom_scale - self.zoom_wheel_step
if new_scale >= self.zoom_min:
self.zoom_scale = new_scale
self._apply_zoom()
return True
return super().eventFilter(obj, event)

View File

@@ -0,0 +1,478 @@
"""
Annotation tools widget for controlling annotation parameters.
Includes polyline tool, color picker, class selection, and annotation management.
"""
from PySide6.QtWidgets import (
QWidget,
QVBoxLayout,
QHBoxLayout,
QLabel,
QGroupBox,
QPushButton,
QComboBox,
QSpinBox,
QDoubleSpinBox,
QCheckBox,
QColorDialog,
QInputDialog,
QMessageBox,
)
from PySide6.QtGui import QColor, QIcon, QPixmap, QPainter
from PySide6.QtCore import Qt, Signal
from typing import Optional, Dict
from src.database.db_manager import DatabaseManager
from src.utils.logger import get_logger
logger = get_logger(__name__)
class AnnotationToolsWidget(QWidget):
"""
Widget for annotation tool controls.
Features:
- Enable/disable polyline tool
- Color selection for polyline pen
- Object class selection
- Add new object classes
- Pen width control
- Clear annotations
Signals:
polyline_enabled_changed: Emitted when polyline tool is enabled/disabled (bool)
polyline_pen_color_changed: Emitted when polyline pen color changes (QColor)
polyline_pen_width_changed: Emitted when polyline pen width changes (int)
class_selected: Emitted when object class is selected (dict)
clear_annotations_requested: Emitted when clear button is pressed
"""
polyline_enabled_changed = Signal(bool)
polyline_pen_color_changed = Signal(QColor)
polyline_pen_width_changed = Signal(int)
simplify_on_finish_changed = Signal(bool)
simplify_epsilon_changed = Signal(float)
# Toggle visibility of bounding boxes on the canvas
show_bboxes_changed = Signal(bool)
class_selected = Signal(dict)
class_color_changed = Signal()
clear_annotations_requested = Signal()
# Request deletion of the currently selected annotation on the canvas
delete_selected_annotation_requested = Signal()
def __init__(self, db_manager: DatabaseManager, parent=None):
"""
Initialize annotation tools widget.
Args:
db_manager: Database manager instance
parent: Parent widget
"""
super().__init__(parent)
self.db_manager = db_manager
self.polyline_enabled = False
self.current_color = QColor(255, 0, 0, 128) # Red with 50% alpha
self.current_class = None
self._setup_ui()
self._load_object_classes()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
# Polyline Tool Group
polyline_group = QGroupBox("Polyline Tool")
polyline_layout = QVBoxLayout()
# Enable/Disable polyline tool
button_layout = QHBoxLayout()
self.polyline_toggle_btn = QPushButton("Start Drawing Polyline")
self.polyline_toggle_btn.setCheckable(True)
self.polyline_toggle_btn.clicked.connect(self._on_polyline_toggle)
button_layout.addWidget(self.polyline_toggle_btn)
polyline_layout.addLayout(button_layout)
# Polyline pen width control
width_layout = QHBoxLayout()
width_layout.addWidget(QLabel("Pen Width:"))
self.polyline_pen_width_spin = QSpinBox()
self.polyline_pen_width_spin.setMinimum(1)
self.polyline_pen_width_spin.setMaximum(20)
self.polyline_pen_width_spin.setValue(3)
self.polyline_pen_width_spin.valueChanged.connect(
self._on_polyline_pen_width_changed
)
width_layout.addWidget(self.polyline_pen_width_spin)
width_layout.addStretch()
polyline_layout.addLayout(width_layout)
# Simplification controls (RDP)
simplify_layout = QHBoxLayout()
self.simplify_checkbox = QCheckBox("Simplify on finish")
self.simplify_checkbox.setChecked(True)
self.simplify_checkbox.stateChanged.connect(self._on_simplify_toggle)
simplify_layout.addWidget(self.simplify_checkbox)
simplify_layout.addWidget(QLabel("epsilon (px):"))
self.eps_spin = QDoubleSpinBox()
self.eps_spin.setRange(0.0, 1000.0)
self.eps_spin.setSingleStep(0.5)
self.eps_spin.setValue(2.0)
self.eps_spin.valueChanged.connect(self._on_eps_change)
simplify_layout.addWidget(self.eps_spin)
simplify_layout.addStretch()
polyline_layout.addLayout(simplify_layout)
polyline_group.setLayout(polyline_layout)
layout.addWidget(polyline_group)
# Object Class Group
class_group = QGroupBox("Object Class")
class_layout = QVBoxLayout()
# Class selection dropdown
self.class_combo = QComboBox()
self.class_combo.currentIndexChanged.connect(self._on_class_selected)
class_layout.addWidget(self.class_combo)
# Add / manage classes
class_button_layout = QHBoxLayout()
self.add_class_btn = QPushButton("Add New Class")
self.add_class_btn.clicked.connect(self._on_add_class)
class_button_layout.addWidget(self.add_class_btn)
self.refresh_classes_btn = QPushButton("Refresh")
self.refresh_classes_btn.clicked.connect(self._load_object_classes)
class_button_layout.addWidget(self.refresh_classes_btn)
class_layout.addLayout(class_button_layout)
# Class color (associated with selected object class)
color_layout = QHBoxLayout()
color_layout.addWidget(QLabel("Class Color:"))
self.color_btn = QPushButton()
self.color_btn.setFixedSize(40, 30)
self.color_btn.clicked.connect(self._on_color_picker)
self._update_color_button()
color_layout.addWidget(self.color_btn)
color_layout.addStretch()
class_layout.addLayout(color_layout)
# Selected class info
self.class_info_label = QLabel("No class selected")
self.class_info_label.setWordWrap(True)
self.class_info_label.setStyleSheet(
"QLabel { color: #888; font-style: italic; }"
)
class_layout.addWidget(self.class_info_label)
class_group.setLayout(class_layout)
layout.addWidget(class_group)
# Actions Group
actions_group = QGroupBox("Actions")
actions_layout = QVBoxLayout()
# Show / hide bounding boxes
self.show_bboxes_checkbox = QCheckBox("Show bounding boxes")
self.show_bboxes_checkbox.setChecked(True)
self.show_bboxes_checkbox.stateChanged.connect(self._on_show_bboxes_toggle)
actions_layout.addWidget(self.show_bboxes_checkbox)
self.clear_btn = QPushButton("Clear All Annotations")
self.clear_btn.clicked.connect(self._on_clear_annotations)
actions_layout.addWidget(self.clear_btn)
# Delete currently selected annotation (enabled when a selection exists)
self.delete_selected_btn = QPushButton("Delete Selected Annotation")
self.delete_selected_btn.clicked.connect(self._on_delete_selected_annotation)
self.delete_selected_btn.setEnabled(False)
actions_layout.addWidget(self.delete_selected_btn)
actions_group.setLayout(actions_layout)
layout.addWidget(actions_group)
layout.addStretch()
self.setLayout(layout)
def _update_color_button(self):
"""Update the color button appearance with current color."""
pixmap = QPixmap(40, 30)
pixmap.fill(self.current_color)
# Add border
painter = QPainter(pixmap)
painter.setPen(Qt.black)
painter.drawRect(0, 0, pixmap.width() - 1, pixmap.height() - 1)
painter.end()
self.color_btn.setIcon(QIcon(pixmap))
self.color_btn.setStyleSheet(f"background-color: {self.current_color.name()};")
def _load_object_classes(self):
"""Load object classes from database and populate combo box."""
try:
classes = self.db_manager.get_object_classes()
# Clear and repopulate combo box
self.class_combo.clear()
self.class_combo.addItem("-- Select Class / Show All --", None)
for cls in classes:
self.class_combo.addItem(cls["class_name"], cls)
logger.debug(f"Loaded {len(classes)} object classes")
except Exception as e:
logger.error(f"Error loading object classes: {e}")
QMessageBox.warning(
self, "Error", f"Failed to load object classes:\n{str(e)}"
)
def _on_polyline_toggle(self, checked: bool):
"""Handle polyline tool enable/disable."""
self.polyline_enabled = checked
if checked:
self.polyline_toggle_btn.setText("Stop Drawing Polyline")
self.polyline_toggle_btn.setStyleSheet(
"QPushButton { background-color: #4CAF50; }"
)
else:
self.polyline_toggle_btn.setText("Start Drawing Polyline")
self.polyline_toggle_btn.setStyleSheet("")
self.polyline_enabled_changed.emit(self.polyline_enabled)
logger.debug(f"Polyline tool {'enabled' if checked else 'disabled'}")
def _on_polyline_pen_width_changed(self, width: int):
"""Handle polyline pen width changes."""
self.polyline_pen_width_changed.emit(width)
logger.debug(f"Polyline pen width changed to {width}")
def _on_simplify_toggle(self, state: int):
"""Handle simplify-on-finish checkbox toggle."""
enabled = bool(state)
self.simplify_on_finish_changed.emit(enabled)
logger.debug(f"Simplify on finish set to {enabled}")
def _on_eps_change(self, val: float):
"""Handle epsilon (RDP tolerance) value changes."""
epsilon = float(val)
self.simplify_epsilon_changed.emit(epsilon)
logger.debug(f"Simplification epsilon changed to {epsilon}")
def _on_show_bboxes_toggle(self, state: int):
"""Handle 'Show bounding boxes' checkbox toggle."""
show = bool(state)
self.show_bboxes_changed.emit(show)
logger.debug(f"Show bounding boxes set to {show}")
def _on_color_picker(self):
"""Open color picker dialog and update the selected object's class color."""
if not self.current_class:
QMessageBox.warning(
self,
"No Class Selected",
"Please select an object class before changing its color.",
)
return
# Use current class color (without alpha) as the base
base_color = QColor(self.current_class.get("color", self.current_color.name()))
color = QColorDialog.getColor(
base_color,
self,
"Select Class Color",
QColorDialog.ShowAlphaChannel, # Allow alpha in UI, but store RGB in DB
)
if not color.isValid():
return
# Normalize to opaque RGB for storage
new_color = QColor(color)
new_color.setAlpha(255)
hex_color = new_color.name()
try:
# Update in database
self.db_manager.update_object_class(
class_id=self.current_class["id"], color=hex_color
)
except Exception as e:
logger.error(f"Failed to update class color in database: {e}")
QMessageBox.critical(
self,
"Error",
f"Failed to update class color in database:\n{str(e)}",
)
return
# Update local class data and combo box item data
self.current_class["color"] = hex_color
current_index = self.class_combo.currentIndex()
if current_index >= 0:
self.class_combo.setItemData(current_index, dict(self.current_class))
# Update info label text
info_text = f"Class: {self.current_class['class_name']}\nColor: {hex_color}"
if self.current_class.get("description"):
info_text += f"\nDescription: {self.current_class['description']}"
self.class_info_label.setText(info_text)
# Use semi-transparent version for polyline pen / button preview
class_color = QColor(hex_color)
class_color.setAlpha(128)
self.current_color = class_color
self._update_color_button()
self.polyline_pen_color_changed.emit(class_color)
logger.debug(
f"Updated class '{self.current_class['class_name']}' color to "
f"{hex_color} (polyline pen alpha={class_color.alpha()})"
)
# Notify listeners (e.g., AnnotationTab) so they can reload/redraw
self.class_color_changed.emit()
def _on_class_selected(self, index: int):
"""Handle object class selection (including '-- Select Class --')."""
class_data = self.class_combo.currentData()
if class_data:
self.current_class = class_data
# Update info label
info_text = (
f"Class: {class_data['class_name']}\n" f"Color: {class_data['color']}"
)
if class_data.get("description"):
info_text += f"\nDescription: {class_data['description']}"
self.class_info_label.setText(info_text)
# Update polyline pen color to match class color with semi-transparency
class_color = QColor(class_data["color"])
if class_color.isValid():
# Add 50% alpha for semi-transparency
class_color.setAlpha(128)
self.current_color = class_color
self._update_color_button()
self.polyline_pen_color_changed.emit(class_color)
self.class_selected.emit(class_data)
logger.debug(f"Selected class: {class_data['class_name']}")
else:
# "-- Select Class --" chosen: clear current class and show all annotations
self.current_class = None
self.class_info_label.setText("No class selected")
self.class_selected.emit(None)
logger.debug("Class selection cleared: showing annotations for all classes")
def _on_add_class(self):
"""Handle adding a new object class."""
# Get class name
class_name, ok = QInputDialog.getText(
self, "Add Object Class", "Enter class name:"
)
if not ok or not class_name.strip():
return
class_name = class_name.strip()
# Check if class already exists
existing = self.db_manager.get_object_class_by_name(class_name)
if existing:
QMessageBox.warning(
self, "Class Exists", f"A class named '{class_name}' already exists."
)
return
# Get color
color = QColorDialog.getColor(self.current_color, self, "Select Class Color")
if not color.isValid():
return
# Get optional description
description, ok = QInputDialog.getText(
self, "Class Description", "Enter class description (optional):"
)
if not ok:
description = None
# Add to database
try:
class_id = self.db_manager.add_object_class(
class_name, color.name(), description.strip() if description else None
)
logger.info(f"Added new object class: {class_name} (ID: {class_id})")
# Reload classes and select the new one
self._load_object_classes()
# Find and select the newly added class
for i in range(self.class_combo.count()):
class_data = self.class_combo.itemData(i)
if class_data and class_data.get("id") == class_id:
self.class_combo.setCurrentIndex(i)
break
QMessageBox.information(
self, "Success", f"Class '{class_name}' added successfully!"
)
except Exception as e:
logger.error(f"Error adding object class: {e}")
QMessageBox.critical(
self, "Error", f"Failed to add object class:\n{str(e)}"
)
def _on_clear_annotations(self):
"""Handle clear annotations button."""
reply = QMessageBox.question(
self,
"Clear Annotations",
"Are you sure you want to clear all annotations?",
QMessageBox.Yes | QMessageBox.No,
QMessageBox.No,
)
if reply == QMessageBox.Yes:
self.clear_annotations_requested.emit()
logger.debug("Clear annotations requested")
def _on_delete_selected_annotation(self):
"""Handle delete selected annotation button."""
self.delete_selected_annotation_requested.emit()
logger.debug("Delete selected annotation requested")
def set_has_selected_annotation(self, has_selection: bool):
"""
Enable/disable actions that require a selected annotation.
Args:
has_selection: True if an annotation is currently selected on the canvas.
"""
self.delete_selected_btn.setEnabled(bool(has_selection))
def get_current_class(self) -> Optional[Dict]:
"""Get currently selected object class."""
return self.current_class
def get_polyline_pen_color(self) -> QColor:
"""Get current polyline pen color."""
return self.current_color
def get_polyline_pen_width(self) -> int:
"""Get current polyline pen width."""
return self.polyline_pen_width_spin.value()
def is_polyline_enabled(self) -> bool:
"""Check if polyline tool is enabled."""
return self.polyline_enabled

View File

@@ -0,0 +1,282 @@
"""
Image display widget with zoom functionality for the microscopy object detection application.
Reusable widget for displaying images with zoom controls.
"""
from PySide6.QtWidgets import QWidget, QVBoxLayout, QLabel, QScrollArea
from PySide6.QtGui import QPixmap, QImage, QKeyEvent
from PySide6.QtCore import Qt, QEvent, Signal
from pathlib import Path
import numpy as np
from src.utils.image import Image, ImageLoadError
from src.utils.logger import get_logger
logger = get_logger(__name__)
class ImageDisplayWidget(QWidget):
"""
Reusable widget for displaying images with zoom functionality.
Features:
- Display images from Image objects
- Zoom in/out with mouse wheel
- Zoom in/out with +/- keyboard keys
- Reset zoom with Ctrl+0
- Scroll area for large images
Signals:
zoom_changed: Emitted when zoom level changes (float zoom_scale)
"""
zoom_changed = Signal(float) # Emitted when zoom level changes
def __init__(self, parent=None):
"""
Initialize the image display widget.
Args:
parent: Parent widget
"""
super().__init__(parent)
self.current_image = None
self.original_pixmap = None # Store original pixmap for zoom
self.zoom_scale = 1.0 # Current zoom scale
self.zoom_min = 0.1 # Minimum zoom (10%)
self.zoom_max = 10.0 # Maximum zoom (1000%)
self.zoom_step = 0.1 # Zoom step for +/- keys
self.zoom_wheel_step = 0.15 # Zoom step for mouse wheel
self._setup_ui()
def _setup_ui(self):
"""Setup user interface."""
layout = QVBoxLayout()
layout.setContentsMargins(0, 0, 0, 0)
# Scroll area for image
self.scroll_area = QScrollArea()
self.scroll_area.setWidgetResizable(True)
self.scroll_area.setMinimumHeight(400)
self.image_label = QLabel("No image loaded")
self.image_label.setAlignment(Qt.AlignCenter)
self.image_label.setStyleSheet(
"QLabel { background-color: #2b2b2b; color: #888; }"
)
self.image_label.setScaledContents(False)
# Enable mouse tracking for wheel events
self.image_label.setMouseTracking(True)
self.scroll_area.setWidget(self.image_label)
# Install event filter to capture wheel events on scroll area
self.scroll_area.viewport().installEventFilter(self)
layout.addWidget(self.scroll_area)
self.setLayout(layout)
# Set focus policy to receive keyboard events
self.setFocusPolicy(Qt.StrongFocus)
def load_image(self, image: Image):
"""
Load and display an image.
Args:
image: Image object to display
Raises:
ImageLoadError: If image cannot be displayed
"""
self.current_image = image
# Reset zoom when loading new image
self.zoom_scale = 1.0
# Convert to QPixmap and display
self._display_image()
logger.debug(f"Loaded image into display widget: {image.width}x{image.height}")
def clear(self):
"""Clear the displayed image."""
self.current_image = None
self.original_pixmap = None
self.zoom_scale = 1.0
self.image_label.setText("No image loaded")
self.image_label.setPixmap(QPixmap())
logger.debug("Cleared image display")
def _display_image(self):
"""Display the current image in the image label."""
if self.current_image is None:
return
try:
# Get RGB image data
if self.current_image.channels == 3:
image_data = self.current_image.get_rgb()
height, width, channels = image_data.shape
else:
image_data = self.current_image.get_grayscale()
height, width = image_data.shape
channels = 1
# Ensure data is contiguous for proper QImage display
image_data = np.ascontiguousarray(image_data)
# Use actual stride from numpy array for correct display
bytes_per_line = image_data.strides[0]
qimage = QImage(
image_data.data,
width,
height,
bytes_per_line,
self.current_image.qtimage_format,
).copy() # Copy to ensure Qt owns its memory after this scope
# Convert to pixmap
pixmap = QPixmap.fromImage(qimage)
# Store original pixmap for zooming
self.original_pixmap = pixmap
# Apply zoom and display
self._apply_zoom()
except Exception as e:
logger.error(f"Error displaying image: {e}")
raise ImageLoadError(f"Failed to display image: {str(e)}")
def _apply_zoom(self):
"""Apply current zoom level to the displayed image."""
if self.original_pixmap is None:
return
# Calculate scaled size
scaled_width = int(self.original_pixmap.width() * self.zoom_scale)
scaled_height = int(self.original_pixmap.height() * self.zoom_scale)
# Scale pixmap
scaled_pixmap = self.original_pixmap.scaled(
scaled_width,
scaled_height,
Qt.KeepAspectRatio,
(
Qt.SmoothTransformation
if self.zoom_scale >= 1.0
else Qt.FastTransformation
),
)
# Display in label
self.image_label.setPixmap(scaled_pixmap)
self.image_label.setScaledContents(False)
self.image_label.adjustSize()
# Emit zoom changed signal
self.zoom_changed.emit(self.zoom_scale)
def zoom_in(self):
"""Zoom in on the image."""
if self.original_pixmap is None:
return
new_scale = self.zoom_scale + self.zoom_step
if new_scale <= self.zoom_max:
self.zoom_scale = new_scale
self._apply_zoom()
logger.debug(f"Zoomed in to {int(self.zoom_scale * 100)}%")
def zoom_out(self):
"""Zoom out from the image."""
if self.original_pixmap is None:
return
new_scale = self.zoom_scale - self.zoom_step
if new_scale >= self.zoom_min:
self.zoom_scale = new_scale
self._apply_zoom()
logger.debug(f"Zoomed out to {int(self.zoom_scale * 100)}%")
def reset_zoom(self):
"""Reset zoom to 100%."""
if self.original_pixmap is None:
return
self.zoom_scale = 1.0
self._apply_zoom()
logger.debug("Reset zoom to 100%")
def set_zoom(self, scale: float):
"""
Set zoom to a specific scale.
Args:
scale: Zoom scale (1.0 = 100%)
"""
if self.original_pixmap is None:
return
# Clamp to min/max
scale = max(self.zoom_min, min(self.zoom_max, scale))
self.zoom_scale = scale
self._apply_zoom()
logger.debug(f"Set zoom to {int(self.zoom_scale * 100)}%")
def get_zoom_percentage(self) -> int:
"""
Get current zoom level as percentage.
Returns:
Zoom level as integer percentage (e.g., 100 for 100%)
"""
return int(self.zoom_scale * 100)
def keyPressEvent(self, event: QKeyEvent):
"""Handle keyboard events for zooming."""
if event.key() in (Qt.Key_Plus, Qt.Key_Equal):
# + or = key (= is the unshifted + on many keyboards)
self.zoom_in()
event.accept()
elif event.key() == Qt.Key_Minus:
# - key
self.zoom_out()
event.accept()
elif event.key() == Qt.Key_0 and event.modifiers() == Qt.ControlModifier:
# Ctrl+0 to reset zoom
self.reset_zoom()
event.accept()
else:
super().keyPressEvent(event)
def eventFilter(self, obj, event: QEvent) -> bool:
"""Event filter to capture wheel events for zooming."""
if event.type() == QEvent.Wheel:
wheel_event = event
if self.original_pixmap is not None:
# Get wheel angle delta
delta = wheel_event.angleDelta().y()
# Zoom in/out based on wheel direction
if delta > 0:
# Scroll up = zoom in
new_scale = self.zoom_scale + self.zoom_wheel_step
if new_scale <= self.zoom_max:
self.zoom_scale = new_scale
self._apply_zoom()
else:
# Scroll down = zoom out
new_scale = self.zoom_scale - self.zoom_wheel_step
if new_scale >= self.zoom_min:
self.zoom_scale = new_scale
self._apply_zoom()
return True # Event handled
return super().eventFilter(obj, event)

49
src/gui_launcher.py Normal file
View File

@@ -0,0 +1,49 @@
"""GUI launcher module for microscopy object detection application."""
import sys
from pathlib import Path
from PySide6.QtWidgets import QApplication
from PySide6.QtCore import Qt
from src import __version__
from src.gui.main_window import MainWindow
from src.utils.logger import setup_logging
from src.utils.config_manager import ConfigManager
def main():
"""Launch the GUI application."""
# Setup logging
config_manager = ConfigManager()
log_config = config_manager.get_section("logging")
setup_logging(
log_file=log_config.get("file", "logs/app.log"),
level=log_config.get("level", "INFO"),
log_format=log_config.get("format"),
)
# Enable High DPI scaling
QApplication.setHighDpiScaleFactorRoundingPolicy(
Qt.HighDpiScaleFactorRoundingPolicy.PassThrough
)
# Create Qt application
app = QApplication(sys.argv)
app.setApplicationName("Microscopy Object Detection")
app.setOrganizationName("MicroscopyLab")
app.setApplicationVersion(__version__)
# Set application style
app.setStyle("Fusion")
# Create and show main window
window = MainWindow()
window.show()
# Run application
sys.exit(app.exec())
if __name__ == "__main__":
main()

0
src/model/__init__.py Normal file
View File

384
src/model/inference.py Normal file
View File

@@ -0,0 +1,384 @@
"""
Inference engine for the microscopy object detection application.
Handles detection inference and result storage.
"""
from typing import List, Dict, Optional, Callable
from pathlib import Path
import cv2
import numpy as np
from src.model.yolo_wrapper import YOLOWrapper
from src.database.db_manager import DatabaseManager
from src.utils.image import Image
from src.utils.logger import get_logger
from src.utils.file_utils import get_relative_path
logger = get_logger(__name__)
class InferenceEngine:
"""Handles detection inference and result storage."""
def __init__(self, model_path: str, db_manager: DatabaseManager, model_id: int):
"""
Initialize inference engine.
Args:
model_path: Path to YOLO model weights
db_manager: Database manager instance
model_id: ID of the model in database
"""
self.yolo = YOLOWrapper(model_path)
self.yolo.load_model()
self.db_manager = db_manager
self.model_id = model_id
logger.info(f"InferenceEngine initialized with model_id {model_id}")
def detect_single(
self,
image_path: str,
relative_path: str,
conf: float = 0.25,
save_to_db: bool = True,
repository_root: Optional[str] = None,
) -> Dict:
"""
Detect objects in a single image.
Args:
image_path: Absolute path to image file
relative_path: Relative path from repository root
conf: Confidence threshold
save_to_db: Whether to save results to database
repository_root: Base directory used to compute relative_path (if known)
Returns:
Dictionary with detection results
"""
try:
# Normalize storage path (fall back to absolute path when repo root is unknown)
stored_relative_path = relative_path
if not repository_root:
stored_relative_path = str(Path(image_path).resolve())
# Get image dimensions
img = Image(image_path)
width = img.width
height = img.height
# Perform detection
detections = self.yolo.predict(image_path, conf=conf)
# Add/get image in database
image_id = self.db_manager.get_or_create_image(
relative_path=stored_relative_path,
filename=Path(image_path).name,
width=width,
height=height,
)
inserted_count = 0
deleted_count = 0
# Save detections to database, replacing any previous results for this image/model
if save_to_db:
deleted_count = self.db_manager.delete_detections_for_image(
image_id, self.model_id
)
if detections:
detection_records = []
for det in detections:
# Use normalized bbox from detection
bbox_normalized = det[
"bbox_normalized"
] # [x_min, y_min, x_max, y_max]
metadata = {
"class_id": det["class_id"],
"source_path": str(Path(image_path).resolve()),
}
if repository_root:
metadata["repository_root"] = str(
Path(repository_root).resolve()
)
record = {
"image_id": image_id,
"model_id": self.model_id,
"class_name": det["class_name"],
"bbox": tuple(bbox_normalized),
"confidence": det["confidence"],
"segmentation_mask": det.get("segmentation_mask"),
"metadata": metadata,
}
detection_records.append(record)
inserted_count = self.db_manager.add_detections_batch(
detection_records
)
logger.info(
f"Saved {inserted_count} detections to database (replaced {deleted_count})"
)
else:
logger.info(
f"Detection run removed {deleted_count} stale entries but produced no new detections"
)
return {
"success": True,
"image_path": image_path,
"image_id": image_id,
"detections": detections,
"count": len(detections),
}
except Exception as e:
logger.error(f"Error detecting objects in {image_path}: {e}")
return {
"success": False,
"image_path": image_path,
"error": str(e),
"detections": [],
"count": 0,
}
def detect_batch(
self,
image_paths: List[str],
repository_root: str,
conf: float = 0.25,
progress_callback: Optional[Callable[[int, int, str], None]] = None,
) -> List[Dict]:
"""
Detect objects in multiple images.
Args:
image_paths: List of absolute image paths
repository_root: Root directory for relative paths
conf: Confidence threshold
progress_callback: Optional callback(current, total, message)
Returns:
List of detection result dictionaries
"""
results = []
total = len(image_paths)
logger.info(f"Starting batch detection on {total} images")
for i, image_path in enumerate(image_paths, 1):
# Calculate relative path
rel_path = get_relative_path(image_path, repository_root)
# Perform detection
result = self.detect_single(
image_path,
rel_path,
conf=conf,
repository_root=repository_root,
)
results.append(result)
# Update progress
if progress_callback:
progress_callback(i, total, f"Processed {rel_path}")
if i % 10 == 0:
logger.info(f"Processed {i}/{total} images")
logger.info(f"Batch detection complete: {total} images processed")
return results
def detect_with_visualization(
self,
image_path: str,
conf: float = 0.25,
bbox_thickness: int = 2,
bbox_colors: Optional[Dict[str, str]] = None,
draw_masks: bool = True,
) -> tuple:
"""
Detect objects and return annotated image.
Args:
image_path: Path to image
conf: Confidence threshold
bbox_thickness: Thickness of bounding boxes
bbox_colors: Dictionary mapping class names to hex colors
draw_masks: Whether to draw segmentation masks (if available)
Returns:
Tuple of (detections, annotated_image_array)
"""
try:
detections = self.yolo.predict(image_path, conf=conf)
# Load image
img = cv2.imread(image_path)
if img is None:
raise ValueError(f"Failed to load image: {image_path}")
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
height, width = img.shape[:2]
# Default colors if not provided
if bbox_colors is None:
bbox_colors = {}
default_color = self._hex_to_bgr(bbox_colors.get("default", "#00FF00"))
# Draw detections
for det in detections:
# Get color for this class
class_name = det["class_name"]
color_hex = bbox_colors.get(
class_name, bbox_colors.get("default", "#00FF00")
)
color = self._hex_to_bgr(color_hex)
# Draw segmentation mask if available and requested
if draw_masks and det.get("segmentation_mask"):
mask_normalized = det["segmentation_mask"]
if mask_normalized and len(mask_normalized) > 0:
# Convert normalized coordinates to absolute pixels
mask_points = np.array(
[
[int(pt[0] * width), int(pt[1] * height)]
for pt in mask_normalized
],
dtype=np.int32,
)
# Create a semi-transparent overlay
overlay = img.copy()
cv2.fillPoly(overlay, [mask_points], color)
# Blend with original image (30% opacity)
cv2.addWeighted(overlay, 0.3, img, 0.7, 0, img)
# Draw mask contour
cv2.polylines(img, [mask_points], True, color, bbox_thickness)
# Get absolute coordinates for bounding box
bbox_abs = det["bbox_absolute"]
x1, y1, x2, y2 = [int(v) for v in bbox_abs]
# Draw bounding box
cv2.rectangle(img, (x1, y1), (x2, y2), color, bbox_thickness)
# Prepare label
label = f"{class_name} {det['confidence']:.2f}"
# Draw label background
(label_w, label_h), baseline = cv2.getTextSize(
label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1
)
cv2.rectangle(
img,
(x1, y1 - label_h - baseline - 5),
(x1 + label_w, y1),
color,
-1,
)
# Draw label text
cv2.putText(
img,
label,
(x1, y1 - baseline - 5),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
(255, 255, 255),
1,
)
return detections, img
except Exception as e:
logger.error(f"Error creating visualization: {e}")
# Return empty detections and original image if possible
try:
img = cv2.imread(image_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
return [], img
except:
return [], np.zeros((480, 640, 3), dtype=np.uint8)
def get_detection_summary(self, detections: List[Dict]) -> Dict[str, any]:
"""
Generate summary statistics for detections.
Args:
detections: List of detection dictionaries
Returns:
Dictionary with summary statistics
"""
if not detections:
return {
"total_count": 0,
"class_counts": {},
"avg_confidence": 0.0,
"confidence_range": (0.0, 0.0),
}
# Count by class
class_counts = {}
confidences = []
for det in detections:
class_name = det["class_name"]
class_counts[class_name] = class_counts.get(class_name, 0) + 1
confidences.append(det["confidence"])
return {
"total_count": len(detections),
"class_counts": class_counts,
"avg_confidence": sum(confidences) / len(confidences),
"confidence_range": (min(confidences), max(confidences)),
}
@staticmethod
def _hex_to_bgr(hex_color: str) -> tuple:
"""
Convert hex color to BGR tuple.
Args:
hex_color: Hex color string (e.g., '#FF0000')
Returns:
BGR tuple (B, G, R)
"""
hex_color = hex_color.lstrip("#")
if len(hex_color) != 6:
return (0, 255, 0) # Default green
try:
r = int(hex_color[0:2], 16)
g = int(hex_color[2:4], 16)
b = int(hex_color[4:6], 16)
return (b, g, r) # OpenCV uses BGR
except ValueError:
return (0, 255, 0) # Default green
def change_model(self, model_path: str, model_id: int) -> bool:
"""
Change the current model.
Args:
model_path: Path to new model weights
model_id: ID of new model in database
Returns:
True if successful, False otherwise
"""
try:
self.yolo = YOLOWrapper(model_path)
if self.yolo.load_model():
self.model_id = model_id
logger.info(f"Model changed to {model_path}")
return True
return False
except Exception as e:
logger.error(f"Error changing model: {e}")
return False

519
src/model/yolo_wrapper.py Normal file
View File

@@ -0,0 +1,519 @@
"""
YOLO model wrapper for the microscopy object detection application.
Provides a clean interface to YOLOv8 for training, validation, and inference.
"""
from ultralytics import YOLO
from pathlib import Path
from typing import Optional, List, Dict, Callable, Any
import torch
import tempfile
import os
import numpy as np
from src.utils.image import Image, convert_grayscale_to_rgb_preserve_range
from src.utils.logger import get_logger
from src.utils.train_ultralytics_float import train_with_float32_loader
logger = get_logger(__name__)
class YOLOWrapper:
"""Wrapper for YOLOv8 model operations."""
def __init__(self, model_path: str = "yolov8s-seg.pt"):
"""
Initialize YOLO model.
Args:
model_path: Path to model weights (.pt file)
"""
self.model_path = model_path
self.model = None
self.device = "cuda" if torch.cuda.is_available() else "cpu"
logger.info(f"YOLOWrapper initialized with device: {self.device}")
def load_model(self) -> bool:
"""
Load YOLO model from path.
Returns:
True if loaded successfully, False otherwise
"""
try:
logger.info(f"Loading YOLO model from {self.model_path}")
self.model = YOLO(self.model_path)
self.model.to(self.device)
logger.info("Model loaded successfully")
return True
except Exception as e:
logger.error(f"Error loading model: {e}")
return False
def train(
self,
data_yaml: str,
epochs: int = 100,
imgsz: int = 640,
batch: int = 16,
patience: int = 50,
save_dir: str = "data/models",
name: str = "custom_model",
resume: bool = False,
callbacks: Optional[Dict[str, Callable]] = None,
use_float32_loader: bool = True,
**kwargs,
) -> Dict[str, Any]:
"""
Train the YOLO model with optional float32 loader for 16-bit TIFFs.
Args:
data_yaml: Path to data.yaml configuration file
epochs: Number of training epochs
imgsz: Input image size
batch: Batch size
patience: Early stopping patience
save_dir: Directory to save trained model
name: Name for the training run
resume: Resume training from last checkpoint
callbacks: Optional Ultralytics callback dictionary
use_float32_loader: Use custom Float32Dataset for 16-bit TIFFs (default: True)
**kwargs: Additional training arguments
Returns:
Dictionary with training results
"""
if 1:
logger.info(f"Starting training: {name}")
logger.info(
f"Data: {data_yaml}, Epochs: {epochs}, Batch: {batch}, ImgSz: {imgsz}"
)
# Check if dataset has 16-bit TIFFs and use float32 loader
if use_float32_loader:
logger.info("Using Float32Dataset loader for 16-bit TIFF support")
return train_with_float32_loader(
model_path=self.model_path,
data_yaml=data_yaml,
epochs=epochs,
imgsz=imgsz,
batch=batch,
patience=patience,
save_dir=save_dir,
name=name,
callbacks=callbacks,
device=self.device,
resume=resume,
**kwargs,
)
else:
# Standard training (old behavior)
if self.model is None:
if not self.load_model():
raise RuntimeError(
f"Failed to load model from {self.model_path}"
)
results = self.model.train(
data=data_yaml,
epochs=epochs,
imgsz=imgsz,
batch=batch,
patience=patience,
project=save_dir,
name=name,
device=self.device,
resume=resume,
**kwargs,
)
logger.info("Training completed successfully")
return self._format_training_results(results)
# except Exception as e:
# logger.error(f"Error during training: {e}")
# raise
def validate(self, data_yaml: str, split: str = "val", **kwargs) -> Dict[str, Any]:
"""
Validate the model.
Args:
data_yaml: Path to data.yaml configuration file
split: Dataset split to validate on ('val' or 'test')
**kwargs: Additional validation arguments
Returns:
Dictionary with validation metrics
"""
if self.model is None:
if not self.load_model():
raise RuntimeError(f"Failed to load model from {self.model_path}")
try:
logger.info(f"Starting validation on {split} split")
results = self.model.val(
data=data_yaml, split=split, device=self.device, **kwargs
)
logger.info("Validation completed successfully")
return self._format_validation_results(results)
except Exception as e:
logger.error(f"Error during validation: {e}")
raise
def predict(
self,
source: str,
conf: float = 0.25,
iou: float = 0.45,
save: bool = False,
save_txt: bool = False,
save_conf: bool = False,
**kwargs,
) -> List[Dict]:
"""
Perform inference on image(s).
Args:
source: Path to image or directory
conf: Confidence threshold
iou: IoU threshold for NMS
save: Whether to save annotated images
save_txt: Whether to save labels to .txt files
save_conf: Whether to save confidence in labels
**kwargs: Additional prediction arguments
Returns:
List of detection dictionaries
"""
if self.model is None:
if not self.load_model():
raise RuntimeError(f"Failed to load model from {self.model_path}")
prepared_source, cleanup_path = self._prepare_source(source)
try:
logger.info(f"Running inference on {source}")
results = self.model.predict(
source=prepared_source,
conf=conf,
iou=iou,
save=save,
save_txt=save_txt,
save_conf=save_conf,
device=self.device,
**kwargs,
)
detections = self._format_prediction_results(results)
logger.info(f"Inference complete: {len(detections)} detections")
return detections
except Exception as e:
logger.error(f"Error during inference: {e}")
raise
finally:
# Clean up temporary files (only for non-16-bit images)
# 16-bit TIFFs return numpy arrays directly, so cleanup_path is None
if cleanup_path:
try:
os.remove(cleanup_path)
logger.debug(f"Cleaned up temporary file: {cleanup_path}")
except OSError as cleanup_error:
logger.warning(
f"Failed to delete temporary file {cleanup_path}: {cleanup_error}"
)
def export(
self, format: str = "onnx", output_path: Optional[str] = None, **kwargs
) -> str:
"""
Export model to different format.
Args:
format: Export format (onnx, torchscript, tflite, etc.)
output_path: Path for exported model
**kwargs: Additional export arguments
Returns:
Path to exported model
"""
if self.model is None:
if not self.load_model():
raise RuntimeError(f"Failed to load model from {self.model_path}")
try:
logger.info(f"Exporting model to {format} format")
export_path = self.model.export(format=format, **kwargs)
logger.info(f"Model exported to {export_path}")
return str(export_path)
except Exception as e:
logger.error(f"Error exporting model: {e}")
raise
def _prepare_source(self, source):
"""Convert single-channel images to RGB for inference.
For 16-bit TIFF files, this will:
1. Load using tifffile
2. Normalize to float32 [0-1] (NO uint8 conversion to avoid data loss)
3. Replicate grayscale → RGB (3 channels)
4. Pass directly as numpy array to YOLO
"""
cleanup_path = None
if isinstance(source, (str, Path)):
source_path = Path(source)
if source_path.is_file():
try:
img_obj = Image(source_path)
# Check if it's a 16-bit TIFF file
is_16bit_tiff = (
source_path.suffix.lower() in [".tif", ".tiff"]
and img_obj.dtype == np.uint16
)
if is_16bit_tiff:
# Process 16-bit TIFF: normalize to float32 [0-1]
# NO uint8 conversion - pass float32 directly to avoid data loss
normalized_float = img_obj.to_normalized_float32()
# Convert grayscale to RGB by replicating channels
if len(normalized_float.shape) == 2:
# Grayscale: H,W → H,W,3
rgb_float = np.stack([normalized_float] * 3, axis=-1)
elif (
len(normalized_float.shape) == 3
and normalized_float.shape[2] == 1
):
# Grayscale with channel dim: H,W,1 → H,W,3
rgb_float = np.repeat(normalized_float, 3, axis=2)
else:
# Already multi-channel
rgb_float = normalized_float
# Ensure contiguous array and float32
rgb_float = np.ascontiguousarray(rgb_float, dtype=np.float32)
logger.info(
f"Loaded 16-bit TIFF {source_path} as float32 [0-1] RGB "
f"(shape: {rgb_float.shape}, dtype: {rgb_float.dtype}, "
f"range: [{rgb_float.min():.4f}, {rgb_float.max():.4f}])"
)
# Return numpy array directly - YOLO can handle it
return rgb_float, cleanup_path
else:
# Standard processing for other images
pil_img = img_obj.pil_image
if len(pil_img.getbands()) == 1:
rgb_img = convert_grayscale_to_rgb_preserve_range(pil_img)
else:
rgb_img = pil_img.convert("RGB")
suffix = source_path.suffix or ".png"
tmp = tempfile.NamedTemporaryFile(suffix=suffix, delete=False)
tmp_path = tmp.name
tmp.close()
rgb_img.save(tmp_path)
cleanup_path = tmp_path
logger.info(
f"Converted image {source_path} to RGB for inference at {tmp_path}"
)
return tmp_path, cleanup_path
except Exception as convert_error:
logger.warning(
f"Failed to preprocess {source_path} as RGB, continuing with original file: {convert_error}"
)
return source, cleanup_path
def _format_training_results(self, results) -> Dict[str, Any]:
"""Format training results into dictionary."""
try:
# Get the results dict
results_dict = (
results.results_dict if hasattr(results, "results_dict") else {}
)
formatted = {
"success": True,
"final_epoch": getattr(results, "epoch", 0),
"metrics": {
"mAP50": float(results_dict.get("metrics/mAP50(B)", 0)),
"mAP50-95": float(results_dict.get("metrics/mAP50-95(B)", 0)),
"precision": float(results_dict.get("metrics/precision(B)", 0)),
"recall": float(results_dict.get("metrics/recall(B)", 0)),
},
"best_model_path": str(Path(results.save_dir) / "weights" / "best.pt"),
"last_model_path": str(Path(results.save_dir) / "weights" / "last.pt"),
"save_dir": str(results.save_dir),
}
return formatted
except Exception as e:
logger.error(f"Error formatting training results: {e}")
return {"success": False, "error": str(e)}
def _format_validation_results(self, results) -> Dict[str, Any]:
"""Format validation results into dictionary."""
try:
box_metrics = results.box
formatted = {
"success": True,
"mAP50": float(box_metrics.map50),
"mAP50-95": float(box_metrics.map),
"precision": float(box_metrics.mp),
"recall": float(box_metrics.mr),
"fitness": (
float(results.fitness) if hasattr(results, "fitness") else 0.0
),
}
# Add per-class metrics if available
if hasattr(box_metrics, "ap") and hasattr(results, "names"):
class_metrics = {}
for idx, name in results.names.items():
if idx < len(box_metrics.ap):
class_metrics[name] = {
"ap": float(box_metrics.ap[idx]),
"ap50": (
float(box_metrics.ap50[idx])
if hasattr(box_metrics, "ap50")
else 0.0
),
}
formatted["class_metrics"] = class_metrics
return formatted
except Exception as e:
logger.error(f"Error formatting validation results: {e}")
return {"success": False, "error": str(e)}
def _format_prediction_results(self, results) -> List[Dict]:
"""Format prediction results into list of dictionaries."""
detections = []
try:
for result in results:
boxes = result.boxes
image_path = str(result.path)
orig_shape = result.orig_shape # (height, width)
height, width = orig_shape
# Check if this is a segmentation model with masks
has_masks = hasattr(result, "masks") and result.masks is not None
for i in range(len(boxes)):
# Get normalized coordinates
xyxyn = boxes.xyxyn[i].cpu().numpy() # Normalized [x1, y1, x2, y2]
detection = {
"image_path": image_path,
"class_id": int(boxes.cls[i]),
"class_name": result.names[int(boxes.cls[i])],
"confidence": float(boxes.conf[i]),
"bbox_normalized": [
float(v) for v in xyxyn
], # [x_min, y_min, x_max, y_max]
"bbox_absolute": [
float(v) for v in boxes.xyxy[i].cpu().numpy()
], # Absolute pixels
}
# Extract segmentation mask if available
if has_masks:
try:
# Get the mask for this detection
mask_data = result.masks.xy[
i
] # Polygon coordinates in absolute pixels
# Convert to normalized coordinates
if len(mask_data) > 0:
mask_normalized = []
for point in mask_data:
x_norm = float(point[0]) / width
y_norm = float(point[1]) / height
mask_normalized.append([x_norm, y_norm])
detection["segmentation_mask"] = mask_normalized
else:
detection["segmentation_mask"] = None
except Exception as mask_error:
logger.warning(
f"Error extracting mask for detection {i}: {mask_error}"
)
detection["segmentation_mask"] = None
else:
detection["segmentation_mask"] = None
detections.append(detection)
return detections
except Exception as e:
logger.error(f"Error formatting prediction results: {e}")
return []
@staticmethod
def convert_bbox_format(
bbox: List[float], format_from: str = "xywh", format_to: str = "xyxy"
) -> List[float]:
"""
Convert bounding box between formats.
Formats:
- xywh: [x_center, y_center, width, height]
- xyxy: [x_min, y_min, x_max, y_max]
Args:
bbox: Bounding box coordinates
format_from: Source format
format_to: Target format
Returns:
Converted bounding box
"""
if format_from == "xywh" and format_to == "xyxy":
x, y, w, h = bbox
return [x - w / 2, y - h / 2, x + w / 2, y + h / 2]
elif format_from == "xyxy" and format_to == "xywh":
x1, y1, x2, y2 = bbox
return [(x1 + x2) / 2, (y1 + y2) / 2, x2 - x1, y2 - y1]
else:
return bbox
def get_model_info(self) -> Dict[str, Any]:
"""
Get information about the loaded model.
Returns:
Dictionary with model information
"""
if self.model is None:
return {"error": "Model not loaded"}
try:
info = {
"model_path": self.model_path,
"device": self.device,
"task": getattr(self.model, "task", "unknown"),
}
# Try to get class names
if hasattr(self.model, "names"):
info["classes"] = self.model.names
info["num_classes"] = len(self.model.names)
return info
except Exception as e:
logger.error(f"Error getting model info: {e}")
return {"error": str(e)}

7
src/utils/__init__.py Normal file
View File

@@ -0,0 +1,7 @@
"""
Utility modules for the microscopy object detection application.
"""
from src.utils.image import Image, ImageLoadError
__all__ = ["Image", "ImageLoadError"]

230
src/utils/config_manager.py Normal file
View File

@@ -0,0 +1,230 @@
"""
Configuration manager for the microscopy object detection application.
Handles loading, saving, and accessing application configuration.
"""
import yaml
from pathlib import Path
from typing import Any, Dict, Optional
from src.utils.logger import get_logger
from src.utils.image import Image
logger = get_logger(__name__)
class ConfigManager:
"""Manages application configuration."""
def __init__(self, config_path: str = "config/app_config.yaml"):
"""
Initialize configuration manager.
Args:
config_path: Path to configuration file
"""
self.config_path = Path(config_path)
self.config: Dict[str, Any] = {}
self._load_config()
def _load_config(self) -> None:
"""Load configuration from YAML file."""
try:
if self.config_path.exists():
with open(self.config_path, "r") as f:
self.config = yaml.safe_load(f) or {}
logger.info(f"Configuration loaded from {self.config_path}")
else:
logger.warning(f"Configuration file not found: {self.config_path}")
self._create_default_config()
except Exception as e:
logger.error(f"Error loading configuration: {e}")
self._create_default_config()
def _create_default_config(self) -> None:
"""Create default configuration."""
self.config = {
"database": {"path": "data/detections.db"},
"image_repository": {
"base_path": "",
"allowed_extensions": Image.SUPPORTED_EXTENSIONS,
},
"models": {
"default_base_model": "yolov8s-seg.pt",
"models_directory": "data/models",
"base_model_choices": [
"yolov8s-seg.pt",
"yolov11s-seg.pt",
],
},
"training": {
"default_epochs": 100,
"default_batch_size": 16,
"default_imgsz": 640,
"default_patience": 50,
"default_lr0": 0.01,
"two_stage": {
"enabled": False,
"stage1": {
"epochs": 20,
"lr0": 0.0005,
"patience": 10,
"freeze": 10,
},
"stage2": {
"epochs": 150,
"lr0": 0.0003,
"patience": 30,
},
},
},
"detection": {
"default_confidence": 0.25,
"default_iou": 0.45,
"max_batch_size": 100,
},
"visualization": {
"bbox_colors": {
"organelle": "#FF6B6B",
"membrane_branch": "#4ECDC4",
"default": "#00FF00",
},
"bbox_thickness": 2,
"font_size": 12,
},
"export": {"formats": ["csv", "json", "excel"], "default_format": "csv"},
"logging": {
"level": "INFO",
"file": "logs/app.log",
"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s",
},
}
self.save_config()
def save_config(self) -> bool:
"""
Save current configuration to file.
Returns:
True if successful, False otherwise
"""
try:
# Create directory if it doesn't exist
self.config_path.parent.mkdir(parents=True, exist_ok=True)
with open(self.config_path, "w") as f:
yaml.dump(self.config, f, default_flow_style=False, sort_keys=False)
logger.info(f"Configuration saved to {self.config_path}")
return True
except Exception as e:
logger.error(f"Error saving configuration: {e}")
return False
def get(self, key: str, default: Any = None) -> Any:
"""
Get configuration value by key.
Args:
key: Configuration key (can use dot notation, e.g., 'database.path')
default: Default value if key not found
Returns:
Configuration value or default
"""
keys = key.split(".")
value = self.config
for k in keys:
if isinstance(value, dict) and k in value:
value = value[k]
else:
return default
return value
def set(self, key: str, value: Any) -> None:
"""
Set configuration value by key.
Args:
key: Configuration key (can use dot notation)
value: Value to set
"""
keys = key.split(".")
config = self.config
# Navigate to the nested dictionary
for k in keys[:-1]:
if k not in config:
config[k] = {}
config = config[k]
# Set the value
config[keys[-1]] = value
logger.debug(f"Configuration updated: {key} = {value}")
def get_section(self, section: str) -> Dict[str, Any]:
"""
Get entire configuration section.
Args:
section: Section name (e.g., 'database', 'training')
Returns:
Dictionary with section configuration
"""
return self.config.get(section, {})
def update_section(self, section: str, values: Dict[str, Any]) -> None:
"""
Update entire configuration section.
Args:
section: Section name
values: Dictionary with new values
"""
if section not in self.config:
self.config[section] = {}
self.config[section].update(values)
logger.debug(f"Configuration section updated: {section}")
def reload(self) -> None:
"""Reload configuration from file."""
self._load_config()
def get_database_path(self) -> str:
"""Get database path."""
return self.get("database.path", "data/detections.db")
def get_image_repository_path(self) -> str:
"""Get image repository base path."""
return self.get("image_repository.base_path", "")
def set_image_repository_path(self, path: str) -> None:
"""Set image repository base path."""
self.set("image_repository.base_path", path)
self.save_config()
def get_models_directory(self) -> str:
"""Get models directory path."""
return self.get("models.models_directory", "data/models")
def get_default_training_params(self) -> Dict[str, Any]:
"""Get default training parameters."""
return self.get_section("training")
def get_default_detection_params(self) -> Dict[str, Any]:
"""Get default detection parameters."""
return self.get_section("detection")
def get_bbox_colors(self) -> Dict[str, str]:
"""Get bounding box colors for different classes."""
return self.get("visualization.bbox_colors", {})
def get_allowed_extensions(self) -> list:
"""Get list of allowed image file extensions."""
return self.get(
"image_repository.allowed_extensions", Image.SUPPORTED_EXTENSIONS
)

239
src/utils/file_utils.py Normal file
View File

@@ -0,0 +1,239 @@
"""
File utility functions for the microscopy object detection application.
"""
import os
from pathlib import Path
from typing import List, Optional
from src.utils.logger import get_logger
logger = get_logger(__name__)
def get_image_files(
directory: str,
allowed_extensions: Optional[List[str]] = None,
recursive: bool = False,
) -> List[str]:
"""
Get all image files in a directory.
Args:
directory: Directory path to search
allowed_extensions: List of allowed file extensions (e.g., ['.jpg', '.png'])
recursive: Whether to search recursively
Returns:
List of absolute paths to image files
"""
if allowed_extensions is None:
from src.utils.image import Image
allowed_extensions = Image.SUPPORTED_EXTENSIONS
# Normalize extensions to lowercase
allowed_extensions = [ext.lower() for ext in allowed_extensions]
image_files = []
directory_path = Path(directory)
if not directory_path.exists():
logger.error(f"Directory does not exist: {directory}")
return image_files
try:
if recursive:
# Recursive search
for ext in allowed_extensions:
image_files.extend(directory_path.rglob(f"*{ext}"))
# Also search uppercase extensions
image_files.extend(directory_path.rglob(f"*{ext.upper()}"))
else:
# Top-level search only
for ext in allowed_extensions:
image_files.extend(directory_path.glob(f"*{ext}"))
# Also search uppercase extensions
image_files.extend(directory_path.glob(f"*{ext.upper()}"))
# Convert to absolute paths and sort
image_files = sorted([str(f.absolute()) for f in image_files])
logger.info(f"Found {len(image_files)} image files in {directory}")
except Exception as e:
logger.error(f"Error searching for images: {e}")
return image_files
def ensure_directory(directory: str) -> bool:
"""
Ensure a directory exists, create if it doesn't.
Args:
directory: Directory path
Returns:
True if directory exists or was created successfully
"""
try:
Path(directory).mkdir(parents=True, exist_ok=True)
return True
except Exception as e:
logger.error(f"Error creating directory {directory}: {e}")
return False
def get_relative_path(file_path: str, base_path: str) -> str:
"""
Get relative path from base path.
Args:
file_path: Absolute file path
base_path: Base directory path
Returns:
Relative path string
"""
try:
return str(Path(file_path).relative_to(base_path))
except ValueError:
# If file_path is not relative to base_path, return the filename
return Path(file_path).name
def validate_file_path(file_path: str, must_exist: bool = True) -> bool:
"""
Validate a file path.
Args:
file_path: Path to validate
must_exist: Whether the file must exist
Returns:
True if valid, False otherwise
"""
path = Path(file_path)
if must_exist and not path.exists():
logger.error(f"File does not exist: {file_path}")
return False
if must_exist and not path.is_file():
logger.error(f"Path is not a file: {file_path}")
return False
return True
def get_file_size(file_path: str) -> int:
"""
Get file size in bytes.
Args:
file_path: Path to file
Returns:
File size in bytes, or 0 if error
"""
try:
return Path(file_path).stat().st_size
except Exception as e:
logger.error(f"Error getting file size for {file_path}: {e}")
return 0
def format_file_size(size_bytes: int) -> str:
"""
Format file size in human-readable format.
Args:
size_bytes: Size in bytes
Returns:
Formatted string (e.g., "1.5 MB")
"""
for unit in ["B", "KB", "MB", "GB"]:
if size_bytes < 1024.0:
return f"{size_bytes:.1f} {unit}"
size_bytes /= 1024.0
return f"{size_bytes:.1f} TB"
def create_unique_filename(directory: str, base_name: str, extension: str) -> str:
"""
Create a unique filename by adding a number suffix if file exists.
Args:
directory: Directory path
base_name: Base filename without extension
extension: File extension (with or without dot)
Returns:
Unique filename
"""
if not extension.startswith("."):
extension = "." + extension
directory_path = Path(directory)
filename = f"{base_name}{extension}"
file_path = directory_path / filename
if not file_path.exists():
return filename
# Add number suffix
counter = 1
while True:
filename = f"{base_name}_{counter}{extension}"
file_path = directory_path / filename
if not file_path.exists():
return filename
counter += 1
def is_image_file(
file_path: str, allowed_extensions: Optional[List[str]] = None
) -> bool:
"""
Check if a file is an image based on extension.
Args:
file_path: Path to file
allowed_extensions: List of allowed extensions
Returns:
True if file is an image
"""
if allowed_extensions is None:
from src.utils.image import Image
allowed_extensions = Image.SUPPORTED_EXTENSIONS
extension = Path(file_path).suffix.lower()
return extension in [ext.lower() for ext in allowed_extensions]
def safe_filename(filename: str) -> str:
"""
Convert a string to a safe filename by removing/replacing invalid characters.
Args:
filename: Original filename
Returns:
Safe filename
"""
# Replace invalid characters
invalid_chars = '<>:"/\\|?*'
for char in invalid_chars:
filename = filename.replace(char, "_")
# Remove leading/trailing spaces and dots
filename = filename.strip(". ")
# Ensure filename is not empty
if not filename:
filename = "unnamed"
return filename

407
src/utils/image.py Normal file
View File

@@ -0,0 +1,407 @@
"""
Image loading and management utilities for the microscopy object detection application.
"""
import cv2
import numpy as np
from pathlib import Path
from typing import Optional, Tuple, Union
from PIL import Image as PILImage
import tifffile
from src.utils.logger import get_logger
from src.utils.file_utils import validate_file_path, is_image_file
from PySide6.QtGui import QImage
logger = get_logger(__name__)
class ImageLoadError(Exception):
"""Exception raised when an image cannot be loaded."""
pass
class Image:
"""
A class for loading and managing images from file paths.
Supports multiple image formats: .jpg, .jpeg, .png, .tif, .tiff, .bmp
Provides access to image data in multiple formats (OpenCV/numpy, PIL).
Attributes:
path: Path to the image file
data: Image data as numpy array (OpenCV format, BGR)
pil_image: Image data as PIL Image (RGB)
width: Image width in pixels
height: Image height in pixels
channels: Number of color channels
format: Image file format
size_bytes: File size in bytes
"""
SUPPORTED_EXTENSIONS = [".jpg", ".jpeg", ".png", ".tif", ".tiff", ".bmp"]
def __init__(self, image_path: Union[str, Path]):
"""
Initialize an Image object by loading from a file path.
Args:
image_path: Path to the image file (string or Path object)
Raises:
ImageLoadError: If the image cannot be loaded or is invalid
"""
self.path = Path(image_path)
self._data: Optional[np.ndarray] = None
self._pil_image: Optional[PILImage.Image] = None
self._width: int = 0
self._height: int = 0
self._channels: int = 0
self._format: str = ""
self._size_bytes: int = 0
self._dtype: Optional[np.dtype] = None
# Load the image
self._load()
def _load(self) -> None:
"""
Load the image from disk.
Raises:
ImageLoadError: If the image cannot be loaded
"""
# Validate path
if not validate_file_path(str(self.path), must_exist=True):
raise ImageLoadError(f"Invalid or non-existent file path: {self.path}")
# Check file extension
if not is_image_file(str(self.path), self.SUPPORTED_EXTENSIONS):
ext = self.path.suffix.lower()
raise ImageLoadError(
f"Unsupported image format: {ext}. "
f"Supported formats: {', '.join(self.SUPPORTED_EXTENSIONS)}"
)
try:
# Check if it's a TIFF file - use tifffile for better support
if self.path.suffix.lower() in [".tif", ".tiff"]:
self._data = tifffile.imread(str(self.path))
if self._data is None:
raise ImageLoadError(
f"Failed to load TIFF with tifffile: {self.path}"
)
# Extract metadata
self._height, self._width = (
self._data.shape[:2]
if len(self._data.shape) >= 2
else (self._data.shape[0], 1)
)
self._channels = (
self._data.shape[2] if len(self._data.shape) == 3 else 1
)
self._format = self.path.suffix.lower().lstrip(".")
self._size_bytes = self.path.stat().st_size
self._dtype = self._data.dtype
# Load PIL version for compatibility
if self._channels == 1:
# Grayscale
self._pil_image = PILImage.fromarray(self._data)
else:
# Multi-channel (RGB or RGBA)
self._pil_image = PILImage.fromarray(self._data)
logger.info(
f"Successfully loaded TIFF image: {self.path.name} "
f"({self._width}x{self._height}, {self._channels} channels, "
f"dtype={self._dtype}, {self._format.upper()})"
)
else:
# Load with OpenCV (returns BGR format) for non-TIFF images
self._data = cv2.imread(str(self.path), cv2.IMREAD_UNCHANGED)
if self._data is None:
raise ImageLoadError(
f"Failed to load image with OpenCV: {self.path}"
)
# Extract metadata
self._height, self._width = self._data.shape[:2]
self._channels = (
self._data.shape[2] if len(self._data.shape) == 3 else 1
)
self._format = self.path.suffix.lower().lstrip(".")
self._size_bytes = self.path.stat().st_size
self._dtype = self._data.dtype
# Load PIL version for compatibility (convert BGR to RGB)
if self._channels == 3:
rgb_data = cv2.cvtColor(self._data, cv2.COLOR_BGR2RGB)
self._pil_image = PILImage.fromarray(rgb_data)
elif self._channels == 4:
rgba_data = cv2.cvtColor(self._data, cv2.COLOR_BGRA2RGBA)
self._pil_image = PILImage.fromarray(rgba_data)
else:
# Grayscale
self._pil_image = PILImage.fromarray(self._data)
logger.info(
f"Successfully loaded image: {self.path.name} "
f"({self._width}x{self._height}, {self._channels} channels, "
f"{self._format.upper()})"
)
except Exception as e:
logger.error(f"Error loading image {self.path}: {e}")
raise ImageLoadError(f"Failed to load image: {e}") from e
@property
def data(self) -> np.ndarray:
"""
Get image data as numpy array (OpenCV format, BGR or grayscale).
Returns:
Image data as numpy array
"""
if self._data is None:
raise ImageLoadError("Image data not available")
return self._data
@property
def pil_image(self) -> PILImage.Image:
"""
Get image data as PIL Image (RGB or grayscale).
Returns:
PIL Image object
"""
if self._pil_image is None:
raise ImageLoadError("PIL image not available")
return self._pil_image
@property
def width(self) -> int:
"""Get image width in pixels."""
return self._width
@property
def height(self) -> int:
"""Get image height in pixels."""
return self._height
@property
def shape(self) -> Tuple[int, int, int]:
"""
Get image shape as (height, width, channels).
Returns:
Tuple of (height, width, channels)
"""
print("shape", self._height, self._width, self._channels)
return (self._height, self._width, self._channels)
@property
def channels(self) -> int:
"""Get number of color channels."""
return self._channels
@property
def format(self) -> str:
"""Get image file format (e.g., 'jpg', 'png')."""
return self._format
@property
def size_bytes(self) -> int:
"""Get file size in bytes."""
return self._size_bytes
@property
def size_mb(self) -> float:
"""Get file size in megabytes."""
return self._size_bytes / (1024 * 1024)
@property
def dtype(self) -> np.dtype:
"""Get the data type of the image array."""
if self._dtype is None:
raise ImageLoadError("Image dtype not available")
return self._dtype
@property
def qtimage_format(self) -> QImage.Format:
"""
Get the appropriate QImage format for the image.
Returns:
QImage.Format enum value
"""
if self._channels == 3:
return QImage.Format_RGB888
elif self._channels == 4:
return QImage.Format_RGBA8888
elif self._channels == 1:
if self._dtype == np.uint16:
return QImage.Format_Grayscale16
else:
return QImage.Format_Grayscale8
else:
raise ImageLoadError(f"Unsupported number of channels: {self._channels}")
def get_rgb(self) -> np.ndarray:
"""
Get image data as RGB numpy array.
Returns:
Image data in RGB format as numpy array
"""
if self._channels == 3:
return cv2.cvtColor(self._data, cv2.COLOR_BGR2RGB)
elif self._channels == 4:
return cv2.cvtColor(self._data, cv2.COLOR_BGRA2RGBA)
else:
return self._data
def get_grayscale(self) -> np.ndarray:
"""
Get image as grayscale numpy array.
Returns:
Grayscale image as numpy array
"""
if self._channels == 1:
return self._data
else:
return cv2.cvtColor(self._data, cv2.COLOR_BGR2GRAY)
def copy(self) -> np.ndarray:
"""
Get a copy of the image data.
Returns:
Copy of image data as numpy array
"""
return self._data.copy()
def resize(self, width: int, height: int) -> np.ndarray:
"""
Resize the image to specified dimensions.
Args:
width: Target width in pixels
height: Target height in pixels
Returns:
Resized image as numpy array (does not modify original)
"""
return cv2.resize(self._data, (width, height))
def is_grayscale(self) -> bool:
"""
Check if image is grayscale.
Returns:
True if image is grayscale (1 channel)
"""
return self._channels == 1
def is_color(self) -> bool:
"""
Check if image is color.
Returns:
True if image has 3 or more channels
"""
return self._channels >= 3
def to_normalized_float32(self) -> np.ndarray:
"""
Convert image data to normalized float32 in range [0, 1].
For 16-bit images, this properly scales the full dynamic range.
For 8-bit images, divides by 255.
Already float images are clipped to [0, 1].
Returns:
Normalized image data as float32 numpy array [0, 1]
"""
data = self._data.astype(np.float32)
if self._dtype == np.uint16:
# 16-bit: normalize by max value (65535)
data = data / 65535.0
elif self._dtype == np.uint8:
# 8-bit: normalize by 255
data = data / 255.0
elif np.issubdtype(self._dtype, np.floating):
# Already float, just clip to [0, 1]
data = np.clip(data, 0.0, 1.0)
else:
# Other integer types: use dtype info
if np.issubdtype(self._dtype, np.integer):
max_val = np.iinfo(self._dtype).max
data = data / float(max_val)
else:
# Unknown type: attempt min-max normalization
min_val = data.min()
max_val = data.max()
if max_val > min_val:
data = (data - min_val) / (max_val - min_val)
else:
data = np.zeros_like(data)
return np.clip(data, 0.0, 1.0)
def __repr__(self) -> str:
"""String representation of the Image object."""
return (
f"Image(path='{self.path.name}', "
f"shape=({self._width}x{self._height}x{self._channels}), "
f"format={self._format}, "
f"size={self.size_mb:.2f}MB)"
)
def __str__(self) -> str:
"""String representation of the Image object."""
return self.__repr__()
def convert_grayscale_to_rgb_preserve_range(
pil_image: PILImage.Image,
) -> PILImage.Image:
"""Convert a single-channel PIL image to RGB while preserving dynamic range.
Args:
pil_image: Single-channel PIL image (e.g., 16-bit grayscale).
Returns:
PIL Image in RGB mode with intensities normalized to 0-255.
"""
if pil_image.mode == "RGB":
return pil_image
grayscale = np.array(pil_image)
if grayscale.ndim == 3:
grayscale = grayscale[:, :, 0]
original_dtype = grayscale.dtype
grayscale = grayscale.astype(np.float32)
if grayscale.size == 0:
return PILImage.new("RGB", pil_image.size, color=(0, 0, 0))
if np.issubdtype(original_dtype, np.integer):
denom = float(max(np.iinfo(original_dtype).max, 1))
else:
max_val = float(grayscale.max())
denom = max(max_val, 1.0)
grayscale = np.clip(grayscale / denom, 0.0, 1.0)
grayscale_u8 = (grayscale * 255.0).round().astype(np.uint8)
rgb_arr = np.repeat(grayscale_u8[:, :, None], 3, axis=2)
return PILImage.fromarray(rgb_arr, mode="RGB")

View File

@@ -0,0 +1,122 @@
import numpy as np
from roifile import ImagejRoi
from tifffile import TiffFile, TiffWriter
from pathlib import Path
class UT:
"""
Docstring for UT
Operetta files along with rois drawn in ImageJ
"""
def __init__(self, roifile_fn: Path):
self.roifile_fn = roifile_fn
self.rois = ImagejRoi.fromfile(self.roifile_fn)
self.stem = self.roifile_fn.stem.strip("-RoiSet")
self.image, self.image_props = self._load_images()
def _load_images(self):
"""Loading sequence of tif files
array sequence is CZYX
"""
print(self.roifile_fn.parent, self.stem)
fns = list(self.roifile_fn.parent.glob(f"{self.stem}*.tif*"))
stems = [fn.stem.split(self.stem)[-1] for fn in fns]
n_ch = len(set([stem.split("-ch")[-1].split("t")[0] for stem in stems]))
n_p = len(set([stem.split("-")[0] for stem in stems]))
n_t = len(set([stem.split("t")[1] for stem in stems]))
print(n_ch, n_p, n_t)
with TiffFile(fns[0]) as tif:
img = tif.asarray()
w, h = img.shape
dtype = img.dtype
self.image_props = {
"channels": n_ch,
"planes": n_p,
"tiles": n_t,
"width": w,
"height": h,
"dtype": dtype,
}
image_stack = np.zeros((n_ch, n_p, w, h), dtype=dtype)
for fn in fns:
with TiffFile(fn) as tif:
img = tif.asarray()
stem = fn.stem.split(self.stem)[-1]
ch = int(stem.split("-ch")[-1].split("t")[0])
p = int(stem.split("-")[0].lstrip("p"))
t = int(stem.split("t")[1])
print(fn.stem, "ch", ch, "p", p, "t", t)
image_stack[ch - 1, p - 1] = img
print(image_stack.shape)
return image_stack, self.image_props
@property
def width(self):
return self.image_props["width"]
@property
def height(self):
return self.image_props["height"]
@property
def nchannels(self):
return self.image_props["channels"]
@property
def nplanes(self):
return self.image_props["planes"]
def export_rois(
self,
path: Path,
subfolder: str = "labels",
class_index: int = 0,
):
"""Export rois to a file"""
with open(path / subfolder / f"{self.stem}.txt", "w") as f:
for roi in self.rois:
# TODO add image coordinates normalization
coords = ""
for x, y in roi.subpixel_coordinates:
coords += f"{x/self.width} {y/self.height} "
f.write(f"{class_index} {coords}\n")
return
def export_image(
self,
path: Path,
subfolder: str = "images",
plane_mode: str = "max projection",
channel: int = 0,
):
"""Export image to a file"""
if plane_mode == "max projection":
self.image = np.max(self.image[channel], axis=0)
print(self.image.shape)
with TiffWriter(path / subfolder / f"{self.stem}.tif") as tif:
tif.write(self.image)
if __name__ == "__main__":
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("input", type=Path)
parser.add_argument("output", type=Path)
args = parser.parse_args()
for rfn in args.input.glob("*.zip"):
ut = UT(rfn)
ut.export_rois(args.output, class_index=0)
ut.export_image(args.output, plane_mode="max projection", channel=0)

75
src/utils/logger.py Normal file
View File

@@ -0,0 +1,75 @@
"""
Logging configuration for the microscopy object detection application.
"""
import logging
import sys
from pathlib import Path
from typing import Optional
def setup_logging(
log_file: str = "logs/app.log",
level: str = "INFO",
log_format: Optional[str] = None,
) -> logging.Logger:
"""
Setup application logging.
Args:
log_file: Path to log file
level: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
log_format: Custom log format string
Returns:
Configured logger instance
"""
# Create logs directory if it doesn't exist
log_path = Path(log_file)
log_path.parent.mkdir(parents=True, exist_ok=True)
# Default format if none provided
if log_format is None:
log_format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
# Convert level string to logging constant
numeric_level = getattr(logging, level.upper(), logging.INFO)
# Configure root logger
root_logger = logging.getLogger()
root_logger.setLevel(numeric_level)
# Remove existing handlers
root_logger.handlers.clear()
# Console handler
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setLevel(numeric_level)
console_formatter = logging.Formatter(log_format)
console_handler.setFormatter(console_formatter)
root_logger.addHandler(console_handler)
# File handler
file_handler = logging.FileHandler(log_file)
file_handler.setLevel(numeric_level)
file_formatter = logging.Formatter(log_format)
file_handler.setFormatter(file_formatter)
root_logger.addHandler(file_handler)
# Log initial message
root_logger.info("Logging initialized")
return root_logger
def get_logger(name: str) -> logging.Logger:
"""
Get a logger instance for a specific module.
Args:
name: Logger name (typically __name__)
Returns:
Logger instance
"""
return logging.getLogger(name)

View File

@@ -0,0 +1,561 @@
"""
Custom YOLO training with on-the-fly float32 conversion for 16-bit grayscale images.
This module provides a custom dataset class and training function that:
1. Load 16-bit TIFF images directly with tifffile (no PIL/cv2)
2. Convert to float32 [0-1] on-the-fly (no data loss)
3. Replicate grayscale to 3-channel RGB in memory
4. Use custom training loop to bypass Ultralytics' dataset infrastructure
5. No disk caching required
"""
import numpy as np
import tifffile
import torch
from torch.utils.data import Dataset, DataLoader
from pathlib import Path
from typing import Optional, Dict, Any, List, Tuple
from ultralytics import YOLO
import yaml
import time
from src.utils.logger import get_logger
logger = get_logger(__name__)
class Float32YOLODataset(Dataset):
"""
Custom PyTorch dataset for YOLO that loads 16-bit grayscale TIFFs as float32 RGB.
This dataset:
- Loads with tifffile (not PIL/cv2)
- Converts uint16 → float32 [0-1] (preserves full dynamic range)
- Replicates grayscale to 3 channels
- Returns torch tensors in (C, H, W) format
"""
def __init__(self, images_dir: str, labels_dir: str, img_size: int = 640):
"""
Initialize dataset.
Args:
images_dir: Directory containing images
labels_dir: Directory containing YOLO label files (.txt)
img_size: Target image size (for reference, actual resizing done by model)
"""
self.images_dir = Path(images_dir)
self.labels_dir = Path(labels_dir)
self.img_size = img_size
# Find all image files
extensions = {".tif", ".tiff", ".png", ".jpg", ".jpeg", ".bmp"}
self.image_paths = sorted(
[
p
for p in self.images_dir.rglob("*")
if p.is_file() and p.suffix.lower() in extensions
]
)
if not self.image_paths:
raise ValueError(f"No images found in {images_dir}")
logger.info(
f"Float32YOLODataset initialized with {len(self.image_paths)} images from {images_dir}"
)
def __len__(self):
return len(self.image_paths)
def _read_image(self, img_path: Path) -> np.ndarray:
"""
Read image and convert to float32 [0-1] RGB.
Returns:
numpy array, shape (H, W, 3), dtype float32, range [0, 1]
"""
# Load image with tifffile
img = tifffile.imread(str(img_path))
# Convert to float32
img = img.astype(np.float32)
# Normalize if 16-bit (values > 1.5 indicates uint16)
if img.max() > 1.5:
img = img / 65535.0
# Ensure [0, 1] range
img = np.clip(img, 0.0, 1.0)
# Convert grayscale to RGB
if img.ndim == 2:
# H,W → H,W,3
img = np.repeat(img[..., None], 3, axis=2)
elif img.ndim == 3 and img.shape[2] == 1:
# H,W,1 → H,W,3
img = np.repeat(img, 3, axis=2)
return img # float32 (H, W, 3) in [0, 1]
def _parse_label(self, label_path: Path) -> List[np.ndarray]:
"""
Parse YOLO label file with variable-length rows (segmentation polygons).
Returns:
List of numpy arrays, one per annotation
"""
if not label_path.exists():
return []
labels = []
try:
with open(label_path, "r") as f:
for line in f:
line = line.strip()
if not line:
continue
# Parse space-separated values
values = line.split()
if len(values) >= 5: # At minimum: class_id x y w h
labels.append(
np.array([float(v) for v in values], dtype=np.float32)
)
except Exception as e:
logger.warning(f"Error parsing label {label_path}: {e}")
return []
return labels
def __getitem__(self, idx: int) -> Tuple[torch.Tensor, List[np.ndarray], str]:
"""
Get a single training sample.
Returns:
Tuple of (image_tensor, labels, filename)
- image_tensor: shape (3, H, W), dtype float32, range [0, 1]
- labels: list of numpy arrays with YOLO format labels (variable length for segmentation)
- filename: image filename
"""
img_path = self.image_paths[idx]
label_path = self.labels_dir / f"{img_path.stem}.txt"
# Load image as float32 RGB
img = self._read_image(img_path)
# Convert to tensor: (H, W, 3) → (3, H, W)
img_tensor = torch.from_numpy(img).permute(2, 0, 1).contiguous()
# Load labels (list of variable-length arrays for segmentation)
labels = self._parse_label(label_path)
return img_tensor, labels, img_path.name
def collate_fn(
batch: List[Tuple[torch.Tensor, List[np.ndarray], str]],
) -> Tuple[torch.Tensor, List[List[np.ndarray]], List[str]]:
"""
Collate function for DataLoader.
Args:
batch: List of (img_tensor, labels_list, filename) tuples
where labels_list is a list of variable-length numpy arrays
Returns:
Tuple of (stacked_images, list_of_labels_lists, list_of_filenames)
"""
imgs = [b[0] for b in batch]
labels = [b[1] for b in batch] # Each element is a list of arrays
names = [b[2] for b in batch]
# Stack images - requires same H,W
# For different sizes, implement letterbox/resize in dataset
imgs_batch = torch.stack(imgs, dim=0)
return imgs_batch, labels, names
def train_with_float32_loader(
model_path: str,
data_yaml: str,
epochs: int = 100,
imgsz: int = 640,
batch: int = 16,
patience: int = 50,
save_dir: str = "data/models",
name: str = "custom_model",
callbacks: Optional[Dict] = None,
**kwargs,
) -> Dict[str, Any]:
"""
Train YOLO model with custom Float32 dataset for 16-bit TIFF support.
Uses a custom training loop to bypass Ultralytics' dataset pipeline,
avoiding channel conversion issues.
Args:
model_path: Path to base model weights (.pt file)
data_yaml: Path to dataset YAML configuration
epochs: Number of training epochs
imgsz: Input image size
batch: Batch size
patience: Early stopping patience
save_dir: Directory to save trained model
name: Name for the training run
callbacks: Optional callback dictionary (for progress reporting)
**kwargs: Additional training arguments (lr0, freeze, device, etc.)
Returns:
Dict with training results including model paths and metrics
"""
try:
logger.info(f"Starting Float32 custom training: {name}")
logger.info(
f"Data: {data_yaml}, Epochs: {epochs}, Batch: {batch}, ImgSz: {imgsz}"
)
# Parse data.yaml to get dataset paths
with open(data_yaml, "r") as f:
data_config = yaml.safe_load(f)
dataset_root = Path(data_config.get("path", Path(data_yaml).parent))
train_images = dataset_root / data_config.get("train", "train/images")
val_images = dataset_root / data_config.get("val", "val/images")
# Infer label directories
train_labels = train_images.parent / "labels"
val_labels = val_images.parent / "labels"
logger.info(f"Train images: {train_images}")
logger.info(f"Train labels: {train_labels}")
logger.info(f"Val images: {val_images}")
logger.info(f"Val labels: {val_labels}")
# Create datasets
train_dataset = Float32YOLODataset(
str(train_images), str(train_labels), img_size=imgsz
)
val_dataset = Float32YOLODataset(
str(val_images), str(val_labels), img_size=imgsz
)
# Create data loaders
train_loader = DataLoader(
train_dataset,
batch_size=batch,
shuffle=True,
num_workers=4,
pin_memory=True,
collate_fn=collate_fn,
)
val_loader = DataLoader(
val_dataset,
batch_size=batch,
shuffle=False,
num_workers=2,
pin_memory=True,
collate_fn=collate_fn,
)
# Load model
logger.info(f"Loading model from {model_path}")
ul_model = YOLO(model_path)
# Get PyTorch model
pt_model, loss_fn = _get_pytorch_model(ul_model)
# Setup device
device = kwargs.get("device", "cuda" if torch.cuda.is_available() else "cpu")
# Configure model args for loss function
from types import SimpleNamespace
# Required args for segmentation loss
required_args = {
"overlap_mask": True,
"mask_ratio": 4,
"task": "segment",
"single_cls": False,
"box": 7.5,
"cls": 0.5,
"dfl": 1.5,
}
if not hasattr(pt_model, "args"):
# No args - create SimpleNamespace
pt_model.args = SimpleNamespace(**required_args)
elif isinstance(pt_model.args, dict):
# Args is dict - MUST convert to SimpleNamespace for attribute access
# The loss function uses model.args.overlap_mask (attribute access)
merged = {**pt_model.args, **required_args}
pt_model.args = SimpleNamespace(**merged)
logger.info(
"Converted model.args from dict to SimpleNamespace for loss function compatibility"
)
else:
# Args is SimpleNamespace or other - set attributes
for key, value in required_args.items():
if not hasattr(pt_model.args, key):
setattr(pt_model.args, key, value)
pt_model.to(device)
pt_model.train()
logger.info(f"Training on device: {device}")
logger.info(f"PyTorch model type: {type(pt_model)}")
logger.info(f"Model args configured for segmentation loss")
# Setup optimizer
lr0 = kwargs.get("lr0", 0.01)
optimizer = torch.optim.AdamW(pt_model.parameters(), lr=lr0)
# Training loop
save_path = Path(save_dir) / name
save_path.mkdir(parents=True, exist_ok=True)
weights_dir = save_path / "weights"
weights_dir.mkdir(exist_ok=True)
best_loss = float("inf")
patience_counter = 0
for epoch in range(epochs):
epoch_start = time.time()
running_loss = 0.0
num_batches = 0
logger.info(f"Epoch {epoch+1}/{epochs} starting...")
for batch_idx, (imgs, labels_list, names) in enumerate(train_loader):
imgs = imgs.to(device) # (B, 3, H, W) float32
optimizer.zero_grad()
# Forward pass
try:
preds = pt_model(imgs)
except Exception as e:
# Try with labels
preds = pt_model(imgs, labels_list)
# Compute loss
# For Ultralytics models, the easiest approach is to construct a batch dict
# and call the model in training mode which returns preds + loss
batch_dict = {
"img": imgs, # Already on device
"batch_idx": (
torch.cat(
[
torch.full((len(lab),), i, dtype=torch.long)
for i, lab in enumerate(labels_list)
]
).to(device)
if any(len(lab) > 0 for lab in labels_list)
else torch.tensor([], dtype=torch.long, device=device)
),
"cls": (
torch.cat(
[
torch.from_numpy(lab[:, 0:1])
for lab in labels_list
if len(lab) > 0
]
).to(device)
if any(len(lab) > 0 for lab in labels_list)
else torch.tensor([], dtype=torch.float32, device=device)
),
"bboxes": (
torch.cat(
[
torch.from_numpy(lab[:, 1:5])
for lab in labels_list
if len(lab) > 0
]
).to(device)
if any(len(lab) > 0 for lab in labels_list)
else torch.tensor([], dtype=torch.float32, device=device)
),
"ori_shape": (imgs.shape[2], imgs.shape[3]), # H, W
"resized_shape": (imgs.shape[2], imgs.shape[3]),
}
# Add masks if segmentation labels exist
if any(len(lab) > 5 for lab in labels_list if len(lab) > 0):
masks = []
for lab in labels_list:
if len(lab) > 0 and lab.shape[1] > 5:
# Has segmentation points
masks.append(torch.from_numpy(lab[:, 5:]))
if masks:
batch_dict["masks"] = masks
# Call model loss (it will compute loss internally)
try:
loss_output = pt_model.loss(batch_dict, preds)
if isinstance(loss_output, (tuple, list)):
loss = loss_output[0]
else:
loss = loss_output
except Exception as e:
logger.error(f"Model loss computation failed: {e}")
# Last resort: maybe preds is already a dict with 'loss'
if isinstance(preds, dict) and "loss" in preds:
loss = preds["loss"]
else:
raise RuntimeError(f"Cannot compute loss: {e}")
# Backward pass
loss.backward()
optimizer.step()
running_loss += loss.item()
num_batches += 1
# Report progress via callback
if callbacks and "on_fit_epoch_end" in callbacks:
# Create a mock trainer object for callback
class MockTrainer:
def __init__(self, epoch):
self.epoch = epoch
self.loss_items = [loss.item()]
callbacks["on_fit_epoch_end"](MockTrainer(epoch))
epoch_loss = running_loss / max(1, num_batches)
epoch_time = time.time() - epoch_start
logger.info(
f"Epoch {epoch+1}/{epochs} completed. Avg Loss: {epoch_loss:.4f}, Time: {epoch_time:.1f}s"
)
# Save checkpoint
ckpt_path = weights_dir / f"epoch{epoch+1}.pt"
torch.save(
{
"epoch": epoch + 1,
"model_state_dict": pt_model.state_dict(),
"optimizer_state_dict": optimizer.state_dict(),
"loss": epoch_loss,
},
ckpt_path,
)
# Save as last.pt
last_path = weights_dir / "last.pt"
torch.save(pt_model.state_dict(), last_path)
# Check for best model
if epoch_loss < best_loss:
best_loss = epoch_loss
patience_counter = 0
best_path = weights_dir / "best.pt"
torch.save(pt_model.state_dict(), best_path)
logger.info(f"New best model saved: {best_path}")
else:
patience_counter += 1
# Early stopping
if patience_counter >= patience:
logger.info(f"Early stopping triggered after {epoch+1} epochs")
break
logger.info("Training completed successfully")
# Format results
return {
"success": True,
"final_epoch": epoch + 1,
"metrics": {
"final_loss": epoch_loss,
"best_loss": best_loss,
},
"best_model_path": str(weights_dir / "best.pt"),
"last_model_path": str(weights_dir / "last.pt"),
"save_dir": str(save_path),
}
except Exception as e:
logger.error(f"Error during Float32 training: {e}")
import traceback
logger.error(traceback.format_exc())
raise
def _get_pytorch_model(ul_model: YOLO) -> Tuple[torch.nn.Module, Optional[callable]]:
"""
Extract PyTorch model and loss function from Ultralytics YOLO wrapper.
Args:
ul_model: Ultralytics YOLO model wrapper
Returns:
Tuple of (pytorch_model, loss_function)
"""
# Try to get the underlying PyTorch model
candidates = []
# Direct model attribute
if hasattr(ul_model, "model"):
candidates.append(ul_model.model)
# Sometimes nested
if hasattr(ul_model, "model") and hasattr(ul_model.model, "model"):
candidates.append(ul_model.model.model)
# The wrapper itself
if isinstance(ul_model, torch.nn.Module):
candidates.append(ul_model)
# Find a valid model
pt_model = None
loss_fn = None
for candidate in candidates:
if candidate is None or not isinstance(candidate, torch.nn.Module):
continue
pt_model = candidate
# Try to find loss function
if hasattr(candidate, "loss") and callable(getattr(candidate, "loss")):
loss_fn = getattr(candidate, "loss")
elif hasattr(candidate, "compute_loss") and callable(
getattr(candidate, "compute_loss")
):
loss_fn = getattr(candidate, "compute_loss")
break
if pt_model is None:
raise RuntimeError("Could not extract PyTorch model from Ultralytics wrapper")
logger.info(f"Extracted PyTorch model: {type(pt_model)}")
logger.info(
f"Loss function: {type(loss_fn) if loss_fn else 'None (will attempt fallback)'}"
)
return pt_model, loss_fn
# Compatibility function (kept for backwards compatibility)
def train_float32(model: YOLO, data_yaml: str, **train_kwargs) -> Any:
"""
Train YOLO model with Float32YOLODataset (alternative API).
Args:
model: Initialized YOLO model instance
data_yaml: Path to dataset YAML
**train_kwargs: Training parameters
Returns:
Training results dict
"""
return train_with_float32_loader(
model_path=(
model.model_path if hasattr(model, "model_path") else "yolov8s-seg.pt"
),
data_yaml=data_yaml,
**train_kwargs,
)

0
tests/__init__.py Normal file
View File

182
tests/show_yolo_seg.py Normal file
View File

@@ -0,0 +1,182 @@
#!/usr/bin/env python3
"""
show_yolo_seg.py
Usage:
python show_yolo_seg.py /path/to/image.jpg /path/to/labels.txt
Supports:
- Segmentation polygons: "class x1 y1 x2 y2 ... xn yn"
- YOLO bbox lines as fallback: "class x_center y_center width height"
Coordinates can be normalized [0..1] or absolute pixels (auto-detected).
"""
import sys
import cv2
import numpy as np
import matplotlib.pyplot as plt
import argparse
from pathlib import Path
import random
def parse_label_line(line):
parts = line.strip().split()
if not parts:
return None
cls = int(float(parts[0]))
coords = [float(x) for x in parts[1:]]
return cls, coords
def coords_are_normalized(coords):
# If every coordinate is between 0 and 1 (inclusive-ish), assume normalized
if not coords:
return False
return max(coords) <= 1.001
def yolo_bbox_to_xyxy(coords, img_w, img_h):
# coords: [xc, yc, w, h] normalized or absolute
xc, yc, w, h = coords[:4]
if max(coords) <= 1.001:
xc *= img_w
yc *= img_h
w *= img_w
h *= img_h
x1 = int(round(xc - w / 2))
y1 = int(round(yc - h / 2))
x2 = int(round(xc + w / 2))
y2 = int(round(yc + h / 2))
return x1, y1, x2, y2
def poly_to_pts(coords, img_w, img_h):
# coords: [x1 y1 x2 y2 ...] either normalized or absolute
if coords_are_normalized(coords):
coords = [
coords[i] * (img_w if i % 2 == 0 else img_h) for i in range(len(coords))
]
pts = np.array(coords, dtype=np.int32).reshape(-1, 2)
return pts
def random_color_for_class(cls):
random.seed(cls) # deterministic per class
return tuple(int(x) for x in np.array([random.randint(0, 255) for _ in range(3)]))
def draw_annotations(img, labels, alpha=0.4, draw_bbox_for_poly=True):
# img: BGR numpy array
overlay = img.copy()
h, w = img.shape[:2]
for cls, coords in labels:
if not coords:
continue
# polygon case (>=6 coordinates)
if len(coords) >= 6:
pts = poly_to_pts(coords, w, h)
color = random_color_for_class(cls)
# fill on overlay
cv2.fillPoly(overlay, [pts], color)
# outline on base image
cv2.polylines(img, [pts], isClosed=True, color=color, thickness=2)
# put class text at first point
x, y = int(pts[0, 0]), int(pts[0, 1]) - 6
cv2.putText(
img,
str(cls),
(x, max(6, y)),
cv2.FONT_HERSHEY_SIMPLEX,
0.6,
(255, 255, 255),
2,
cv2.LINE_AA,
)
if draw_bbox_for_poly:
x, y, w_box, h_box = cv2.boundingRect(pts)
cv2.rectangle(img, (x, y), (x + w_box, y + h_box), color, 1)
# YOLO bbox case (4 coords)
elif len(coords) == 4:
x1, y1, x2, y2 = yolo_bbox_to_xyxy(coords, w, h)
color = random_color_for_class(cls)
cv2.rectangle(img, (x1, y1), (x2, y2), color, 2)
cv2.putText(
img,
str(cls),
(x1, max(6, y1 - 4)),
cv2.FONT_HERSHEY_SIMPLEX,
0.6,
(255, 255, 255),
2,
cv2.LINE_AA,
)
else:
# Unknown / invalid format, skip
continue
# blend overlay for filled polygons
cv2.addWeighted(overlay, alpha, img, 1 - alpha, 0, img)
return img
def load_labels_file(label_path):
labels = []
with open(label_path, "r") as f:
for raw in f:
line = raw.strip()
if not line:
continue
parsed = parse_label_line(line)
if parsed:
labels.append(parsed)
return labels
def main():
parser = argparse.ArgumentParser(
description="Show YOLO segmentation / polygon annotations"
)
parser.add_argument("image", type=str, help="Path to image file")
parser.add_argument("labels", type=str, help="Path to YOLO label file (polygons)")
parser.add_argument(
"--alpha", type=float, default=0.4, help="Polygon fill alpha (0..1)"
)
parser.add_argument(
"--no-bbox", action="store_true", help="Don't draw bounding boxes for polygons"
)
args = parser.parse_args()
img_path = Path(args.image)
lbl_path = Path(args.labels)
if not img_path.exists():
print("Image not found:", img_path)
sys.exit(1)
if not lbl_path.exists():
print("Label file not found:", lbl_path)
sys.exit(1)
img = cv2.imread(str(img_path), cv2.IMREAD_COLOR)
if img is None:
print("Could not load image:", img_path)
sys.exit(1)
labels = load_labels_file(str(lbl_path))
if not labels:
print("No labels parsed from", lbl_path)
# continue and just show image
out = draw_annotations(
img.copy(), labels, alpha=args.alpha, draw_bbox_for_poly=(not args.no_bbox)
)
# Convert BGR -> RGB for matplotlib display
out_rgb = cv2.cvtColor(out, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10 * out.shape[0] / out.shape[1]))
plt.imshow(out_rgb)
plt.axis("off")
plt.title(f"{img_path.name} ({lbl_path.name})")
plt.show()
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,109 @@
#!/usr/bin/env python3
"""
Test script for 16-bit TIFF loading and normalization.
"""
import numpy as np
import tifffile
from pathlib import Path
import tempfile
import sys
import os
# Add parent directory to path to import modules
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.utils.image import Image
def create_test_16bit_tiff(output_path: str) -> str:
"""Create a test 16-bit grayscale TIFF file.
Args:
output_path: Path where to save the test TIFF
Returns:
Path to the created TIFF file
"""
# Create a 16-bit grayscale test image (100x100)
# With values ranging from 0 to 65535 (full 16-bit range)
height, width = 100, 100
# Create a gradient pattern
test_data = np.zeros((height, width), dtype=np.uint16)
for i in range(height):
for j in range(width):
# Create a diagonal gradient
test_data[i, j] = int((i + j) / (height + width - 2) * 65535)
# Save as TIFF
tifffile.imwrite(output_path, test_data)
print(f"Created test 16-bit TIFF: {output_path}")
print(f" Shape: {test_data.shape}")
print(f" Dtype: {test_data.dtype}")
print(f" Min value: {test_data.min()}")
print(f" Max value: {test_data.max()}")
return output_path
def test_image_loading():
"""Test loading 16-bit TIFF with the Image class."""
print("\n=== Testing Image Loading ===")
# Create temporary test file
with tempfile.NamedTemporaryFile(suffix=".tif", delete=False) as tmp:
test_path = tmp.name
try:
# Create test image
create_test_16bit_tiff(test_path)
# Load with Image class
print("\nLoading with Image class...")
img = Image(test_path)
print(f"Successfully loaded image:")
print(f" Width: {img.width}")
print(f" Height: {img.height}")
print(f" Channels: {img.channels}")
print(f" Dtype: {img.dtype}")
print(f" Format: {img.format}")
# Test normalization
print("\nTesting normalization to float32 [0-1]...")
normalized = img.to_normalized_float32()
print(f"Normalized image:")
print(f" Shape: {normalized.shape}")
print(f" Dtype: {normalized.dtype}")
print(f" Min value: {normalized.min():.6f}")
print(f" Max value: {normalized.max():.6f}")
print(f" Mean value: {normalized.mean():.6f}")
# Verify normalization
assert normalized.dtype == np.float32, "Dtype should be float32"
assert (
0.0 <= normalized.min() <= normalized.max() <= 1.0
), "Values should be in [0, 1]"
print("\n✓ All tests passed!")
return True
except Exception as e:
print(f"\n✗ Test failed with error: {e}")
import traceback
traceback.print_exc()
return False
finally:
# Cleanup
if os.path.exists(test_path):
os.remove(test_path)
print(f"\nCleaned up test file: {test_path}")
if __name__ == "__main__":
success = test_image_loading()
sys.exit(0 if success else 1)

View File

@@ -0,0 +1,211 @@
"""
Test script for Float32 on-the-fly loading for 16-bit TIFFs.
This test verifies that:
1. Float32YOLODataset can load 16-bit TIFF files
2. Images are converted to float32 [0-1] in memory
3. Grayscale is replicated to 3 channels (RGB)
4. No disk caching is used
5. Full 16-bit precision is preserved
"""
import tempfile
import numpy as np
import tifffile
from pathlib import Path
import yaml
def create_test_dataset():
"""Create a minimal test dataset with 16-bit TIFF images."""
temp_dir = Path(tempfile.mkdtemp())
dataset_dir = temp_dir / "test_dataset"
# Create directory structure
train_images = dataset_dir / "train" / "images"
train_labels = dataset_dir / "train" / "labels"
train_images.mkdir(parents=True, exist_ok=True)
train_labels.mkdir(parents=True, exist_ok=True)
# Create a 16-bit TIFF test image
img_16bit = np.random.randint(0, 65536, (100, 100), dtype=np.uint16)
img_path = train_images / "test_image.tif"
tifffile.imwrite(str(img_path), img_16bit)
# Create a dummy label file
label_path = train_labels / "test_image.txt"
with open(label_path, "w") as f:
f.write("0 0.5 0.5 0.2 0.2\n") # class_id x_center y_center width height
# Create data.yaml
data_yaml = {
"path": str(dataset_dir),
"train": "train/images",
"val": "train/images", # Use same for val in test
"names": {0: "object"},
"nc": 1,
}
yaml_path = dataset_dir / "data.yaml"
with open(yaml_path, "w") as f:
yaml.safe_dump(data_yaml, f)
print(f"✓ Created test dataset at: {dataset_dir}")
print(f" - Image: {img_path} (shape={img_16bit.shape}, dtype={img_16bit.dtype})")
print(f" - Min value: {img_16bit.min()}, Max value: {img_16bit.max()}")
print(f" - data.yaml: {yaml_path}")
return dataset_dir, img_path, img_16bit
def test_float32_dataset():
"""Test the Float32YOLODataset class directly."""
print("\n=== Testing Float32YOLODataset ===\n")
try:
from src.utils.train_ultralytics_float import Float32YOLODataset
print("✓ Successfully imported Float32YOLODataset")
except ImportError as e:
print(f"✗ Failed to import Float32YOLODataset: {e}")
return False
# Create test dataset
dataset_dir, img_path, original_img = create_test_dataset()
try:
# Initialize the dataset
print("\nInitializing Float32YOLODataset...")
dataset = Float32YOLODataset(
images_dir=str(dataset_dir / "train" / "images"),
labels_dir=str(dataset_dir / "train" / "labels"),
img_size=640,
)
print(f"✓ Float32YOLODataset initialized with {len(dataset)} images")
# Get an item
if len(dataset) > 0:
print("\nGetting first item...")
img_tensor, labels, filename = dataset[0]
print(f"✓ Item retrieved successfully")
print(f" - Image tensor shape: {img_tensor.shape}")
print(f" - Image tensor dtype: {img_tensor.dtype}")
print(f" - Value range: [{img_tensor.min():.6f}, {img_tensor.max():.6f}]")
print(f" - Filename: {filename}")
print(f" - Labels: {len(labels)} annotations")
if labels:
print(
f" - First label shape: {labels[0].shape if len(labels) > 0 else 'N/A'}"
)
# Verify it's float32
if img_tensor.dtype == torch.float32:
print("✓ Correct dtype: float32")
else:
print(f"✗ Wrong dtype: {img_tensor.dtype} (expected float32)")
return False
# Verify it's 3-channel in correct format (C, H, W)
if len(img_tensor.shape) == 3 and img_tensor.shape[0] == 3:
print(
f"✓ Correct format: (C, H, W) = {img_tensor.shape} with 3 channels"
)
else:
print(f"✗ Wrong shape: {img_tensor.shape} (expected (3, H, W))")
return False
# Verify it's in [0, 1] range
if 0.0 <= img_tensor.min() and img_tensor.max() <= 1.0:
print("✓ Values in correct range: [0, 1]")
else:
print(
f"✗ Values out of range: [{img_tensor.min()}, {img_tensor.max()}]"
)
return False
# Verify precision (should have many unique values)
unique_values = len(torch.unique(img_tensor))
print(f" - Unique values: {unique_values}")
if unique_values > 256:
print(f"✓ High precision maintained ({unique_values} > 256 levels)")
else:
print(f"⚠ Low precision: only {unique_values} unique values")
print("\n✓ All Float32YOLODataset tests passed!")
return True
else:
print("✗ No items in dataset")
return False
except Exception as e:
print(f"✗ Error during testing: {e}")
import traceback
traceback.print_exc()
return False
def test_integration():
"""Test integration with train_with_float32_loader."""
print("\n=== Testing Integration with train_with_float32_loader ===\n")
# Create test dataset
dataset_dir, img_path, original_img = create_test_dataset()
data_yaml = dataset_dir / "data.yaml"
print(f"\nTest dataset ready at: {data_yaml}")
print("\nTo test full training, run:")
print(f" from src.utils.train_ultralytics_float import train_with_float32_loader")
print(f" results = train_with_float32_loader(")
print(f" model_path='yolov8n-seg.pt',")
print(f" data_yaml='{data_yaml}',")
print(f" epochs=1,")
print(f" batch=1,")
print(f" imgsz=640")
print(f" )")
print("\nThis will use custom training loop with Float32YOLODataset")
return True
def main():
"""Run all tests."""
import torch # Import here to ensure torch is available
print("=" * 70)
print("Float32 Training Loader Test Suite")
print("=" * 70)
results = []
# Test 1: Float32YOLODataset
results.append(("Float32YOLODataset", test_float32_dataset()))
# Test 2: Integration check
results.append(("Integration Check", test_integration()))
# Summary
print("\n" + "=" * 70)
print("Test Summary")
print("=" * 70)
for test_name, passed in results:
status = "✓ PASSED" if passed else "✗ FAILED"
print(f"{status}: {test_name}")
all_passed = all(passed for _, passed in results)
print("=" * 70)
if all_passed:
print("✓ All tests passed!")
else:
print("✗ Some tests failed")
print("=" * 70)
return all_passed
if __name__ == "__main__":
import sys
import torch # Make torch available
success = main()
sys.exit(0 if success else 1)

145
tests/test_image.py Normal file
View File

@@ -0,0 +1,145 @@
"""
Tests for the Image class.
"""
import pytest
import numpy as np
from pathlib import Path
from src.utils.image import Image, ImageLoadError
class TestImage:
"""Test cases for the Image class."""
def test_load_nonexistent_file(self):
"""Test loading a non-existent file raises ImageLoadError."""
with pytest.raises(ImageLoadError):
Image("nonexistent_file.jpg")
def test_load_unsupported_format(self, tmp_path):
"""Test loading an unsupported format raises ImageLoadError."""
# Create a dummy file with unsupported extension
test_file = tmp_path / "test.txt"
test_file.write_text("not an image")
with pytest.raises(ImageLoadError):
Image(test_file)
def test_supported_extensions(self):
"""Test that supported extensions are correctly defined."""
expected_extensions = Image.SUPPORTED_EXTENSIONS
assert Image.SUPPORTED_EXTENSIONS == expected_extensions
def test_image_properties(self, tmp_path):
"""Test image properties after loading."""
# Create a simple test image using numpy and cv2
import cv2
test_img = np.zeros((100, 200, 3), dtype=np.uint8)
test_img[:, :] = [255, 0, 0] # Blue in BGR
test_file = tmp_path / "test.jpg"
cv2.imwrite(str(test_file), test_img)
# Load the image
img = Image(test_file)
# Check properties
assert img.width == 200
assert img.height == 100
assert img.channels == 3
assert img.format == "jpg"
assert img.shape == (100, 200, 3)
assert img.size_bytes > 0
assert img.is_color()
assert not img.is_grayscale()
def test_get_rgb(self, tmp_path):
"""Test RGB conversion."""
import cv2
# Create BGR image
test_img = np.zeros((50, 50, 3), dtype=np.uint8)
test_img[:, :] = [255, 0, 0] # Blue in BGR
test_file = tmp_path / "test_rgb.png"
cv2.imwrite(str(test_file), test_img)
img = Image(test_file)
rgb_data = img.get_rgb()
# RGB should have red channel at 255
assert rgb_data[0, 0, 0] == 0 # R
assert rgb_data[0, 0, 1] == 0 # G
assert rgb_data[0, 0, 2] == 255 # B (was BGR blue)
def test_get_grayscale(self, tmp_path):
"""Test grayscale conversion."""
import cv2
test_img = np.zeros((50, 50, 3), dtype=np.uint8)
test_img[:, :] = [128, 128, 128]
test_file = tmp_path / "test_gray.png"
cv2.imwrite(str(test_file), test_img)
img = Image(test_file)
gray_data = img.get_grayscale()
assert len(gray_data.shape) == 2 # Should be 2D
assert gray_data.shape == (50, 50)
def test_copy(self, tmp_path):
"""Test copying image data."""
import cv2
test_img = np.zeros((50, 50, 3), dtype=np.uint8)
test_file = tmp_path / "test_copy.png"
cv2.imwrite(str(test_file), test_img)
img = Image(test_file)
copied = img.copy()
# Modify copy
copied[0, 0] = [255, 255, 255]
# Original should be unchanged
assert not np.array_equal(img.data[0, 0], copied[0, 0])
def test_resize(self, tmp_path):
"""Test image resizing."""
import cv2
test_img = np.zeros((100, 100, 3), dtype=np.uint8)
test_file = tmp_path / "test_resize.png"
cv2.imwrite(str(test_file), test_img)
img = Image(test_file)
resized = img.resize(50, 50)
assert resized.shape == (50, 50, 3)
# Original should be unchanged
assert img.width == 100
assert img.height == 100
def test_str_repr(self, tmp_path):
"""Test string representation."""
import cv2
test_img = np.zeros((100, 200, 3), dtype=np.uint8)
test_file = tmp_path / "test_str.jpg"
cv2.imwrite(str(test_file), test_img)
img = Image(test_file)
str_repr = str(img)
assert "test_str.jpg" in str_repr
assert "100x200x3" in str_repr
assert "jpg" in str_repr
repr_str = repr(img)
assert "Image" in repr_str
assert "test_str.jpg" in repr_str

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,142 @@
#!/usr/bin/env python3
"""
Test script for training dataset preparation with 16-bit TIFFs.
"""
import numpy as np
import tifffile
from pathlib import Path
import tempfile
import sys
import os
import shutil
# Add parent directory to path to import modules
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.utils.image import Image
def test_float32_3ch_conversion():
"""Test conversion of 16-bit TIFF to 16-bit RGB PNG."""
print("\n=== Testing 16-bit RGB PNG Conversion ===")
# Create temporary directory structure
with tempfile.TemporaryDirectory() as tmpdir:
tmpdir = Path(tmpdir)
src_dir = tmpdir / "original"
dst_dir = tmpdir / "converted"
src_dir.mkdir()
dst_dir.mkdir()
# Create test 16-bit TIFF
test_data = np.zeros((100, 100), dtype=np.uint16)
for i in range(100):
for j in range(100):
test_data[i, j] = int((i + j) / 198 * 65535)
test_file = src_dir / "test_16bit.tif"
tifffile.imwrite(test_file, test_data)
print(f"Created test 16-bit TIFF: {test_file}")
print(f" Shape: {test_data.shape}")
print(f" Dtype: {test_data.dtype}")
print(f" Range: [{test_data.min()}, {test_data.max()}]")
# Simulate the conversion process (matching training_tab.py)
print("\nConverting to 16-bit RGB PNG using PIL merge...")
img_obj = Image(test_file)
from PIL import Image as PILImage
# Get uint16 data
uint16_data = img_obj.data
# Use PIL's merge method with 'I;16' channels (proper way for 16-bit RGB)
if len(uint16_data.shape) == 2:
# Grayscale - replicate to RGB
r_img = PILImage.fromarray(uint16_data, mode="I;16")
g_img = PILImage.fromarray(uint16_data, mode="I;16")
b_img = PILImage.fromarray(uint16_data, mode="I;16")
else:
r_img = PILImage.fromarray(uint16_data[:, :, 0], mode="I;16")
g_img = PILImage.fromarray(
(
uint16_data[:, :, 1]
if uint16_data.shape[2] > 1
else uint16_data[:, :, 0]
),
mode="I;16",
)
b_img = PILImage.fromarray(
(
uint16_data[:, :, 2]
if uint16_data.shape[2] > 2
else uint16_data[:, :, 0]
),
mode="I;16",
)
# Merge channels into RGB
rgb_img = PILImage.merge("RGB", (r_img, g_img, b_img))
# Save as PNG
output_file = dst_dir / "test_16bit_rgb.png"
rgb_img.save(output_file)
print(f"Saved 16-bit RGB PNG: {output_file}")
print(f" PIL mode after merge: {rgb_img.mode}")
# Verify the output - Load with OpenCV (as YOLO does)
import cv2
loaded = cv2.imread(str(output_file), cv2.IMREAD_UNCHANGED)
print(f"\nVerifying output (loaded with OpenCV):")
print(f" Shape: {loaded.shape}")
print(f" Dtype: {loaded.dtype}")
print(f" Channels: {loaded.shape[2] if len(loaded.shape) == 3 else 1}")
print(f" Range: [{loaded.min()}, {loaded.max()}]")
print(f" Unique values: {len(np.unique(loaded[:,:,0]))}")
# Assertions
assert loaded.dtype == np.uint16, f"Expected uint16, got {loaded.dtype}"
assert loaded.shape[2] == 3, f"Expected 3 channels, got {loaded.shape[2]}"
assert (
loaded.min() >= 0 and loaded.max() <= 65535
), f"Expected [0,65535] range, got [{loaded.min()}, {loaded.max()}]"
# Verify all channels are identical (replicated grayscale)
assert np.array_equal(
loaded[:, :, 0], loaded[:, :, 1]
), "Channel 0 and 1 should be identical"
assert np.array_equal(
loaded[:, :, 0], loaded[:, :, 2]
), "Channel 0 and 2 should be identical"
# Verify no data loss
unique_vals = len(np.unique(loaded[:, :, 0]))
print(f"\n Precision check:")
print(f" Unique values in channel: {unique_vals}")
print(f" Source unique values: {len(np.unique(test_data))}")
assert unique_vals == len(
np.unique(test_data)
), f"Expected {len(np.unique(test_data))} unique values, got {unique_vals}"
print("\n✓ All conversion tests passed!")
print(" - uint16 dtype preserved")
print(" - 3 channels created")
print(" - Range [0-65535] maintained")
print(" - No precision loss from conversion")
print(" - Channels properly replicated")
return True
if __name__ == "__main__":
try:
success = test_float32_3ch_conversion()
sys.exit(0 if success else 1)
except Exception as e:
print(f"\n✗ Test failed with error: {e}")
import traceback
traceback.print_exc()
sys.exit(1)

View File

@@ -0,0 +1,150 @@
#!/usr/bin/env python3
"""
Test script for YOLO preprocessing of 16-bit TIFF images with float32 passthrough.
Verifies that no uint8 conversion occurs and data is preserved.
"""
import numpy as np
import tifffile
from pathlib import Path
import tempfile
import sys
import os
# Add parent directory to path to import modules
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.model.yolo_wrapper import YOLOWrapper
def create_test_16bit_tiff(output_path: str) -> str:
"""Create a test 16-bit grayscale TIFF file.
Args:
output_path: Path where to save the test TIFF
Returns:
Path to the created TIFF file
"""
# Create a 16-bit grayscale test image (200x200)
# With specific values to test precision preservation
height, width = 200, 200
# Create a gradient pattern with the full 16-bit range
test_data = np.zeros((height, width), dtype=np.uint16)
for i in range(height):
for j in range(width):
# Create a diagonal gradient using full 16-bit range
test_data[i, j] = int((i + j) / (height + width - 2) * 65535)
# Save as TIFF
tifffile.imwrite(output_path, test_data)
print(f"Created test 16-bit TIFF: {output_path}")
print(f" Shape: {test_data.shape}")
print(f" Dtype: {test_data.dtype}")
print(f" Min value: {test_data.min()}")
print(f" Max value: {test_data.max()}")
print(
f" Sample values: {test_data[50, 50]}, {test_data[100, 100]}, {test_data[150, 150]}"
)
return output_path
def test_float32_passthrough():
"""Test that 16-bit TIFF preprocessing passes float32 directly without uint8 conversion."""
print("\n=== Testing Float32 Passthrough (NO uint8) ===")
# Create temporary test file
with tempfile.NamedTemporaryFile(suffix=".tif", delete=False) as tmp:
test_path = tmp.name
try:
# Create test image
create_test_16bit_tiff(test_path)
# Create YOLOWrapper instance
print("\nTesting YOLOWrapper._prepare_source() for float32 passthrough...")
wrapper = YOLOWrapper()
# Call _prepare_source to preprocess the image
prepared_source, cleanup_path = wrapper._prepare_source(test_path)
print(f"\nPreprocessing result:")
print(f" Original path: {test_path}")
print(f" Prepared source type: {type(prepared_source)}")
# Verify it returns a numpy array (not a file path)
if isinstance(prepared_source, np.ndarray):
print(
f"\n✓ SUCCESS: Prepared source is a numpy array (float32 passthrough)"
)
print(f" Shape: {prepared_source.shape}")
print(f" Dtype: {prepared_source.dtype}")
print(f" Min value: {prepared_source.min():.6f}")
print(f" Max value: {prepared_source.max():.6f}")
print(f" Mean value: {prepared_source.mean():.6f}")
# Verify it's float32 in [0, 1] range
assert (
prepared_source.dtype == np.float32
), f"Expected float32, got {prepared_source.dtype}"
assert (
0.0 <= prepared_source.min() <= prepared_source.max() <= 1.0
), f"Expected values in [0, 1], got [{prepared_source.min()}, {prepared_source.max()}]"
# Verify it has 3 channels (RGB)
assert (
prepared_source.shape[2] == 3
), f"Expected 3 channels (RGB), got {prepared_source.shape[2]}"
# Verify no quantization to 256 levels (would happen with uint8 conversion)
unique_values = len(np.unique(prepared_source))
print(f" Unique values: {unique_values}")
# With float32, we should have much more than 256 unique values
if unique_values > 256:
print(f"\n✓ SUCCESS: Data has {unique_values} unique values (> 256)")
print(f" This confirms NO uint8 quantization occurred!")
else:
print(f"\n✗ WARNING: Data has only {unique_values} unique values")
print(f" This might indicate uint8 quantization happened")
# Sample some values to show precision
print(f"\n Sample normalized values:")
print(f" [50, 50]: {prepared_source[50, 50, 0]:.8f}")
print(f" [100, 100]: {prepared_source[100, 100, 0]:.8f}")
print(f" [150, 150]: {prepared_source[150, 150, 0]:.8f}")
# No cleanup needed since we returned array directly
assert (
cleanup_path is None
), "Cleanup path should be None for float32 pass through"
print("\n✓ All float32 passthrough tests passed!")
return True
else:
print(f"\n✗ FAILED: Prepared source is a file path: {prepared_source}")
print(f" This means data was saved to disk, not passed as float32 array")
if cleanup_path and os.path.exists(cleanup_path):
os.remove(cleanup_path)
return False
except Exception as e:
print(f"\n✗ Test failed with error: {e}")
import traceback
traceback.print_exc()
return False
finally:
# Cleanup
if os.path.exists(test_path):
os.remove(test_path)
print(f"\nCleaned up test file: {test_path}")
if __name__ == "__main__":
success = test_float32_passthrough()
sys.exit(0 if success else 1)

View File

@@ -0,0 +1,126 @@
#!/usr/bin/env python3
"""
Test script for YOLO preprocessing of 16-bit TIFF images.
"""
import numpy as np
import tifffile
from pathlib import Path
import tempfile
import sys
import os
# Add parent directory to path to import modules
sys.path.insert(0, str(Path(__file__).parent.parent))
from src.model.yolo_wrapper import YOLOWrapper
from src.utils.image import Image
from PIL import Image as PILImage
def create_test_16bit_tiff(output_path: str) -> str:
"""Create a test 16-bit grayscale TIFF file.
Args:
output_path: Path where to save the test TIFF
Returns:
Path to the created TIFF file
"""
# Create a 16-bit grayscale test image (200x200)
# With values ranging from 0 to 65535 (full 16-bit range)
height, width = 200, 200
# Create a gradient pattern
test_data = np.zeros((height, width), dtype=np.uint16)
for i in range(height):
for j in range(width):
# Create a diagonal gradient
test_data[i, j] = int((i + j) / (height + width - 2) * 65535)
# Save as TIFF
tifffile.imwrite(output_path, test_data)
print(f"Created test 16-bit TIFF: {output_path}")
print(f" Shape: {test_data.shape}")
print(f" Dtype: {test_data.dtype}")
print(f" Min value: {test_data.min()}")
print(f" Max value: {test_data.max()}")
return output_path
def test_yolo_preprocessing():
"""Test YOLO preprocessing of 16-bit TIFF images."""
print("\n=== Testing YOLO Preprocessing of 16-bit TIFF ===")
# Create temporary test file
with tempfile.NamedTemporaryFile(suffix=".tif", delete=False) as tmp:
test_path = tmp.name
try:
# Create test image
create_test_16bit_tiff(test_path)
# Create YOLOWrapper instance (no actual model loading needed for this test)
print("\nTesting YOLOWrapper._prepare_source()...")
wrapper = YOLOWrapper()
# Call _prepare_source to preprocess the image
prepared_path, cleanup_path = wrapper._prepare_source(test_path)
print(f"\nPreprocessing complete:")
print(f" Original path: {test_path}")
print(f" Prepared path: {prepared_path}")
print(f" Cleanup path: {cleanup_path}")
# Verify the prepared image exists
assert os.path.exists(prepared_path), "Prepared image should exist"
# Load the prepared image and verify it's uint8 RGB
prepared_img = PILImage.open(prepared_path)
print(f"\nPrepared image properties:")
print(f" Mode: {prepared_img.mode}")
print(f" Size: {prepared_img.size}")
print(f" Format: {prepared_img.format}")
# Convert to numpy to check values
img_array = np.array(prepared_img)
print(f" Shape: {img_array.shape}")
print(f" Dtype: {img_array.dtype}")
print(f" Min value: {img_array.min()}")
print(f" Max value: {img_array.max()}")
print(f" Mean value: {img_array.mean():.2f}")
# Verify it's RGB uint8
assert prepared_img.mode == "RGB", "Prepared image should be RGB"
assert img_array.dtype == np.uint8, "Prepared image should be uint8"
assert img_array.shape[2] == 3, "Prepared image should have 3 channels"
assert (
0 <= img_array.min() <= img_array.max() <= 255
), "Values should be in [0, 255]"
# Cleanup prepared file if needed
if cleanup_path and os.path.exists(cleanup_path):
os.remove(cleanup_path)
print(f"\nCleaned up prepared image: {cleanup_path}")
print("\n✓ All YOLO preprocessing tests passed!")
return True
except Exception as e:
print(f"\n✗ Test failed with error: {e}")
import traceback
traceback.print_exc()
return False
finally:
# Cleanup
if os.path.exists(test_path):
os.remove(test_path)
print(f"Cleaned up test file: {test_path}")
if __name__ == "__main__":
success = test_yolo_preprocessing()
sys.exit(0 if success else 1)