Reports & Comparison Results
GoDiffy generates detailed reports for every comparison, helping you quickly identify and understand visual changes in your applications.
Understanding Comparison Results
Result Overview
Each comparison produces:
- Overall Similarity Score: 0-100% indicating how similar the images are
- Pass/Fail Status: Based on your threshold setting
- Individual Image Results: Per-image comparison details
- Visual Diff Images: Highlighted differences
- Statistical Analysis: Detailed metrics
Similarity Score Explained
The similarity score represents how visually similar two images are:
| Score Range | Meaning | Typical Cause |
|---|---|---|
| 95-100% | Nearly identical | Minor rendering differences, anti-aliasing |
| 85-95% | Very similar | Small content changes, color shifts |
| 70-85% | Moderately similar | Layout changes, new elements |
| 50-70% | Somewhat similar | Significant changes, restructuring |
| 0-50% | Very different | Major redesign, completely different content |
Most teams use 95% as a threshold for catching meaningful changes while ignoring minor rendering variations.
Reading Visual Diffs
Diff Image Components
GoDiffy generates three types of images for each comparison:
1. Base Image (Original)
Your baseline or reference image—typically from your main branch or production.
2. Compare Image (New)
The new version you're testing—typically from a feature branch or staging environment.
3. Diff Image (Differences)
A visual representation showing:
- Red highlights: Changed pixels
- Grayscale background: Unchanged areas
- Intensity: Darker red = more significant change
Example Diff Analysis
Base Image: Compare Image: Diff Image:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Header │ │ Header │ │ Header │
│ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │
│ │ Logo │ │ │ │ New Logo│ │ │ │ 🔴🔴🔴 │ │ ← Logo changed
│ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │
│ Content │ │ Content │ │ Content │
│ Footer │ │ Footer │ │ Footer │
└─────────────┘ └─────────────┘ └─────────────┘
Comparison Algorithms
GoDiffy supports three comparison algorithms, each suited for different use cases:
SSIM (Structural Similarity Index)
Best for: General web application testing
How it works:
- Analyzes luminance, contrast, and structure
- Tolerates minor rendering differences
- Focuses on perceptual changes
Use when:
- Testing web applications with dynamic content
- You want to ignore minor anti-aliasing differences
- You care about meaningful visual changes
Example:
Score: 96.5%
Status: ✅ Pass (threshold: 95%)
Changes: Minor font rendering difference
MSE (Mean Squared Error)
Best for: Pixel-perfect requirements
How it works:
- Compares every pixel exactly
- Highly sensitive to any change
- No tolerance for variations
Use when:
- Testing design systems or component libraries
- You need exact pixel matching
- Working with static, controlled content
Example:
Score: 89.2%
Status: ❌ Fail (threshold: 95%)
Changes: 1px border width difference detected
Structural Analysis
Best for: Layout-focused testing
How it works:
- Focuses on element positioning and structure
- Less sensitive to color/content changes
- Detects layout shifts and reflows
Use when:
- Testing responsive layouts
- Detecting layout regressions
- Monitoring element positioning
Example:
Score: 92.1%
Status: ❌ Fail (threshold: 95%)
Changes: Button moved 5px down
Viewing Reports
In the Dashboard
- Navigate to Reports page
- Browse saved reports by date, site, or branch
- Click a report to view details
- Explore individual comparisons within the report
Report Details Page
Each report shows:
Header Information
- Report ID: Unique identifier
- Created Date: When the comparison ran
- Site Name: Which site was tested
- Branch Comparison: Base vs Compare branches
- Overall Status: Pass/Fail
Summary Statistics
Total Images: 24
Passed: 22 (91.7%)
Failed: 2 (8.3%)
Average Similarity: 97.3%
Individual Results Table
| Image | Base | Compare | Similarity | Status | Actions |
|---|---|---|---|---|---|
| homepage.png | ✓ | ✓ | 98.5% | ✅ Pass | View Diff |
| checkout.png | ✓ | ✓ | 92.1% | ❌ Fail | View Diff |
| profile.png | ✓ | ✓ | 99.2% | ✅ Pass | View Diff |
Interpreting Results
Common Scenarios
✅ All Tests Pass (95%+ similarity)
What it means:
- No significant visual changes detected
- Safe to merge/deploy
Action:
- Review the report for peace of mind
- Proceed with confidence
⚠️ Some Tests Fail (Below threshold)
What it means:
- Visual changes detected in specific areas
- Requires manual review
Action:
- Review diff images to understand changes
- Determine if changes are intentional:
- ✅ Expected: Update baseline, approve changes
- ❌ Unexpected: Fix the regression
- Update tests if needed
❌ Many Tests Fail
What it means:
- Significant visual changes across multiple screens
- Major redesign or potential issue
Action:
- Verify the comparison is correct (right branches?)
- Check for systematic issues:
- CSS changes affecting multiple pages
- Font loading issues
- Responsive breakpoint changes
- Decide on next steps:
- If intentional: Bulk update baselines
- If unintentional: Investigate and fix
False Positives
Sometimes you'll see failures for acceptable changes:
Dynamic Content
Problem: Timestamps, user-specific data, random elements Solution:
- Mask dynamic regions before comparison
- Use data fixtures for consistent content
- Exclude specific elements from screenshots
Font Rendering
Problem: Different OS/browser font rendering Solution:
- Use web fonts consistently
- Run tests in containerized environments
- Increase threshold slightly (e.g., 94% instead of 95%)
Animation Timing
Problem: Screenshots captured mid-animation Solution:
- Wait for animations to complete
- Disable animations in test environment
- Use consistent timing in screenshot capture
Managing Reports
Saving Reports
Reports are automatically saved when:
- ✅ Comparison completes successfully
- ✅ You're on a Pro or Enterprise plan
- ❌ Free plan: Reports are temporary (24 hours)
Downloading Reports
JSON Format:
{
"report_id": "rpt_abc123",
"created_at": "2024-01-15T10:30:00Z",
"site_id": "site_xyz",
"base_folder": "/my-app/main/abc123",
"compare_folder": "/my-app/feature/def456",
"results": [
{
"image_name": "homepage.png",
"similarity": 98.5,
"status": "pass",
"algorithm": "ssim"
}
]
}
Use cases:
- Integration with external tools
- Custom reporting dashboards
- Long-term archival
Deleting Reports
- Go to Reports page
- Find the report to delete
- Click trash icon
- Confirm deletion
Deleted reports cannot be recovered. Download important reports before deleting.
GitHub Actions Integration
Automatic Report Comments
When using GitHub Actions, GoDiffy automatically posts comparison results as PR comments:
## 🎨 Visual Regression Test Results
**Status:** ❌ 2 changes detected
### Summary
- Total Images: 24
- Passed: 22 (91.7%)
- Failed: 2 (8.3%)
- Average Similarity: 97.3%
### Failed Comparisons
| Image | Similarity | Diff |
|-------|-----------|------|
| checkout.png | 92.1% | [View Diff](https://app.godiffy.com/...) |
| profile.png | 89.5% | [View Diff](https://app.godiffy.com/...) |
[View Full Report](https://app.godiffy.com/reports/rpt_abc123)
Status Checks
GoDiffy can block PR merges based on results:
# .github/workflows/visual-testing.yml
- name: Visual Regression Testing
uses: godiffy/godiffy-action@v1
with:
api-key: ${{ secrets.GODIFFY_API_KEY }}
site-id: ${{ secrets.GODIFFY_SITE_ID }}
threshold: '0.95'
fail-on-changes: true # Fail the workflow if changes detected
Advanced Analysis
Trend Analysis
Track visual stability over time:
- Comparison History: See how similarity scores change
- Failure Patterns: Identify frequently failing screens
- Baseline Drift: Detect gradual visual changes
Batch Comparisons
Compare multiple folders at once:
# Compare all feature branches against main
curl -X POST https://api.godiffy.com/api/batch-compare \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"site_id": "site_xyz",
"base_folder": "/my-app/main/latest",
"compare_folders": [
"/my-app/feature-a/latest",
"/my-app/feature-b/latest",
"/my-app/feature-c/latest"
]
}'
Region-Based Analysis
Focus on specific areas of your screenshots:
- Define regions of interest (ROI)
- Compare only critical UI elements
- Ignore dynamic content areas
Best Practices
1. Consistent Screenshot Capture
Do:
- Use the same viewport size
- Wait for page load completion
- Disable animations
- Use consistent test data
Don't:
- Capture during animations
- Use random or time-based data
- Vary viewport sizes
- Include dynamic ads or content
2. Meaningful Baselines
Update baselines when:
- ✅ Intentional design changes
- ✅ New features added
- ✅ Bug fixes that change visuals
Don't update baselines for:
- ❌ Unintentional changes
- ❌ Regressions
- ❌ Failed tests you haven't reviewed
3. Threshold Selection
Start conservative:
- Begin with 95% threshold
- Adjust based on your false positive rate
- Different thresholds for different pages if needed
Consider:
- Content type (static vs dynamic)
- Acceptable change tolerance
- Team's review capacity
Troubleshooting
Need help interpreting results or resolving failed comparisons? Jump to Troubleshooting → Reports & comparisons for consolidated guidance on missing images, unexpected failures, and PR comment issues.
Next Steps
- Set Up GitHub Actions - Automate comparisons
- Manage Sites - Organize your tests
- API Reference - Programmatic access
- Pricing Plans - Upgrade for more features
Need Help?
- Contact Support - We're here to help
- API Documentation - Full API reference
- Community Forum - Ask questions