Skip to main content

Reports & Comparison Results

GoDiffy generates detailed reports for every comparison, helping you quickly identify and understand visual changes in your applications.

Understanding Comparison Results

Result Overview

Each comparison produces:

  • Overall Similarity Score: 0-100% indicating how similar the images are
  • Pass/Fail Status: Based on your threshold setting
  • Individual Image Results: Per-image comparison details
  • Visual Diff Images: Highlighted differences
  • Statistical Analysis: Detailed metrics

Similarity Score Explained

The similarity score represents how visually similar two images are:

Score RangeMeaningTypical Cause
95-100%Nearly identicalMinor rendering differences, anti-aliasing
85-95%Very similarSmall content changes, color shifts
70-85%Moderately similarLayout changes, new elements
50-70%Somewhat similarSignificant changes, restructuring
0-50%Very differentMajor redesign, completely different content
Setting Thresholds

Most teams use 95% as a threshold for catching meaningful changes while ignoring minor rendering variations.

Reading Visual Diffs

Diff Image Components

GoDiffy generates three types of images for each comparison:

1. Base Image (Original)

Your baseline or reference image—typically from your main branch or production.

2. Compare Image (New)

The new version you're testing—typically from a feature branch or staging environment.

3. Diff Image (Differences)

A visual representation showing:

  • Red highlights: Changed pixels
  • Grayscale background: Unchanged areas
  • Intensity: Darker red = more significant change

Example Diff Analysis

Base Image:        Compare Image:      Diff Image:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Header │ │ Header │ │ Header │
│ ┌─────────┐ │ │ ┌─────────┐ │ │ ┌─────────┐ │
│ │ Logo │ │ │ │ New Logo│ │ │ │ 🔴🔴🔴 │ │ ← Logo changed
│ └─────────┘ │ │ └─────────┘ │ │ └─────────┘ │
│ Content │ │ Content │ │ Content │
│ Footer │ │ Footer │ │ Footer │
└─────────────┘ └─────────────┘ └─────────────┘

Comparison Algorithms

GoDiffy supports three comparison algorithms, each suited for different use cases:

SSIM (Structural Similarity Index)

Best for: General web application testing

How it works:

  • Analyzes luminance, contrast, and structure
  • Tolerates minor rendering differences
  • Focuses on perceptual changes

Use when:

  • Testing web applications with dynamic content
  • You want to ignore minor anti-aliasing differences
  • You care about meaningful visual changes

Example:

Score: 96.5%
Status: ✅ Pass (threshold: 95%)
Changes: Minor font rendering difference

MSE (Mean Squared Error)

Best for: Pixel-perfect requirements

How it works:

  • Compares every pixel exactly
  • Highly sensitive to any change
  • No tolerance for variations

Use when:

  • Testing design systems or component libraries
  • You need exact pixel matching
  • Working with static, controlled content

Example:

Score: 89.2%
Status: ❌ Fail (threshold: 95%)
Changes: 1px border width difference detected

Structural Analysis

Best for: Layout-focused testing

How it works:

  • Focuses on element positioning and structure
  • Less sensitive to color/content changes
  • Detects layout shifts and reflows

Use when:

  • Testing responsive layouts
  • Detecting layout regressions
  • Monitoring element positioning

Example:

Score: 92.1%
Status: ❌ Fail (threshold: 95%)
Changes: Button moved 5px down

Viewing Reports

In the Dashboard

  1. Navigate to Reports page
  2. Browse saved reports by date, site, or branch
  3. Click a report to view details
  4. Explore individual comparisons within the report

Report Details Page

Each report shows:

Header Information

  • Report ID: Unique identifier
  • Created Date: When the comparison ran
  • Site Name: Which site was tested
  • Branch Comparison: Base vs Compare branches
  • Overall Status: Pass/Fail

Summary Statistics

Total Images: 24
Passed: 22 (91.7%)
Failed: 2 (8.3%)
Average Similarity: 97.3%

Individual Results Table

ImageBaseCompareSimilarityStatusActions
homepage.png98.5%✅ PassView Diff
checkout.png92.1%❌ FailView Diff
profile.png99.2%✅ PassView Diff

Interpreting Results

Common Scenarios

✅ All Tests Pass (95%+ similarity)

What it means:

  • No significant visual changes detected
  • Safe to merge/deploy

Action:

  • Review the report for peace of mind
  • Proceed with confidence

⚠️ Some Tests Fail (Below threshold)

What it means:

  • Visual changes detected in specific areas
  • Requires manual review

Action:

  1. Review diff images to understand changes
  2. Determine if changes are intentional:
    • Expected: Update baseline, approve changes
    • Unexpected: Fix the regression
  3. Update tests if needed

❌ Many Tests Fail

What it means:

  • Significant visual changes across multiple screens
  • Major redesign or potential issue

Action:

  1. Verify the comparison is correct (right branches?)
  2. Check for systematic issues:
    • CSS changes affecting multiple pages
    • Font loading issues
    • Responsive breakpoint changes
  3. Decide on next steps:
    • If intentional: Bulk update baselines
    • If unintentional: Investigate and fix

False Positives

Sometimes you'll see failures for acceptable changes:

Dynamic Content

Problem: Timestamps, user-specific data, random elements Solution:

  • Mask dynamic regions before comparison
  • Use data fixtures for consistent content
  • Exclude specific elements from screenshots

Font Rendering

Problem: Different OS/browser font rendering Solution:

  • Use web fonts consistently
  • Run tests in containerized environments
  • Increase threshold slightly (e.g., 94% instead of 95%)

Animation Timing

Problem: Screenshots captured mid-animation Solution:

  • Wait for animations to complete
  • Disable animations in test environment
  • Use consistent timing in screenshot capture

Managing Reports

Saving Reports

Reports are automatically saved when:

  • ✅ Comparison completes successfully
  • ✅ You're on a Pro or Enterprise plan
  • ❌ Free plan: Reports are temporary (24 hours)

Downloading Reports

JSON Format:

{
"report_id": "rpt_abc123",
"created_at": "2024-01-15T10:30:00Z",
"site_id": "site_xyz",
"base_folder": "/my-app/main/abc123",
"compare_folder": "/my-app/feature/def456",
"results": [
{
"image_name": "homepage.png",
"similarity": 98.5,
"status": "pass",
"algorithm": "ssim"
}
]
}

Use cases:

  • Integration with external tools
  • Custom reporting dashboards
  • Long-term archival

Deleting Reports

  1. Go to Reports page
  2. Find the report to delete
  3. Click trash icon
  4. Confirm deletion
warning

Deleted reports cannot be recovered. Download important reports before deleting.

GitHub Actions Integration

Automatic Report Comments

When using GitHub Actions, GoDiffy automatically posts comparison results as PR comments:

## 🎨 Visual Regression Test Results

**Status:** ❌ 2 changes detected

### Summary
- Total Images: 24
- Passed: 22 (91.7%)
- Failed: 2 (8.3%)
- Average Similarity: 97.3%

### Failed Comparisons
| Image | Similarity | Diff |
|-------|-----------|------|
| checkout.png | 92.1% | [View Diff](https://app.godiffy.com/...) |
| profile.png | 89.5% | [View Diff](https://app.godiffy.com/...) |

[View Full Report](https://app.godiffy.com/reports/rpt_abc123)

Status Checks

GoDiffy can block PR merges based on results:

# .github/workflows/visual-testing.yml
- name: Visual Regression Testing
uses: godiffy/godiffy-action@v1
with:
api-key: ${{ secrets.GODIFFY_API_KEY }}
site-id: ${{ secrets.GODIFFY_SITE_ID }}
threshold: '0.95'
fail-on-changes: true # Fail the workflow if changes detected

Advanced Analysis

Trend Analysis

Track visual stability over time:

  • Comparison History: See how similarity scores change
  • Failure Patterns: Identify frequently failing screens
  • Baseline Drift: Detect gradual visual changes

Batch Comparisons

Compare multiple folders at once:

# Compare all feature branches against main
curl -X POST https://api.godiffy.com/api/batch-compare \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"site_id": "site_xyz",
"base_folder": "/my-app/main/latest",
"compare_folders": [
"/my-app/feature-a/latest",
"/my-app/feature-b/latest",
"/my-app/feature-c/latest"
]
}'

Region-Based Analysis

Focus on specific areas of your screenshots:

  • Define regions of interest (ROI)
  • Compare only critical UI elements
  • Ignore dynamic content areas

Best Practices

1. Consistent Screenshot Capture

Do:

  • Use the same viewport size
  • Wait for page load completion
  • Disable animations
  • Use consistent test data

Don't:

  • Capture during animations
  • Use random or time-based data
  • Vary viewport sizes
  • Include dynamic ads or content

2. Meaningful Baselines

Update baselines when:

  • ✅ Intentional design changes
  • ✅ New features added
  • ✅ Bug fixes that change visuals

Don't update baselines for:

  • ❌ Unintentional changes
  • ❌ Regressions
  • ❌ Failed tests you haven't reviewed

3. Threshold Selection

Start conservative:

  • Begin with 95% threshold
  • Adjust based on your false positive rate
  • Different thresholds for different pages if needed

Consider:

  • Content type (static vs dynamic)
  • Acceptable change tolerance
  • Team's review capacity

Troubleshooting

Need help interpreting results or resolving failed comparisons? Jump to Troubleshooting → Reports & comparisons for consolidated guidance on missing images, unexpected failures, and PR comment issues.

Next Steps

Need Help?