Home

Features

SailDiff

Introduction

SailDiff is a tool for running automated statistical testing on Sailfish performance data. It provides powerful comparison capabilities to help you understand performance changes and differences.

SailDiff operates in two main modes:

  1. Historical Comparisons: Compare current test runs against previously saved tracking data
  2. Method Comparisons: Compare multiple methods within a single test run using the [SailfishComparison] attribute

When enabled, SailDiff will produce various measurements describing the differences between test runs or methods. Results are presented via multiple output formats:

  • Test Output Window: Real-time results during test execution
  • Consolidated Markdown: Session-based markdown files with comprehensive comparison data
  • Consolidated CSV: Session-based CSV files with structured comparison data for analysis

Method Comparisons

For real-time method comparisons within a single test run, see the Method Comparisons documentation. This feature allows you to compare multiple algorithms or implementations automatically using the [SailfishComparison("GroupName")] attribute.

Method comparisons generate:

  • N×N comparison matrices: Every method compared against every other method in the same group
  • Statistical significance testing: P-values and confidence intervals
  • Performance ratios: Clear "X times faster/slower" descriptions
  • Consolidated outputs: Both markdown and CSV formats available with [WriteToMarkdown] and [WriteToCsv] attributes

Enabling / Configuring SailDiff

If using Sailfish as a test project, you can create a .sailfish.json file in the root of your test project (next to your .csproj file). This file can hold various configuration settings. When found, SailDiff will be automatically run. If any compatible setting is omitted, a sensible default will be used.

Example .sailfish.json

{
"SailfishSettings": {
"DisableOverheadEstimation": false,
"NumWarmupIterationsOverride": 1,
"SampleSizeOverride": 30
},
"SailDiffSettings": {
"TestType": "TTest",
"Alpha": 0.005,
"Disabled": false
},
"ScaleFishSettings": {},
"GlobalSettings": {
"UseOutlierDetection": true,
"ResultsDirectory": "SailfishIDETestOutput",
"DisableEverything": false,
"Round": 5
}
}

SailDiffSettings

TestType

Description: Specifies an enum type for a statistical test. One of:

  • TwoSampleWilcoxonSignedRankTest
  • WilcoxonRankSumTest
  • KolmogorovSmirnovTest
  • TTest (Default)

Alpha

Description: Threshold for significance detection. (Aka 'PValue threshold').

Default: 0.005

Disabled

Description: Disable SailDiff

Default: false

Example IDE Output

Statistical Test
----------------
Test Used: TTest
PVal Threshold: 0.005
PValue: 0.0528963431
Change: No Change (reason: 0.0528963431 > 0.005)
| | Before (ms) | After (ms) |
| --- | --- | --- |
| Mean | 61.7671 | 55.0063 |
| Median | 62.3821 | 56.1542 |
| Sample Size | 30 | 30 |

Markdown

Display NameMeanBefore (N=7)MeanAfter (N=7)MedianBeforeMedianAfterPValueChange Description
Example.Test()190.78 ms191.35 ms187.689 ms186.9367 ms0.89023No Change

The Mean and median are both presented alongside a PValue and Change description. The PValue is returned from the statistical test and compared to a user-set threshold to determine the change description.

Library

You may use the RunSettingsBuilder to configure SailDiff before running.

var settings = new SailfDiffSettings(
alpha: 0.001,
round: 3,
useOutlierDetection: true,
testType: TestType.TTest,
maxDegreeOfParallelism: 4,
disableOrdering: false);
var settings = RunSettingsBuilder
.CreateBuilder()
.WithSailDiff(settings)
.Build();

Customizing the SailDiff inputs

By default, Sailfish will look for the most recent file in the default tracking directory when you execute a test run via a console app.

The flow of the analysis is

  1. Program Execution
  2. TestCaseCompletedNotification
  3. TestRunCompletedNotification
  4. BeforeAndAfterFileLocationRequest
  5. ReadInBeforeAndAfterDataRequest
  6. Saildiff

This flow shows that there are two points at which you can minipulate the data inputs:

  • IRequestHandler<BeforeAndAfterFileLocationRequest, BeforeAndAfterFileLocationResponse>
  • IRequestHandler<ReadInBeforeAndAfterDataCommand, ReadInBeforeAndAfterDataResponse>

Reading Tracking Data from a Custom Location

internal class SailfishBeforeAndAfterFileLocationHandler
: IRequestHandler<BeforeAndAfterFileLocationCommand, BeforeAndAfterFileLocationResponse>
{
private readonly ITrackingFileDirectoryReader trackingFileDirectoryReader;
public SailfishBeforeAndAfterFileLocationHandler(
IRunSettings runSettings,
ITrackingFileDirectoryReader trackingFileDirectoryReader)
{
this.trackingFileDirectoryReader = trackingFileDirectoryReader;
this.runSettings = runSettings
}
public Task<BeforeAndAfterFileLocationResponse> Handle(
BeforeAndAfterFileLocationCommand request,
CancellationToken cancellationToken)
{
var trackingFiles = trackingFileDirectoryReader
.FindTrackingFilesInDirectoryOrderedByLastModified(
runSettings.GetRunSettingsTrackingDirectoryPath(),
ascending: false);
// Consider reading data from a:
// - database
// - cloud storage container
// - cloud log processing tool
// - network drive
// - local directory
return Task.FromResult(new BeforeAndAfterFileLocationResponse(
new List<string>() { trackingFiles.BeforeFilePath }.Where(x => !string.IsNullOrEmpty(x)),
new List<string>() { trackingFiles.AfterFilePath }.Where(x => !string.IsNullOrEmpty(x))));
}
}

Reading Tracking Data that you wish to aggregate prior to testing

internal class SailfishReadInBeforeAndAfterDataHandler
: IRequestHandler<ReadInBeforeAndAfterDataCommand, ReadInBeforeAndAfterDataResponse>
{
public async Task<ReadInBeforeAndAfterDataResponse> Handle(
ReadInBeforeAndAfterDataCommand request,
CancellationToken cancellationToken)
{
// When you return the data, you are also required to
// provide an IEnumerable<string> that represents the files that were used.
return new ReadInBeforeAndAfterDataResponse(
new TestData(dataSourcesBefore, beforeData),
new TestData(dataSourcesAfter, afterData));
}
}

If you inspect the TestData source code, you will find that it takes an IEnumerable of test Ids, which are intended for you to keep track of which processed files were used in the statistical test.

SailDiff will automatically aggregate data when multiple files are provided.

Which SailDiff Test should I use?

When customizing the TestSettings TestType (either via .sailfish.json or RunSettingsBuilder), you have three options to choose from.

You can follow this rule of thumb when choosing:

if (your test makes requests over a network):
One of:
- TwoSampleWilcoxonSignedRankTest
- WilcoxonRankSumTest
- KolmogorovSmirnovTest
else:
- TTest
Previous
Sailfish