flutter/dev/benchmarks/microbenchmarks/lib/stocks/build_bench.dart
John McDole b755641559
Address frame policy benchmark flakes (#155130)
Recently the microbenchmarks were flakey, but from an older bug. Turns out, `LiveTestWidgetsFlutterBindingFramePolicy` is defaulted to `fadePointers` with this fun note:

> This can result in additional frames being pumped beyond those that
the test itself requests, which can cause differences in behavior

Both `text_intrinsic_bench` and `build_bench` use a similar pattern:
* Load stocks app
* Open the menu
* Switch to `benchmark` frame policy

What happens, rarely, is that
`LiveTestWidgetsFlutterBinding.pumpBenchmark()` will call (async) `handleBeginFrame` and `handleDrawFrame`. `handleDrawFrame` juggles a tri-state boolean (null, false, true). This boolean is only reset to `null` when handleDrawFrame is called back to back, say, from an extra frame that was scheduled.

1. Switch tri-state boolean to an enum, its easier to read
2. remove asserts that compile away in benchmarks (`--profile`)
3. use `Error.throwWithStackTrace` to keep stack traces.

I've been running this test on device lab hardware for hundreds of runs and have not hit a failure yet.

Fixes #150542
Fixes #150543 - throw stack!
2024-09-12 23:19:15 +00:00

68 lines
2.6 KiB
Dart

// Copyright 2014 The Flutter Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
import 'package:flutter/material.dart';
import 'package:flutter_test/flutter_test.dart';
import 'package:stocks/main.dart' as stocks;
import 'package:stocks/stock_data.dart' as stock_data;
import '../common.dart';
const Duration kBenchmarkTime = Duration(seconds: 15);
Future<List<double>> runBuildBenchmark() async {
assert(false, "Don't run benchmarks in debug mode! Use 'flutter run --release'.");
stock_data.StockData.actuallyFetchData = false;
// We control the framePolicy below to prevent us from scheduling frames in
// the engine, so that the engine does not interfere with our timings.
final LiveTestWidgetsFlutterBinding binding = TestWidgetsFlutterBinding.ensureInitialized() as LiveTestWidgetsFlutterBinding;
final Stopwatch watch = Stopwatch();
int iterations = 0;
final List<double> values = <double>[];
await benchmarkWidgets((WidgetTester tester) async {
stocks.main();
await tester.pump(); // Start startup animation
await tester.pump(const Duration(seconds: 1)); // Complete startup animation
await tester.tapAt(const Offset(20.0, 40.0)); // Open drawer
await tester.pump(); // Start drawer animation
await tester.pumpAndSettle(const Duration(seconds: 1)); // Complete drawer animation
final Element appState = tester.element(find.byType(stocks.StocksApp));
binding.framePolicy = LiveTestWidgetsFlutterBindingFramePolicy.benchmark;
Duration elapsed = Duration.zero;
while (elapsed < kBenchmarkTime) {
watch.reset();
watch.start();
appState.markNeedsBuild();
// We don't use tester.pump() because we're trying to drive it in an
// artificially high load to find out how much CPU each frame takes.
// This differs from normal benchmarks which might look at how many
// frames are missed, etc.
// We use Timer.run to ensure there's a microtask flush in between
// the two calls below.
await tester.pumpBenchmark(Duration(milliseconds: iterations * 16));
watch.stop();
iterations += 1;
elapsed += Duration(microseconds: watch.elapsedMicroseconds);
values.add(watch.elapsedMicroseconds.toDouble());
}
});
return values;
}
Future<void> execute() async {
final BenchmarkResultPrinter printer = BenchmarkResultPrinter();
printer.addResultStatistics(
description: 'Stock build',
values: await runBuildBenchmark(),
unit: 'µs per iteration',
name: 'stock_build_iteration',
);
printer.printToStdout();
}