Passing Build Analysis is required to merge into the runtime repo.
To resolve failures, do the following, in order:
In case of failure, any PR on the runtime will have a failed GitHub check - PR Build Analysis - which has a summary of all failures, including a list of matching known issues as well as any regressions introduced to the build or the tests. This tab should be your first stop for analyzing the PR failures.
This check tries to bubble as much useful information about all failures for any given PR and the pipelines it runs. It tracks both build and test failures and provides quick links to the build/test legs, the logs, and other supplemental information that Azure DevOps
may provide. The idea is to minimize the number of links to follow and tries to surface well known issues that have already been previously identified. It also adds a link to the Helix Artifacts
tab of a failed test, as it often contains more detailed logs of the execution or a dump that’s been collected at fault time.
Validation may fail for several reasons, and for each one we have a different recommended action:
Re-run failed checks
./azp run runtime
--amend --no-edit
and force push to your branch.@dotnet-policy-service rerun
comment to the PR.An issue that has not been reported before will look like this in the Build Analysis
check tab:
You can use the console log, any potential attached dumps in the artifacts section, or any other piece of information printed to help you decide if it’s a regression caused by the change. Similarly, for runtime tests we will try to print the crashing stacks to aid in the investigation.
If you have considered all the diagnostic artifacts and determined the failure is definitely not caused by changes in your PR, please do this:
The system cannot open the device or file specified. : 'NuGet-Migrations'
for issue https://github.com/dotnet/runtime/issues/80619.Report repository issue
. This will prepopulate an issue with the appropriate tags and with a body similar to:
Build Information
Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=242380
Build error leg or test failing: Build / linux-arm64 Release AllSubsets_Mono_Minijit_RuntimeTests minijit / Build Tests
Pull request: https://github.com/dotnet/runtime/pull/84716
<!-- Error message template -->
## Error Message
Fill the error message using [known issues guidance](https://github.com/dotnet/arcade/blob/main/Documentation/Projects/Build%20Analysis/KnownIssues.md#how-to-fill-out-a-known-issue-error-section).
```json
{
"ErrorMessage": "",
"BuildRetry": false,
"ErrorPattern": "",
"ExcludeConsoleLog": false
}
```
It already contains most of the essential information, but it is very important that you fill out the json blob.
ErrorMessage
field the string that you found uniquely identifies the issue. In case you need to use a regex, use the ErrorPattern
field instead. This is a limited to a single-line, non-backtracking regex as described here. This regex also needs to be appropriately escaped. Check the arcade known issues documentation for a good guide on proper regex and JSON escaping.ExcludeConsoleLog
describes if the execution logs should be considered on top of the individual test results. For most cases, this should be set to true
as the failure will happen within a single test. Setting it to false
will mean all failures within an xUnit set of tests will also get attributed to this particular error, since there’s one log describing all the problems. Due to limitations in Known Issues around rate limiting and xUnit resiliency, setting ExcludeConsoleLog=false
is necessary in two scenarios:
Once the issue is open, feel free to rerun the Build Analysis
check and the issue should be recognized as known if all was filed correctly and you are ready to merge once all unrelated issues are marked as known. However, there are some known limitations to the system as previously described. Additionally, the system only looks at the error message the stacktrace fields of an Azure DevOps test result, and the console log in the helix queue.
The Build Analysis
requests are sent to a queue. In certain scenarios, this queue can have many items to process and it can take a while for the status to be updated. If you do not see the status getting updated, be patient and wait at least 10 minutes before investigating further.
If rerunning the check doesn’t pick up the known issue and you feel it should, feel free to tag @dotnet/runtime-infrastructure to request infrastructure team for help.
After you do this, if the failure is occurring frequently as per the data captured in the recently opened issue, please disable the failing test(s) with the corresponding tracking issue link in a follow-up Pull Request.
disabled-test
label and remove the blocking tags.[ActiveIssue(link)]
attribute on the test method. You can narrow the disabling down to runtime variant, flavor, and platform. For an example see File_AppendAllLinesAsync_Encodedsrc/tests
, please edit issues.targets
. There are several groups for different types of disable (mono vs. coreclr, different platforms, different scenarios). Add the folder containing the test and issue mimicking any of the samples in the file.There are plenty of intermittent failures that won’t manifest again on a retry. Therefore these steps should be followed for every iteration of the PR build, e.g. before retrying/rebuilding.
To unconditionally bypass the build analysis check (turn it green), you can add a comment to your PR with the following text:
/ba-g <reason>
The Build Analysis
requests are sent to a queue. In certain scenarios, this queue can have many items to process and it can take a while for the status to be updated. If you do not see the status getting updated, be patient and wait at least 10 minutes before investigating further.
For more information, see https://github.com/dotnet/arcade/blob/main/Documentation/Projects/Build%20Analysis/EscapeMechanismforBuildAnalysis.md
{
"ErrorPattern": "The system cannot open the device or file specified. : ('|')NuGet-Migrations('|')",
"BuildRetry": false,
"ExcludeConsoleLog": false
}
This is a case where the issue is tied to the machine the workitem falls on. Everything would fail in that test group, so ExcludeConsoleLog
isn’t harmful and the string is specific to the issue. The proper usage of this provides useful insight such as an accurate count of the impact of the issue without blocking other devs: