[analyzer] Fix zext assertion failure in loop unrolling #121203

shenjunjiekoda · 2024-12-27T11:02:22Z

The current implementation of APInt extension in the code can trigger an assertion failure when the zext function is called with a target width smaller than the current bit width. For example:

if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
    InitNum = InitNum.zext(BoundNum.getBitWidth());
    BoundNum = BoundNum.zext(InitNum.getBitWidth());
}

This logic does not guarantee that the zext target width is always greater than or equal to the current bit width, leading to potential crashes.

Expected Behavior:

Ensure InitNum and BoundNum are extended to the maximum of their respective widths.
Prevent assertion failures by enforcing correct zext usage.

Fixes #121201

llvmbot · 2024-12-27T11:02:58Z

@llvm/pr-subscribers-clang

Author: JOSTAR (shenjunjiekoda)

Changes

The current implementation of APInt extension in the code can trigger an assertion failure when the zext function is called with a target width smaller than the current bit width. For example:

if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
    InitNum = InitNum.zext(BoundNum.getBitWidth());
    BoundNum = BoundNum.zext(InitNum.getBitWidth());
}

This logic does not guarantee that the zext target width is always greater than or equal to the current bit width, leading to potential crashes.

Expected Behavior:

Ensure InitNum and BoundNum are extended to the maximum of their respective widths.
Prevent assertion failures by enforcing correct zext usage.

Depend on ##121201

Full diff: https://github.com/llvm/llvm-project/pull/121203.diff

1 Files Affected:

(modified) clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp (+6-4)

diff --git a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
index 96f5d7c44baf89..e3b27e22712b58 100644
--- a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
+++ b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
@@ -283,10 +283,12 @@ static bool shouldCompletelyUnroll(const Stmt *LoopStmt, ASTContext &ASTCtx,
   llvm::APInt InitNum =
       Matches[0].getNodeAs<IntegerLiteral>("initNum")->getValue();
   auto CondOp = Matches[0].getNodeAs<BinaryOperator>("conditionOperator");
-  if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
-    InitNum = InitNum.zext(BoundNum.getBitWidth());
-    BoundNum = BoundNum.zext(InitNum.getBitWidth());
-  }
+  unsigned MaxWidth = std::max(InitNum.getBitWidth(), BoundNum.getBitWidth());
+
+  if (InitNum.getBitWidth() != MaxWidth)
+      InitNum = InitNum.zext(MaxWidth);
+  if (BoundNum.getBitWidth() != MaxWidth)
+      BoundNum = BoundNum.zext(MaxWidth);
 
   if (CondOp->getOpcode() == BO_GE || CondOp->getOpcode() == BO_LE)
     maxStep = (BoundNum - InitNum + 1).abs().getZExtValue();

llvmbot · 2024-12-27T11:02:58Z

@llvm/pr-subscribers-clang-static-analyzer-1

Author: JOSTAR (shenjunjiekoda)

Changes

The current implementation of APInt extension in the code can trigger an assertion failure when the zext function is called with a target width smaller than the current bit width. For example:

if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
    InitNum = InitNum.zext(BoundNum.getBitWidth());
    BoundNum = BoundNum.zext(InitNum.getBitWidth());
}

This logic does not guarantee that the zext target width is always greater than or equal to the current bit width, leading to potential crashes.

Expected Behavior:

Ensure InitNum and BoundNum are extended to the maximum of their respective widths.
Prevent assertion failures by enforcing correct zext usage.

Depend on ##121201

Full diff: https://github.com/llvm/llvm-project/pull/121203.diff

1 Files Affected:

(modified) clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp (+6-4)

diff --git a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
index 96f5d7c44baf89..e3b27e22712b58 100644
--- a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
+++ b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
@@ -283,10 +283,12 @@ static bool shouldCompletelyUnroll(const Stmt *LoopStmt, ASTContext &ASTCtx,
   llvm::APInt InitNum =
       Matches[0].getNodeAs<IntegerLiteral>("initNum")->getValue();
   auto CondOp = Matches[0].getNodeAs<BinaryOperator>("conditionOperator");
-  if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
-    InitNum = InitNum.zext(BoundNum.getBitWidth());
-    BoundNum = BoundNum.zext(InitNum.getBitWidth());
-  }
+  unsigned MaxWidth = std::max(InitNum.getBitWidth(), BoundNum.getBitWidth());
+
+  if (InitNum.getBitWidth() != MaxWidth)
+      InitNum = InitNum.zext(MaxWidth);
+  if (BoundNum.getBitWidth() != MaxWidth)
+      BoundNum = BoundNum.zext(MaxWidth);
 
   if (CondOp->getOpcode() == BO_GE || CondOp->getOpcode() == BO_LE)
     maxStep = (BoundNum - InitNum + 1).abs().getZExtValue();

github-actions · 2024-12-27T11:05:46Z

✅ With the latest revision this PR passed the C/C++ code formatter.

steakhal · 2024-12-27T11:08:03Z

Hey, could you add a test for this PR that would crash on main, but wouldn't with this patch?

steakhal · 2024-12-27T14:34:42Z

Could you please add a RUN line to the test so that the test actually gets invoked by the check-clang-analysis build target, thus get tested in CI, and a // no-crash comment at the line where previously crashed during interpretation?
I'd also prefer a clang-formated test file if possible.
Are you sure the test case is minimal and couldn't be minimized further now that you know why it crashes?

shenjunjiekoda · 2024-12-27T14:54:59Z

Could you please add a RUN line to the test so that the test actually gets invoked by the check-clang-analysis build target, thus get tested in CI, and a // no-crash comment at the line where previously crashed during interpretation? I'd also prefer a clang-formated test file if possible. Are you sure the test case is minimal and couldn't be minimized further now that you know why it crashes?

The crash occurred due to a failed assertion in the zext method of APInt. The zext function requires the following condition to be met:

// Zero extend to a new width.
APInt APInt::zext(unsigned width) const {
  assert(width >= BitWidth && "Invalid APInt ZeroExtend request");
  // ...
}

However, the original logic in clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp used an inequality check (!=) to determine if the widths were mismatched. This could lead to a scenario where one of the zext calls in the if block triggers the assertion failure internally:

static bool shouldCompletelyUnroll(const Stmt *LoopStmt, ASTContext &ASTCtx,
                                   ExplodedNode *Pred, unsigned &maxStep) {

  // ...
  if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
    InitNum = InitNum.zext(BoundNum.getBitWidth());
    BoundNum = BoundNum.zext(InitNum.getBitWidth());
  }

For the test case, I used the cvise tool to simplify the test/std-test.cc file from the libfmt repo while ensuring it remained free of compilation errors. This test case appears to be the minimal version that cvise could produce.

steakhal

Thanks for the nice reproducer!
The test looks a bit verbose to my taste, but it's okay as-is.
I had some deeper thoughts of the fix inline to settle before we could merge this.

Thanks again for working on this issue!

steakhal · 2024-12-27T16:13:53Z

clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp

-    InitNum = InitNum.zext(BoundNum.getBitWidth());
-    BoundNum = BoundNum.zext(InitNum.getBitWidth());
-  }
+  unsigned MaxWidth = std::max(InitNum.getBitWidth(), BoundNum.getBitWidth());


Have you checked if there is a utility achieving this for us? Like the APSIntType type? (IDK, I rarely ever use it)
I also wonder if we actually need to operate at an APSInt level, maybe we could just convert both of these into int64 and do the math on those.

Is zero extension is actually correct in semantics? What if the InitNum was negative, léike -1, then shouldn't we use sign extension?

Thank you for your thoughtful feedback!

In Clang's AST, IntegerLiteral represents unsigned integer values. To represent a literal like -1, the AST typically uses a combination of UnaryOperator(-) and IntegerLiteral.

Regarding zero extension (zext), it is semantically correct here because APInt (the return type of IntegerLiteral::getValue()) inherently represents unsigned values.

I considered whether tools like APSIntType could simplify this logic. However, since this operation deals specifically with unsigned integers (APInt), using APSIntType would introduce unnecessary complexity for signed semantics, which is not required here. The current use of APInt ensures precision and correctness.

On whether we could switch to int64, I think converting to int64 might not simplify the code too much.

Let me know if further clarification or adjustments are needed!

llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Dec 27, 2024

steakhal self-requested a review December 27, 2024 11:08

[analyzer] Fix zext assertion failure in loop unrolling

2cee5fc

shenjunjiekoda force-pushed the fix/sa-loopunrolling branch from 3bcf5c8 to 2cee5fc Compare December 27, 2024 14:09

add testcase

a4d3a24

shenjunjiekoda added 3 commits December 27, 2024 14:35

fix testcase

ee331f6

fix testcase

1e7f333

fix testcase

d8cc2b9

fix testcase

9ae83de

steakhal reviewed Dec 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[analyzer] Fix zext assertion failure in loop unrolling #121203

[analyzer] Fix zext assertion failure in loop unrolling #121203

shenjunjiekoda commented Dec 27, 2024 •

edited by steakhal

Loading

llvmbot commented Dec 27, 2024

llvmbot commented Dec 27, 2024

github-actions bot commented Dec 27, 2024 •

edited

Loading

steakhal commented Dec 27, 2024

steakhal commented Dec 27, 2024 •

edited

Loading

shenjunjiekoda commented Dec 27, 2024 •

edited

Loading

steakhal left a comment

steakhal Dec 27, 2024

shenjunjiekoda Dec 28, 2024

[analyzer] Fix zext assertion failure in loop unrolling #121203

Are you sure you want to change the base?

[analyzer] Fix zext assertion failure in loop unrolling #121203

Conversation

shenjunjiekoda commented Dec 27, 2024 • edited by steakhal Loading

llvmbot commented Dec 27, 2024

llvmbot commented Dec 27, 2024

github-actions bot commented Dec 27, 2024 • edited Loading

steakhal commented Dec 27, 2024

steakhal commented Dec 27, 2024 • edited Loading

shenjunjiekoda commented Dec 27, 2024 • edited Loading

steakhal left a comment

Choose a reason for hiding this comment

steakhal Dec 27, 2024

Choose a reason for hiding this comment

shenjunjiekoda Dec 28, 2024

Choose a reason for hiding this comment

shenjunjiekoda commented Dec 27, 2024 •

edited by steakhal

Loading

github-actions bot commented Dec 27, 2024 •

edited

Loading

steakhal commented Dec 27, 2024 •

edited

Loading

shenjunjiekoda commented Dec 27, 2024 •

edited

Loading