-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[analyzer] Fix zext assertion failure in loop unrolling #121203
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-clang Author: JOSTAR (shenjunjiekoda) ChangesThe current implementation of APInt extension in the code can trigger an assertion failure when the if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
InitNum = InitNum.zext(BoundNum.getBitWidth());
BoundNum = BoundNum.zext(InitNum.getBitWidth());
} This logic does not guarantee that the Expected Behavior:
Depend on ##121201 Full diff: https://github.com/llvm/llvm-project/pull/121203.diff 1 Files Affected:
diff --git a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
index 96f5d7c44baf89..e3b27e22712b58 100644
--- a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
+++ b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
@@ -283,10 +283,12 @@ static bool shouldCompletelyUnroll(const Stmt *LoopStmt, ASTContext &ASTCtx,
llvm::APInt InitNum =
Matches[0].getNodeAs<IntegerLiteral>("initNum")->getValue();
auto CondOp = Matches[0].getNodeAs<BinaryOperator>("conditionOperator");
- if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
- InitNum = InitNum.zext(BoundNum.getBitWidth());
- BoundNum = BoundNum.zext(InitNum.getBitWidth());
- }
+ unsigned MaxWidth = std::max(InitNum.getBitWidth(), BoundNum.getBitWidth());
+
+ if (InitNum.getBitWidth() != MaxWidth)
+ InitNum = InitNum.zext(MaxWidth);
+ if (BoundNum.getBitWidth() != MaxWidth)
+ BoundNum = BoundNum.zext(MaxWidth);
if (CondOp->getOpcode() == BO_GE || CondOp->getOpcode() == BO_LE)
maxStep = (BoundNum - InitNum + 1).abs().getZExtValue();
|
@llvm/pr-subscribers-clang-static-analyzer-1 Author: JOSTAR (shenjunjiekoda) ChangesThe current implementation of APInt extension in the code can trigger an assertion failure when the if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
InitNum = InitNum.zext(BoundNum.getBitWidth());
BoundNum = BoundNum.zext(InitNum.getBitWidth());
} This logic does not guarantee that the Expected Behavior:
Depend on ##121201 Full diff: https://github.com/llvm/llvm-project/pull/121203.diff 1 Files Affected:
diff --git a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
index 96f5d7c44baf89..e3b27e22712b58 100644
--- a/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
+++ b/clang/lib/StaticAnalyzer/Core/LoopUnrolling.cpp
@@ -283,10 +283,12 @@ static bool shouldCompletelyUnroll(const Stmt *LoopStmt, ASTContext &ASTCtx,
llvm::APInt InitNum =
Matches[0].getNodeAs<IntegerLiteral>("initNum")->getValue();
auto CondOp = Matches[0].getNodeAs<BinaryOperator>("conditionOperator");
- if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
- InitNum = InitNum.zext(BoundNum.getBitWidth());
- BoundNum = BoundNum.zext(InitNum.getBitWidth());
- }
+ unsigned MaxWidth = std::max(InitNum.getBitWidth(), BoundNum.getBitWidth());
+
+ if (InitNum.getBitWidth() != MaxWidth)
+ InitNum = InitNum.zext(MaxWidth);
+ if (BoundNum.getBitWidth() != MaxWidth)
+ BoundNum = BoundNum.zext(MaxWidth);
if (CondOp->getOpcode() == BO_GE || CondOp->getOpcode() == BO_LE)
maxStep = (BoundNum - InitNum + 1).abs().getZExtValue();
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
Hey, could you add a test for this PR that would crash on main, but wouldn't with this patch? |
3bcf5c8
to
2cee5fc
Compare
Could you please add a RUN line to the test so that the test actually gets invoked by the |
The crash occurred due to a failed assertion in the // Zero extend to a new width.
APInt APInt::zext(unsigned width) const {
assert(width >= BitWidth && "Invalid APInt ZeroExtend request");
// ...
} However, the original logic in static bool shouldCompletelyUnroll(const Stmt *LoopStmt, ASTContext &ASTCtx,
ExplodedNode *Pred, unsigned &maxStep) {
// ...
if (InitNum.getBitWidth() != BoundNum.getBitWidth()) {
InitNum = InitNum.zext(BoundNum.getBitWidth());
BoundNum = BoundNum.zext(InitNum.getBitWidth());
} For the test case, I used the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice reproducer!
The test looks a bit verbose to my taste, but it's okay as-is.
I had some deeper thoughts of the fix inline to settle before we could merge this.
Thanks again for working on this issue!
InitNum = InitNum.zext(BoundNum.getBitWidth()); | ||
BoundNum = BoundNum.zext(InitNum.getBitWidth()); | ||
} | ||
unsigned MaxWidth = std::max(InitNum.getBitWidth(), BoundNum.getBitWidth()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked if there is a utility achieving this for us? Like the APSIntType
type? (IDK, I rarely ever use it)
I also wonder if we actually need to operate at an APSInt level, maybe we could just convert both of these into int64 and do the math on those.
Is zero extension is actually correct in semantics? What if the InitNum
was negative, léike -1
, then shouldn't we use sign extension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your thoughtful feedback!
In Clang's AST, IntegerLiteral
represents unsigned integer values. To represent a literal like -1
, the AST typically uses a combination of UnaryOperator(-)
and IntegerLiteral
.
Regarding zero extension (zext
), it is semantically correct here because APInt
(the return type of IntegerLiteral::getValue()
) inherently represents unsigned values.
I considered whether tools like APSIntType
could simplify this logic. However, since this operation deals specifically with unsigned integers (APInt), using APSIntType would introduce unnecessary complexity for signed semantics, which is not required here. The current use of APInt ensures precision and correctness.
On whether we could switch to int64
, I think converting to int64 might not simplify the code too much.
Let me know if further clarification or adjustments are needed!
The current implementation of APInt extension in the code can trigger an assertion failure when the
zext
function is called with a target width smaller than the current bit width. For example:This logic does not guarantee that the
zext
target width is always greater than or equal to the current bit width, leading to potential crashes.Expected Behavior:
zext
usage.Fixes #121201