-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using PtrArray
s
#454
Stop using PtrArray
s
#454
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #454 +/- ##
=======================================
Coverage 70.84% 70.84%
=======================================
Files 66 66
Lines 3807 3807
=======================================
Hits 2697 2697
Misses 1110 1110
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Note that julia> using Polyester
julia> function foo!(x)
@batch for i in eachindex(x)
if i == 1
@show typeof(x)
end
x[i] = sin(i)
end
return sum(x)
end
foo! (generic function with 1 method)
julia> foo!(rand(5))
typeof(x) = StrideArraysCore.PtrArray{Float64, 1, (1,), Tuple{Int64}, Tuple{Nothing}, Tuple{Static.StaticInt{1}}} |
The current conclusion of #453 is that its an ARM Julia problem. |
So what do we do now? Ignore #453 and this PR because it's only relevant to macOS AMR? I'll try to find a MWE to report it at StrideArrays.jl, but I don't have time for this now. |
Yes I would ignore this for now. We should probably look at this again when 1.11 comes around in a month. |
Some of the issues probably also were the fixed bugs with the unsafe bitsets. |
@efaulhaber ~monthly reminder can we close this? |
Currently not needed! |
The main reason for this PR is the weird heisenbug in #453.
Threaded loops yield different results than non-threaded loops, even though each loop operation works on a different entry of
dv
. This problem disappears when using plain JuliaArray
s instead ofPtrArray
s fordv
.I haven't yet managed to reduce #453 to a MWE, so I can't report this in StrideArrays.jl, so I suggest we use plain Julia arrays instead (for now at least).
Also, with plain Julia arrays, we don't run into problems like JuliaSIMD/StrideArrays.jl#88.
Since the reason we used
PtrArray
s in the first place is to avoid allocations, this PR is highly performance relevant.As expected, we have some more allocations:
main:
This PR:
Here are some benchmarks with the dam break 2D example:
Apple M2 Pro
1 thread main:
1 thread this PR:
6 threads main:
6 threads this PR:
So we do have a little more allocations, but the performance difference in full simulations is actually negligible.
I'd also like to hear @sloede's or @ranocha's feedback here, since
PtrArray
s are also used in Trixi.Here are some full timer outputs:
Ryzen Threadripper 3990X
1 thread main:
1 thread this PR:
24 threads main:
24 threads this PR:
Apple M1 Pro
6 threads main:
6 threads this PR: