Running model multiple times gets much slower? #468

FR13ndSDP · 2023-12-15T14:13:28Z

FR13ndSDP
Dec 15, 2023

Here is a simple piece of code:

using JLD2, Lux, LuxCUDA

@load "./NN/luxmodel.jld2" model ps st

ps = ps |> gpu_device()

input = CUDA.ones(Float64, 9, 1024*256)

@time for i = 1:10
    Lux.apply(model, input, ps, st)
end

I just found that if the model is being run only once, the time it takes is 0.000380 seconds (4.81 k allocations: 218.922 KiB)

But if it is being run 10 times inside a for loop, the time it takes is 0.380598 seconds (8.69 k allocations: 404.797 KiB, 0.60% gc time).

I think this problem might be related to GC, how can I avoid it?

Answered by FR13ndSDP

Dec 17, 2023

Just found out that I should use CUDA.@time instead of @time. Converting input to Float32 and running the model using Lux.apply(model, cu(input), ps, st) is much faster.

View full answer

FR13ndSDP · 2023-12-17T10:25:40Z

FR13ndSDP
Dec 17, 2023
Author

Just found out that I should use CUDA.@time instead of @time. Converting input to Float32 and running the model using Lux.apply(model, cu(input), ps, st) is much faster.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LuxDL

Running model multiple times gets much slower? #468

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

LuxDL

Running model multiple times gets much slower? #468

FR13ndSDP Dec 15, 2023

Replies: 1 comment

FR13ndSDP Dec 17, 2023 Author

FR13ndSDP
Dec 15, 2023

FR13ndSDP
Dec 17, 2023
Author