-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
THe performance with GPU backend #118
Comments
Hi, @lilux618. libgrape-lite is not ready for the graph500 benchmark and BFS has some specific algorithmic optimizations that libgrape-lite does not apply. This is because these optimizations are hard to generalize thus other algorithms are also hard to gain benefits from them. For example, BFS is idempotent, thus the race condition is benign, which means we can avoid atomic operations in top-down BFS and do early termination in bottom-up BFS. However, other applications can not update without atomic operations and we do not want to keep some "special code path" only for specific algorithms in a programmable framework. Further, libgrape-lite also does not apply direction-optimization or leverage bitmap to avoid redundant memory access in our example code for BFS, this may explain why the performance of libgrape-lite GPU is slower than other graph500 benchmarks. libgrape-lite just recently supported GPU and we are still working on that these days. It would be great if you are willing to contribute code to help us improve the performance of BFS, you can modify the GPU parallel_engine and implement a fully-optimized version for BFS. |
Thank you ! I am new in libgrape-lite, and I am learning how to use and modify it. For example, now the BFS code can be used with only one source which is specified at the command line terminal, how can I change it in source code level? Further more, how do I merge other code with this program?
|
The BFS source is passed in via gflags. Specifically, the source is set at here.
To integrate other project with libgrape-lite. You can refer the CreateAndQuery functions. FYI: libgrape-lite also supports multi-gpu. The computation and communication follow the PIE model. |
Do you have questions or need support? Please describe.
I run the libgrapelite on A100 with datasets graph500-26 using BFS, the results are as follow:
load graph: 1080.76 sec
load application: 0.341124 sec
run algorithm: 0.23278 sec
print output: 57.175 sec
It's not faster than on the CPU ,what's wrong with it ? Is it OK ? According to the graph500-benchmark list 'https://graph500.org/?page_id=12' , the GPU performance with BFS -graph500-26 can be as high as 319.061 GTEPS , while in libgrape-lite , this results can be computed as 22616/0.23=4.66GTEPS, is it too small ?
Additional context
Add any other context about the question here.
The text was updated successfully, but these errors were encountered: