Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ft llama opt #762

Open
wants to merge 60 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
bf93524
fix readme
dypshong Sep 11, 2023
6bbba86
add lamma template
dypshong Sep 11, 2023
a0276fb
rename varaible
dypshong Sep 12, 2023
a590e94
dump
dypshong Sep 12, 2023
a763d18
add examples
dypshong Sep 12, 2023
2cb06f1
llama......
dypshong Sep 13, 2023
ca0a25a
remove gpt dependency
dypshong Sep 13, 2023
662d3b6
fix loadModel to load llama & fix invokeGeneralLLaMALayerNorm to invo…
dypshong Sep 15, 2023
dbe0657
remove debug code and bug fix
dypshong Sep 15, 2023
29c7b69
only contextdecoder is necessary
dypshong Sep 15, 2023
1494d2f
dump
dypshong Sep 15, 2023
81dc94a
dump
dypshong Sep 16, 2023
d5b2c12
for junsik
dypshong Sep 16, 2023
8ec39b5
first success
dypshong Sep 16, 2023
4434e65
remove debugging code print
dypshong Sep 16, 2023
95a7efe
remove debugging code
dypshong Sep 16, 2023
0a0015d
LLaMA Constructor fix
dypshong Sep 18, 2023
6ed3747
llama-opt
dypshong Sep 18, 2023
321bc73
buf fix
dypshong Sep 18, 2023
837e9d7
dump
dypshong Sep 18, 2023
56c3325
add gemm_cofing.in
dypshong Sep 18, 2023
4a0a9d7
remove backup file trace
dypshong Sep 19, 2023
e63b85b
remove gemm_config.in
dypshong Sep 19, 2023
3a10308
dumdump
dypshong Sep 20, 2023
be29883
test done
dypshong Sep 22, 2023
d403986
debug code
dypshong Sep 23, 2023
db6efdd
dump
dypshong Sep 24, 2023
6e09959
dmpdmp
dypshong Sep 24, 2023
cf8087a
dp
dypshong Sep 24, 2023
13478f4
dmp
dypshong Sep 25, 2023
df743e0
no cache version
dypshong Sep 25, 2023
220aec0
no-cache version bug fix
dypshong Sep 25, 2023
4fb06e7
cache version
dypshong Sep 25, 2023
5f10b4e
Merge pull request #1 from shongshong2/llama-cache
dypshong Sep 25, 2023
857d956
remove logging
dypshong Sep 25, 2023
2609e97
Merge branch 'llama-cache' into main
dypshong Sep 25, 2023
3074afa
remove README
dypshong Sep 26, 2023
d472912
Merge branch 'main' of https://github.com/shongshong2/FasterTransform…
dypshong Sep 26, 2023
8092007
overlap
dypshong Sep 26, 2023
1187340
overlapping versino
dypshong Sep 27, 2023
949c4e7
start_pos for each sample
dypshong Sep 29, 2023
083f3bb
get back start_pos
dypshong Sep 29, 2023
f08ada9
debug
dypshong Sep 29, 2023
e5d92df
chkpt
dypshong Sep 29, 2023
433f2c9
ckpt
dypshong Sep 29, 2023
b63b496
ckpt
dypshong Sep 30, 2023
6ee6105
08:42
dypshong Sep 30, 2023
1955508
# input check bug fix
dypshong Sep 30, 2023
57dded4
code rf
dypshong Sep 30, 2023
d3a83aa
add macro
dypshong Sep 30, 2023
1365462
07_03
dypshong Sep 30, 2023
305540f
add multiple devent
dypshong Sep 30, 2023
294a6fc
ft_llama-06_48
dypshong Oct 1, 2023
a7e7089
ref
dypshong Oct 1, 2023
b8f34e0
Merge pull request #2 from shongshong2/ft_llama-06_48
dypshong Oct 1, 2023
5515d83
remove mpi requirement
dypshong Oct 1, 2023
9b38f6e
Merge pull request #3 from shongshong2/ft-llama-opt
dypshong Oct 1, 2023
ce8c72a
add mpi_cxx
dypshong Oct 1, 2023
48b35f7
ref
dypshong Oct 1, 2023
baca61b
final
dypshong Oct 2, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,7 @@ __pycache__/
**/.ipynb_checkpoints/

/3rdparty/NeMo/
/3rdparty/apex/
/3rdparty/apex/
20B_checkpoints/
compile_commands.json
model/
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ add_library(transformer-shared SHARED

if (BUILD_MULTI_GPU)
target_link_libraries(transformer-shared PUBLIC
-lmpi
-lmpi -lmpi_cxx
${NCCL_LIBRARIES}
)
endif()
Expand Down
417 changes: 417 additions & 0 deletions FasterTransformerReadME.md

Large diffs are not rendered by default.

418 changes: 8 additions & 410 deletions README.md

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions examples/cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ add_subdirectory(wenet)
add_subdirectory(gptj)
add_subdirectory(gptneox)
add_subdirectory(multi_gpu_gpt)
#add_subdirectory(llama)

if(ENABLE_FP8)
add_subdirectory(gpt_fp8)
Expand Down
22 changes: 22 additions & 0 deletions examples/cpp/llama/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (c) 2019-2023, NVIDIA CORPORATION. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

add_library(llama_example_utils STATIC llama_example_utils.cc)
target_link_libraries(llama_example_utils PUBLIC -lcublas -lcublasLt -lcudart
nvtx_utils mpi_utils nccl_utils)

add_executable(llama_example llama_example.cc)
target_link_libraries(llama_example PUBLIC -lcublas -lcublasLt -lcudart
LLaMA mpi_utils nccl_utils nvtx_utils
llama_example_utils word_list)
2 changes: 2 additions & 0 deletions examples/cpp/llama/bad_words.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
7768,3908
1,2
21 changes: 21 additions & 0 deletions examples/cpp/llama/llama_config.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[ft_instance_hyperparameter]
model_name=llama_33B
model_dir=../models/llama
data_type=fp16
pipeline_para_size=4


[request]
request_batch_size=32
start_pos=2

[llama_33B]
head_num=52
size_per_head=128
vocab_size=32000
decoder_layers=60
rotary_embedding=128
multiple_of=256
max_seq_len=1024
padding_id=0
random_seed=0
Loading