Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance wasm with checkpoint and restore support (#2333) #3289

Open
wants to merge 19 commits into
base: dev/checkpoint_and_restore
Choose a base branch
from

Conversation

victoryang00
Copy link

  • Add wasm_runtime_checkpoint/wasm_runtime_restore API
  • Support AOT and Classic Interpreter mode checkpoint and debug through OS signal, tested on windows/mac/linux aarch64/x64
  • Static instrument the AOT to have the checkpoint and restore switches
  • Add sub extra library folder for implementing the ckpt-restore
  • Include extra dependency of yalantinglib

Create a one-commit PR for adding the functionality for now, which requires more code reviews and testing in the future, like when ther I could better organize the third-party yalantings library and test functionality for Classic Interpreter mode and AOT mode.

@victoryang00 victoryang00 force-pushed the main branch 3 times, most recently from 3fba687 to 5defce3 Compare April 8, 2024 02:24
@victoryang00 victoryang00 changed the base branch from main to dev/checkpoint_and_restore April 8, 2024 16:55
@victoryang00 victoryang00 changed the title Enhance wasm with checkpoint and restore support (#2333) [WIP] Enhance wasm with checkpoint and restore support (#2333) Apr 8, 2024
@victoryang00 victoryang00 force-pushed the main branch 2 times, most recently from e851973 to c3a1363 Compare April 9, 2024 21:07
@victoryang00 victoryang00 force-pushed the main branch 17 times, most recently from 0d74ae0 to 861d6a3 Compare May 2, 2024 00:19
@tamaroning
Copy link

Hi @victoryang00 !
What is the current status of this pr?
I am interested in the implementation and interfaces of this feature.
(There is no intention to rush you.)

@victoryang00
Copy link
Author

Hi @victoryang00 ! What is the current status of this pr? I am interested in the implementation and interfaces of this feature. (There is no intention to rush you.)

You can now use https://github.com/Multi-V-VM/MVVM first, this PR requires a lot more work to be accepted.

@tamaroning
Copy link

@victoryang00
Thank you!

@victoryang00 victoryang00 force-pushed the main branch 2 times, most recently from 892d1ca to 2da0ccf Compare May 26, 2024 18:37
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't add this file, it has been moved into .github/scripts/codeql_buildscript.sh.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this file had been moved to .github/scripts/codeql_fail_on_error.py

README.md Outdated
@@ -41,6 +41,7 @@ WebAssembly Micro Runtime (WAMR) is a lightweight standalone WebAssembly (Wasm)
- [Berkeley/Posix Socket support](./doc/socket_api.md), ref to [document](./doc/socket_api.md) and [sample](./samples/socket-api)
- [Multi-tier JIT](./product-mini#linux) and [Running mode control](https://bytecodealliance.github.io/wamr.dev/blog/introduction-to-wamr-running-modes/)
- Language bindings: [Go](./language-bindings/go/README.md), [Python](./language-bindings/python/README.md), [Rust](./language-bindings/rust/README.md)
- Language bindings: [Go](./language-bindings/go/README.md), [Python](./language-bindings/python/README.md), [Rust](./language-bindings/rust/README.md)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated line. Do you want to add feature checkpoint and restore?

Comment on lines 78 to 82
bh_static_assert(offsetof(AOTFrame, ip_offset) == sizeof(uintptr_t) * 4);
bh_static_assert(offsetof(AOTFrame, sp) == sizeof(uintptr_t) * 5);
bh_static_assert(offsetof(AOTFrame, frame_ref) == sizeof(uintptr_t) * 6);
bh_static_assert(offsetof(AOTFrame, lp) == sizeof(uintptr_t) * 7);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated code, had better remove these lines.

#if WASM_ENABLE_SHARED_MEMORY != 0
/* Currently we have only one memory instance */
bool is_shared_memory =
module->memories[0].memory_flags & 0x02 ? true : false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had better change 0x02 to SHARED_MEMORY_FLAG

exec_env, exec_env->restore_call_chain, exec_env->handle);
if (((AOTFrame *)exec_env->cur_frame)->func_index != func_index) {
LOG_DEBUG("NOT MATCH!!!\n");
exit(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had better not exit, can we just bh_assert(((AOTFrame *)exec_env->cur_frame)->func_index == func_index)?

frame->sp = frame->lp + max_local_cell_num;
#if WASM_ENABLE_GC != 0
frame->frame_ref = frame->sp + max_stack_cell_num;
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated code as below, how about:

#if WASM_ENABLE_GC != 0 || WASM_ENALBE_CHECKPOINT_RESTORE != 0
    frame->sp = frame->lp + max_local_cell_num;
    frame->frame_ref = (uint8 *)(frame->sp + max_stack_cell_num);
#endif

core/iwasm/common/wasm_exec_env.h Show resolved Hide resolved
core/iwasm/common/wasm_runtime_common.c Show resolved Hide resolved
* Andrew Quinn
*
* Copyright 2024 Regents of the Univeristy of California
* UC Santa Cruz Sluglab.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the license info in the source files of this folder? And do you agree them to use the license of this project, or add a line SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception like other files?

@wenyongh
Copy link
Contributor

wenyongh commented Jun 7, 2024

@victoryang00 I only read several files of this PR and added some comments, and will look into more files in the future. BTW, I merged main into dev/checkpoint_and_restore (#3511), could you rebase the code again? Many thanks.

victoryang00 and others added 18 commits June 7, 2024 21:09
- Add wasm_runtime_checkpoint/wasm_runtime_restore API
- Support AOT and Classic Interpreter mode checkpoint and debug through OS signal, tested on windows/mac/linux aarch64/x64
- Static instrument the AOT to have the checkpoint and restore switches
- Add sub extra library folder for implementing the ckpt-restore
- Include extra dependency of yalantinglib

Co-authored-by: Aibo Hu <[email protected]>
Co-authored-by: kikispace <[email protected]>
Co-authored-by: Brian Zhao <[email protected]>
Signed-off-by: victoryang00 <[email protected]>
Signed-off-by: victoryang00 <[email protected]>
Signed-off-by: victoryang00 <[email protected]>
bytecodealliance#3190)

Also, print the function name on argument mismatch.

Signed-off-by: victoryang00 <[email protected]>
- Address values in call stack dump are relative to file beginning
- If running under fast-interp mode, address values are relative to
  every pre-compiled function beginning, which is not compatible
  with addr2line

Signed-off-by: victoryang00 <[email protected]>
…alliance#3206)

Update the `addr2line` script so that:
- line info is printed in a more convenient format, e.g.
```
0: c
        at wasm-micro-runtime/test-tools/addr2line/trap.c:5:1
1: b
        at wasm-micro-runtime/test-tools/addr2line/trap.c:11:12
2: a
        at wasm-micro-runtime/test-tools/addr2line/trap.c:17:12
```
similar to how Rust prints stack traces when there's a panic. In an IDE, the user
can conveniently click on the printed path and be redirected to the file line.
- a new `--no-addr` argument can be provided to the script

It can be used in fast interpreter mode (that is not supported by the script otherwise)
or with older wamr versions (where the stack trace only had the function index info
and not the function address). In that case, `wasm-objdump` is used to get the function
name from the index and `llvm-dwarfdump` to obtain the location info (where the line
refers to the start of the function).

Signed-off-by: victoryang00 <[email protected]>
…ytecodealliance#3209)

- (Export)wasm_runtime_module_malloc
  - wasm_module_malloc
    - wasm_module_malloc_internal
  - aot_module_malloc
    - aot_module_malloc_internal
- wasm_runtime_module_realloc
  - wasm_module_realloc
    - wasm_module_realloc_internal
  - aot_module_realloc
    - aot_module_realloc_internal
- (Export)wasm_runtime_module_free
  - wasm_module_free
    - wasm_module_free_internal
  - aot_module_malloc
    - aot_module_free_internal
- (Export)wasm_runtime_module_dup_data
  - wasm_module_dup_data
  - aot_module_dup_data
- (Export)wasm_runtime_validate_app_addr
- (Export)wasm_runtime_validate_app_str_addr
- (Export)wasm_runtime_validate_native_addr
- (Export)wasm_runtime_addr_app_to_native
- (Export)wasm_runtime_addr_native_to_app
- (Export)wasm_runtime_get_app_addr_range
- aot_set_aux_stack
- aot_get_aux_stack
- wasm_set_aux_stack
- wasm_get_aux_stack
- aot_check_app_addr_and_convert, wasm_check_app_addr_and_convert
  and jit_check_app_addr_and_convert
- wasm_exec_env_set_aux_stack
- wasm_exec_env_get_aux_stack
- wasm_cluster_create_thread
- wasm_cluster_allocate_aux_stack
- wasm_cluster_free_aux_stack

- WASMModule and AOTModule
  - field aux_data_end, aux_heap_base and aux_stack_bottom
- WASMExecEnv
  - field aux_stack_boundary and aux_stack_bottom
- AOTCompData
  - field aux_data_end, aux_heap_base and aux_stack_bottom
- WASMMemoryInstance(AOTMemoryInstance)
  - field memory_data_size and change __padding to is_memory64
- WASMModuleInstMemConsumption
  - field total_size and memories_size
- WASMDebugExecutionMemory
  - field start_offset and current_pos
- WASMCluster
  - field stack_tops

- libc-builtin
- libc-emcc
- libc-uvwasi
- libc-wasi
- Python and Go Language Embedding
- Interpreter Debug engine
- Multi-thread: lib-pthread, wasi-threads and thread manager

Signed-off-by: victoryang00 <[email protected]>
…ytecodealliance#3232)

Adding the N from "aot_func#N" with the import function count is the correct
wasm function index.

Signed-off-by: victoryang00 <[email protected]>
Add CodeQL Workflow for Code Security Analysis

This pull request introduces a CodeQL workflow to enhance the security analysis of our repository.
CodeQL is a powerful static analysis tool that helps identify and mitigate security vulnerabilities in
our codebase. By integrating this workflow into our GitHub Actions, we can proactively identify
and address potential issues before they become security threats.

We added a new CodeQL workflow file (.github/workflows/codeql.yml) that
- Runs on nightly-run, and consider runs on every pull request to the main branch in the future.
- Excludes queries with a high false positive rate or low-severity findings.
- Does not display results for third-party code, focusing only on our own codebase.

Testing:
To validate the functionality of this workflow, we have run several test scans on the codebase and
reviewed the results. The workflow successfully compiles the project, identifies issues, and provides
actionable insights while reducing noise by excluding certain queries and third-party code.

Deployment:
Once this pull request is merged, the CodeQL workflow will be active and automatically run on
every push and pull request to the main branch. To view the results of these code scans, please
follow these steps:
1. Under the repository name, click on the Security tab.
2. In the left sidebar, click Code scanning alerts.

Additional Information:
- You can further customize the workflow to adapt to your specific needs by modifying the workflow file.
- For more information on CodeQL and how to interpret its results, refer to the GitHub documentation
and the CodeQL documentation.

Signed-off-by: Brian <[email protected]>
Signed-off-by: victoryang00 <[email protected]>
…liance#3246)

Enhance CodeQL Code Security Analysis:
- Add more compilation combinations to build iwasm with different kinds of features
- Disable run on PR created and keep nightly run, since the whole time is very long,
   and will check how to restore run on PR created in the future

Signed-off-by: victoryang00 <[email protected]>
Adding a new cmake flag (cache variable) `WAMR_BUILD_MEMORY64` to enable
the memory64 feature, it can only be enabled on the 64-bit platform/target and
can only use software boundary check. And when it is enabled, it can support both
i32 and i64 linear memory types. The main modifications are:

- wasm loader & mini-loader: loading and bytecode validating process
- wasm runtime: memory instantiating process
- classic-interpreter: wasm code executing process
- Support memory64 memory in related runtime APIs
- Modify main function type check when it's memory64 wasm file
- Modify `wasm_runtime_invoke_native` and `wasm_runtime_invoke_native_raw` to
  handle registered native function pointer argument when memory64 is enabled
- memory64 classic-interpreter spec test in `test_wamr.sh` and in CI

Currently, it supports memory64 memory wasm file that uses core spec
(including bulk memory proposal) opcodes and threads opcodes.

ps.
bytecodealliance#3091
bytecodealliance#3240
bytecodealliance#3260

Signed-off-by: victoryang00 <[email protected]>
…+ methods (bytecodealliance#3190)" (bytecodealliance#3281)

This reverts commit 0e8d949.

Because it doesn't make much sense anymore after we disabled debug info
processing on C++ functions in:
"aot debug: process lldb_function_to_function_dbi only for C".

Signed-off-by: victoryang00 <[email protected]>
…alliance#3265)

- Add new API wasm_runtime_load_ex() in wasm_export.h
  and wasm_module_new_ex in wasm_c_api.h
- Put aot_create_perf_map() into a separated file aot_perf_map.c
- In perf.map, function names include user specified module name
- Enhance the script to help flamegraph generations

Signed-off-by: victoryang00 <[email protected]>
…ce#3282)

When thread manager is enabled, the aux stack of exec_env may be allocated
by wasm_cluster_allocate_aux_stack or disabled by setting aux_stack_bottom
as UINTPTR_MAX directly. For the latter, no need to free it.

And fix an issue when paring `--gc-heap-size=n` argument for iwasm, and
fix a variable shadowed warning in fast-jit.

Signed-off-by: victoryang00 <[email protected]>
…ance#3327)

And enable code format check for wasm_export.h and add '\n' in os_printf
in sgx platform source files.

Signed-off-by: victoryang00 <[email protected]>
…3355)

- Add a few API (bytecodealliance#3325)
   ```c
   wasm_runtime_detect_native_stack_overflow_size
   wasm_runtime_detect_native_stack_overflow
   ```
- Adapt the runtime to use them
- Adapt samples/native-stack-overflow to use them
- Add a few missing overflow checks in the interpreters
- Build and run the sample on the CI

Signed-off-by: victoryang00 <[email protected]>
- Add wasm_runtime_checkpoint/wasm_runtime_restore API
- Support AOT and Classic Interpreter mode checkpoint and debug through OS signal, tested on windows/mac/linux aarch64/x64
- Static instrument the AOT to have the checkpoint and restore switches
- Add sub extra library folder for implementing the ckpt-restore
- Include extra dependency of yalantinglib

Co-authored-by: Aibo Hu <[email protected]>
Co-authored-by: kikispace <[email protected]>
Co-authored-by: Brian Zhao <[email protected]>
Signed-off-by: victoryang00 <[email protected]>
@victoryang00 victoryang00 force-pushed the main branch 2 times, most recently from bc4aef3 to 4a5b269 Compare June 7, 2024 13:49
@victoryang00 victoryang00 force-pushed the main branch 2 times, most recently from 3b9e1b6 to c1616ba Compare July 19, 2024 05:51
@victoryang00
Copy link
Author

Ready for merge. Still has problems with interpreter mode SIGINT file snapshots and AOT compilation.

@victoryang00 victoryang00 changed the title [WIP] Enhance wasm with checkpoint and restore support (#2333) Enhance wasm with checkpoint and restore support (#2333) Jul 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants