You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
paddle版本:paddlepaddle_gpu-3.0.0.dev20241223-cp310-cp310-linux_x86_64.whl
文档中3.1 数据并行 提到的示例代码运行正常。
文档中3.4 3D 混合并行策略提到的示例代码,直接运行会报错。报错信息:
for step, inputs in enumerate(dataloader):
File "/usr/local/lib/python3.10/dist-packages/paddle/distributed/auto_parallel/api.py", line 3309, in __next__
batch_data = next(self.iter)
AttributeError: 'ShardDataloader' object has no attribute 'iter'
The text was updated successfully, but these errors were encountered:
感谢您的帮助,这次可以运行了,不过文档4.5 动转静训练提到的测试例还是无法运行:
该章节提到要添加一块代码,但是我在4.4 3D 混合并行策略 代码中添加后,会报错:
报错log:
INFO 2024-12-26 10:00:20,273 helper.py:274] start to build program for mode = train.
C++ Traceback (most recent call last):
0 paddle::pybind::static_api_mean(_object*, _object*, _object*)
1 CallStackRecorder::AttachToOps()
2 CallStackRecorder::GetOpCallstackInfo()
Error Message Summary:
FatalError: `Segmentation fault` is detected by the operating system.
[TimeInfo: *** Aborted at 1735207220 (unix time) try "date -d @1735207220" if you are using GNU date ***]
[SignalInfo: *** SIGSEGV (@0x0) received by PID 37717 (TID 0x7fa67a0a1740) from PID 0 ***]
#代码如下
opt = dist.shard_optimizer(opt)
#添加文档提到的代码
dist_model = dist.to_static(
model, dataloader, paddle.mean, opt
)
dist_model.train()
for step, inputs in enumerate(dataloader()):
data = inputs
loss = dist_model(data)
print(step, loss)
exit()
for step, inputs in enumerate(dataloader()):
data = inputs[0]
logits = model(data)
loss = paddle.mean(logits)
loss.backward()
opt.step()
opt.clear_grad()
https://www.paddlepaddle.org.cn/documentation/docs/zh/guides/paddle_v3_features/auto_parallel_cn.html#sanzidongbingxinghefenbushicelve
paddle版本:paddlepaddle_gpu-3.0.0.dev20241223-cp310-cp310-linux_x86_64.whl
文档中3.1 数据并行 提到的示例代码运行正常。
文档中3.4 3D 混合并行策略提到的示例代码,直接运行会报错。报错信息:
The text was updated successfully, but these errors were encountered: