Skip to content

在NPU lora微调Qwen3-coder-30B Agent能力超级慢 #6455

@buer103

Description

@buer103

在单机8卡910B2 lora微调Qwen3-coder-30B Agent能力超级慢 为什么呢而且显存占用也一般
启动命令:
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
NPROC_PER_NODE=8
swift sft
--model /Qwen3-Coder-30B-A3B-Instruct/
--model_type qwen3_moe
--template qwen3
--train_type lora
--lora_rank 8
--torch_dtype bfloat16
--dataset /data.jsonl
--num_train_epochs 2
--per_device_train_batch_size 1
--per_device_eval_batch_size 1
--gradient_accumulation_steps 2
--learning_rate 1e-5
--lr_scheduler_type cosine
--warmup_ratio 0.05
--max_length 8192
--truncation_strategy left
--padding_side left
--output_dir /Qwen3-Coder-30B-A3B-Instruct-LoRA
--save_steps 8
--eval_steps 8
--save_total_limit 16
--save_strategy steps
--save_only_model true
--dataloader_num_workers 4
--dataset_num_proc 16
--report_to tensorboard
--logging_steps 1
--deepspeed zero3

显卡占用:

+------------------------------------------------------------------------------------------------+
| npu-smi 23.0.6 Version: 23.0.6 |
+---------------------------+---------------+----------------------------------------------------+
| NPU Name | Health | Power(W) Temp(C) Hugepages-Usage(page)|
| Chip | Bus-Id | AICore(%) Memory-Usage(MB) HBM-Usage(MB) |
+===========================+===============+====================================================+
| 0 910B2 | OK | 103.5 49 0 / 0 |
| 0 | 0000:C1:00.0 | 0 0 / 0 23434/ 65536 |
+===========================+===============+====================================================+
| 1 910B2 | OK | 100.6 52 0 / 0 |
| 0 | 0000:01:00.0 | 0 0 / 0 26618/ 65536 |
+===========================+===============+====================================================+
| 2 910B2 | OK | 104.7 52 0 / 0 |
| 0 | 0000:C2:00.0 | 0 0 / 0 22957/ 65536 |
+===========================+===============+====================================================+
| 3 910B2 | OK | 108.6 51 0 / 0 |
| 0 | 0000:02:00.0 | 0 0 / 0 20591/ 65536 |
+===========================+===============+====================================================+
| 4 910B2 | OK | 106.2 49 0 / 0 |
| 0 | 0000:81:00.0 | 0 0 / 0 23811/ 65536 |
+===========================+===============+====================================================+
| 5 910B2 | OK | 107.0 52 0 / 0 |
| 0 | 0000:41:00.0 | 0 0 / 0 18246/ 65536 |
+===========================+===============+====================================================+
| 6 910B2 | OK | 101.8 50 0 / 0 |
| 0 | 0000:82:00.0 | 0 0 / 0 21878/ 65536 |
+===========================+===============+====================================================+
| 7 910B2 | OK | 104.7 52 0 / 0 |
| 0 | 0000:42:00.0 | 0 0 / 0 23330/ 65536 |
+===========================+===============+====================================================+
+---------------------------+---------------+----------------------------------------------------+
| NPU Chip | Process id | Process name | Process memory(MB) |
+===========================+===============+====================================================+
| 0 0 | 1062078 | pt_main_thread | 20117 |
+===========================+===============+====================================================+
| 1 0 | 1062079 | pt_main_thread | 23301 |
+===========================+===============+====================================================+
| 2 0 | 1062080 | pt_main_thread | 19641 |
+===========================+===============+====================================================+
| 3 0 | 1062081 | pt_main_thread | 17273 |
+===========================+===============+====================================================+
| 4 0 | 1062082 | pt_main_thread | 20491 |
+===========================+===============+====================================================+
| 5 0 | 1062083 | pt_main_thread | 14931 |
+===========================+===============+====================================================+
| 6 0 | 1062084 | pt_main_thread | 18561 |
+===========================+===============+====================================================+
| 7 0 | 1062085 | pt_main_thread | 20013 |
+===========================+===============+====================================================+

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions