Skip to content

Conversation

@wtmlon
Copy link
Collaborator

@wtmlon wtmlon commented Nov 26, 2024

PR types

PR changes

Description

支持 resume 压缩 ckpt 上限控制

@paddle-bot
Copy link

paddle-bot bot commented Nov 26, 2024

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 11.42857% with 31 lines in your changes missing coverage. Please review.

Project coverage is 53.10%. Comparing base (8fd33a9) to head (12d84c6).
Report is 228 commits behind head on develop.

Files with missing lines Patch % Lines
...p/trainer/unified_checkpoint/unified_checkpoint.py 5.00% 19 Missing ⚠️
...lp/quantization/unified_checkpoint_quantization.py 15.38% 11 Missing ⚠️
paddlenlp/transformers/model_utils.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9494      +/-   ##
===========================================
+ Coverage    52.91%   53.10%   +0.19%     
===========================================
  Files          688      694       +6     
  Lines       109331   110989    +1658     
===========================================
+ Hits         57848    58940    +1092     
- Misses       51483    52049     +566     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


# Quantization times exceeds the limit. Turn off the quantization strategy.
if quant_ckpt_resume_times > MAX_QUANTIZATION_TIMES:
ckpt_quant_stage = "O0"
Copy link
Contributor

@DesmonDay DesmonDay Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

把这个开关的修改写在这里感觉不太对?MAX_QUANTIZATION_TIMES主要是限制你保存为压缩checkpoint的次数,所以应该把 ckpt_quant_stage 的修改同步到 save逻辑,加载这里改了也没有作用吧?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最后这里还是有点疑问,这里是加载optimizer逻辑,ckpt_quant_stage = "O0"不应该有外界的改变,而是直接通过checkpoint保存的index来读取。

# save opt index json if checkpoint quantization is on.
if self.args.ckpt_quant_stage != "O0":
sharded_optim_index = {"ckpt_quant_stage": self.args.ckpt_quant_stage}
if self.args.ckpt_quant_stage != "O0" and "quant_reach_limit" not in infohub:

This comment was marked as resolved.

is_sync=is_sync_save,
state_dict_type="optimizer_weight",
ckpt_quant_stage=self.args.ckpt_quant_stage,
ckpt_quant_stage=self.args.ckpt_quant_stage if "quant_reach_limit" not in infohub else "O0",

This comment was marked as resolved.

is_sync=is_sync_save,
state_dict_type="optimizer_weight",
ckpt_quant_stage=self.args.ckpt_quant_stage,
ckpt_quant_stage=self.args.ckpt_quant_stage if "quant_reach_limit" not in infohub else "O0",

This comment was marked as resolved.

Copy link
Contributor

@DesmonDay DesmonDay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wawltor wawltor merged commit 2985f90 into PaddlePaddle:develop Nov 29, 2024
9 of 12 checks passed
wtmlon added a commit to wtmlon/PaddleNLP that referenced this pull request Nov 29, 2024
* support quant ckpt limit strategy

* bug fix

* bug fix

* fix bug

* add log, fix bug
Conflicts:
	paddlenlp/utils/env.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants