Skip to content

convert-hf-to-ggml.py offload_weight() too much parameters #36

@dmeziere

Description

@dmeziere

I tried to follow the Quick start.

$ git clone https://github.com/bigcode-project/starcoder.cpp
Clonage dans 'starcoder.cpp'...
remote: Enumerating objects: 110, done.
remote: Counting objects: 100% (110/110), done.
remote: Compressing objects: 100% (78/78), done.
remote: Total 110 (delta 34), reused 93 (delta 25), pack-reused 0 (from 0)
Réception d'objets: 100% (110/110), 7.28 Mio | 28.89 Mio/s, fait.
Résolution des deltas: 100% (34/34), fait.
$ cd starcoder.cpp/
$ python convert-hf-to-ggml.py bigcode/gpt_bigcode-santacoder
Loading model:  bigcode/gpt_bigcode-santacoder
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 159/159 [00:00<00:00, 324kB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.08M/2.08M [00:00<00:00, 6.11MB/s]
special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 138/138 [00:00<00:00, 387kB/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 812/812 [00:00<00:00, 2.52MB/s]
model.safetensors: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.25G/2.25G [00:43<00:00, 52.3MB/s]
Traceback (most recent call last):
  File "/mnt/hdd/Data/ia/starcoder.cpp/convert-hf-to-ggml.py", line 58, in <module>
    model = AutoModelForCausalLM.from_pretrained(model_name, config=config, torch_dtype=torch.float16 if use_f16 else torch.float32, low_cpu_mem_usage=True, trust_remote_code=True, offload_state_dict=True)
  File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3507, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3932, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 798, in _load_state_dict_into_meta_model
    state_dict_index = offload_weight(param, param_name, model, state_dict_folder, state_dict_index)
TypeError: offload_weight() takes from 3 to 4 positional arguments but 5 were given

Then i tried make, but it also returns a lot of warnings. But i think it has compiled.

$ ./quantize models/bigcode/gpt_bigcode-santacoder-ggml.bin models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin 3
starcoder_model_quantize: loading model from 'models/bigcode/gpt_bigcode-santacoder-ggml.bin'
starcoder_model_quantize: failed to open 'models/bigcode/gpt_bigcode-santacoder-ggml.bin' for reading
main: failed to quantize model from 'models/bigcode/gpt_bigcode-santacoder-ggml.bin'

The model seems to be missing, probably because of the errors of convert-hf-to-ggml.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions