-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Description
I tried to follow the Quick start.
$ git clone https://github.com/bigcode-project/starcoder.cpp
Clonage dans 'starcoder.cpp'...
remote: Enumerating objects: 110, done.
remote: Counting objects: 100% (110/110), done.
remote: Compressing objects: 100% (78/78), done.
remote: Total 110 (delta 34), reused 93 (delta 25), pack-reused 0 (from 0)
Réception d'objets: 100% (110/110), 7.28 Mio | 28.89 Mio/s, fait.
Résolution des deltas: 100% (34/34), fait.
$ cd starcoder.cpp/
$ python convert-hf-to-ggml.py bigcode/gpt_bigcode-santacoder
Loading model: bigcode/gpt_bigcode-santacoder
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 159/159 [00:00<00:00, 324kB/s]
tokenizer.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.08M/2.08M [00:00<00:00, 6.11MB/s]
special_tokens_map.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 138/138 [00:00<00:00, 387kB/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 812/812 [00:00<00:00, 2.52MB/s]
model.safetensors: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.25G/2.25G [00:43<00:00, 52.3MB/s]
Traceback (most recent call last):
File "/mnt/hdd/Data/ia/starcoder.cpp/convert-hf-to-ggml.py", line 58, in <module>
model = AutoModelForCausalLM.from_pretrained(model_name, config=config, torch_dtype=torch.float16 if use_f16 else torch.float32, low_cpu_mem_usage=True, trust_remote_code=True, offload_state_dict=True)
File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3507, in from_pretrained
) = cls._load_pretrained_model(
File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3932, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/home/dmeziere/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 798, in _load_state_dict_into_meta_model
state_dict_index = offload_weight(param, param_name, model, state_dict_folder, state_dict_index)
TypeError: offload_weight() takes from 3 to 4 positional arguments but 5 were given
Then i tried make, but it also returns a lot of warnings. But i think it has compiled.
$ ./quantize models/bigcode/gpt_bigcode-santacoder-ggml.bin models/bigcode/gpt_bigcode-santacoder-ggml-q4_1.bin 3
starcoder_model_quantize: loading model from 'models/bigcode/gpt_bigcode-santacoder-ggml.bin'
starcoder_model_quantize: failed to open 'models/bigcode/gpt_bigcode-santacoder-ggml.bin' for reading
main: failed to quantize model from 'models/bigcode/gpt_bigcode-santacoder-ggml.bin'
The model seems to be missing, probably because of the errors of convert-hf-to-ggml.py.
Metadata
Metadata
Assignees
Labels
No labels