GPT-NeoX has only minimal inference support #3293

cebtenzzre · 2023-09-21T04:50:58Z

Steps to reproduce:

Download https://huggingface.co/EleutherAI/gpt-neox-20b
Convert the model and attempt to use it:

$ TMPDIR=/var/tmp ./convert-gptneox-hf-to-gguf.py gpt-neox-20b 1 --outfile gpt-neox-20b.f16.gguf
$ ./main -m gpt-neox-20b.f16.gguf
<snip>
llama_model_loader: - type  f32:  354 tensors
llama_model_loader: - type  f16:  178 tensors
error loading model: cannot find tokenizer scores in model file

llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'gpt-neox-20b.f16.gguf'
main: error: unable to load model

The text was updated successfully, but these errors were encountered:

cebtenzzre · 2023-09-21T21:34:51Z

Even if you add dummy scores and token types in the conversion script, it fails here:

llama.cpp/llama.cpp

Line 2288 in bc9d3e3

throw std::runtime_error("unknown architecture");

Was GPT-NeoX ever even implemented in GGUF?

Jacoby1218 · 2023-09-22T19:30:31Z

Was GPT-NeoX ever even implemented in GGUF?

Yes, example inference code exists here: https://github.com/ggerganov/llama.cpp/blob/master/examples/gptneox-wip/gptneox-main.cpp

cebtenzzre · 2023-09-22T21:58:01Z

Oh, it has a separate implementation. So I can't currently use it with any third-party software that uses the llama.cpp API.

edit: This file is not listed in either of the build scripts. It doesn't seem to have GPU acceleration. It seems like that could be improved.

ggerganov · 2023-09-28T20:09:45Z

Yeah, there's just a poc implementation. We should add it in llama.cpp eventually

maddes8cht · 2023-11-12T13:29:07Z

The situation is now that we do have code in this repository to "successfully" convert and quantize a gpt-neo-x model, but no way to run these models.
https://github.com/ggerganov/ggml/tree/master/examples/gpt-neox does have its own convert-script. The conversion here in the convert-hf-to-gguf.py does not seem to have any purpose at all.

maddes8cht · 2023-11-22T21:34:45Z

I still would like to bring this forward again:
There is code inside the convert-hf-to-gguf.py since its first release in #3838 and before in the seperate convert-gptneox-script to somehow "sucessfully" convert gpt-neo-x models into gguf models. But there is no code whatsoever to run inference on a model that the convert script labels as a gptneox model.

Galunid · 2023-11-22T22:07:44Z

I believe you can run this using https://github.com/ggerganov/ggml/blob/master/examples/gpt-neox/main.cpp
it's just not in llama.cpp. It'll be supported eventually.

maddes8cht · 2023-11-22T23:14:51Z

Oh, I alredy compiled that example for testing. It seems to expect the old ggml .bin files, which can be created using the convert program in the same example directory. It doesn't run the gguf files that are built using the convet-hf-to-gguf.pyi script. Right now, there is no code that can run these gguf files converted from gpt-neo-x models.

ggerganov · 2023-11-23T09:32:25Z

It's much easier to add new arches to llama.cpp now (I hope) - PRs welcome

github-actions · 2024-04-03T01:15:50Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

cebtenzzre · 2024-04-03T01:20:51Z

I'm still interested in this.

github-actions · 2024-05-18T01:58:34Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

cebtenzzre assigned klosax Sep 21, 2023

cebtenzzre changed the title ~~GPT-NeoX : cannot find tokenizer scores in model file~~ GPT-NeoX has a conversion script but cannot be loaded or used for inference Sep 21, 2023

cebtenzzre added the enhancement New feature or request label Sep 21, 2023

cebtenzzre changed the title ~~GPT-NeoX has a conversion script but cannot be loaded or used for inference~~ GPT-NeoX has only minimal inference support Sep 22, 2023

cebtenzzre mentioned this issue Sep 27, 2023

gguf : fix a few general keys #3341

Merged

maddes8cht mentioned this issue Nov 22, 2023

Generalize convert scripts #3838

Merged

19 tasks

maddes8cht mentioned this issue Nov 22, 2023

Support stableCode models (which seems to be gpt-neo-x that we can convert into gguf) #4174

Closed

github-actions bot added the stale label Mar 20, 2024

github-actions bot closed this as completed Apr 3, 2024

cebtenzzre unassigned klosax Apr 3, 2024

cebtenzzre removed the stale label Apr 3, 2024

cebtenzzre reopened this Apr 3, 2024

github-actions bot added the stale label May 4, 2024

github-actions bot closed this as completed May 18, 2024

cebtenzzre reopened this May 22, 2024

cebtenzzre linked a pull request May 22, 2024 that will close this issue

Add missing inference support for GPTNeoXForCausalLM (Pythia and GPT-NeoX base models) #7461

Merged

github-actions bot removed the stale label May 23, 2024

fairydreaming closed this as completed in #7461 May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-NeoX has only minimal inference support #3293

GPT-NeoX has only minimal inference support #3293

cebtenzzre commented Sep 21, 2023

cebtenzzre commented Sep 21, 2023 •

edited

Loading

Jacoby1218 commented Sep 22, 2023

cebtenzzre commented Sep 22, 2023 •

edited

Loading

ggerganov commented Sep 28, 2023

maddes8cht commented Nov 12, 2023

maddes8cht commented Nov 22, 2023

Galunid commented Nov 22, 2023

maddes8cht commented Nov 22, 2023 •

edited

Loading

ggerganov commented Nov 23, 2023

github-actions bot commented Apr 3, 2024

cebtenzzre commented Apr 3, 2024

github-actions bot commented May 18, 2024

GPT-NeoX has only minimal inference support #3293

GPT-NeoX has only minimal inference support #3293

Comments

cebtenzzre commented Sep 21, 2023

cebtenzzre commented Sep 21, 2023 • edited Loading

Jacoby1218 commented Sep 22, 2023

cebtenzzre commented Sep 22, 2023 • edited Loading

ggerganov commented Sep 28, 2023

maddes8cht commented Nov 12, 2023

maddes8cht commented Nov 22, 2023

Galunid commented Nov 22, 2023

maddes8cht commented Nov 22, 2023 • edited Loading

ggerganov commented Nov 23, 2023

github-actions bot commented Apr 3, 2024

cebtenzzre commented Apr 3, 2024

github-actions bot commented May 18, 2024

cebtenzzre commented Sep 21, 2023 •

edited

Loading

cebtenzzre commented Sep 22, 2023 •

edited

Loading

maddes8cht commented Nov 22, 2023 •

edited

Loading