Replies: 2 comments
-
|
To my knowledge, the inference API does not support adapter models. You might need to use model = AutoModelForCausalLM.from_pretrained(
base_model_name_or_path, device_map='auto')
model = PeftModel.from_pretrained(
model,
lora_model_name_or_path,
device_map='auto'
)
model = model.merge_and_unload() # Needs peft>=0.3.0
model.push_to_hub(model_name)
# For the inference API to work, we need to push the tokenizer too.
tokenizer.push_to_hub(model_name) |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
This is very helpful. Thank you! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
After successfully training, how would I use the LLaMa based model in HuggingFace? I pushed the contents of the
lora_modelsfolder which I uniquely labeled but it is apparently missing the base model in order to successfully use an inference API?Beta Was this translation helpful? Give feedback.
All reactions