LLM FineTuning

Raw

RLHF in 2024 with DPO & Hugging Face

Brain dump (WIP)

A Beginner’s Guide to Fine-Tuning Mistral 7B Instruct Model

Fixed notebook is Mistral_7B_qLora_Finetuning.ipynb. But prompt formatting is still in doubt.

Colab: https://adithyask.medium.com/a-beginners-guide-to-fine-tuning-mistral-7b-instruct-model-0f39647b20fe
Source: https://github.com/adithya-s-k/CompanionLLM

⚠️ This notebook needs to add pad_token_id=2 when calling merged_model.generate() in Test the merged model:

outputs = merged_model.generate(input_ids=input_ids, pad_token_id=2,
max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.5)