LLM FineTuning

Raw

Supervised Fine-tuning (SFT) with Unsloth (Recommend)

Direct Preference Optimization (DPO) (Not Recommend)

Brain dump (WIP)

A Beginner’s Guide to Fine-Tuning Mistral 7B Instruct Model

Fixed notebook is Mistral_7B_qLora_Finetuning.ipynb. But prompt formatting is still in doubt.

  • Colab: https://adithyask.medium.com/a-beginners-guide-to-fine-tuning-mistral-7b-instruct-model-0f39647b20fe

  • Source: https://github.com/adithya-s-k/CompanionLLM

  • ⚠️ This notebook needs to add pad_token_id=2 when calling merged_model.generate() in Test the merged model:

    outputs = merged_model.generate(input_ids=input_ids, pad_token_id=2,
    max_new_tokens=100, do_sample=True, top_p=0.9,temperature=0.5)