🤗 HuggingFace - Candle
Raw
- https://github.com/huggingface/candle
- https://github.com/nogibjj/candle-cookbook
A minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.
Examples
These online demos run entirely in your browser:
- yolo: pose estimation and object recognition.
- whisper: text to speech.
- LLaMA2: text generation.
- LLaMA and LLaMA-v2: general LLM.
- Falcon: general LLM.
- StarCoder: LLM specialized to code generation.
- Quantized LLaMA: quantized version of the LLaMA model using the same quantization techniques as llama.cpp.
- Stable Diffusion: text to image generative model, support for the 1.5, 2.1, and SDXL 1.0 versions.
- segment-anything: image segmentation model with prompt.
- Whisper: speech recognition model.
- Bert: useful for sentence embeddings.
- DINOv2: computer vision model trained using self-supervision (can be used for imagenet classification, depth evaluation, segmentation).
Setup
llama2-c on Windows 11 WSL2 Ubuntu, RTX4090
# Choose example llama2-c
cd candle-wasm-examples/llama2-c
# Setup raw
wget https://huggingface.co/spaces/lmz/candle-llama2/resolve/main/model.bin
wget https://huggingface.co/spaces/lmz/candle-llama2/resolve/main/tokenizer.json
# Setup tools
cargo install --locked trunk
cargo install --locked wasm-bindgen-cli
sudo apt install libssl-dev
# Build
. ./build-lib.sh
# Serve
trunk serve --release --port 8081
open http://127.0.0.1:8081
Mistral on Windows 11 WSL2 Ubuntu, RTX4090
# Update & upgrade
sudo apt update && sudo apt upgrade
# Remove previous NVIDIA installation
sudo apt autoremove nvidia* --purge
# Setup cuda ref: https://gist.github.com/denguir/b21aa66ae7fb1089655dd9de8351a202
sudo wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo wget https://developer.download.nvidia.com/compute/cuda/12.4.0/local_installers/cuda-repo-wsl-ubuntu-12-4-local_12.4.0-1_amd64.deb
sudo dpkg -i cuda-repo-wsl-ubuntu-12-4-local_12.4.0-1_amd64.deb
sudo cp /var/cuda-repo-wsl-ubuntu-12-4-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4
sudo apt-get -y install cuda-nvcc-12-4
sudo ubuntu-drivers autoinstall
sudo apt install nvidia-driver-525
sudo apt install nvidia-cuda-toolkit
# Check NVIDIA Drivers
nvidia-smi
# Check CUDA
nvcc --version
# Setup cuDNN
sudo apt install libcudnn8
sudo apt install libcudnn8-dev
# Check CuDNN
/sbin/ldconfig -N -v $(sed 's/:/ /' <<< $LD_LIBRARY_PATH) 2>/dev/null | grep libcudnn
# Source
echo 'export CUDA_HOME=/usr/local/cuda-12.4' >> ~/.bashrc
echo 'export PATH=/usr/local/cuda-12.4/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda-12.4/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
sudo ln -fs /usr/local/cuda-12.4/bin/nvcc /usr/bin/nvcc
# Setup raw
wget https://huggingface.co/lmz/candle-mistral/resolve/main/pytorch_model-00001-of-00002.safetensors
wget https://huggingface.co/lmz/candle-mistral/resolve/main/pytorch_model-00002-of-00002.safetensors
wget https://huggingface.co/lmz/candle-mistral/resolve/main/tokenizer.json
# Or macOS
curl -LO https://huggingface.co/lmz/candle-mistral/resolve/main/pytorch_model-00001-of-00002.safetensors
curl -LO https://huggingface.co/lmz/candle-mistral/resolve/main/pytorch_model-00002-of-00002.safetensors
curl -LO https://huggingface.co/lmz/candle-mistral/resolve/main/tokenizer.json
# Run
cargo run --example mistral --release --features cuda,cudnn -- --prompt "Write helloworld code in Rust" --weight-files=pytorch_model-00001-of-00002.safetensors,pytorch_model-00002-of-00002.safetensors --tokenizer-file=tokenizer.json --sample-len 150
Whisper v3
cargo run --example whisper --release -- --model=large-v3