Install llama.cpp, run GGUF models with llama-cli, and serve OpenAI-compatible APIs using llama-server. Key flags, examples, and tuning tips with a short commands cheatsheet
#Cheatsheet #GGUF #AI #LLM #DevOps #OpenAI #API #SelfHosting #CUDA #Prometheus #llama.cpp