Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 https://ift.tt/W3NAIRx
Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 I was running into an issue with a vLLM bug that affected multi...
Manish Pethev -
December 27, 2023
Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 https://ift.tt/W3NAIRx
Reviewed by Manish Pethev
on
December 27, 2023
Rating: