Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
-
Updated
Jul 13, 2023 - Python
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for accelerating ML workloads.
Operating LLMs in production
A high-throughput and memory-efficient inference and serving engine for LLMs
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
A high-performance serving framework for ML models, offers dynamic batching and multi-stage pipeline to fully exploit your compute machine
Ray Aviary - evaluate multiple LLMs easily
This is suite of the hands-on training materials that shows how to scale CV, NLP, time-series forecasting workloads with Ray.
Deploy and Scale LLM-based applications
Ray and Anyscale for UC Berkeley AI Hackathon!
Hinglish Chatbot powered by Azure Cognitive Services, Google Translate and Open AI
Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.
To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."