Running large language models like OPT-175B/GPT-3 on a single GPU. Up to 100x faster than other offloading systems.
Python 1.5k 64
This organization has no public members. You must be a member to see who’s a part of this organization.
Loading…