Model Serving Examples

Natural Language Processing (NLP)

LLaMA model deployment with TorchServe

LLaMA is a large language model Meta AI released for research purpose. This example demonstrates how to deploy the LLaMA model with TorchServe on Kubeflow on vSphere to provide prediction service.

To be specific, this example covers the following use cases:

  • Large Language Model deployment with TorchServe

To get more details on this example, visit its page.

BLOOM model deployment with KServe

BLOOM is a large language model released by BigScience. It has been trained on 46 different human languages and 13 programming languages. This example demonstrates how to deploy the BLOOM model on Kubeflow on vSphere to provide prediction service.

To be specific, this example covers the following use cases:

  • Large Language Model deployment with KServe

To get more details on this example, visit this page.