Skip to main content

hf_model

Store a HuggingFace model to remote storage. This function downloads a model from the HuggingFace Hub and prefetches it to remote storage, supporting optional sharding using vLLM for large models.

def hf_model(
repo: str,
raw_data_path: str | None = None,
artifact_name: str | None = None,
architecture: str | None = None,
task: str = "auto",
modality: tuple[str, ...] = ("text",),
serial_format: str | None = None,
model_type: str | None = None,
short_description: str | None = None,
shard_config: ShardConfig | None = None,
hf_token_key: str = "HF_TOKEN",
resources: Resources = Resources(cpu="2", memory="8Gi", disk="50Gi"),
force: int = 0
) - > Run

Store a HuggingFace model to remote storage. This function downloads a model from the HuggingFace Hub and prefetches it to remote storage. It supports optional sharding using vLLM for large models.

Parameters

NameTypeDescription
repostrThe HuggingFace repository ID (e.g., 'meta-llama/Llama-2-7b-hf').
raw_data_path`strNone` = None
artifact_name`strNone` = None
architecture`strNone` = None
taskstr = "auto"Model task (e.g., 'generate', 'classify', 'embed').
modalitytuple[str, ...] = ("text",)Modalities supported by the model.
serial_format`strNone` = None
model_type`strNone` = None
short_description`strNone` = None
shard_config`ShardConfigNone` = None
hf_token_keystr = "HF_TOKEN"Name of the secret containing the HuggingFace token.
resourcesResources = Resources(cpu="2", memory="8Gi", disk="50Gi")The S3 bucket name where exported reports are stored
forceint = 0Force re-prefetch. Increment to force a new prefetch.

Returns

TypeDescription
RunA Run object representing the prefetch task execution.