hf_model

Store a HuggingFace model to remote storage. This function downloads a model from the HuggingFace Hub and prefetches it to remote storage, supporting optional sharding using vLLM for large models.

def hf_model(
    repo: str,
    raw_data_path: str | None = None,
    artifact_name: str | None = None,
    architecture: str | None = None,
    task: str = "auto",
    modality: tuple[str, ...] = ("text",),
    serial_format: str | None = None,
    model_type: str | None = None,
    short_description: str | None = None,
    shard_config: ShardConfig | None = None,
    hf_token_key: str = "HF_TOKEN",
    resources: Resources = Resources(cpu="2", memory="8Gi", disk="50Gi"),
    force: int = 0
) - > Run

Store a HuggingFace model to remote storage. This function downloads a model from the HuggingFace Hub and prefetches it to remote storage. It supports optional sharding using vLLM for large models.

Parameters

Name	Type	Description
repo	`str`	The HuggingFace repository ID (e.g., 'meta-llama/Llama-2-7b-hf').
raw_data_path	`str	None` = None
artifact_name	`str	None` = None
architecture	`str	None` = None
task	`str` = "auto"	Model task (e.g., 'generate', 'classify', 'embed').
modality	`tuple[str, ...]` = ("text",)	Modalities supported by the model.
serial_format	`str	None` = None
model_type	`str	None` = None
short_description	`str	None` = None
shard_config	`ShardConfig	None` = None
hf_token_key	`str` = "HF_TOKEN"	Name of the secret containing the HuggingFace token.
resources	`Resources` = Resources(cpu="2", memory="8Gi", disk="50Gi")	The S3 bucket name where exported reports are stored
force	`int` = 0	Force re-prefetch. Increment to force a new prefetch.

Returns

Type	Description
`Run`	A Run object representing the prefetch task execution.

Parameters​

Returns​

Parameters

Returns