hf_model
Store a HuggingFace model to remote storage. This function downloads a model from the HuggingFace Hub and prefetches it to remote storage, supporting optional sharding using vLLM for large models.
def hf_model(
repo: str,
raw_data_path: str | None = None,
artifact_name: str | None = None,
architecture: str | None = None,
task: str = "auto",
modality: tuple[str, ...] = ("text",),
serial_format: str | None = None,
model_type: str | None = None,
short_description: str | None = None,
shard_config: ShardConfig | None = None,
hf_token_key: str = "HF_TOKEN",
resources: Resources = Resources(cpu="2", memory="8Gi", disk="50Gi"),
force: int = 0
) - > Run
Store a HuggingFace model to remote storage. This function downloads a model from the HuggingFace Hub and prefetches it to remote storage. It supports optional sharding using vLLM for large models.
Parameters
| Name | Type | Description |
|---|---|---|
| repo | str | The HuggingFace repository ID (e.g., 'meta-llama/Llama-2-7b-hf'). |
| raw_data_path | `str | None` = None |
| artifact_name | `str | None` = None |
| architecture | `str | None` = None |
| task | str = "auto" | Model task (e.g., 'generate', 'classify', 'embed'). |
| modality | tuple[str, ...] = ("text",) | Modalities supported by the model. |
| serial_format | `str | None` = None |
| model_type | `str | None` = None |
| short_description | `str | None` = None |
| shard_config | `ShardConfig | None` = None |
| hf_token_key | str = "HF_TOKEN" | Name of the secret containing the HuggingFace token. |
| resources | Resources = Resources(cpu="2", memory="8Gi", disk="50Gi") | The S3 bucket name where exported reports are stored |
| force | int = 0 | Force re-prefetch. Increment to force a new prefetch. |
Returns
| Type | Description |
|---|---|
Run | A Run object representing the prefetch task execution. |