Skip to main content

VLLMShardArgs

Arguments for sharding a model using vLLM.

Attributes

AttributeTypeDescription
tensor_parallel_sizeint = 1Number of tensor parallel workers.
dtypestr = "auto"Data type for model weights.
trust_remote_codebool = trueWhether to trust remote code from HuggingFace.
max_model_len`intNone` = null
file_pattern`strNone` = DEFAULT_SHARD_PATTERN
max_file_sizeint = 5 * 1024**3Maximum size for each sharded file.

Methods


get_vllm_args()

@classmethod
def get_vllm_args(
model_path: str
) - > dict[str, Any]

Get arguments dict for vLLM LLM constructor.

Parameters

NameTypeDescription
model_pathstrThe path to the model to be loaded by vLLM.

Returns

TypeDescription
dict[str, Any]The dictionary of arguments suitable for the vLLM LLM constructor.