A generic file class representing a file with a specified format. Provides both async and sync interfaces for file operations. All methods without _sync suffix are async. The class should be instantiated using one of the class methods. The constructor should be used only to instantiate references to existing remote objects. The generic type T represents the format of the file.
Attributes
| Attribute | Type | Description |
|---|
| path | str | The path to the file (can be local or remote) |
| name | Optional[str] = None | Optional name for the file (defaults to basename of path) |
| format | str = "" | The format of the file. |
| hash | Optional[str] = None | The hash value of the file, used for cache key computation. |
| hash_method | Optional[HashMethod] = None | The method used to compute the hash of the file, which can be a HashMethod object or a string representing a precomputed cache key. |
Methods
pre_init()
@classmethod
def pre_init(
data: Any
)
Internal: Pydantic validator to set default name from path. Not intended for direct use.
Parameters
| Name | Type | Description |
|---|
| data | Any | The data dictionary being initialized. |
lazy_uploader()
@classmethod
def lazy_uploader() - > Callable[[], Coroutine[Any, Any, tuple[str | None, str]]]| None
Returns the lazy uploader callable, which is used to upload local files to remote storage when in remote mode.
Returns
| Type | Description |
|---|
| `Callable[[], Coroutine[Any, Any, tuple[str | None, str]]] |
lazy_uploader()
@classmethod
def lazy_uploader(
lazy_uploader: Callable[[], Coroutine[Any, Any, tuple[str | None, str]]]| None
)
Sets the lazy uploader callable, which is used to upload local files to remote storage when in remote mode.
Parameters
| Name | Type | Description |
|---|
| lazy_uploader | `Callable[[], Coroutine[Any, Any, tuple[str | None, str]]] |
schema_match()
@classmethod
def schema_match(
incoming: dict
)
Internal: Check if incoming schema matches File schema. Not intended for direct use.
Parameters
| Name | Type | Description |
|---|
| incoming | dict | The incoming schema dictionary to compare against the File schema. |
new_remote()
@classmethod
def new_remote(
file_name: Optional[str] = None,
hash_method: Optional[HashMethod | str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]
Create a new File reference for a remote file that will be written to. Use this when you want to create a new file and write to it directly without creating a local file first.
Parameters
| Name | Type | Description |
|---|
| file_name | Optional[str] = None | Optional string specifying a remote file name. If not set, a generated file name will be returned. |
| hash_method | `Optional[HashMethod | str]` = None |
Returns
| Type | Description |
|---|
[File](file.md?sid=flyte_io__file_file)[T] | A new File instance with a generated remote path |
named_remote()
@classmethod
def named_remote(
name: str
) - > [File](file.md?sid=flyte_io__file_file)[T]
Create a File reference whose remote path is derived deterministically from name. Unlike new_remote, which generates a random path on every call, this method produces the same path for the same name within a given task execution. This makes it safe across retries: the first attempt uploads to the path and subsequent retries resolve to the identical location without re-uploading.
Parameters
| Name | Type | Description |
|---|
| name | str | Plain filename (e.g., "data.csv"). Must not contain path separators. |
Returns
| Type | Description |
|---|
[File](file.md?sid=flyte_io__file_file)[T] | A File instance whose path is stable across retries. |
from_existing_remote()
@classmethod
def from_existing_remote(
remote_path: str,
file_cache_key: Optional[str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]
Create a File reference from an existing remote file. Use this when you want to reference a file that already exists in remote storage without uploading it.
Parameters
| Name | Type | Description |
|---|
| remote_path | str | The remote path to the existing file |
| file_cache_key | Optional[str] = None | Optional hash value to use for cache key computation. If not specified, the cache key will be computed based on the file's attributes (path, name, format). |
Returns
| Type | Description |
|---|
[File](file.md?sid=flyte_io__file_file)[T] | A new File instance pointing to the existing remote file |
open()
@classmethod
def open(
mode: str = "rb",
block_size: Optional[int] = None,
cache_type: str = "readahead",
cache_options: Optional[dict] = None,
compression: Optional[str] = None,
**kwargs: Any
) - > AsyncGenerator[Union[AsyncWritableFile, AsyncReadableFile, "HashingWriter"], None]
Asynchronously open the file and return a file-like object. Use this method in async tasks to read from or write to files directly.
Parameters
| Name | Type | Description |
|---|
| mode | str = "rb" | The mode to open the file in (default: 'rb'). Common modes: 'rb' (read binary), 'wb' (write binary), 'rt' (read text), 'wt' (write text) |
| block_size | Optional[int] = None | Size of blocks for reading in bytes. Useful for streaming large files. |
| cache_type | str = "readahead" | Caching mechanism to use ('readahead', 'mmap', 'bytes', 'none') |
| cache_options | Optional[dict] = None | Dictionary of options for the cache |
| compression | Optional[str] = None | Compression format or None for auto-detection |
| **kwargs | Any | Additional arguments passed to fsspec's open method |
Returns
| Type | Description |
|---|
AsyncGenerator[Union[AsyncWritableFile, AsyncReadableFile, "HashingWriter"], None] | An async file-like object that can be used with async read/write operations |
exists()
@classmethod
def exists() - > bool
Asynchronously check if the file exists.
Returns
| Type | Description |
|---|
bool | True if the file exists, False otherwise |
exists_sync()
@classmethod
def exists_sync() - > bool
Synchronously check if the file exists. Use this in non-async tasks or when you need synchronous file existence checking.
Returns
| Type | Description |
|---|
bool | True if the file exists, False otherwise |
open_sync()
@classmethod
def open_sync(
mode: str = "rb",
block_size: Optional[int] = None,
cache_type: str = "readahead",
cache_options: Optional[dict] = None,
compression: Optional[str] = None,
**kwargs: Any
) - > Generator[IO[Any], None, None]
Synchronously open the file and return a file-like object. Use this method in non-async tasks to read from or write to files directly.
Parameters
| Name | Type | Description |
|---|
| mode | str = "rb" | The mode to open the file in (default: 'rb'). Common modes: 'rb' (read binary), 'wb' (write binary), 'rt' (read text), 'wt' (write text) |
| block_size | Optional[int] = None | Size of blocks for reading in bytes. Useful for streaming large files. |
| cache_type | str = "readahead" | Caching mechanism to use ('readahead', 'mmap', 'bytes', 'none') |
| cache_options | Optional[dict] = None | Dictionary of options for the cache |
| compression | Optional[str] = None | Compression format or None for auto-detection |
| **kwargs | Any | Additional arguments passed to fsspec's open method |
Returns
| Type | Description |
|---|
Generator[IO[Any], None, None] | A file-like object that can be used with standard read/write operations |
download()
@classmethod
def download(
local_path: Optional[Union[str, Path]] = None
) - > str
Asynchronously download the file to a local path. Use this when you need to download a remote file to your local filesystem for processing.
Parameters
| Name | Type | Description |
|---|
| local_path | Optional[Union[str, Path]] = None | The local path to download the file to. If None, a temporary directory will be used and a path will be generated. |
Returns
| Type | Description |
|---|
str | The absolute path to the downloaded file |
download_sync()
@classmethod
def download_sync(
local_path: Optional[Union[str, Path]] = None
) - > str
Synchronously download the file to a local path. Use this in non-async tasks when you need to download a remote file to your local filesystem.
Parameters
| Name | Type | Description |
|---|
| local_path | Optional[Union[str, Path]] = None | The local path to download the file to. If None, a temporary directory will be used and a path will be generated. |
Returns
| Type | Description |
|---|
str | The absolute path to the downloaded file |
from_local_sync()
@classmethod
def from_local_sync(
local_path: Union[str, Path],
remote_destination: Optional[str] = None,
hash_method: Optional[HashMethod | str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]
Synchronously create a new File object from a local file by uploading it to remote storage. Use this in non-async tasks when you have a local file that needs to be uploaded to remote storage.
Parameters
| Name | Type | Description |
|---|
| local_path | Union[str, Path] | Path to the local file |
| remote_destination | Optional[str] = None | Optional remote path to store the file. If None, a path will be automatically generated. |
| hash_method | `Optional[HashMethod | str]` = None |
Returns
| Type | Description |
|---|
[File](file.md?sid=flyte_io__file_file)[T] | A new File instance pointing to the uploaded remote file |
from_local()
@classmethod
def from_local(
local_path: Union[str, Path],
remote_destination: Optional[str] = None,
hash_method: Optional[HashMethod | str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]
Asynchronously create a new File object from a local file by uploading it to remote storage. Use this in async tasks when you have a local file that needs to be uploaded to remote storage.
Parameters
| Name | Type | Description |
|---|
| local_path | Union[str, Path] | Path to the local file |
| remote_destination | Optional[str] = None | Optional remote path to store the file. If None, a path will be automatically generated. |
| hash_method | `Optional[HashMethod | str]` = None |
Returns
| Type | Description |
|---|
[File](file.md?sid=flyte_io__file_file)[T] | A new File instance pointing to the uploaded remote file |