Skip to main content

File

A generic file class representing a file with a specified format. Provides both async and sync interfaces for file operations. All methods without _sync suffix are async. The class should be instantiated using one of the class methods. The constructor should be used only to instantiate references to existing remote objects. The generic type T represents the format of the file.

Attributes

AttributeTypeDescription
pathstrThe path to the file (can be local or remote)
nameOptional[str] = NoneOptional name for the file (defaults to basename of path)
formatstr = ""The format of the file.
hashOptional[str] = NoneThe hash value of the file, used for cache key computation.
hash_methodOptional[HashMethod] = NoneThe method used to compute the hash of the file, which can be a HashMethod object or a string representing a precomputed cache key.

Methods


pre_init()

@classmethod
def pre_init(
data: Any
)

Internal: Pydantic validator to set default name from path. Not intended for direct use.

Parameters

NameTypeDescription
dataAnyThe data dictionary being initialized.

lazy_uploader()

@classmethod
def lazy_uploader() - > Callable[[], Coroutine[Any, Any, tuple[str | None, str]]]| None

Returns the lazy uploader callable, which is used to upload local files to remote storage when in remote mode.

Returns

TypeDescription
`Callable[[], Coroutine[Any, Any, tuple[strNone, str]]]

lazy_uploader()

@classmethod
def lazy_uploader(
lazy_uploader: Callable[[], Coroutine[Any, Any, tuple[str | None, str]]]| None
)

Sets the lazy uploader callable, which is used to upload local files to remote storage when in remote mode.

Parameters

NameTypeDescription
lazy_uploader`Callable[[], Coroutine[Any, Any, tuple[strNone, str]]]

schema_match()

@classmethod
def schema_match(
incoming: dict
)

Internal: Check if incoming schema matches File schema. Not intended for direct use.

Parameters

NameTypeDescription
incomingdictThe incoming schema dictionary to compare against the File schema.

new_remote()

@classmethod
def new_remote(
file_name: Optional[str] = None,
hash_method: Optional[HashMethod | str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]

Create a new File reference for a remote file that will be written to. Use this when you want to create a new file and write to it directly without creating a local file first.

Parameters

NameTypeDescription
file_nameOptional[str] = NoneOptional string specifying a remote file name. If not set, a generated file name will be returned.
hash_method`Optional[HashMethodstr]` = None

Returns

TypeDescription
[File](file.md?sid=flyte_io__file_file)[T]A new File instance with a generated remote path

named_remote()

@classmethod
def named_remote(
name: str
) - > [File](file.md?sid=flyte_io__file_file)[T]

Create a File reference whose remote path is derived deterministically from name. Unlike new_remote, which generates a random path on every call, this method produces the same path for the same name within a given task execution. This makes it safe across retries: the first attempt uploads to the path and subsequent retries resolve to the identical location without re-uploading.

Parameters

NameTypeDescription
namestrPlain filename (e.g., "data.csv"). Must not contain path separators.

Returns

TypeDescription
[File](file.md?sid=flyte_io__file_file)[T]A File instance whose path is stable across retries.

from_existing_remote()

@classmethod
def from_existing_remote(
remote_path: str,
file_cache_key: Optional[str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]

Create a File reference from an existing remote file. Use this when you want to reference a file that already exists in remote storage without uploading it.

Parameters

NameTypeDescription
remote_pathstrThe remote path to the existing file
file_cache_keyOptional[str] = NoneOptional hash value to use for cache key computation. If not specified, the cache key will be computed based on the file's attributes (path, name, format).

Returns

TypeDescription
[File](file.md?sid=flyte_io__file_file)[T]A new File instance pointing to the existing remote file

open()

@classmethod
def open(
mode: str = "rb",
block_size: Optional[int] = None,
cache_type: str = "readahead",
cache_options: Optional[dict] = None,
compression: Optional[str] = None,
**kwargs: Any
) - > AsyncGenerator[Union[AsyncWritableFile, AsyncReadableFile, "HashingWriter"], None]

Asynchronously open the file and return a file-like object. Use this method in async tasks to read from or write to files directly.

Parameters

NameTypeDescription
modestr = "rb"The mode to open the file in (default: 'rb'). Common modes: 'rb' (read binary), 'wb' (write binary), 'rt' (read text), 'wt' (write text)
block_sizeOptional[int] = NoneSize of blocks for reading in bytes. Useful for streaming large files.
cache_typestr = "readahead"Caching mechanism to use ('readahead', 'mmap', 'bytes', 'none')
cache_optionsOptional[dict] = NoneDictionary of options for the cache
compressionOptional[str] = NoneCompression format or None for auto-detection
**kwargsAnyAdditional arguments passed to fsspec's open method

Returns

TypeDescription
AsyncGenerator[Union[AsyncWritableFile, AsyncReadableFile, "HashingWriter"], None]An async file-like object that can be used with async read/write operations

exists()

@classmethod
def exists() - > bool

Asynchronously check if the file exists.

Returns

TypeDescription
boolTrue if the file exists, False otherwise

exists_sync()

@classmethod
def exists_sync() - > bool

Synchronously check if the file exists. Use this in non-async tasks or when you need synchronous file existence checking.

Returns

TypeDescription
boolTrue if the file exists, False otherwise

open_sync()

@classmethod
def open_sync(
mode: str = "rb",
block_size: Optional[int] = None,
cache_type: str = "readahead",
cache_options: Optional[dict] = None,
compression: Optional[str] = None,
**kwargs: Any
) - > Generator[IO[Any], None, None]

Synchronously open the file and return a file-like object. Use this method in non-async tasks to read from or write to files directly.

Parameters

NameTypeDescription
modestr = "rb"The mode to open the file in (default: 'rb'). Common modes: 'rb' (read binary), 'wb' (write binary), 'rt' (read text), 'wt' (write text)
block_sizeOptional[int] = NoneSize of blocks for reading in bytes. Useful for streaming large files.
cache_typestr = "readahead"Caching mechanism to use ('readahead', 'mmap', 'bytes', 'none')
cache_optionsOptional[dict] = NoneDictionary of options for the cache
compressionOptional[str] = NoneCompression format or None for auto-detection
**kwargsAnyAdditional arguments passed to fsspec's open method

Returns

TypeDescription
Generator[IO[Any], None, None]A file-like object that can be used with standard read/write operations

download()

@classmethod
def download(
local_path: Optional[Union[str, Path]] = None
) - > str

Asynchronously download the file to a local path. Use this when you need to download a remote file to your local filesystem for processing.

Parameters

NameTypeDescription
local_pathOptional[Union[str, Path]] = NoneThe local path to download the file to. If None, a temporary directory will be used and a path will be generated.

Returns

TypeDescription
strThe absolute path to the downloaded file

download_sync()

@classmethod
def download_sync(
local_path: Optional[Union[str, Path]] = None
) - > str

Synchronously download the file to a local path. Use this in non-async tasks when you need to download a remote file to your local filesystem.

Parameters

NameTypeDescription
local_pathOptional[Union[str, Path]] = NoneThe local path to download the file to. If None, a temporary directory will be used and a path will be generated.

Returns

TypeDescription
strThe absolute path to the downloaded file

from_local_sync()

@classmethod
def from_local_sync(
local_path: Union[str, Path],
remote_destination: Optional[str] = None,
hash_method: Optional[HashMethod | str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]

Synchronously create a new File object from a local file by uploading it to remote storage. Use this in non-async tasks when you have a local file that needs to be uploaded to remote storage.

Parameters

NameTypeDescription
local_pathUnion[str, Path]Path to the local file
remote_destinationOptional[str] = NoneOptional remote path to store the file. If None, a path will be automatically generated.
hash_method`Optional[HashMethodstr]` = None

Returns

TypeDescription
[File](file.md?sid=flyte_io__file_file)[T]A new File instance pointing to the uploaded remote file

from_local()

@classmethod
def from_local(
local_path: Union[str, Path],
remote_destination: Optional[str] = None,
hash_method: Optional[HashMethod | str] = None
) - > [File](file.md?sid=flyte_io__file_file)[T]

Asynchronously create a new File object from a local file by uploading it to remote storage. Use this in async tasks when you have a local file that needs to be uploaded to remote storage.

Parameters

NameTypeDescription
local_pathUnion[str, Path]Path to the local file
remote_destinationOptional[str] = NoneOptional remote path to store the file. If None, a path will be automatically generated.
hash_method`Optional[HashMethodstr]` = None

Returns

TypeDescription
[File](file.md?sid=flyte_io__file_file)[T]A new File instance pointing to the uploaded remote file