class documentation

class FileReader:

Known subclasses: pyalma.local.LocalFileReader, pyalma.ssh.SshClient

Constructor: FileReader()

View In Hierarchy

Abstract base class for reading and managing different file types, both locally and remotely.

This class is designed to be subclassed by specific implementations (e.g., local or SSH).

Method __del__ Destructor to clean up temporary files if clean_on_destruction is enabled.
Method __init__ Initializes the FileReader with default configurations.
Method clean_tmp_files Deletes a temporary file if it exists.
Method decode_content_by_type Decodes content based on file type, returning a DataFrame or raw string.
Method download_remote_file Abstract method. Downloads a file from a remote location.
Method get_file_extension Extracts the file extension from a given path.
Method get_file_size Placeholder to get file size.
Method is_remote Checks if the reader is set to remote mode.
Method isfile Checks if a given path exists and is a file.
Method listdir Abstract method. Should be implemented by subclasses to list directory contents.
Method load_h5ad_file Placeholder for remote/local logic in loading H5AD files.
Method read_file Unified file reader for local or remote paths. :param path: File path. :param type: Optional file type override (generic types: pdf, image, text, csv, zip). :param as_dataframe: Whether to parse into a DataFrame...
Method read_file_into_df File reader into a dataframe for local and remote paths :param path: File path. :param type: Optional file type override (generic types: pdf, image, text, csv, zip). :param as_dataframe: Force a parsing into a DataFrame...
Method read_h5ad Reads an H5AD file using `anndatareader`.
Method read_vcf_file_into_df Reads a VCF (Variant Call Format) file using `pysam`.
Method set_clean_on_dest Sets the `clean_on_destruction` flag.
Method write_to_remote_file Abstract method. Writes data to a remote file.
Instance Variable clean_on_destruction Undocumented
Instance Variable files_to_clean Undocumented
Instance Variable remote Undocumented
Method _is_auto_dataframe_type Undocumented
Method _is_binary_type Undocumented
Method _is_text_type Undocumented
def __del__(self):
overridden in pyalma.ssh.SshClient

Destructor to clean up temporary files if clean_on_destruction is enabled.

def __init__(self):

Initializes the FileReader with default configurations.

  • `files_to_clean`: Tracks temporary files that may need deletion.
  • `remote`: Flag indicating remote operation.
  • `clean_on_destruction`: Determines whether to delete files on object destruction.
def clean_tmp_files(self, path):

Deletes a temporary file if it exists.

:param path: Path to the file to delete. :type path: str

def decode_content_by_type(self, content, type, as_dataframe=False, as_binary=False, **kwargs):

Decodes content based on file type, returning a DataFrame or raw string.

:param content: Raw content (str, bytes, or file path). :param type: File type (file extension generic types: pdf, image, text, csv, zip). :param kwargs: Extra arguments for `pandas.read_csv`. :return: Decoded content (DataFrame, str, or bytes).

def download_remote_file(self, remote_path, local_path):
overridden in pyalma.ssh.SshClient

Abstract method. Downloads a file from a remote location.

:raises NotImplementedError: Always.

def get_file_extension(self, file_path):

Extracts the file extension from a given path.

:param file_path: Full path to the file. :type file_path: str

:return: File extension without the dot. :rtype: str

def get_file_size(self, path):
overridden in pyalma.ssh.SshClient

Placeholder to get file size.

:param path: Path to the file. :type path: str

:return: File size (to be implemented). :rtype: None

def is_remote(self):

Checks if the reader is set to remote mode.

:return: True if remote, False otherwise. :rtype: bool

def isfile(self, path):
overridden in pyalma.ssh.SshClient

Checks if a given path exists and is a file.

:param path: Path to check. :type path: str

:return: True if path exists and is a file. :rtype: bool

def listdir(self, path):

Abstract method. Should be implemented by subclasses to list directory contents.

:raises NotImplementedError: Always.

def load_h5ad_file(self, path, local_path):
overridden in pyalma.ssh.SshClient

Placeholder for remote/local logic in loading H5AD files.

:param path: Path to the file. :param local_path: Local destination path. :type path: str :type local_path: str

:return: Path to use. :rtype: str

def read_file(self, path, type=None, as_dataframe=False, as_binary=False, **kwargs):

Unified file reader for local or remote paths. :param path: File path. :param type: Optional file type override (generic types: pdf, image, text, csv, zip). :param as_dataframe: Whether to parse into a DataFrame. :param as_binary: Force raw binary return.

def read_file_into_df(self, path, type=None, as_binary=False, **kwargs):

File reader into a dataframe for local and remote paths :param path: File path. :param type: Optional file type override (generic types: pdf, image, text, csv, zip). :param as_dataframe: Force a parsing into a DataFrame. :param as_binary: Force raw binary return.

def read_h5ad(self, path):

Reads an H5AD file using `anndatareader`.

:param path: Path to the `.h5ad` file. :type path: str

:return: Loaded `AnnData` object. :rtype: AnnData

def read_vcf_file_into_df(self, path):

Reads a VCF (Variant Call Format) file using `pysam`.

:param path: Path to the VCF file. :type path: str

:return: VCF file as a `pysam.VariantFile` object. :rtype: pysam.VariantFile

def set_clean_on_dest(self, value):

Sets the `clean_on_destruction` flag.

:param value: Enable or disable automatic cleanup. :type value: bool

def write_to_remote_file(self, data, remote_path, file_format='csv'):
overridden in pyalma.ssh.SshClient

Abstract method. Writes data to a remote file.

:param data: Data to write. :param remote_path: Path on remote server. :param file_format: File format, default is 'csv'. :raises NotImplementedError: Always.

clean_on_destruction =

Undocumented

files_to_clean: list =

Undocumented

remote: bool =
overridden in pyalma.ssh.SshClient

Undocumented

def _is_auto_dataframe_type(self, type):

Undocumented

def _is_binary_type(self, type):

Undocumented

def _is_text_type(self, type):

Undocumented