deeprvat.utils

Module Contents

Functions

fdrcorrect_df

Apply False Discovery Rate (FDR) correction to p-values in a DataFrame.

bfcorrect_df

Apply Bonferroni correction to p-values in a DataFrame.

pval_correction

Apply p-value correction to a DataFrame.

suggest_hparams

Suggest hyperparameters using Optuna’s suggest methods.

compute_se

Compute standard error.

standardize_series

Standardize a pandas Series.

my_quantile_transform

Gaussian quantile transform for values in a pandas Series.

standardize_series_with_params

Standardize a pandas Series using provided standard deviation and mean.

calculate_mean_std

Calculate mean and standard deviation of a pandas Series.

safe_merge

Safely merge two pandas DataFrames.

resolve_path_with_env

Resolve a path with environment variables.

copy_with_env

Copy a file or directory to a destination with environment variables.

load_or_init

Load a pickled file or initialize an object.

remove_prefix

Remove a prefix from a string.

suggest_batch_size

Suggest a batch size for a tensor based on available GPU memory.

Data

logger

API

deeprvat.utils.logger = 'getLogger(...)'
deeprvat.utils.fdrcorrect_df(group: pandas.DataFrame, alpha: float) pandas.DataFrame

Apply False Discovery Rate (FDR) correction to p-values in a DataFrame.

Parameters:
  • group (pd.DataFrame) – DataFrame containing a “pval” column.

  • alpha (float) – Significance level.

Returns:

Original DataFrame with additional columns “significant” and “pval_corrected”.

Return type:

pd.DataFrame

deeprvat.utils.bfcorrect_df(group: pandas.DataFrame, alpha: float) pandas.DataFrame

Apply Bonferroni correction to p-values in a DataFrame.

Parameters:
  • group (pd.DataFrame) – DataFrame containing a “pval” column.

  • alpha (float) – Significance level.

Returns:

Original DataFrame with additional columns “significant” and “pval_corrected”.

Return type:

pd.DataFrame

deeprvat.utils.pval_correction(group: pandas.DataFrame, alpha: float, correction_type: str = 'FDR')

Apply p-value correction to a DataFrame.

Parameters:
  • group (pd.DataFrame) – DataFrame containing a column named “pval” with p-values to correct.

  • alpha (float) – Significance level.

  • correction_type (str) – Type of p-value correction. Options are ‘FDR’ (default) and ‘Bonferroni’.

Returns:

Original DataFrame with additional columns “significant” and “pval_corrected”.

Return type:

pd.DataFrame

deeprvat.utils.suggest_hparams(config: Dict, trial: optuna.trial.Trial, basename: str = '') Dict

Suggest hyperparameters using Optuna’s suggest methods.

Parameters:
  • config (Dict) – Configuration dictionary with hyperparameter specifications.

  • trial (optuna.trial.Trial) – Optuna trial instance.

  • basename (str) – Base name for hyperparameter suggestions.

Returns:

Updated configuration with suggested hyperparameters.

Return type:

Dict

deeprvat.utils.compute_se(errors: numpy.ndarray) float

Compute standard error.

Parameters:

errors (np.ndarray) – Array of errors.

Returns:

Standard error.

Return type:

float

deeprvat.utils.standardize_series(x: pandas.Series) pandas.Series

Standardize a pandas Series.

Parameters:

x (pd.Series) – Input Series.

Returns:

Standardized Series.

Return type:

pd.Series

deeprvat.utils.my_quantile_transform(x, seed=1)

Gaussian quantile transform for values in a pandas Series.

Parameters:
  • x (pd.Series) – Input pandas Series.

  • seed (int) – Random seed.

Returns:

Transformed Series.

Return type:

pd.Series

Note

“nan” values are kept

deeprvat.utils.standardize_series_with_params(x: pandas.Series, std, mean) pandas.Series

Standardize a pandas Series using provided standard deviation and mean.

Parameters:
  • x (pd.Series) – Input Series.

  • std – Standard deviation to use for standardization.

  • mean – Mean to use for standardization.

Returns:

Standardized Series.

Return type:

pd.Series

deeprvat.utils.calculate_mean_std(x: pandas.Series, ignore_zero=True) pandas.Series

Calculate mean and standard deviation of a pandas Series.

Parameters:
  • x (pd.Series) – Input Series.

  • ignore_zero (bool) – Whether to ignore zero values in calculations, defaults to True.

Returns:

Tuple of standard deviation and mean.

Return type:

Tuple[float, float]

deeprvat.utils.safe_merge(left: pandas.DataFrame, right: pandas.DataFrame, validate: str = '1:1', equal_row_nums: bool = False)

Safely merge two pandas DataFrames.

Parameters:
  • left (pd.DataFrame) – Left DataFrame.

  • right (pd.DataFrame) – Right DataFrame.

  • validate (str) – Validation method for the merge.

  • equal_row_nums (bool) – Whether to check if the row numbers are equal, defaults to False.

Raises:
  • ValueError – If left and right dataframe rows are unequal when ‘equal_row_nums’ is True.

  • RuntimeError – If merged DataFrame has unequal row numbers compared to the left DataFrame.

Returns:

Merged DataFrame.

Return type:

pd.DataFrame

deeprvat.utils.resolve_path_with_env(path: str) str

Resolve a path with environment variables.

Parameters:

path (str) – Input path.

Returns:

Resolved path.

Return type:

str

deeprvat.utils.copy_with_env(path: str, destination: str) str

Copy a file or directory to a destination with environment variables.

Parameters:
  • path (str) – Input path (file or directory).

  • destination (str) – Destination path.

Returns:

Resulting destination path.

Return type:

str

deeprvat.utils.load_or_init(pickle_file: str, init_fn: Callable) Any

Load a pickled file or initialize an object.

Parameters:
  • pickle_file (str) – Pickle file path.

  • init_fn (Callable) – Initialization function.

Returns:

Loaded or initialized object.

Return type:

Any

deeprvat.utils.remove_prefix(string, prefix)

Remove a prefix from a string.

Parameters:
  • string (str) – Input string.

  • prefix (str) – Prefix to remove.

Returns:

String without the specified prefix.

Return type:

str

deeprvat.utils.suggest_batch_size(tensor_shape: Iterable[int], example: Dict[str, Any] = {'batch_size': 16384, 'tensor_shape': (20, 125, 38), 'max_mem_bytes': 22890098688}, buffer_bytes: int = 2500000000)

Suggest a batch size for a tensor based on available GPU memory.

Parameters:
  • tensor_shape (Iterable[int]) – Shape of the tensor.

  • example (Dict[str, Any]) – Example dictionary with batch size, tensor shape, and max memory bytes.

  • buffer_bytes (int) – Buffer bytes to consider.

Returns:

Suggested batch size for the given tensor shape and GPU memory.

Return type:

int