deeprvat.data.rare
Module Contents
Classes
Data
API
- deeprvat.data.rare.logger = 'getLogger(...)'
- class deeprvat.data.rare.PaddedAnnotations(base_dataset, annotations: List[str], thresholds: Dict[str, str] = None, gene_file: Optional[str] = None, genes_to_keep: Optional[Set[str]] = None, pad_value: Union[float, int, str] = 0.0, verbose: bool = False, low_memory: bool = False, skip_embedding: bool = False)
Initialization
- embed(idx: int, variant_ids: numpy.ndarray, genotype: numpy.ndarray) List[List[torch.Tensor]]
Returns: List[List[torch.Tensor]]
One outer list element for each gene; inner list elements are annotations for variants, one element for each variant in a gene for this sample
- collate_fn(batch: List[List[List[numpy.ndarray]]], device: torch.device = torch.device('cpu')) torch.Tensor
Returns: torch.Tensor
Dimensions of tensor: samples x genes x annotations x variants. Last dimension is padded to fit all variants.
- setup_annotations(rare_variant_ids: pandas.Series, thresholds: Optional[Dict[str, str]], gene_file: Optional[str], genes_to_keep: Optional[Set[str]] = None)
- apply_thresholds(thresholds: Optional[Dict[str, str]])
- remap_group_ids()
- setup_metadata()
- get_metadata() Dict[str, numpy.ndarray]
- class deeprvat.data.rare.SparseGenotype(base_dataset, annotations: List[str], thresholds: Dict[str, str] = None, gene_file: Optional[str] = None, genes_to_keep: Optional[Set[str]] = None, verbose: bool = False, low_memory: bool = False)
Initialization
- embed(idx: int, variant_ids: numpy.ndarray, genotype: numpy.ndarray) scipy.sparse.coo_matrix
Returns: List[List[torch.Tensor]]
One outer list element for each gene; inner list elements are annotations for variants, one element for each variant in a gene for this sample
- collate_fn(batch: List[scipy.sparse.coo_matrix]) scipy.sparse.coo_matrix
- setup_annotations(rare_variant_ids: pandas.Series, thresholds: Optional[Dict[str, str]], gene_file: Optional[str], genes_to_keep: Optional[Set[str]] = None)
- apply_thresholds(thresholds: Optional[Dict[str, str]])
- remap_group_ids()
- setup_metadata()
- get_metadata() Dict[str, numpy.ndarray]