🔎
AnnDataQuery
  • AnnDataQuery
  • Quick Start
  • Installation
  • Reference
    • API Reference
      • fetch
        • AnnDataFetcher
      • format_data
        • DataFormatter
      • locate
        • AnnDataLocator
Powered by GitBook
On this page
  • Install the library
  • AnnData
  • adata_query.fetch:
  • adata_query.format_data
  • adata_query.locate
  • Example notebook

Quick Start

This guide offers a brief overview of functionality

Install the library

The best way to interact with our API is to use one of our official libraries:

# Install via pip
pip install adata_query
# Install the developer version via GitHub
git clone https://github.com/mvinyard/AnnDataQuery.git; cd ./AnnDataQuery;
pip install -e .

AnnData

This package is downstream of data loading and assumes a generally typical implementation of adata created using the AnnData package or Scanpy.

import anndata

h5ad_path = "/path/to/your/adata.h5ad"

adata = anndata.read_h5ad(h5ad_path)

Once you have some data, you are ready to interface with adata_query.

adata_query.fetch:

This is probably the most useful function in the library and relies on the two functions, below. In short, this function takes a string and returns a matrix by the string, from adata. You can do this in grouped fashion, based on pd.groupby

import adata_query

key = "X_pca" # stored in adata.obsm

data = adata_query.fetch(adata = adata, key = "X_pca")
import adata_query

key = "X_pca" # stored in adata.obsm
groupby = "cluster" # cell annotation in adata.obs

data = adata_query.fetch(
    adata = adata,
    key = key,
    groupby = groupby,
)

In this example, the returned data is now of type: List.

adata_query.format_data

These functions seem trivial, but they become useful for adding flexibility into more complex workflows.

For some data stored as np.ndarray.

import adata_query

data = adata_query.format(data) # returns np.ndarray

For some data stored as np.ndarray.

import adata_query

data = adata_query.format(data, torch = True, device = "cpu") # torch.Tensor on cpu

For some data stored as np.ndarray.

import adata_query

data = adata_query.format(data, torch = True) # torch.Tensor on gpu, if available

# torch.Tensor can also be explicitly declared to a specific device
data = adata_query.format(data, torch = True, device = "cuda:0")

# Apple Silicon also works and will be automatically detected
data = adata_query.format(data, torch = True, device = "mps:0")

adata_query.locate

I don't anticipate this function to be widely used beyond its implementation in adata_query.fetch.

import adata_query

key = "X_pca"

attr_key = adata_query.locate(adata, key = key) # attr_key = "obsm"

Example notebook

Try some examples in Google Colab:

PreviousAnnDataQueryNextInstallation

Last updated 1 year ago

Cover

Overview of common use-cases