Quick Start

This guide offers a brief overview of functionality

Install the library

The best way to interact with our API is to use one of our official libraries:

# Install via pip
pip install adata_query

# Install the developer version via GitHub
git clone https://github.com/mvinyard/AnnDataQuery.git; cd ./AnnDataQuery;
pip install -e .

AnnData

This package is downstream of data loading and assumes a generally typical implementation of adata created using the AnnData package or Scanpy.

import anndata

h5ad_path = "/path/to/your/adata.h5ad"

adata = anndata.read_h5ad(h5ad_path)

Once you have some data, you are ready to interface with adata_query.

`adata_query.fetch`:

This is probably the most useful function in the library and relies on the two functions, below. In short, this function takes a string and returns a matrix by the string, from adata. You can do this in grouped fashion, based on pd.groupby

import adata_query

key = "X_pca" # stored in adata.obsm

data = adata_query.fetch(adata = adata, key = "X_pca")

import adata_query

key = "X_pca" # stored in adata.obsm
groupby = "cluster" # cell annotation in adata.obs

data = adata_query.fetch(
    adata = adata,
    key = key,
    groupby = groupby,
)

In this example, the returned data is now of type: List.

`adata_query.format_data`

These functions seem trivial, but they become useful for adding flexibility into more complex workflows.

For some data stored as np.ndarray.

import adata_query

data = adata_query.format(data) # returns np.ndarray

For some data stored as np.ndarray.

import adata_query

data = adata_query.format(data, torch = True, device = "cpu") # torch.Tensor on cpu

For some data stored as np.ndarray.

import adata_query

data = adata_query.format(data, torch = True) # torch.Tensor on gpu, if available

# torch.Tensor can also be explicitly declared to a specific device
data = adata_query.format(data, torch = True, device = "cuda:0")

# Apple Silicon also works and will be automatically detected
data = adata_query.format(data, torch = True, device = "mps:0")

`adata_query.locate`

I don't anticipate this function to be widely used beyond its implementation in adata_query.fetch.

import adata_query

key = "X_pca"

attr_key = adata_query.locate(adata, key = key) # attr_key = "obsm"

Example notebook

Try some examples in Google Colab:

Overview of common use-cases

PreviousAnnDataQuery NextInstallation

Last updated 1 year ago

Install the library

AnnData

adata_query.fetch:

adata_query.format_data

adata_query.locate

Example notebook

`adata_query.fetch`:

`adata_query.format_data`

`adata_query.locate`