--- title: "Bioactivity API" author: "Center for Computational Toxicology and Exposure" output: prettydoc::html_pretty: theme: architect toc: yes toc_depth: 5 vignette: > %\VignetteIndexEntry{Bioactivity} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{css, echo=FALSE} .scroll-200 { max-height: 200px; overflow-y: auto; } .noticebox { padding: 1em; background: lightgray; color: blue; border: 2px solid black; border-radius: 10px; } ``` ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(httptest) start_vignette("4") ``` ```{r setup, echo=FALSE, message=FALSE, warning=FALSE} if (!library(ccdR, logical.return = TRUE)){ devtools::load_all() } ``` ```{r setup-print, echo = FALSE} # Redefining the knit_print method to truncate character values to 25 characters # in each column and to truncate the columns in the print call to prevent # wrapping tables with several columns. #library(ccdR) knit_print.data.table = function(x, ...) { y <- data.table::copy(x) y <- y[, lapply(.SD, function(t){ if (is.character(t)){ t <- strtrim(t, 25) } return(t) })] print(y, trunc.cols = TRUE) } registerS3method( "knit_print", "data.table", knit_print.data.table, envir = asNamespace("knitr") ) # helper function for printing printFormattedTable <- function(res_dt, widen = c()) { DT::datatable(res_dt, style = 'bootstrap', class = 'table-bordered table-condensed', rownames = FALSE, options = list(scrollX = TRUE, autoWidth = TRUE, dom = 't', columnDefs = list(list(width = '1250px', targets = widen)))) } ``` # Introduction In this vignette, [CTX Bioactivity API](https://api-ccte.epa.gov/docs/bioactivity.html) will be explored. ::: {.noticebox data-latex=""} **NOTE:** Please see the introductory vignette for an overview of the *ccdR* package and initial set up instruction with API key storage. ::: Data provided by the API's Bioactivity endpoints are sourced from ToxCast's invitrodb. US EPA's Toxicity Forecaster (ToxCast) program makes *in vitro* medium- and high-throughput screening assay data publicly available for prioritization and hazard characterization of thousands of chemicals. The [ToxCast pipeline (tcpl)](https://CRAN.R-project.org/package=tcpl) is an R package that manages, curve-fits, plots, and stores ToxCast data to populate its linked MySQL database, invitrodb . These assays comprise Tier 2-3 of the new Computational Toxicology Blueprint, and employ automated chemical screening technologies, to evaluate the effects of chemical exposure on living cells and biological macromolecules, such as proteins (Thomas et al., 2019). More information on the ToxCast program can be found at . This flexible analysis pipeline is capable of efficiently processing and storing large volumes of data. The diverse data, received in heterogeneous formats from numerous vendors, are transformed to a standard computable format and loaded into the invitrodb database by vendor-specific R scripts. Once data is loaded into the database, ToxCast utilizes generalized processing functions provided in this package to process, normalize, model, qualify, and visualize the data.
![Figure 1: Conceptual overview of the ToxCast Pipeline functionality](Pictures/Fig1_tcpl_overview.png)
The Bioactivity API endpoints are organized into two different resources, "Assay" and "Data". "Assay" resource endpoints provide assay metadata for specific or all invitrodb 'aeids' (assay endpoint ids). These include annotations from invitrodb's assay, assay_component, assay_component_endpoint, assay_list, assay_source, and gene tables, all returned in a by-aeid format. "Data" resource endpoints are split into summary data (by 'aeid') and bioactivity data by 'm4id' (i.e. both 'aeid' and 'spid'). The summary endpoint returns the number of active hits and total multi- and single-concentration chemicals tested for specific 'aeids'. The other endpoints return chemical information, level 3 concentration-response values, level 4 fit parameters, level 5 hit parameters, and level 6 flags for individual chemicals tested for given 'AEIDs', 'm4ids', 'SPIDs', or 'DTXSIDs'. Regular ToxCast/invitrodb users may find it easier to use [tcpl](https://CRAN.R-project.org/package=tcpl), which has integrated ccdR's bioactivity functions to make the API data retrievable in a familiar format. See the [tcpl](https://CRAN.R-project.org/package=tcpl) vignette regarding data retrieval via API for more information. # Functions Several ccdR functions are used to access the CTX Bioactivity API data. ## Bioactivity Assay Resource Specific assays may be searched as well as all available assays that have data using two different functions. ### Get annotation by aeid `get_annotation_by_aeid()` retrieves annotation for a specific assay endpoint id (aeid). ```{r ccdR annotation by aeid, message=FALSE} assay <- get_annotation_by_aeid(AEID = "891") ``` ```{r, echo=FALSE} printFormattedTable(assay, c(4, 18, 33, 51)) # printed using custom formatted table ``` `get_annotation_by_aeid_batch()` retrieves annotation for a list (or vector) of assay endpoint ids (aeids). ```{r ccdR annotation by aeid batch, message=FALSE} assays <- get_annotation_by_aeid_batch(AEID = c(759,700,891)) # return is in list form by aeid, convert to table for output assays <- data.table::rbindlist(assays) ``` ```{r ccdR-all-assays, message=FALSE, eval=FALSE} printFormattedTable(assays, c(4, 18, 19, 33, 51)) # printed using custom formatted table ``` ### Get all assay annotations `get_all_assays()` retrieves all annotations for all assays available. ```{r ccdR all assays, message=FALSE, eval=FALSE} all_assays <- get_all_assays() ``` ## Bioactivity Data Resource There are several resources for retrieving bioactivity data associated with a variety of identifier types (e.g., DTXSID, aeid) that are available to the user. ### Get summary data `get_bioactivity_summary()` retrieves a summary of the number of active hits compared to the total number tested for both multiple and single concentration by aeid. ```{r ccdR summary by aeid, message=FALSE} summary <- get_bioactivity_summary(AEID = "891") ``` ```{r, echo=FALSE} printFormattedTable(summary) # printed using custom formatted table ``` `get_bioactivity_summary_batch()` retrieves a summary for a list (or vector) of assay endpoint ids (aeids). ```{r ccdR summary by aeid batch, message=FALSE} summary <- get_bioactivity_summary_batch(AEID = c(759,700,891)) summary <- data.table::rbindlist(summary) ``` ```{r, echo=FALSE} printFormattedTable(summary) # printed using custom formatted table ``` ### Get data `get_bioactivity_details()` can retrieve all available multiple concentration data by assay endpoint id (aeid), sample id (spid), Level 4 ID (m4id), or chemical DTXSID. Returned is chemical information, level 3 concentration-response values, level 4 fit parameters, level 5 hit parameters, and level 6 flags for individual chemicals tested. An example for each request parameter is provided below: ```{r ccdR data by spid, message=FALSE, results = FALSE} # By spid spid_data <- get_bioactivity_details(SPID = 'TP0000904H05') ``` ```{r, echo=FALSE} printFormattedTable(head(spid_data), c(ncol(spid_data)-2)) # printed using custom formatted table ``` ```{r ccdR data by m4id, message=FALSE, results = FALSE} # By m4id m4id_data <- get_bioactivity_details(m4id = 739695) ``` ```{r, echo=FALSE} printFormattedTable(m4id_data, c(ncol(m4id_data) - 2)) # printed using custom formatted table ``` ```{r ccdR data by dtxsid, message=FALSE, results = FALSE} # By DTXSID dtxsid_data <- get_bioactivity_details(DTXSID = "DTXSID30944145") ``` ```{r, echo=FALSE} printFormattedTable(dtxsid_data, c(ncol(dtxsid_data)-2)) # printed using custom formatted table ``` ```{r ccdR data by aeid, message=FALSE, results = FALSE} # By aeid aeid_data <- get_bioactivity_details(AEID = 704) ``` ```{r, echo=FALSE} printFormattedTable(head(aeid_data), c(ncol(aeid_data)-2)) # printed using custom formatted table ``` Similar to the other `_batch` functions, `get_bioactivity_details_batch()` retrieves data for a list (or vector) of assay endpoint ids (aeid), sample ids (spid), Level 4 IDs (m4id), or chemical DTXSIDs. ```{r ccdR data by aeid batch, message=FALSE, eval=FALSE} aeid_data_batch <- get_bioactivity_details_batch(AEID = c(759,700,891)) aeid_data_batch <- data.table::rbindlist(aeid_data_batch, fill = TRUE) ``` # Conclusion In this vignette, a variety of functions that access different types of data found in the `Bioactivity` endpoints of the CTX APIs were listed. We encourage the reader to explore the data accessible through these endpoints work with it to get a better understanding of what data is available. Additionally, experienced ToxCast/invitrodb users may find it easier to continue to use [tcpl](https://CRAN.R-project.org/package=tcpl), which has integrated ccdR's bioactivity functions to make the API data retrievable in a familiar format. ```{r breakdown, echo = FALSE, results = 'hide'} # This chunk will be hidden in the final product. It serves to undo defining the # custom print function to prevent unexpected behavior after this module during # the final knitting process knit_print.data.table = knitr::normal_print registerS3method( "knit_print", "data.table", knit_print.data.table, envir = asNamespace("knitr") ) ``` ```{r, include=FALSE} end_vignette() ```