Back to publications
Software Application Profile: ShinyDataSHIELD—an R Shiny application to perform federated non-disclosive data analysis in multicohort studies
Paper Details
Published: 2022/10/27
Journal: International Journal of Epidemiology
DOI: https://doi.org/10.1093/ije/dyac201
Motivation
DataSHIELD is an open-source software infrastructure enabling the analysis of data distributed across multiple databases (federated data) without leaking individuals’ information (non-disclosive). It has applications in many scientific domains, ranging from biosciences to social sciences and including high-throughput genomic studies. R is the language used to interact with (and build) DataSHIELD. This creates difficulties for researchers who do not have experience writing R code or lack the time to learn how to use the DataSHIELD functions. To help new researchers use the DataSHIELD infrastructure and to improve the user-friendliness for experienced researchers, we present ShinyDataSHIELD.
Implementation
ShinyDataSHIELD is a web application with an R backend that serves as a graphical user interface (GUI) to the DataSHIELD infrastructure.
General features
The version of the application presented here includes modules to perform: (i) exploratory analysis through descriptive summary statistics and graphical representations (scatter plots, histograms, heatmaps and boxplots); (ii) statistical modelling (generalized linear fixed and mixed-effects models, survival analysis through Cox regression); (iii) genome-wide association studies (GWAS); and (iv) omic analysis (transcriptomics, epigenomics and multi-omic integration).
Availability
ShinyDataSHIELD is publicly hosted online [https://datashield-demo.obiba.org/], the source code and user guide are deposited on Zenodo DOI 10.5281/zenodo.6500323, freely available to non-commercial users under ‘Commons Clause’ License Condition v1.0. Docker images are also available [https://hub.docker.com/r/brgelab/shiny-data-shield].