Welcome to ShinySC’s documentation

Applied Bioinformatics Laboratory@CGBS

ShinySC is a desktop application designed for analyzing scRNA-seq data and automatically annotating cell types.

💡

ShinySC

Analyzes scRNA-seq data with ease

Integrates Seurat, visual analytics, and annotation algorithms (e.g., ScType, scCATCH, SingleR, and GPTCelltype)

Available for Windows, Mac and Linux, as desktop or web application

💡

ShinySC Desktop Edition for Windows (Download)

💡

ShinySC Web Edition (ShinySC web )

Table of Content

1. Installing ShinySC

Windows

Download the ShinySC for Windows (link)

Unzip the file

Install using ShinySC_ver.exe

Mac & Linux

Install Docker Desktop (link)

Pull the shinysc image

Run a new container
1. Wait until the service is ready and access the ShinySC via the link
1. ShinySC GUI interface
1. Stop, Restart , and Delete ShinySC container

Web (No installation required)

We also offer a web version of ShinySC using Docker and ShinyProxy. Due to network limitations, the file upload size is capped at 400 MB. For larger files or greater security, we recommend the desktop application.

ShinySC web

2. Implementation

2-1 The framework of ShinySC

ShinySC consists of nine modules: (1) Data Upload, which supports a wide range of input formats; (2) Quality Control, enabling filtering of cells and genes based on customizable thresholds; (3) Feature Selection and Dimension Reduction, identifying highly variable genes and applying PCA, UMAP, or t-SNE for visualization; (4) Clustering, using graph-based algorithms with adjustable resolution settings and visual outputs; (5) Gene Marker Identification, detecting cluster-specific markers; (6) Automatic Cell Type Annotation, integrating reference-based, marker-based, and GPT-4-based strategies for label assignment; (7) Batch Correction, removing batch effects for integrated analysis across datasets; (8) Differential Gene Expression, allowing comparisons between cell clusters or experimental conditions; and (9) Trajectory Inference, reconstructing dynamic developmental or differentiation pathways.

2-2 Input

Supported formats

No.	Platform/ Software	Data format
1	10X Genomics (Cell Ranger prior to v3.0)	(MEX format): `.zip` barcode.tsv, genes.tsv, matrix.mtx
2	10X Genomics (Cell Ranger v7.0 and later)	(MEX format) : `.tar.gz` barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz
3	10X Genomics (Cell Ranger v3.0 and later)	(HDF5format): `.h5`
4	Seurat object	`.rds`
5	Scanpy	Annotated Data Array: `.h5ad`
6	BD Rhapsody	`.csv`
7	CellView	`.Rds`

2-3 Quality Control

Performing QC involves filtering out low-quality cells based on the number of genes detected (Gene Count), the number of UMIs (Count Depth), and the percentage of mitochondrial gene expression (Mito. gene expression). Visualization tools like violin plots help determine appropriate filtering thresholds.

In addition to violin plots, ShinySC provides histograms to visualize distributions of detected gene content across cells, identify outliers, aid in setting filtering thresholds, and offer a clear overview of QC metrics. ShinySC also includes filters with dynamic dash lines to indicate the filtering range, helping users determine appropriate thresholds for filtering out undesirable cells. Filtered cells are faded out in grey for clarity.

2-4 Feature Selection

Once QC is completed, the data undergoes normalization, variable feature identification, and scaling for subsequent analyses. ShinySC offers two normalization methods: LogNormalize and SCTransform.

Feature selection involves choosing the most variable genes based on expression, retaining meaningful biological information, and excluding random noise. This preserves significant structure and improves computational efficiency for further analysis.

After selecting the most variable genes to retain meaningful biological information and reduce noise, the resulting gene set undergoes PCA. This reduces dimensionality, captures the main sources of variation, and makes subsequent analysis more efficient and insightful.

💡

Determination of optimal clustering resolution

Seurat's FindClusters function allows for iterative refinement of the resolution parameter for cell clustering. However, Seurat lacks built-in visualization tools to explore cluster changes across resolutions. ShinySC compensates for this by incorporating Clustree, which visualizes cluster evolution across different resolutions ①, ensuring the selection of the most appropriate resolution for subsequent step of clustering②③.

2-5 Clustering

Following initial data processing steps such as normalization, scaling, feature selection, PCA, and clustering, UMAP and t-SNE are key non-linear dimensionality reduction techniques for visualizing complex gene expression patterns in scRNA-Seq data. They help researchers understand cellular composition and heterogeneity. ShinySC offers both UMAP and t-SNE visualizations with interactive features for toggling labels and adjusting label size, plot width, and height.

2-6 Find Gene Markers

FindAllMarkers is a function in the Seurat package designed to identify marker genes that distinguish each cluster in single-cell RNA-seq data by performing differential expression testing and identifying genes significantly expressed in one cluster compared to all others.

ShinySC provides a GUI for FindAllMarkers, enabling users to select differential expression tests from a drop-down menu①， toggle positive markers on/off②, set thresholds for minimum cell expression fraction and log fold change③④,and use a slider to select the top N markers per cluster ⑤.

An interactive table ⑥ with filtering and sorting functionalities displays all gene markers identified by Seurat's FindAllMarkers⑨. A dedicated table ⑦ presents the top N gene markers for each cluster. These markers are selectively filtered based on the percentage difference ④ in gene expression between clusters and ranked by average log fold change (avg_logFC). This table is dynamically connected to a UMAP visualization ⑧, showing marker gene expression levels in each cell cluster.

In addition to the interactive table, ShinySC features dot matrices ⑩ that offer a clear and intuitive visual representation of potential gene markers across different clusters.

Users can customize the k-value ⑪, adjust column and row label sizes, rotate label text, and modify the display's width and height to further enhance the visualization of expression patterns and gene markers ⑫.

2-7 Automatic cell-type annotation

Manual cell-type annotation is time-consuming, error-prone, and dependent on researcher expertise. In contrast, automatic annotation can process thousands of cells in minutes, delivering consistent, reproducible results. ShinySC integrates tools like ScType, scCATCH, SingleR and GPTCelltype, streamlining the annotation process. This allows researchers to obtain accurate, data-driven insights and focus on higher-level analysis and biological discovery, bypassing the tedious task of manual annotation.

2-7-1 Marker-Based Annotation

💡

ScType uses a curated list of marker genes to identify and annotate cell types, with 3,980 markers for 194 human cell types across 17 tissues and 4,212 markers for mouse cell types.

ShinySC provides a selective drop-down menu ① enabling users to choose the appropriate tissue type for their samples. Beyond integrating the exclusive display function from the ScType command-line R package ②③, ShinySC also offers tables ④ and charts ⑤ that summarize cell-type specific marker genes and their distribution, complementing ScType's capabilities.

💡

scCATCH automates cell-type identification by detecting cluster marker genes and annotating them with evidence-based scores using the tissue-specific cell taxonomy reference database, CellMatch. CellMatch provides comprehensive data on 353 cell types and 686 subtypes across 184 tissue types, supported by 2,096 references from human and mouse studies. Notably, it also allows users to select different combinations of tissues for annotation.

While scCATCH allows selection of tissue combinations for annotation, it lacks an interface for selecting gene markers from 353 cell types across 184 tissue types. To enhance usability, ShinySC uses cascade filters ①② for streamlined gene marker selection and provides an interactive table categorizing reference-supported markers ③ .

ShinySC features a radio button ④ to choose between Seurat's FindAllMarkers and scCATCH's algorithm for identifying highly expressed gene markers. The identified cell-type-related markers ⑤ are displayed with supported references (PubMed IDs)⑥, highlighted in UMAP visualizations, and their expression levels shown across cell types ⑦.

2.7.2 Reference-based annotation

💡

SingleR is a reference-based tool for single-cell RNA-seq analysis, supporting the Human Primary Cell Atlas, Blueprint/ENCODE, Mouse RNA-Seq, and the Immunological Genome Project, as well as custom references. It primarily handles human and mouse data, using correlation-based methods to efficiently assign cell identities.

ShinySC have streamlined SingleR analysis by adding a simple drop-down menu to select either human or mouse datasets, along with their corresponding references. This minimizes manual matching, reduces errors, and saves time. By linking each organism to the right reference, cell-type annotation becomes faster and more accessible for users of all experience levels.

SingleR Reference Datasets

Reference Dataset	Organism	Source	Coverage	Description	Access in R
HumanPrimaryCellAtlasData (HPCA)	Human	Microarray data of purified human primary cells	Broad set of immune & non-immune primary cell types	General annotation across major cell categories (T cells, B cells, epithelial, fibroblasts, etc.)	`library(celldex); hpca <- HumanPrimaryCellAtlasData()`
BlueprintEncodeData	Human	RNA-seq from the BLUEPRINT & ENCODE consortia	Primarily blood & immune cells, plus some non-immune cell types	Studies focused on human blood & immunology	`library(celldex); blueprint <- BlueprintEncodeData()`
DatabaseImmuneCellExpressionData (DICE)	Human	DICE database (microarray or RNA-seq) of purified human immune populations	Many subtypes of T cells, B cells, monocytes, dendritic cells, etc.	Projects needing finer resolution of immune subtypes	May require separate data packages. Check DICE Database for details.
MonacoImmuneData	Human	RNA-seq data of sorted immune populations (Monaco Lab)	Granular coverage of multiple immune cell types	Detailed separation of subtle immune subsets	`library(celldex); monaco <- MonacoImmuneData()`
NovershternHematopoieticData	Human	Microarray data from Novershtern et al. (hematopoiesis)	Mature hematopoietic lineages, progenitors, etc.	Hematopoiesis-related scRNA-seq studies	`library(celldex); novershtern <- NovershternHematopoieticData()`
MouseRNAseqData	Mouse	Bulk RNA-seq from multiple mouse tissues/cell types	Wide range of mouse tissues & cell types	General mouse single-cell annotation	`library(celldex); mouse_ref <- MouseRNAseqData()`
ImmGenData	Mouse	Immunological Genome Project (ImmGen), focusing on mouse immune cells	Detailed coverage of mouse immune cell populations	In-depth annotation of mouse immune subsets	`library(celldex); immgen <- ImmGenData()`

2.7.3 GPT-based annotation

💡

GPTCelltype, powered by GPT-4, offers cost-efficient cell type annotation, eliminates the need to manually gather gene marker reference datasets, and has the potential to integrate with single-cell analysis pipelines like Seurat.

Despite GPT-4's capability to accurately annotate cell types using marker gene information, the standard single-cell RNA-seq workflow for identifying these markers is missing in the GPTCelltype package. ShinySC ensures completeness by providing the full workflow up to the FindAllMarkers step ①②, making the entire process seamless and comprehensive.

After identifying gene markers for each cluster ②, users can click the "Annotate with GPT-4o" button ③ to perform cell type annotation. Note that GPTCelltype requires an OpenAI API key ④ for this function ⑤⑥⑦. We recommend using the desktop version of ShinySC and advise against sharing your API key or uploading it to public spaces to ensure security.

💡

To register for an OpenAI API key, follow these steps:
1. Sign Up or Log In:
  - Visit the OpenAI website.
  - Click on "Sign Up" to create a new account or "Log In" if you already have an account.
1. Navigate to API Section:
  - After logging in, go to the API section. This is usually found in the dashboard or in the account settings menu.
1. Create a New API Key:
  - Look for an option to create a new API key. This might be labeled as "New API Key," "Generate API Key," or similar.
  - Click on this option and follow the prompts to generate a new key.
1. Copy and Secure Your API Key:
  - Once generated, copy the API key and store it securely. Do not share this key publicly or with unauthorized individuals.
1. Add Billing Information:
  - Ensure that your account has the necessary billing information set up. The API usage might require a credit card or other payment method to be linked to your account.
1. Use Your API Key:
  - Use this API key in your applications, such as ShinySC, to access GPT-4 capabilities.
For detailed instructions, you can refer to OpenAI's official documentation or support resources.

2.7.4 Summary of Cell-Type Annotations Across Annotation Packages

ShinySC provides a way to compare cell-type annotations from different methods, such as ScType, scCATCH, SingleR and GPTCelltype in a single platform. By integrating diverse approaches, it helps users cross-check classifications and get a clearer picture of cell identities ①. This makes it a useful tool for ensuring more reliable single-cell analysis.

ShinySC’s interactive summary table aggregates annotation results ② (e.g., SingleR, scCATCH) and includes an editable “Custom_Label” column ③. Double‐click to rename clusters, instantly updating UMAP plots. This streamlines label reconciliation and exporting final annotations.

2-8 Batch Effect Correction

💡

Technical variations (batch effects) across scRNA-seq experiments can mask real biological signals. Correcting these artifacts (“integration”) ensures cells from different datasets remain comparable and prevents downstream analyses from reflecting experimental bias instead of true cell differences.

The Batch Effect Correction module is accessed by clicking its icon in the left‐side toolbar ①. Checking the “Use Demo Files” box ② automatically loads example PBMC datasets, enabling a quick test of the multiple datasets integration workflow. After uploading your own files or selecting these demos, clicking “Start Analysis with Uploaded Files” or “Start Analysis with Demo Files” ③ initiates processing and displays both unintegrated and integrated UMAP plots. ShinySC integrates datasets using Seurat’s FindIntegrationAnchors() and IntegrateData(), reducing batch effects and aligning cell populations, as visualized in the integrated UMAP ④. Finally, click “Download Integrated Data” ⑤ to obtain the batch‐corrected Seurat object (.rds), ready for further exploration in R or other tools.

2-9 Packages incorporated in ShinySC

Package Name	Description	Link
shiny	Web Application Framework for R	https://cran.r-project.org/package=shiny
bs4Dash	Bootstrap 4 'Dashboard' Theme for 'shiny'	https://cran.r-project.org/package=bs4Dash
shinyWidgets	Custom Inputs Widgets for Shiny	https://cran.r-project.org/package=shinyWidgets
shinyjs	Easily Improve the User Experience in Shiny Apps	https://cran.r-project.org/package=shinyjs
DT	A Wrapper of the jQuery 'DataTables' Library	https://cran.r-project.org/package=DT
data.table	Extension of data.frame	https://cran.r-project.org/package=data.table
Seurat	Tools for Single Cell Genomics	https://cran.r-project.org/package=Seurat
SeuratDisk	Interfaces for HDF5-Based Single Cell Storage Formats	https://github.com/mojaveazure/seurat-disk
patchwork	Thematic Package for Assembly of ggplot2 Plots	https://cran.r-project.org/package=patchwork
plotly	Create Interactive Web Graphics via 'plotly.js'	https://cran.r-project.org/package=plotly
dplyr	A Grammar of Data Manipulation	https://cran.r-project.org/package=dplyr
waiter	Show Loading Screens with Progress Bars for Shiny	https://cran.r-project.org/package=waiter
clustree	Visualize Hierarchical Clustering of Single Cell Data	https://cran.r-project.org/package=clustree
dittoSeq	Visualization and Analysis Tools for Single-Cell RNA-Seq Data	https://bioconductor.org/packages/release/bioc/html/dittoSeq.html
HGNChelper	Gene Symbol Conversion and Correction	https://cran.r-project.org/package=HGNChelper
openxlsx	Read, Write and Edit XLSX Files	https://cran.r-project.org/package=openxlsx
shinyscreenshot	Capture Screenshots of Shiny Applications	https://cran.r-project.org/package=shinyscreenshot
scCATCH	Single-Cell Cluster-based Annotation Toolkit for Cellular Heterogeneity	https://cran.r-project.org/package=scCATCH
stringr	Simple, Consistent Wrappers for Common String Operations	https://cran.r-project.org/package=stringr
ggraph	An Implementation of Grammar of Graphics for Graphs and Networks	https://cran.r-project.org/package=ggraph
igraph	Network Analysis and Visualization	https://cran.r-project.org/package=igraph
tidyverse	Easily Install and Load the 'tidyverse'	https://cran.r-project.org/package=tidyverse
data.tree	Creating and Modifying Tree Structures	https://cran.r-project.org/package=data.tree
scCustomize	Customization of Seurat Objects	https://cran.r-project.org/package=scCustomize
pixiedust	Create Pixie Dust in R	https://cran.r-project.org/package=pixiedust
shinyalert	Easily Create Shiny Alerts	https://cran.r-project.org/package=shinyalert
GPTCelltype	GPT-based Cell Type Annotation	https://github.com/Winnie09/GPTCelltype
SingleR	Reference-based Cell Type Annotation	https://www.bioconductor.org/packages/release/bioc/html/SingleR.html
Slingshot	Trajectory Inference	https://github.com/kstreet13/slingshot

2-10 Benchmarking and Performance Assessment

A total of 320k human PBMCs for GEM-X Flex, utilizing a 16-plex sub-pooling strategy, were obtained from 10x Genomics. For benchmarking, we analyzed subsets of 200k, 100k, 80k, 40k, 20k, 10k, and 2k cells to evaluate performance across different input sizes.

Benchmarking was conducted on an AMD Ryzen 7 1700 Eight-Core Processor (3000 MHz) with 64GB of memory

Functional Components	200k cells	100k cells	80k cells	40k cells	20k cells	10k cells	8k cells	6k cells	4k cells	2k cells
Preprocessing (QC, Normalize)	1.38 min	0.74 min	0.62 min	0.36 min	0.23 min	0.15 min	0.12 min	0.09 min	0.09 min	0.07 min
Find highly variable genes	1.35 min	0.66 min	0.53 min	0.27 min	0.14 min	0.08 min	0.06 min	0.05 min	0.04 min	0.03 min
Regressing	16.24 min	8.11 min	6.51 min	3.30 min	1.67 min	0.87 min	0.70 min	0.52 min	0.35 min	0.19 min
PCA	1.17 min	0.54 min	0.42 min	0.20 min	0.09 min	0.05 min	0.04 min	0.04 min	0.03 min	0.03 min
Clustering	7.85 min	2.20 min	1.66 min	0.52 min	0.22 min	0.08 min	0.07 min	0.05 min	0.03 min	0.02 min
UMAP	5.04 min	2.28 min	1.71 min	0.80 min	0.41 min	0.33 min	0.27 min	0.22 min	0.16 min	0.11 min
FindAllMarkers	37.91 min	15.64 min	12.79 min	5.84 min	2.59 min	1.33 min	1.10 min	0.87 min	0.50 min	0.25 min
Cell type annotation (ScType)	1.50 min	0.73 min	0.56 min	0.28 min	0.15 min	0.08 min	0.07 min	0.06 min	0.05 min	0.03 min
Cell type annotation (SingleR)	6.49 min	3.24 min	2.59 min	1.30 min	0.66 min	0.34 min	0.27 min	0.21 min	0.14 min	0.08 min
Cell type annotation (GPT)	0.05 min	0.06 min	0.04 min	0.04 min	0.03 min	0.04 min	0.03 min	0.04 min	0.03 min	0.04 min
Cell type annotation (scCATCH)	896.73 min	390.73 min	328.13 min	148.64 min	67.28 min	30.85 min	27.18 min	19.76 min	12.14 min	6.09 min

3. Examples of Use

To demonstrate the functionalities of ShinySC, two publicly available single-cell RNA-seq datasets were analyzed.

💡

Example 1 utilized the PBMC3k dataset from 10x Genomics (2,700 peripheral blood mononuclear cells from a healthy donor), which includes major immune cell types such as T cells, B cells, NK cells, and monocytes. This dataset was used to showcase essential analytical steps including quality control, normalization, clustering, and automatic cell type annotation.

💡

Example 2 employed the dataset from Kang et al. (GEO: GSE96583), comprising approximately 15,000 PBMCs exposed to either control or interferon-β stimulation. This dataset highlighted ShinySC’s capabilities in batch correction and condition-specific transcriptional analysis, illustrating both shared and cell-type–specific responses to interferon stimulation.

3-1 Example 1 - Core Workflow Demonstration

3-1-1 Data upload

For convenient testing of ShinySC with a demonstration dataset, users can visit web version of ShinySC . Navigate to the "Data Upload" panel ①, select the "Use Demo" checkbox ②, and click "Load Demo" under Example 1 ③. Then, launch the Quality Control process ④.

3-1-2 Quality control

Using the interactive scroll bars, users can easily exclude cells with gene counts above 2,500 ① or below 200 ②, and those with mitochondrial content exceeding 5% ③, in accordance with the Seurat - Guided Clustering Tutorial.

ShinySC provides histograms and violin plots for count depth, gene numbers, and mitochondrial gene fractions per cell, essential for quality control and identifying low-quality cells in single-cell RNA sequencing.

3.1.3 Feature selection

After removing unwanted cells, the next step is data normalization. ShinySC offers the LogNormalize method by default, which scales and log-transforms feature expressions. Alternatively, users can also select the SCTransform method via a convenient drop-down menu.

ShinySC selects the 2,000 most variably expressed genes by default for downstream PCA analysis, highlighting key biological signals and enhancing the efficiency and insightfulness of subsequent analyses.

All the processes mentioned above will be automatically performed after clicking the "STEP 3 (Feature Selection)" button.

3-1-4 Clustering

💡

Determining the optimal resolution is crucial for achieving ideal granularity in cell cluster results. ShinySC incorporates the Clustree method to visualize cluster evolution across different resolutions ① simultaneously. As shown in the figure below, resolutions between 0.4 and 0.6 produce consistent clustering results ②, aligning with the optimal resolution ③ suggested by the Seurat guided tutorial.

After selecting the optimal resolution from the Clustree plot, proceed to STEP4 (Clustering) ④ to initiate cell clustering. Utilizing both tSNE ⑤ and UMAP ⑥ methods preserves local distances, enhancing the understanding of cellular composition and heterogeneity. This approach ensures consistent clustering results with the Seurat tutorial. Interactive features ⑦ for toggling labels and adjusting label size, plot width, and height facilitate the creation of publication-ready figures.

3-1-5 Automatic cell-type annotation ＆ Find Gene Markers

💡

Manual cell-type annotation is laborious and error-prone. ShinySC automates cell-type annotation with ScType, scCATCH, SingleR and GPTCelltype, delivering consistent results in minutes. This enables researchers to focus on advanced analysis and discovery.

🛠

ScType annotation

First, select the ScType algorithm ①. Then, choose the appropriate tissue type ② for peripheral blood mononuclear cells (PBMCs). Finally, click “Annotate with ScType" button ③ to annotate the cell clusters ④.

🛠

scCATCH annotation

Select scCATCH via the radio button ①, then choose Blood-associated tissue types through the cascade filters ②, including blood, peripheral blood, plasma, serum, umbilical cord blood, and venous blood. The corresponding markers will be displayed in an interactive table for annotating cell clusters ③.

Users can choose between scCATCH's algorithm (Slow) ④ and Seurat's Find,AllMarkers (Fast) for identifying highly expressed gene markers and matching these potential markers with known tissue-specific cell markers ③⑥.

🛠

SingleR annotation

To annotate a PBMC dataset using ShinySC, select SingleR through the radio button ①, then choose "(Human) NovershternHematopoieticData" as the reference dataset ②, which contains hematopoietic lineage data for PBMC annotation. Click the "Run" button ③ to start the process, and the UMAP plot will display the annotated cell types. Using SingleR, PBMC subsets such as Basophils, B cells, CD4+ T cells, CD8+ T cells, NK cells, Monocytes, Dendritic cells, Granulocytes, and HSCs (Hematopoietic Stem Cells) are identified ④.

🛠

GPTCelltype annotation

Unlike ScType and scCATCH, which feature built-in gene marker identification and scoring algorithms, GPTCelltype lacks these capabilities and instead depends on Seurat's FindAllMarkers function. ShinySC addresses this limitation by seamlessly integrating FindAllMarkers with GPTCelltype, allowing users to initiate the process with a simple click of the Run button ①.

ShinySC includes a switch button ② for users to identify differentially expressed gene markers as positive only or both positive and negative using Seurat's FindAllMarkers function ③.

Resulting gene markers ④ can be further filtered by adjusted p-value, average log fold change, and pct.diff, where pct.diff represents the difference in gene expression percentage between the cluster and all other cells ⑤.

A dot plot is utilized to confirm the expression patterns of marker genes, whereas UMAP is employed to assess the specificity of these gene markers across cell clusters ⑥.

Once gene markers for individual clusters are identified, users can click the Annotate with GPT-4o button ⑦ to perform automatic cell type annotation using GPTCelltype.

After entering the OpenAI API key ⑧, GPTCelltype automatically annotates cell types ⑨ using marker gene information from the preceding gene marker analysis workflow ⑩.

To register for an OpenAI API key, follow these steps:
1. Sign Up or Log In:
  - Visit the OpenAI website.
  - Click on "Sign Up" to create a new account or "Log In" if you already have an account.
1. Navigate to API Section:
  - After logging in, go to the API section. This is usually found in the dashboard or in the account settings menu.
1. Create a New API Key:
  - Look for an option to create a new API key. This might be labeled as "New API Key," "Generate API Key," or similar.
  - Click on this option and follow the prompts to generate a new key.
1. Copy and Secure Your API Key:
  - Once generated, copy the API key and store it securely. Do not share this key publicly or with unauthorized individuals.
1. Add Billing Information:
  - Ensure that your account has the necessary billing information set up. The API usage might require a credit card or other payment method to be linked to your account.
1. Use Your API Key:
  - Use this API key in your applications, such as ShinySC, to access GPT-4o capabilities.
For detailed instructions, you can refer to OpenAI's official documentation or support resources.

3-1-6 Summary of Cell-Type Annotation

💡

After executing a cell type annotation method, users can click the "Summary of Cell-Type Annotations" button ① to access a comprehensive overview of annotations across various annotation packages ③. To maintain the most up-to-date result of the annotated Seurat object, users should click the "Update" button ②.

Automated cell labeling methods offer convenience and organization but heavily depend on recognized marker genes for each cell type. These methods efficiently categorize primary cell types like B cells, Natural Killer cells, Dendritic cells, and Platelets. However, distinguishing subtypes such as Regulatory T cells, Helper T cells, Cytotoxic T cells, and Memory T cells can be challenging. Conflicting labels within a cluster indicate the presence of different cell subtypes. To resolve these conflicts, a majority-rule approach prioritizing consensus labels or using more general labels can be employed. When uncertainty arises, experts should adjust initial clustering parameters and utilize additional features to improve cluster labels.

The "Custom_Label" functionality allows users to modify cell-type annotations, as demonstrated in the figure above. These labels are aligned with annotations from the Seurat tutorial, ensuring consistency and facilitating direct comparisons across different annotation algorithms.

3-2 Example 2 - Batch correction and condition-specific Analysis

To demonstrate functionality, users may enable the "Use Demo Datasets" option ① ② to load PBMC data from control and interferon-β–stimulated conditions. Integration is performed ③ via Seurat’s CCA-based FindIntegrationAnchors() and IntegrateData() functions within the Batch Correction module.

UMAP plots before ① and after ② integration enable visual assessment of cell population alignment .

To enable visual assessment of cell population alignment, users can activate the “Split by orig.ident” option ⑤ in the plot settings panel ③. This separates the UMAP visualization by sample condition— IMMUNE_CTRL and IMMUNE_STIM—allowing direct comparison of cell distributions. Ensure that the “Group by” field ④ is set to seurat_annotations to display cell-type identities. Enabling “Show Labels” further assists in verifying alignment of shared cell types across conditions.

The integrated Seurat object is downloadable in .rds format for downstream differential expression analysis.

The integrated Seurat object was uploaded via the “Data Upload” module ① by selecting Seurat R object (.rds) ② as the input format and Homo sapiens as the organism. Since the dataset had been preprocessed, the Quality Control step was skipped ③, allowing the analysis to proceed directly to feature selection ④.

As informed by the clustree visualization, a resolution of 0.3 stably reproduced 14 distinct clusters, aligning with the results from the official Seurat tutorial.

The analysis followed the same workflow outlined in Example 1 and concluded with cell-type annotation.

In ShinySC, condition-specific differential expression analysis is performed through the Differential Expression module ①. Users begin by loading the integrated Seurat object ②, then specify the relevant columns for cell type annotations (e.g., seurat_annotations, ③) and experimental conditions (e.g., stim, ④) via dropdown menus. The target cell type or cluster for comparison (e.g., CD14 Mono) is selected, followed by execution of the analysis using the Run DEG Analysis button ⑤.

Differentially expressed genes were identified using Seurat’s FindMarkers() function and presented in an interactive, searchable table. Users can search for a gene of interest (e.g., CD3D) using the search box ①. By clicking on a specific gene entry ②, corresponding expression patterns can be visualized instantly through UMAP, dot plot, or violin plot views. This integration enables rapid and intuitive comparison of gene expression across cell types or conditions.

The results aligned with Kang et al., showing stable expression of canonical markers (e.g., CD3D, GNLY) across conditions, while interferon-stimulated genes (IFI6, ISG15) were broadly upregulated. CD14 decreased in monocytes, and CXCL10 increased in monocytes and B cells, highlighting the value of integrated analysis in revealing both shared and cell-type–specific responses to interferon.