R Major Troubleshooting [21-40]
Recommended Article : 【R STUDIO】 R Studio Table of Contents
21. Error: could not find function “% >%”
⑴ (package) Solution
install.packages("magrittr")
install.packages("dplyr")
library(magrittr)
library(dplyr)
22. Error in is.empty(.) : could not find function “is.empty”
⑴ (package) Solution
install.packages('rapportools')
library(rapportools)
23. Error in fill_palette(palette = “npg”) : could not find function “fill_palette”
⑴ (package) Solution
install.packages('ggpubr')
library(ggpubr)
24. Error: package or namespace load failed for ‘Seurat’ in dyn.load(file, DLLpath = DLLpath, …): unable to load shared object ‘/opt/conda/lib/R/library/igraph/libs/igraph.so’: libglpk.so.40: cannot open shared object file: No such file or directory
⑴ (package) Cause : This problem occurs when setting up RStudio server on Linux servers like Ubuntu and trying to download the Seurat package.
⑵ (package) Solution
## Linux Ubuntu
sudo apt update
sudo apt install libglpk40
## RStudio
remove.packages('rlang')
install.packages('rlang')
install.packages('lifecycle')
⑶ Detailed Explanation
①
sudo apt update
: To resolve “E: Unable to locate package” issue (ref)
②
sudo apt install libglpk40
: Install libglpk.so.40 since it’s missing (ref)
③
remove.packages('rlang'), install.packages('rlang'), install.packages('lifecycle')
: To resolve the “Error: package or namespace load failed for ‘Seurat’…” issue (ref)
25. The previous R session was abnormally terminated due to an unexpected crash. You may have lost workspace data as a result of this crash. RStudio may not have restored the previously active project as a precaution. You may switch back to it using the Projects menu.
⑴ (system) Cause : Insufficient resources
26. Error in if (all(cur_arg < 0)) { : missing value where TRUE/FALSE needed
⑴ (grammar) Problem Scenario : After performing SaveH5Seurat(object, filename = "myData.h5Seurat")
, when trying to use Convert("myData.h5Seurat", dest = "h5ad")
, this error message appears.
⑵ (grammar) Cause : This error occurs when attempting to access the matrix with rowname NM_001001130.3 in object@assays$RNA@data
and object@assays$RNA@scale.data
, and trying to input a matrix with the same rowname forcefully for a Seurat object that has rownames like NM-001001130.3.
⑶ (grammar) Solution : Use gsub("_", "-", str)
to replace NM_001001130.3 with NM-001001130.3.
27. Error: package or namespace load failed for ‘hdf5r’ in dyn.load(file, DLLpath = DLLpath, …): unable to load shared object ‘/opt/conda/lib/R/library/hdf5r/libs/hdf5r.so’: libhdf5_serial_hl.so.100: cannot open shared object file: No such file or directory
⑴ (package) Solution
sudo apt-get update
sudo apt-get install libhdf5-dev
sudo apt-get install libhdf5-serial-dev
28. Error in data %>% gather(CellType, Proportion, -SampleID) : could not find function “%>%”
⑴ (package) Solution 1
# If you don't have tidyverse installed, install it first:
# install.packages("tidyverse")
# Then, load it
library(tidyverse)
⑵ (package) Solution 2 (ref)
library(tidyr)
29. Error in .subscript.2ary(x, i, j, drop = TRUE) : subscript out of bounds
⑴ (package) Situation : br.sp_subset <- subset(br.sp, features = top_genes[valid_genes])
⑵ (package) Solution 1. : Downgrade Seurat from 5 to 4.
devtools::install_github("satijalab/seurat", ref = "v4.3.0")
⑶ (package) Solution 2. Syntax change in Seurat v5 for subset
# Seurat v4
br.sp_subset <- subset(br.sp, features = top_genes)
# Seurat v5
br.sp[["RNAsub"]] <- subset(br.sp[["RNA"]], features = top_genes)
DefaultAssay(br.sp) <- "RNAsub"
30. Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth = TRUE, : You’re computing too large a percentage of total singular values, use a standard svd instead. Error in irlba::irlba(Matrix::t(preproc_res), nv = min(num_dim, min(dim(FM)) - ): function ‘as_cholmod_sparse’ not provided by package ‘Matrix’
⑴ (package) Problem situation : cds1 <- preprocess_cds(cds1, method='LSI')
⑵ (package) Solution : Redefine preprocess_cds (ref). Other necessary functions are defined in ref1, ref2, ref3, ref4.
preprocess_cds <- function(cds,
method = c('PCA', "LSI"),
num_dim = 50,
norm_method = c("log", "size_only", "none"),
use_genes = NULL,
pseudo_count = NULL,
scaling = TRUE,
verbose = FALSE,
build_nn_index = FALSE,
nn_control = list()) {
assertthat::assert_that(
tryCatch(expr = ifelse(match.arg(method) == "",TRUE, TRUE),
error = function(e) FALSE),
msg = "method must be one of 'PCA' or 'LSI'")
method <- match.arg(method)
assertthat::assert_that(
tryCatch(expr = ifelse(match.arg(norm_method) == "",TRUE, TRUE),
error = function(e) FALSE),
msg = "norm_method must be one of 'log', 'size_only' or 'none'")
norm_method <- match.arg(norm_method)
assertthat::assert_that(assertthat::is.count(num_dim))
if(!is.null(use_genes)) {
assertthat::assert_that(is.character(use_genes))
assertthat::assert_that(all(use_genes %in% row.names(rowData(cds))),
msg = paste("use_genes must be NULL, or all must",
"be present in the row.names of rowData(cds)"))
}
assertthat::assert_that(!is.null(size_factors(cds)),
msg = paste("You must call estimate_size_factors before calling",
"preprocess_cds."))
assertthat::assert_that(sum(is.na(size_factors(cds))) == 0,
msg = paste("One or more cells has a size factor of",
"NA."))
if(build_nn_index) {
nn_control <- set_nn_control(mode=1,
nn_control=nn_control,
nn_control_default=get_global_variable('nn_control_annoy_cosine'),
nn_index=NULL,
k=NULL,
verbose=verbose)
}
#ensure results from RNG sensitive algorithms are the same on all calls
set.seed(2016)
FM <- normalize_expr_data(cds, norm_method, pseudo_count)
if (nrow(FM) == 0) {
stop("all rows have standard deviation zero")
}
if (!is.null(use_genes)) {
FM <- FM[use_genes, ]
}
fm_rowsums = Matrix::rowSums(FM)
FM <- FM[is.finite(fm_rowsums) & fm_rowsums != 0, ]
#
# Notes:
# o the functions save_transform_models/load_transform_models
# expect that the reduce_dim_aux slot consists of a S4Vectors::SimpleList
# that stores information about methods with the elements
# reduce_dim_aux[[method]][['model']] for the transform elements
# reduce_dim_aux[[method]][[nn_method]] for the nn index
# and depends on the elements within model and nn_method.
#
if(method == 'PCA') {
cds <- initialize_reduce_dim_metadata(cds, 'PCA')
cds <- initialize_reduce_dim_model_identity(cds, 'PCA')
if (verbose) message("Remove noise by PCA ...")
# Initialize variables
preproc_res <- NULL
rotation_matrix <- NULL
sdev_values <- NULL
# Determine the dimension of the feature matrix
dim_FM <- min(dim(FM)) - 1
# Use irlba if the number of dimensions is less than 20% of the matrix dimension, else use svd
if (num_dim <= 0.2 * dim_FM) {
irlba_res <- irlba::irlba(Matrix::t(FM), n = min(num_dim, dim_FM), center = scaling, scale. = scaling)
preproc_res <- irlba_res$x
irlba_rotation <- irlba_res$rotation
rotation_matrix <- irlba_res$rotation
sdev_values <- irlba_res$sdev
} else {
svd_res <- svd(Matrix::t(FM))
irlba_res = svd_res
preproc_res <- svd_res$u[, 1:num_dim] %*% diag(svd_res$d[1:num_dim])
irlba_rotation <- svd_res$v[, 1:num_dim]
rotation_matrix <- svd_res$v[, 1:num_dim]
sdev_values <- svd_res$d[1:num_dim]
}
row.names(preproc_res) <- colnames(cds)
SingleCellExperiment::reducedDims(cds)[[method]] <- as.matrix(preproc_res)
row.names(rotation_matrix) <- rownames(FM)
# we need svd_v downstream so
# calculate gene_loadings in cluster_cells.R
cds@reduce_dim_aux[['PCA']][['model']][['num_dim']] <- num_dim
cds@reduce_dim_aux[['PCA']][['model']][['norm_method']] <- norm_method
cds@reduce_dim_aux[['PCA']][['model']][['use_genes']] <- use_genes
cds@reduce_dim_aux[['PCA']][['model']][['pseudo_count']] <- pseudo_count
cds@reduce_dim_aux[['PCA']][['model']][['svd_v']] <- irlba_rotation
cds@reduce_dim_aux[['PCA']][['model']][['svd_sdev']] <- irlba_res$sdev
cds@reduce_dim_aux[['PCA']][['model']][['svd_center']] <- irlba_res$center
cds@reduce_dim_aux[['PCA']][['model']][['svd_scale']] <- irlba_res$svd_scale
# Note that prop_var_expl is the fraction of variance explained by the retained
# PCs, not the fraction of total variance.
cds@reduce_dim_aux[['PCA']][['model']][['prop_var_expl']] <- irlba_res$sdev^2 / sum(irlba_res$sdev^2)
matrix_id <- get_unique_id(SingleCellExperiment::reducedDims(cds)[['PCA']])
counts_identity <- get_counts_identity(cds)
cds <- set_reduce_dim_matrix_identity(cds, 'PCA',
'matrix:PCA',
matrix_id,
counts_identity[['matrix_type']],
counts_identity[['matrix_id']],
'matrix:PCA',
matrix_id)
cds <- set_reduce_dim_model_identity(cds, 'PCA',
'matrix:PCA',
matrix_id,
'none',
'none')
if( build_nn_index ) {
nn_index <- make_nn_index(subject_matrix=SingleCellExperiment::reducedDims(cds)[[method]], nn_control=nn_control, verbose=verbose)
cds <- set_cds_nn_index(cds=cds, reduction_method=method, nn_index=nn_index, verbose=verbose)
}
else
cds <- clear_cds_nn_index(cds=cds, reduction_method=method, nn_method='all')
} else if(method == "LSI") {
cds <- initialize_reduce_dim_metadata(cds, 'LSI')
cds <- initialize_reduce_dim_model_identity(cds, 'LSI')
# preproc_res <- tfidf(FM)
tfidf_res <- tfidf(FM)
preproc_res <- tfidf_res[['tf_idf_counts']]
num_col <- ncol(preproc_res)
irlba_res <- irlba::irlba(Matrix::t(preproc_res),
nv = min(num_dim,min(dim(FM)) - 1))
preproc_res <- irlba_res$u %*% diag(irlba_res$d)
row.names(preproc_res) <- colnames(cds)
SingleCellExperiment::reducedDims(cds)[[method]] <- as.matrix(preproc_res)
irlba_rotation = irlba_res$v
row.names(irlba_rotation) = rownames(FM)
cds@reduce_dim_aux[['LSI']][['model']][['num_dim']] <- num_dim
cds@reduce_dim_aux[['LSI']][['model']][['norm_method']] <- norm_method
cds@reduce_dim_aux[['LSI']][['model']][['use_genes']] <- use_genes
cds@reduce_dim_aux[['LSI']][['model']][['pseudo_count']] <- pseudo_count
cds@reduce_dim_aux[['LSI']][['model']][['log_scale_tf']] <- tfidf_res[['log_scale_tf']]
cds@reduce_dim_aux[['LSI']][['model']][['frequencies']] <- tfidf_res[['frequencies']]
cds@reduce_dim_aux[['LSI']][['model']][['scale_factor']] <- tfidf_res[['scale_factor']]
cds@reduce_dim_aux[['LSI']][['model']][['col_sums']] <- tfidf_res[['col_sums']]
cds@reduce_dim_aux[['LSI']][['model']][['row_sums']] <- tfidf_res[['row_sums']]
cds@reduce_dim_aux[['LSI']][['model']][['num_cols']] <- tfidf_res[['num_cols']]
cds@reduce_dim_aux[['LSI']][['model']][['svd_v']] <- irlba_rotation
cds@reduce_dim_aux[['LSI']][['model']][['svd_sdev']] <- irlba_res$d/sqrt(max(1, num_col - 1))
# we need svd_v downstream so
# calculate gene_loadings in cluster_cells.R
matrix_id <- get_unique_id(SingleCellExperiment::reducedDims(cds)[['LSI']])
counts_identity <- get_counts_identity(cds)
cds <- set_reduce_dim_matrix_identity(cds, 'LSI',
'matrix:LSI',
matrix_id,
counts_identity[['matrix_type']],
counts_identity[['matrix_id']],
'matrix:LSI',
matrix_id)
cds <- set_reduce_dim_model_identity(cds, 'LSI',
'matrix:LSI',
matrix_id,
'none',
'none')
if( build_nn_index ) {
nn_index <- make_nn_index(subject_matrix=SingleCellExperiment::reducedDims(cds)[[method]], nn_control=nn_control, verbose=verbose)
cds <- set_cds_nn_index(cds=cds, reduction_method=method, nn_index=nn_index, verbose=verbose)
}
else
cds <- clear_cds_nn_index(cds=cds, reduction_method=method, nn_method='all')
}
if(!is.null(cds@reduce_dim_aux[['Aligned']]) && !is.null(cds@reduce_dim_aux[['Aligned']][['model']][['beta']])) {
cds@reduce_dim_aux[['Aligned']][['model']][['beta']] <- NULL
}
cds
}
# Helper function to normalize the expression data prior to dimensionality
# reduction
normalize_expr_data <- function(cds,
norm_method = c("log", "size_only", "none"),
pseudo_count = NULL) {
norm_method <- match.arg(norm_method)
FM <- SingleCellExperiment::counts(cds)
# If we're going to be using log, and the user hasn't given us a
# pseudocount set it to 1 by default.
if (is.null(pseudo_count)){
if(norm_method == "log")
pseudo_count <- 1
else
pseudo_count <- 0
}
if (norm_method == "log") {
# If we are using log, normalize by size factor before log-transforming
FM <- Matrix::t(Matrix::t(FM)/size_factors(cds))
if (pseudo_count != 1 || is_sparse_matrix(SingleCellExperiment::counts(cds)) == FALSE){
FM <- FM + pseudo_count
FM <- log2(FM)
} else {
FM@x = log2(FM@x + 1)
}
} else if (norm_method == "size_only") {
FM <- Matrix::t(Matrix::t(FM)/size_factors(cds))
FM <- FM + pseudo_count
}
return (FM)
}
# Andrew's tfidf
tfidf <- function(count_matrix, frequencies=TRUE, log_scale_tf=TRUE,
scale_factor=100000, block_size=2000e6) {
# Use either raw counts or divide by total counts in each cell
if (frequencies) {
# "term frequency" method
col_sums <- Matrix::colSums(count_matrix)
tf <- Matrix::t(Matrix::t(count_matrix) / col_sums)
} else {
# "raw count" method
col_sums <- NA
tf <- count_matrix
}
# Either TF method can optionally be log scaled
if (log_scale_tf) {
if (frequencies) {
tf@x = log1p(tf@x * scale_factor)
} else {
tf@x = log1p(tf@x * 1)
}
}
# IDF w/ "inverse document frequency smooth" method
num_cols <- ncol(count_matrix)
row_sums <- Matrix::rowSums(count_matrix > 0)
idf = log(1 + num_cols / row_sums)
# Try to just to the multiplication and fall back on delayed array
# TODO hopefully this actually falls back and not get jobs killed in SGE
tf_idf_counts = tryCatch({
tf_idf_counts = tf * idf
tf_idf_counts
}, error = function(e) {
print(paste("TF*IDF multiplication too large for in-memory, falling back",
"on DelayedArray."))
options(DelayedArray.block.size=block_size)
DelayedArray:::set_verbose_block_processing(TRUE)
tf = DelayedArray::DelayedArray(tf)
idf = as.matrix(idf)
tf_idf_counts = tf * idf
tf_idf_counts
})
rownames(tf_idf_counts) = rownames(count_matrix)
colnames(tf_idf_counts) = colnames(count_matrix)
tf_idf_counts = methods::as(tf_idf_counts, "sparseMatrix")
return(list(tf_idf_counts=tf_idf_counts, frequencies=frequencies, log_scale_tf=log_scale_tf, scale_factor=scale_factor, col_sums=col_sums, row_sums=row_sums, num_cols=num_cols))
}
① You can simply call the above functions like this.
source("https://github.com/JB243/nate9389/blob/main/RStudio/preprocess_cds_and_as_cholmod_sparse.R?raw=true")
31. Warning: No layers found matching search pattern provided Error in FetchData.Assay5(object = object[[DefaultAssay(object = object)]], : layer “data” is not found in the object
⑴ (Grammar) Cause: For Seurat version 5, separate normalization is required to prevent errors in SpatialFeaturePlot
.
⑵ (Grammar) Solution Method:
# before
library(Seurat)
br.sp = Load10X_Spatial('~/Downloads/sample', slice= 'slice1')
SpatialFeaturePlot(br.sp, "Slc2a1") # error occurs
# after
library(Seurat)
br.sp = Load10X_Spatial('~/Downloads/sample', slice= 'slice1')
br.sp <- SCTransform(br.sp, assay = "Spatial", verbose = FALSE, variable.features.n = 1000)
SpatialFeaturePlot(br.sp, "Slc2a1")
32. Error in fill_alpha(data$fill %||% “black”, data$alpha): could not find function “fill_alpha”
⑴ (package) Problem: The error occurred when conducting SpatialDimPlot(tnbc.merge, label = TRUE, label.size = 3)
.
⑵ (package) Solution: Update ggplot2 3.4.4
to ggplot2 3.5.0
.
33. data layers are not joined. Please run JoinLayers
⑴ (grammar) Problem: FindAllMarkers(tnbc.merge, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
⑵ (grammar) Solution: After executing tnbc.merge = JoinLayers(tnbc.merge)
, run the above code.
34. Using github PAT from envvar GITHUB_PAT. Use gitcreds::gitcreds_set()
and unset GITHUB_PAT in .Renviron (or elsewhere) if you want to use the more secure git credential store instead. Error: Failed to install ‘unknown package’ from GitHub: HTTP error 401. Bad credentials Rate limit remaining: 59/60 Rate limit reset at: 2024-05-30 17:44:43 UTC
⑴ (grammar) Solution: If there is something assigned when you execute Sys.getenv("GITHUB_PAT")
, run Sys.unsetenv("GITHUB_PAT")
to delete that environment setting. (ref)
35. Error in if (tools::file_ext(filename) == “parquet”) { : the condition has length > 1
⑴ (Package) Problem: An error occurs when using Seurat::Load10X_Spatial.
⑵ (Package) Solution: Downgrade Seurat from version 5.1.0
to version 5.0.1
.
36. Issues installing rjson and RcppParallel when setting up gpsFISH on an RStudio server
⑴ (Package) Solution
# 1. Installing rjson
remotes::install_version("rjson", version = "0.2.21", repos = "http://cran.us.r-project.org")
# 2. Installing RcppParallel (Terminal)
R CMD INSTALL --use-vanilla /data/RcppParallel_5.1.9.tar.gz
# 3. Installing gpsFISH
devtools::install_github("kharchenkolab/gpsFISH")
Input : June 9, 2023, 15:38
Modified : December 16, 2023, 12:29