Title: | Similarity Identification in Gene Expression |
---|---|
Description: | Provides a classification framework to use expression patterns of pathways as features to identify similarity between biological samples. It provides a new measure for quantifying similarity between expression patterns of pathways. |
Authors: | Seyed Ali Madani Tonekaboni [aut], Gangesh Beri [aut], Janosch Ortmann [aut], Benjamin Haibe-Kains [aut, cre] |
Maintainer: | Benjamin Haibe-Kains <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.1.0 |
Built: | 2024-11-12 06:13:24 UTC |
Source: | https://github.com/cran/SIGN |
BubbleSort is a function for calculating bubble sort correlation between two vectors
BubbleSort(Vec1, Vec2)
BubbleSort(Vec1, Vec2)
Vec1 |
Vector of values of 1st feature across samples |
Vec2 |
Vector of values of 2nd feature across samples |
Bubble sort similarity between the two vectors
EventRenaming is a function for changing annotation of censored samples to 0 and dead samples to 1 for survival analysis
EventRenaming(EventVec, Censored_Annot)
EventRenaming(EventVec, Censored_Annot)
EventVec |
Status vector for all of the samples (patients) including both samples undergone an event or censored |
Censored_Annot |
Index of samples censored in the dataset |
Vector of events including 0 for censoring and 1 for death
ExpPhen_Matching is a function for matching samples between expression matrices and metadata matrix (clinical feature matrix)
ExpPhen_Matching(ExpMat, MetaMat, SamID_Meta)
ExpPhen_Matching(ExpMat, MetaMat, SamID_Meta)
ExpMat |
Matrix of expression of genes (samples in columns and genes in rows) |
MetaMat |
Matrix of clinical features (samples in columns) |
SamID_Meta |
Sample ID in MetaMat |
List of expression matrix and metadata of the clinical information after matching patiend IDs between the expression and clinical information matrices
ExpPhen_Subdividing is a function for grouping samples based on a clinical feature available in metadata matrix (clinical feature matrix)
ExpPhen_Subdividing(ExpMeta_List, SubDiv_ID)
ExpPhen_Subdividing(ExpMeta_List, SubDiv_ID)
ExpMeta_List |
List containing expression matrix and metadata matrix |
SubDiv_ID |
Index of the target clinical feature in metadata matrix for samples grouping |
List of expression and clinical information of patients grouped based on the specified clinical feature
ExpPheno_Categorize is a function for grouping samples based on their survival to 3 groups of poor, good, and intermediate
ExpPheno_Categorize(ExpMeta_List, Time_ID, Event_ID, Mad_Factor, MinNum_ExClass, Expression_Log2 = FALSE)
ExpPheno_Categorize(ExpMeta_List, Time_ID, Event_ID, Mad_Factor, MinNum_ExClass, Expression_Log2 = FALSE)
ExpMeta_List |
List containing expression matrix and metadata matrix |
Time_ID |
Index of time to death in metadata matrix |
Event_ID |
Index of event in metadata matrix |
Mad_Factor |
Threshold of mad in time to death values to determine poor survival group |
MinNum_ExClass |
Minimum number of samples that has to be kept in poor and good group (if number of samples is lower than this threhold, more samples will be addedd in order of survival) |
Expression_Log2 |
Parameter for gene expsression value transformation to logarithmic scale (log2(expression value+1)) |
List of expression matrices, and time to event as well as event for the patients within each category of poor, intermediate or good survival
GeneMatching is a function to remove uncommon genes between a list of expression matrices
GeneMatching(ExpList)
GeneMatching(ExpList)
ExpList |
List of expression matrices |
List of expression matrices restricted to the common genes between them
Genes_SimCal is a function to calculate similarity between a set of samples and 2 reference groups of samples
Genes_SimCal(ExpMat_Test, ExpMat_Ref1, ExpMat_Ref2, RefIDs, TestClassIter, SampleIter)
Genes_SimCal(ExpMat_Test, ExpMat_Ref1, ExpMat_Ref2, RefIDs, TestClassIter, SampleIter)
ExpMat_Test |
Expression matrix for the test samples for which SIGN will indetify the similarity with the 2 reference sataset |
ExpMat_Ref1 |
Expression matrix for the 1st reference set fo samples |
ExpMat_Ref2 |
Expression matrix for the 2nd reference set fo samples |
RefIDs |
Annotations corresponding to the 2 expression matrices (1st and 2nd names are associated with the 1st and 2nd expression matrix and ) |
TestClassIter |
Index to be matched with RefIDs for removal of test samples from reference expression matrices |
SampleIter |
Index of samples in the test expression matrix exist in referencece expression matrix 1 or 2 |
Vector of similarity between the target samples and the 2 reference sets
GSVA_Calculation is a function for Calculating correlation between expression level of pathways between 2 groups using GSVA
GSVA_Calculation(ExpMat1, ExpMat2, GeneVec, GeneSets, Name = "SampleComparison")
GSVA_Calculation(ExpMat1, ExpMat2, GeneVec, GeneSets, Name = "SampleComparison")
ExpMat1 |
Expression matrix of genes in the 1st group of sampls |
ExpMat2 |
Expression matrix of genes in the 2nd group of sampls |
GeneVec |
Name of genes in the same order as considered in ExpMat1 and ExpMat2 |
GeneSets |
List of genes within pathways |
Name |
Name used for naming the columns of output matrix of correlation between the 2 groups |
Similarity of the pathway between the two expression matrices based on pearson correlation, bubble sort, and wilcoxon paaired rank test using GSVA enrichment scores of pathways
Pathway_Grouping is a function to make a pathway list from files containing genes within each pathway
Pathway_Grouping(PathwayDir, Pattern)
Pathway_Grouping(PathwayDir, Pattern)
PathwayDir |
Path of directory including the files of pathways |
Pattern |
Pattern should be used to select the files of pathway genes from PathwayDir |
List of genes within the pathway
Pathway_similarity is a function for calculating correlation between expression level of pathways between 2 groups using all the available approaches in SIGN
Pathway_similarity(ExpMat1, ExpMat2, GeneVec, GeneSets, Name)
Pathway_similarity(ExpMat1, ExpMat2, GeneVec, GeneSets, Name)
ExpMat1 |
Expression matrix of genes in the 1st group of sampls |
ExpMat2 |
Expression matrix of genes in the 2nd group of sampls |
GeneVec |
Name of genes in the same order as considered in ExpMat1 and ExpMat2 |
GeneSets |
List of genes within pathways |
Name |
Name used for naming the columns of output matrix of correlation between the 2 groups |
Similarity of the pathway between the two expression matrices using pearson correlation, bubble sort, and wilcoxon paaired rank test
SIGN_Aggregate is a function to reshape the list of pathway scoring, time to death, and event and return a summary list
SIGN_Aggregate(ScoreList, TimeList, EventList)
SIGN_Aggregate(ScoreList, TimeList, EventList)
ScoreList |
List of similarity scores identified using different methodologies |
TimeList |
List of time to event (death) for different groups of patients |
EventList |
List of event vectors (death or censored) for different groups of patients |
List of scores identified for each sample as well as time to death and event of that sample
SIGN_Ensemble_SimCal is a function for Generating list fo similarities based on different pathway quantification methods and similarity measures
SIGN_Ensemble_SimCal(ExpList, RefClassID, TestClassID, GeneID, PathwaySets)
SIGN_Ensemble_SimCal(ExpList, RefClassID, TestClassID, GeneID, PathwaySets)
ExpList |
List of expression matrices for different groups of samples used in the centroid classification scheme |
RefClassID |
Names of the matrices in the ExpList |
TestClassID |
ID of a matrix in ExpList to be used as test set |
GeneID |
Parameter to determine if gene annotations are provided as Symbols or EntrezIDs |
PathwaySets |
List of pathways containing gene annotations for each pathways |
List of similarities identified in both gene and pathway level
Similarities_Wrapper is wrapper to identify similarities between the expression of genes in target sample and the reference expression matrix
Similarities_Wrapper(ExpMat_Test, ExpMat_Ref, GeneVec, PathwaySet, RefID, TestClassIter, SampleIter)
Similarities_Wrapper(ExpMat_Test, ExpMat_Ref, GeneVec, PathwaySet, RefID, TestClassIter, SampleIter)
ExpMat_Test |
Expression matrix of test samples |
ExpMat_Ref |
Expression matrix of reference samples |
GeneVec |
Vector of gene names |
PathwaySet |
List of pathways containing gene annotations for each pathways |
RefID |
Class of the reference set |
TestClassIter |
Class of the test set (if it is the same as reference set, the target test sample will be removed fro the reference set) |
SampleIter |
Target test sample in ExpMat_Testto be used for comparison with ExpMat_Ref |
List of similarities between the target sample and the expression matrix of reference samples
SimSummary_2Class is a function to calculating similarity between two set of samples
SimSummary_2Class(SimMat1, SimMat2)
SimSummary_2Class(SimMat1, SimMat2)
SimMat1 |
Matrix of similarity of the target samples with the 1st reference matrix |
SimMat2 |
Matrix of similarity of the target samples with the 2nd reference matrix |
Matrix of similarities of samples
Survival_Stats is a function for building cox model using all the features and each feature as a separate model
Survival_Stats(ScoreMat, TimeVec, EventVec)
Survival_Stats(ScoreMat, TimeVec, EventVec)
ScoreMat |
Matrix of feature values used for survival predition |
TimeVec |
Vectore of time to death of samples (patients) |
EventVec |
Vector of events for the samples (patients) as being dead or censored |
A list containing summary of a cox model using all of the features and separate cox models for each feature
SurvivalStat_PostProcess is a function to Extract summary statistics of the built cox model
SurvivalStat_PostProcess(StatList)
SurvivalStat_PostProcess(StatList)
StatList |
Summary lists of the cox models built using all the |
A list including Cindex, Cindex_std and LogTest_pval
TSC is a function to calculate transcriprtional similarity coefficient between two biological pathways
TSC(PathwayExp1, PathwayExp2)
TSC(PathwayExp1, PathwayExp2)
PathwayExp1 |
Expression matrix of genes within the chosen pathway in the 1st set of samples |
PathwayExp2 |
Expression matrix of genes within the chosen pathway in the 2nd set of samples |
Transcriptional similarity coefficient
Pathway1_ExpMat <- matrix(runif(100,0,10), ncol = 10) Pathway2_ExpMat <- matrix(runif(100,0,10), ncol = 10) TSC(Pathway1_ExpMat, Pathway2_ExpMat)
Pathway1_ExpMat <- matrix(runif(100,0,10), ncol = 10) Pathway2_ExpMat <- matrix(runif(100,0,10), ncol = 10) TSC(Pathway1_ExpMat, Pathway2_ExpMat)