Mens X Machina

Our software

Recent News






Projects



CAUSAL PATH

Next Generation Causal Analysis inspired by the induction of biological pathways from cytometry data

Details
HUNT

Lorem ipsum d

Details
STATEGRA

Statistical methods and tools for the integrative analysis of omics data

Details
EPILOGEAS – ARISTEIA II

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod

Details

Our Team

Theo. Giakoumakis

Ph.D

Ph.D. student,
Department of Computer Science, University of Crete, Hellas

J. Xanthopoulos

Msc

Postgraduate student, Department of Computer Science, University of Crete, Hellas

Konstantina Biza

Msc

Postgraduate student, Department of Computer Science, University of Crete, Hellas

Stefanos Fafalios

Msc

Postgraduate student, Department of Computer Science, University of Crete, Hellas

Myrto Krana

Msc

Postgraduate student, Department of Computer Science, University of Crete, Hellas

Ioulia Karagiannaki

Msc

Postgraduate student, Department of Computer Science, University of Crete, Hellas

Publications

2019

  • [DOI] E. Ewing, L. Kular, S. J. Fernandes, N. Karathanasis, V. Lagani, S. Ruhrmann, I. Tsamardinos, J. Tegner, F. Piehl, D. Gomez-Cabrero, and M. Jagodic, “Combining evidence from four immune cell types identifies DNA methylation patterns that implicate functionally distinct pathways during Multiple Sclerosis progression,” EBioMedicine, 2019.
    [Summary]

    Abstract Background Multiple Sclerosis (MS) is a chronic inflammatory disease and a leading cause of progressive neurological disability among young adults. DNA methylation, which intersects genes and environment to control cellular functions on a molecular level, may provide insights into MS pathogenesis. Methods We measured DNA methylation in CD4+ T cells (n = 31), CD8+ T cells (n = 28), CD14+ monocytes (n = 35) and CD19+ B cells (n = 27) from relapsing-remitting (RRMS), secondary progressive (SPMS) patients and healthy controls (HC) using Infinium HumanMethylation450 arrays. Monocyte (n = 25) and whole blood (n = 275) cohorts were used for validations. Findings B cells from MS patients displayed most significant differentially methylated positions (DMPs), followed by monocytes, while only few DMPs were detected in T cells. We implemented a non-parametric combination framework (omicsNPC) to increase discovery power by combining evidence from all four cell types. Identified shared DMPs co-localized at MS risk loci and clustered into distinct groups. Functional exploration of changes discriminating RRMS and SPMS from HC implicated lymphocyte signaling, T cell activation and migration. SPMS-specific changes, on the other hand, implicated myeloid cell functions and metabolism. Interestingly, neuronal and neurodegenerative genes and pathways were also specifically enriched in the SPMS cluster. Interpretation We utilized a statistical framework (omicsNPC) that combines multiple layers of evidence to identify DNA methylation changes that provide new insights into MS pathogenesis in general, and disease progression, in particular. Fund This work was supported by the Swedish Research Council, Stockholm County Council, AstraZeneca, European Research Council, Karolinska Institutet and Margaretha af Ugglas Foundation.

  • [DOI] I. Ferreirós-Vidal, T. Carroll, T. Zhang, V. Lagani, R. N. Ramirez, E. Ing-Simmons, A. Garcia, L. Cooper, Z. Liang, G. Papoutsoglou, G. Dharmalingam, Y. Guo, S. Tarazona, S. J. Fernandes, P. Noori, G. Silberberg, A. G. Fisher, I. Tsamardinos, A. Mortazavi, B. Lenhard, A. Conesa, J. Tegner, M. Merkenschlager, and D. Gomez-Cabrero, “Feedforward regulation of Myc coordinates lineage-specific with housekeeping gene expression during B cell progenitor cell differentiation,” PLOS Biology, vol. 17, iss. 4, pp. 1-28, 2019.
    [Summary]

    Author summary The human body is made from billions of cells comprizing many specialized cell types. All of these cells ultimately come from a single fertilized oocyte in a process that has two key features: proliferation, which expands cell numbers, and differentiation, which diversifies cell types. Here, we have examined the transition from proliferation to differentiation using B lymphocytes as an example. We find that the transition from proliferation to differentiation involves changes in the expression of genes, which can be categorized into cell-type–specific genes and broadly expressed “housekeeping” genes. The expression of many housekeeping genes is controlled by the gene regulatory factor Myc, whereas the expression of many B lymphocyte–specific genes is controlled by the Ikaros family of gene regulatory proteins. Myc is repressed by Ikaros, which means that changes in housekeeping and tissue-specific gene expression are coordinated during the transition from proliferation to differentiation.

  • [DOI] Y. Pantazis and I. Tsamardinos, “A unified approach for sparse dynamical system inference from temporal measurements,” , 2019.
    [Summary]

    Temporal variations in biological systems and more generally in natural sciences are typically modeled as a set of ordinary, partial or stochastic differential or difference equations. Algorithms for learning the structure and the parameters of a dynamical system are distinguished based on whether time is discrete or continuous, observations are time-series or time-course and whether the system is deterministic or stochastic, however, there is no approach able to handle the various types of dynamical systems simultaneously.In this paper, we present a unified approach to infer both the structure and the parameters of non-linear dynamical systems of any type under the restriction of being linear with respect to the unknown parameters. Our approach, which is named Unified Sparse Dynamics Learning (USDL), constitutes of two steps. First, an atemporal system of equations is derived through the application of the weak formulation. Then, assuming a sparse representation for the dynamical system, we show that the inference problem can be expressed as a sparse signal recovery problem, allowing the application of an extensive body of algorithms and theoretical results. Results on simulated data demonstrate the efficacy and superiority of the USDL algorithm under multiple interventions and/or stochasticity. Additionally, USDL’s accuracy significantly correlates with theoretical metrics such as the exact recovery coefficient. On real single-cell data, the proposed approach is able to induce high-confidence subgraphs of the signaling pathway.Source code is available at Bioinformatics online. USDL algorithm has been also integrated in SCENERY (http://scenery.csd.uoc.gr/); an online tool for single-cell mass cytometry analytics.Supplementary data are available at Bioinformatics online.

  • [PDF] “Forward-Backward Selection with Early Dropping,” Journal of Machine Learning Research, vol. 20, pp. 1-39, 2019.
    [Summary]

    Forward-backward selection is one of the most basic and commonly-used feature selection algorithms available. It is also general and conceptually applicable to many different types of data. In this paper, we propose a heuristic that significantly improves its running time, while preserving predictive performance. The idea is to temporarily discard the variables that are conditionally independent with the outcome given the selected variable set. Depending on how those variables are reconsidered and reintroduced, this heuristic gives rise to a family of algorithms with increasingly stronger theoretical guarantees. In distributions that can be faithfully represented by Bayesian networks or maximal ancestral graphs, members of this algorithmic family are able to correctly identify the Markov blanket in the sample limit. In experiments we show that the proposed heuristic increases computational efficiency by about 1-2 orders of magnitude, while selecting fewer or the same number of variables and retaining predictive performance. Furthermore, we show that the proposed algorithm and feature selection with LASSO perform similarly when restricted to select the same number of variables, making the proposed algorithm an attractive alternative for problems where no (efficient) algorithm for LASSO exists

2018

  • [DOI] K. Tsirlis, V. Lagani, S. Triantafillou, and I. Tsamardinos, “On scoring Maximal Ancestral Graphs with the Max\textendashMin Hill Climbing algorithm,” International Journal of Approximate Reasoning, vol. 102, pp. 74-85, 2018.
    [Summary]

    We consider the problem of causal structure learning in presence of latent confounders. We propose a hybrid method, MAG Max–Min Hill-Climbing (M3HC) that takes as input a data set of continuous variables, assumed to follow a multivariate Gaussian distribution, and outputs the best fitting maximal ancestral graph. M3HC builds upon a previously proposed method, namely GSMAG, by introducing a constraint-based first phase that greatly reduces the space of structures to investigate. On a large scale experimentation we show that the proposed algorithm greatly improves on GSMAG in all comparisons, and over a set of known networks from the literature it compares positively against FCI and cFCI as well as competitively against GFCI, three well known constraint-based approaches for causal-network reconstruction in presence of latent confounders.

  • [DOI] M. Tsagris, “Bayesian Network Learning with the PC Algorithm: An Improved and Correct Variation,” Applied Artificial Intelligence , vol. 33, iss. 2, pp. 101-123, 2018.
    [Summary]

    PC is a prototypical constraint-based algorithm for learning Bayesian networks, a special case of directed acyclic graphs. An existing variant of it, in the R package pcalg, was developed to make the skeleton phase order independent. In return, it has notably increased execution time. In this paper, we clarify that the PC algorithm the skeleton phase of PC is indeed order independent. The modification we propose outperforms pcalg’s variant of the PC in terms of returning correct networks of better quality as is less prone to errors and in some cases it is a lot more computationally cheaper. In addition, we show that pcalg’s variant does not return valid acyclic graphs.

  • [DOI] I. Tsamardinos, E. Greasidou, and G. Borboudakis, “Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation,” Machine Learning, vol. 107, iss. 12, pp. 1895-1922, 2018.
    [Summary]

    Cross-Validation (CV), and out-of-sample performance-estimation protocols in general, are often employed both for (a) selecting the optimal combination of algorithms and values of hyper-parameters (called a configuration) for producing the final predictive model, and (b) estimating the predictive performance of the final model. However, the cross-validated performance of the best configuration is optimistically biased. We present an efficient bootstrap method that corrects for the bias, called Bootstrap Bias Corrected CV (BBC-CV). BBC-CV’s main idea is to bootstrap the whole process of selecting the best-performing configuration on the out-of-sample predictions of each configuration, without additional training of models. In comparison to the alternatives, namely the nested cross-validation (Varma and Simon in BMC Bioinform 7(1):91, 2006) and a method by Tibshirani and Tibshirani (Ann Appl Stat 822–829, 2009), BBC-CV is computationally more efficient, has smaller variance and bias, and is applicable to any metric of performance (accuracy, AUC, concordance index, mean squared error). Subsequently, we employ again the idea of bootstrapping the out-of-sample predictions to speed up the CV process. Specifically, using a bootstrap-based statistical criterion we stop training of models on new folds of inferior (with high probability) configurations. We name the method Bootstrap Bias Corrected with Dropping CV (BBCD-CV) that is both efficient and provides accurate performance estimates.

  • [DOI] I. Tsamardinos, G. Borboudakis, P. Katsogridakis, P. Pratikakis, and V. Christophides, “A greedy feature selection algorithm for Big Data of high dimensionality,” Machine Learning, 2018.
    [Summary]

    We present the Parallel, Forward–Backward with Pruning (PFBP) algorithm for feature selection (FS) for Big Data of high dimensionality. PFBP partitions the data matrix both in terms of rows as well as columns. By employing the concepts of p-values of conditional independence tests and meta-analysis techniques, PFBP relies only on computations local to a partition while minimizing communication costs, thus massively parallelizing computations. Similar techniques for combining local computations are also employed to create the final predictive model. PFBP employs asymptotically sound heuristics to make early, approximate decisions, such as Early Dropping of features from consideration in subsequent iterations, Early Stopping of consideration of features within the same iteration, or Early Return of the winner in each iteration. PFBP provides asymptotic guarantees of optimality for data distributions faithfully representable by a causal network (Bayesian network or maximal ancestral graph). Empirical analysis confirms a super-linear speedup of the algorithm with increasing sample size, linear scalability with respect to the number of features and processing cores. An extensive comparative evaluation also demonstrates the effectiveness of PFBP against other algorithms in its class. The heuristics presented are general and could potentially be employed to other greedy-type of FS algorithms. An application on simulated Single Nucleotide Polymorphism (SNP) data with 500K samples is provided as a use case.

  • [DOI] M. Adamou, G. Antoniou, E. Greasidou, V. Lagani, P. Charonyktakis, I. Tsamardinos, and M. Doyle, “Toward Automatic Risk Assessment to Support Suicide Prevention,” Crisis, pp. 1-8, 2018.
    [Summary]

    Background: Suicide has been considered an important public health issue for years and is one of the main causes of death worldwide. Despite prevention strategies being applied, the rate of suicide has not changed substantially over the past decades. Suicide risk has proven extremely difficult to assess for medical specialists, and traditional methodologies deployed have been ineffective. Advances in machine learning make it possible to attempt to predict suicide with the analysis of relevant data aiming to inform clinical practice. Aims: We aimed to (a) test our artificial intelligence based, referral-centric methodology in the context of the National Health Service (NHS), (b) determine whether statistically relevant results can be derived from data related to previous suicides, and (c) develop ideas for various exploitation strategies. Method: The analysis used data of patients who died by suicide in the period 2013–2016 including both structured data and free-text medical notes, necessitating the deployment of state-of-the-art machine learning and text mining methods. Limitations: Sample size is a limiting factor for this study, along with the absence of non-suicide cases. Specific analytical solutions were adopted for addressing both issues. Results and Conclusion: The results of this pilot study indicate that machine learning shows promise for predicting within a specified period which people are most at risk of taking their own life at the time of referral to a mental health service.

  • [DOI] K. Lakiotaki, N. Vorniotakis, M. Tsagris, G. Georgakopoulos, and I. Tsamardinos, “BioDataome: a collection of uniformly preprocessed and automatically annotated datasets for data-driven biology,” Database, iss. bay011, 2018.
    [Summary]

    Biotechnology revolution generates a plethora of omics data with an exponential growth pace. Therefore, biological data mining demands automatic, ‘high quality’ curation efforts to organize biomedical knowledge into online databases. BioDataome is a database of uniformly preprocessed and disease-annotated omics data with the aim to promote and accelerate the reuse of public data. We followed the same preprocessing pipeline for each biological mart (microarray gene expression, RNA-Seq gene expression and DNA methylation) to produce ready for downstream analysis datasets and automatically annotated them with disease-ontology terms. We also designate datasets that share common samples and automatically discover control samples in case-control studies. Currently, BioDataome includes ∼5600 datasets, ∼260 000 samples spanning ∼500 diseases and can be easily used in large-scale massive experiments and meta-analysis. All datasets are publicly available for querying and downloading via BioDataome web application. We demonstrate BioDataome’s utility by presenting exploratory data analysis examples. We have also developed BioDataome R package found in: https://github.com/mensxmachina/BioDataome/. Database URL: http://dataome.mensxmachina.org/

  • [DOI] M. Markaki, I. Tsamardinos, A. Langhammer, V. Lagani, K. Hveem, and O. D. Røe, “A Validated Clinical Risk Prediction Model for Lung Cancer in Smokers of All Ages and Exposure Types: A HUNT Study.,” EBioMedicine, 2018.
    [Summary]

    Lung cancer causes >1·6 million deaths annually, with early diagnosis being paramount to effective treatment. Here we present a validated risk assessment model for lung cancer screening. The prospective HUNT2 population study in Norway examined 65,237 people aged >20years in 1995-97. After a median of 15·2years, 583 lung cancer cases had been diagnosed; 552 (94·7%) ever-smokers and 31 (5·3%) never-smokers. We performed multivariable analyses of 36 candidate risk predictors, using multiple imputation of missing data and backwards feature selection with Cox regression. The resulting model was validated in an independent Norwegian prospective dataset of 45,341 ever-smokers, in which 675 lung cancers had been diagnosed after a median follow-up of 11·6years. Our final HUNT Lung Cancer Model included age, pack-years, smoking intensity, years since smoking cessation, body mass index, daily cough, and hours of daily indoors exposure to smoke. External validation showed a 0·879 concordance index (95% CI 0·866-0·891) with an area under the curve of 0·87 (95% CI 0·85-0·89) within 6years. Only 22% of ever-smokers would need screening to identify 81·85% of all lung cancers within 6years. Our model of seven variables is simple, accurate, and useful for screening selection.

  • M. Panagopoulou, M. Karaglani, I. Balgkouranidou, V. Vasilakakis, E. Biziota, T. Koukaki, E. Karamitrousis, E. Nena, I. Tsamardinos, G. Kolios, E. Lianidou, S. Kakolyris, and E. Chatzaki, “Circulating cell free DNA in Breast cancer: size profiling, levels and methylation patterns lead to prognostic and predictive classifiers,” (to appear) Oncogene , 2018.
    [Summary]

    Blood circulating cell-free DNA (ccfDNA) is a suggested biosource of valuable clinical information for cancer, meeting the need for a minimally-invasive advancement in the route of precision medicine. In this paper, we evaluated the prognostic and predictive potential of ccfDNA parameters in early and advanced breast cancer. Groups consisted of 150 and 16 breast cancer patients under adjuvant and neoadjuvant therapy respectively, 34 patients with metastatic disease and 35 healthy volunteers. Direct quantification of ccfDNA in plasma revealed elevated concentrations correlated to the incidence of death, shorter PFS, and non-response to pharmacotherapy in the metastatic but not in the other groups. The methylation status of a panel of cancer-related genes chosen based on previous expression and epigenetic data (KLK10, SOX17, WNT5A, MSH2, GATA3) was assessed by quantitative methylation-specific PCR. All but the GATA3 gene was more frequently methylated in all the patient groups than in healthy individuals (all p < 0.05). The methylation of WNT5A was statistically significantly correlated to greater tumor size and poor prognosis characteristics and in advanced stage disease with shorter OS. In the metastatic group, also SOX17 methylation was significantly correlated to the incidence of death, shorter PFS, and OS. KLK10 methylation was significantly correlated to unfavorable clinicopathological characteristics and relapse, whereas in the adjuvant group to shorter DFI. Methylation of at least 3 or 4 genes was significantly correlated to shorter OS and no pharmacotherapy response, respectively. Classification analysis by a fully automated, machine learning software produced a single-parametric linear model using ccfDNA plasma concentration values, with great discriminating power to predict response to chemotherapy (AUC 0.803, 95% CI [0.606, 1.000]) in the metastatic group. Two more multi-parametric signatures were produced for the metastatic group, predicting survival and disease outcome. Finally, a multiple logistic regression model was constructed, discriminating between patient groups and healthy individuals. Overall, ccfDNA emerged as a highly potent predictive classifier in metastatic breast cancer. Upon prospective clinical evaluation, all the signatures produced could aid accurate prognosis.

  • [DOI] M. Tsagris, V. Lagani, and I. Tsamardinos, ” Feature selection for high-dimensional temporal data,” BMC Bioinformatics, iss. 1, 2018.
    [Summary]

    Feature selection is commonly employed for identifying collectively-predictive biomarkers and biosignatures; it facilitates the construction of small statistical models that are easier to verify, visualize, and comprehend while providing insight to the human expert. In this work, we extend established constrained-based, feature-selection methods to high-dimensional “omics” temporal data, where the number of measurements is orders of magnitude larger than the sample size. The extension required the development of conditional independence tests for temporal and/or static variables conditioned on a set of temporal variables. The algorithm is able to return multiple, equivalent solution subsets of variables, scale to tens of thousands of features, and outperform or be on par with existing methods depending on the analysis task specifics. The use of this algorithm is suggested for variable selection with high-dimensional temporal data.

  • [DOI] M. Tsagris, G. Borboudakis, V. Lagani, and I. Tsamardinos, “Constraint-based causal discovery with mixed data,” International Journal of Data Science and Analytics, 2018.
    [Summary]

    We address the problem of constraint-based causal discovery with mixed data types, such as (but not limited to) continuous, binary, multinomial and or-dinal variables. We use likelihood-ratio tests based on appropriate regression models, and show how to derive symmetric conditional independence tests. Such tests can then be directly used by existing constraint-based methods with mixed data, such as the PC and FCI algorithms for learning Bayesian networks and maximal ancestral graphs respectively. In experiments on simu-lated Bayesian networks, we employ the PC algorithm with different conditional independence tests for mixed data, and show that the proposed approach outperforms alternatives in terms of learning accuracy.

  • [DOI] M. Adamou, G. Antoniou, E. Greassidou, V. Lagani, P. Charonyktakis, and I. Tsamardinos, “Mining Free-Text Medical Notes for Suicide Risk Assessment.” 2018.
    [Summary]

    Suicide has been considered as an important public health issue for a very long time, and is one of the main causes of death worldwide. Despite suicide prevention strategies being applied, the rate of suicide has not changed substantially over the past decades. Advances in machine learning make it possible to attempt to predict suicide based on the analysis of relevant data to inform clinical practice. This paper reports on findings from the analysis of data of patients who died by suicide in the period 2013-2016 and made use of both structured data and free-text medical notes. We focus on examining various text-mining approaches to support risk assessment. The results show that using advance machine learning and text-mining techniques, it is possible to predict within a specified period which people are most at risk of taking their own life at the time of referral to a mental health service.

2017

  • [DOI] V. Lagani, G. Athineou, A. Farcomeni, M. Tsagris, and I. Tsamardinos, “Feature Selection with the R Package MXM: Discovering Statistically Equivalent Feature Subsets,” Journal of Statistical Software, vol. 80, iss. 7, 2017.
    [Summary]

    The statistically equivalent signature (SES) algorithm is a method for feature selection inspired by the principles of constraint-based learning of Bayesian networks. Most of the currently available feature selection methods return only a single subset of features, supposedly the one with the highest predictive power. We argue that in several domains multiple subsets can achieve close to maximal predictive accuracy, and that arbitrarily providing only one has several drawbacks. The SES method attempts to identify multiple, predictive feature subsets whose performances are statistically equivalent. In that respect the SES algorithm subsumes and extends previous feature selection algorithms, like the max-min parent children algorithm. The SES algorithm is implemented in an homonym function included in the R package MXM, standing for mens ex machina, meaning ‘mind from the machine’ in Latin. The MXM implementation of SES handles several data analysis tasks, namely classification, regression and survival analysis. In this paper we present the SES algorithm, its implementation, and provide examples of use of the SES function in R. Furthermore, we analyze three publicly available data sets to illustrate the equivalence of the signatures retrieved by SES and to contrast SES against the state-of-the-art feature selection method LASSO. Our results provide initial evidence that the two methods perform comparably well in terms of predictive accuracy and that multiple, equally predictive signatures are actually present in real world data.

  • [DOI] G. Orfanoudaki, M. Markaki, K. Chatzi, I. Tsamardinos, and A. Economou, “MatureP: prediction of secreted proteins with exclusive information from their mature regions,” Scientific Reports, vol. 7, iss. 1, p. 3263–, 2017.
    [Summary]

    More than a third of the cellular proteome is non-cytoplasmic. Most secretory proteins use the Sec system for export and are targeted to membranes using signal peptides and mature domains. To specifically analyze bacterial mature domain features, we developed MatureP, a classifier that predicts secretory sequences through features exclusively computed from their mature domains. MatureP was trained using Just Add Data Bio, an automated machine learning tool. Mature domains are predicted efficiently with ~92% success, as measured by the Area Under the Receiver Operating Characteristic Curve (AUC). Predictions were validated using experimental datasets of mutated secretory proteins. The features selected by MatureP reveal prominent differences in amino acid content between secreted and cytoplasmic proteins. Amino-terminal mature domain sequences have enhanced disorder, more hydroxyl and polar residues and less hydrophobics. Cytoplasmic proteins have prominent amino-terminal hydrophobic stretches and charged regions downstream. Presumably, secretory mature domains comprise a distinct protein class. They balance properties that promote the necessary flexibility required for the maintenance of non-folded states during targeting and secretion with the ability of post-secretion folding. These findings provide novel insight in protein trafficking, sorting and folding mechanisms and may benefit protein secretion biotechnology.

  • [DOI] G. Papoutsoglou, G. Athineou, V. Lagani, I. Xanthopoulos, A. Schmidt, S. éliás, J. Tegnér, and I. Tsamardinos, “SCENERY: a web application for (causal) network reconstruction from cytometry data,” Nucleic Acids Research, vol. 37, p. D412–D416, 2017.
    [Summary]

    Flow and mass cytometry technologies can probe proteins as biological markers in thousands of individual cells simultaneously, providing unprecedented opportunities for reconstructing networks of protein interactions through machine learning algorithms. The network reconstruction (NR) problem has been well-studied by the machine learning community. However, the potentials of available methods remain largely unknown to the cytometry community, mainly due to their intrinsic complexity and the lack of comprehensive, powerful and easy-to-use NR software implementations specific for cytometry data. To bridge this gap, we present Single CEll NEtwork Reconstruction sYstem (SCENERY), a web server featuring several standard and advanced cytometry data analysis methods coupled with NR algorithms in a user-friendly, on-line environment. In SCENERY, users may upload their data and set their own study design. The server offers several data analysis options categorized into three classes of methods: data (pre)processing, statistical analysis and NR. The server also provides interactive visualization and download of results as ready-to-publish images or multimedia reports. Its core is modular and based on the widely-used and robust R platform allowing power users to extend its functionalities by submitting their own NR methods. SCENERY is available at scenery.csd.uoc.gr or http://mensxmachina.org/en/software/.

  • [DOI] K. Siomos, E. Papadaki, I. Tsamardinos, K. Kerkentzes, M. Koygioylis, and C. Trakatelli, “Prothrombotic and Endothelial Inflammatory Markers in Greek Patients with Type 2 Diabetes Compared to Non-Diabetics,” Endocrinology & Metabolic Syndrome, iss. 1, 2017.
    [Summary]

    Objective: To evaluate specific factors of coagulation and endothelial inflammatory markers namely, thrombomodulin, soluble receptor of the protein C (sEPCR), factor VIII, plasminogen activator inhibitor 1, Von Willebrandt factor, fibrinogen, fibrinogen dimers (d-dimers), high sensitivity C-reactive protein and homocysteine in a subset of Greek subjects with and without Type 2 (T2) Diabetes. Design: 84 subjects, of which 44 patients with T2 diabetes, were included in the randomized comparative prospective cross sectional study. The subjects were split into a Τ2 diabetics group and a group of healthy controls of similar age, anthropometric profiles and similar gender distribution. Results: A total of 47 variables and biomarkers together with indicators for metabolic profiles, clinical history, as well as detailed anthropometric profiles and traditional risk factors, were evaluated. Dipeptidyl peptidase-4 (DPP4), Insulin, use of Sulfonylurea, high HBA1c and glucose levels, were clearly statistically differentiated in the two groups, while no other biomarkers including the new potential indicators were found to be different. High values of thrombomodulin and homocysteine were correlated with a rise in creatinine and thus seem to affect renal function in the diabetic patients group while in the non-diabetics group the correlations are different with sEPCR having a relative strong negative correlation in renal function as measured with The Modification of Diet in Renal Disease, in agreement with the latest international findings. Conclusions: The presence of T2 diabetes in conjunction with age clearly correlates with problems in renal function, thrombomodulin and homocysteine could serve as indicators for renal damage in diabetics but not in healthy individuals. sEPCR on the other hand could be a potential generic indicator for renal damage. Thrombomodulin and sEPCR as prothombotic agents, did not show any indication that they can be utilised as markers for the prevention and/or treatment of thrombotic complications in diabetic patients.

  • [DOI] S. Triantafillou, V. Lagani, C. Heinze-Deml, A. Schmidt, J. Tegner, and I. Tsamardinos, “Predicting Causal Relationships from Biological Data: Applying Automated Casual Discovery on Mass Cytometry Data of Human Immune Cells,” Triantafillou S, Lagani V, Heinze-Deml C, Schmidt A, Tegner J, Tsamardinos I. Predicting Causal Relationships from Biological Data: Applying Automated Causal Discovery on Mass Cytometry Data of Human Immune Cells. Scientific Reports. 2017;7:12724. doi:10., 2017.
    [Summary]

    Learning the causal relationships that define a molecular system allows us to predict how the system will respond to different interventions. Distinguishing causality from mere association typically requires randomized experiments. Methods for automated causal discovery from limited experiments exist, but have so far rarely been tested in systems biology applications. In this work, we apply state-of-the art causal discovery methods on a large collection of public mass cytometry data sets, measuring intra-cellular signaling proteins of the human immune system and their response to several perturbations. We show how different experimental conditions can be used to facilitate causal discovery, and apply two fundamental methods that produce context-specific causal predictions. Causal predictions were reproducible across independent data sets from two different studies, but often disagree with the KEGG pathway databases. Within this context, we discuss the caveats we need to overcome for automated causal discovery to become a part of the routine data analysis in systems biology.

Read more

About Us

Mens Ex Machina, Mind from the Machine or “Ο από Μηχανής Νους” paraphrases the latin expression Deus Ex Machina, God from the Machine. The name was suggested by Lucy Sofiadou, Prof. Tsamardinos’ wife.

We are a research group, founded in October 2006, led by Associate Professor Ioannis Tsamardinos, interested in Artificial Intelligence, Machine Learning, and Biomedical Informatics and affiliated with the Computer Science Department of University of Crete. The group’s aims are to progress science and disseminate knowledge via educational activities and computer tools. Our group is involved in

Research:

Theoretical, algorithmic, and applied research in all of the above areas; we are also involved in interdisciplinary collaborations with biologists, physicians and practitioners from other fields.

Education:

Educational activities, such as teaching university courses, tutorials, summers schools, as well as supervising undergraduate dissertations, masters projects, and Ph.D. theses.

Systems and Software:

Implementation of tools, systems, and code libraries to aid the dissemination of the research results.Funding is provided from through the University of Crete, often originating from European and International research grants.

Current research activities include but not limited to the following:

        • Causal discovery methods and the induction of causal models from observational studies. Specifically, we have recently introduced the problem of Integrative Causal Analysis (INCA).

        • Feature selection (a.k.a. variable selection) for classification and regression.

        • Induction of graphical models, such as Bayesian Networks from data.

        • Analysis of biomedical data and applications of AI and Machine Learning methods to induce new biomedical knowledge.

        • Activity recognition in Ambient Intelligent environments.

       

Ioannis Tsamardinos
Associate Professor, Department of Computer Science, University of Crete