Programme(s) to which this project applies: |
☑ MPhil/PhD | ☒ MRes[Med] | ☒ URIS |
Descriptive statistics of prevalent prescriptions of anti-cancer drugs for patients after being diagnosed with HCC. Network analysis will be used to show multi-drug prescription patterns in major subgroups of patients. Both statistical inference and casual machine learning-based models will be used to evaluate the individualized treatment effectiveness of anti-cancer drugs.
Traditional machine learning focuses on predicting outcomes by minimizing the discrepancy between predicted and true values. In contrast, the objective of causal machine learning is to estimate the causal effects of interventions or treatments, specifically by evaluating the incremental changes resulting from a particular decision. Causal machine learning can provide personalized estimates of treatment effects for subpopulations or even predict outcomes for individual patients, which is essential for tailoring treatments to each patient. Furthermore, there has been a growing number of methods applied to predicting treatment effects. For example, A study assessed the postoperative results of patients undergoing laparoscopic liver resection (LLR) and open liver resection (OLR) using propensity score matching and inverse probability weighting (IPW) analysis to confirm the effect of parenchyma-sparing LLR for lesions in the right posterior segments on surgical outcomes. Yoon et al. propose a novel method, GANITE, which uses a generative adversarial network (GAN) to estimate individualized treatment effects. They investigated the efficacy of - blockers in patients with heart failure and reduced left ventricular ejection fraction (LVEF) using variational autoencoder and hierarchical clustering, applied to pooled data from nine randomized trials. The results identified patient subgroups in sinus rhythm with suboptimal efficacy from -blockers and, importantly, a subgroup of patients with atrial fibrillation in whom-blockers significantly reduced the risk of death despite an overall neutral response.
Methods:
Causal Machine Learning
Unlike traditional machine learning, which primarily seeks to find relationship from data and predict outcomes. For example, we can find certain lifestyle is related to the risk of disease. However, the relationship does not imply causation. Causal machine learning pay more attention to causality by introducing a decision to understand how changes in one variable can directly affect another. Causal learning is commonly used to predict therapeutic outcomes such as toxicity and efficacy, which aids in drug safety and evaluation[44]. Here, I would like to introduce several mainstream comprehensive machine learning causal inference packages, such as CausalML, DoWhy, EconML, and CausalNex. These packages provide a variety of causal inference methods, including propensity score matching and double machine learning.
Propensity score matching
Propensity score systems can be used in observational studies to decrease confounding variables, an indication of bias. Propensity score matching (PSM) is a statistical matching technique that takes into consideration the factors that predict receiving a treatment in order to evaluate the impact of a policy, treatment, or other intervention.
Double machine learning
Doubly machine learning is a statistical method that combines causal inference and machine learning to estimate treatment effects. Its core lies in its ability to provide unbiased estimates if either the propensity score model or the outcome model is correctly specified. This characteristic enhances reliability, especially when facing model misspecification or incomplete data.
Variational AutoEncoder
The Variational AutoEncoder (VAE) is a generative model that employs an encoder-decoder neural network architecture. Unlike classical AutoEncoder (AE), the encoder of VAE does not extract a latent representation directly but a mean vector and a standard deviation vector predefined to be distributed in the latent space. These vectors are transformed and constrained to serve as inputs for the decoder. To ensure the diversity in the generated data, VAE utilizes a reparameterization method that enables the generation of new samples.
Professor JD Zhou, Department of Family Medicine and Primary Care
Prof. Zhou was jointly appointed by School of Public Health at the University of Hong Kong (HKU) Li Ka Shing Faculty of Medicine in 2024. Prof. Zhou received his post-doctoral training as a Medical Statistician at the Nuffield Department of Clinical Medicine, University of Oxford, United Kingdom. He earned his Ph.D. in Data Science from School of Data Science, City University of Hong Kong, Hong Kong SAR, China. Before joining HKU, Prof. Zhou worked as Assistant Professor at Warwick Medical School, University of Warwick, United Kingdom.
Prof Zhou has research interests in big data analytics, medical statistics, primary care and preventive health, case-control and cohort studies, predictive and decision analytics, pharmacoepidemiology and aetiology of chronic diseases (including cardiovascular diseases, diabetes mellitus, etc. for family medicine patients). He also has study interests in casual treatment effects analysis (double machine learning), social epidemiology and adverse online events identification with text mining, high-dimension data representation and latent cluster pattern learning in large-scale health datasets. Recently, he is conducting machine learning analytics for illness trajectories and palliative care, especially progression pattern analysis/visualization, end stage risk assessment following chronic diseases, and non-invasive cancer screening.
Key Research Areas
Medical statistics, case-control study designs, population cohort studies, casual effects analysis (propensity score matching, inverse probability weighting, Bayesian modeling, double machine learning), and cost-effectiveness for pharmacotherapy in family medicine patients.
Health big data analytics for primary care of communicable and non-communicable diseases including liver injury, cardiovascular diseases, diabetes, hypertension, and renal failure.
Illness trajectories and palliative care by identifying and visualizing disease progression patterns, conducting severe risk assessment, and developing non-invasive cancer screening tools.
Multi-antibiotics resistance prediction with patient clinical characteristics, blood culture and sensitivity tests using machine learning.
Key Publications
Comparisons of New-Onset Prostate Cancer in Type 2 Diabetes Mellitus Exposed to SGLT2I, DPP4I and GLP1: A Population-Based Cohort Study. Journal of the National Comprehensive Cancer Network. 2024. (Accepted and to appear).
New-Onset Syncope in Diabetic Patients Treated with Sodium-Glucose Cotransporter-2 Inhibitors Versus Dipeptidyl Peptidase-4 Inhibitors: A Chinese Population-Based Cohort Study. European Heart Journal - Cardiovascular Pharmacotherapy. 2023. Doi: 10.1093/ehjcvp/pvad086.
Clinical Characteristics, Risk Factors and Outcomes of Cancer Patients With COVID-19: A Population-Based Study. Cancer Medicine, 2023, 12(1): 287-296. Doi: 10.1002/cam4.4888.
The Association Between Neutrophil-Lymphocyte Ratio and Variability with New-Onset Dementia: A Population-Based Cohort Study. Journal of Alzheimer’s Disease, 2023, 94 (2): 547-557. Doi: 10.3233/JAD-220111.
Healthcare Big Data in Hong Kong: Development and Implementation of Artificial Intelligence-Enhanced Predictive Models for Risk Stratification. Current Problems in Cardiology, 2023, 49(1-B): 102168. Doi: 10.1016/j.cpcardiol.2023.102168.
Risk of New-Onset Prostate Cancer for Metformin Versus Sulfonylurea Use in Type 2 Diabetes Mellitus: A Propensity Score–Matched Study. Journal of the National Comprehensive Cancer Network, 2022, 20(6), 674-682. Doi: 10.6004/jnccn.2022.7010.
High Visit-to-Visit Cholesterol Variability Predicts Heart Failure and Adverse Cardiovascular Events: A Population-Based Cohort Study, European Journal of Preventive Cardiology, 2022, 29(14), e323-e325. Doi: 10.1093/eurjpc/zwac097.
Gender-Specific Clinical Risk Scores Incorporating Blood Pressure Variability for Predicting Incident Dementia. Journal of the American Medical Informatics Association, 2022, 29(2): 335-347. Doi: 10.1136/gutjnl-2020-323668.
Development of A Multivariable Prediction Model for Severe COVID-19 Disease: A Population-Based Study from Hong Kong. NPJ Digital Medicine, 2021, 4(1): 66, Doi: 10.1038/s41746-021-00433-4.
Proton Pump Inhibitor or Famotidine Use and Severe COVID-19 Disease: A Propensity Score-Matched Territory-Wide Study, Gut, 2020, Doi: 10.1136/gutjnl-2020-323668.
For more information or to express interest for this project, please email the supervisor or the specified contact point in the project description. Interested candidates are advised to enclose with your email:
Information on the research programme, funding support and admission documentations could be referenced online at the Research Postgraduate Admissions website. General admission enquiries should be directed to rpgmed@hku.hk.
HKUMed MBBS students interested in the Master of Research in Medicine (MRes[Med]) programme may visit the programme website for more information.
HKUMed UG students interested in the Undergraduate Research Internship Scheme (URIS) may visit the scheme’s website for more information.
Follow HKUMed