Medical Anomaly Detection

The goal of the project is to automatically detect the anomaly (e.g. diseases) in the medical diagnoses based on the lab test results of patients. We attempt to automate the process of producing the health reports, where we on the one hand discover the anomaly in the diagnosis, on the other hand, generate health check diagnosis reports automatically. This system can then assist to inform the medical practitioners about irregularities in the health check results.

The background of this project lies in the fact that it is time consuming for humans (even professionals) to filter out anomalies in the diagnoses. In practice, the diagnosis text was generated "semi-automatically" after a patient has conducted the the health test:

  1. (i) Doctors select the corresponding diagnosis segments based on the lab test results from dropdown lists;
  2. (ii) If segments not found in the pre-existing lists, doctors write down the diagnosis.

The project can be conducted following this protocol:
  1. (i) Seperate healthy (normal) and unhealthy (anomaly) sets
  2. (ii) Train an unsupervised anomaly detection (see [1,2]) algorithm (e.g. autoencoder, clustering-based) using the normal set under the assumption that we have a majority of normal cases and a tiny fraction of anomaly
  3. (iii) In testing, the algorithm differentiates the healthy patients and patients with diseases
  4. (iv) (possibly) Uncover the anomaly missed in the diagnoses
  5. (v) Automate the anomaly detection in the hospital

Contact for more information:


  1. [1] Introduction to Anomaly Detection
  2. [2] Aytekin, C., Ni, X., Cricri, F., & Aksu, E. (2018). Clustering and Unsupervised Anomaly Detection with L2 Normalized Deep Auto-Encoder Representations. [arxiv].