UPenn Logo
NiChart Logo - Image by Gerd Altmann from Pixabay

NiChart Reference Dataset

NiChart Reference Dataset is a large and diverse collection of MRI images from multiple studies. It was created as part of the ISTAGING project to develop a system for identifying imaging biomarkers of aging and neurodegenerative diseases. The dataset includes multi-modal MRI data, as well as carefully curated demographic, clinical, and cognitive variables from participants with a variety of health conditions. The reference dataset is a key component of NiChart for training machine learning models and for creating reference distributions of imaging measures and signatures, which can be used to compare NiChart values that are computed from the user data to normative or disease-related reference values.

The following table is as of this version incomplete; it will be updated as soon as possible.

Table 1. Overview of studies that are part of the NiChart Reference Dataset






Demographics and Clinical Variables

The reference dataset includes a large number of samples from people of different ethnic groups, with a focus on older adults. This diversity is important because it allows to train machine learning models that are more accurate for people of diverse backgrounds.

Figure 1. Reference dataset demographics

The reference dataset contains data from individuals with various neuro-degenerative diseases. Disease subgroups were used to train machine learning models specifically tailored to each disease and to calculate disease-specific reference distributions.

Figure 2. Examples of disease subgroups in the reference dataset