Gap19 Recently rather than predicting categorical variables
Recently, rather than predicting categorical variables as in classification, several studies begin to estimate continuous clinical variables from Gap19 images. Therefore, instead of classify a subject into binary or multiple pre-determined categories or stages of the disease, regression focus on estimating continuous values which may help to assess patient's disease progression. The most commonly used cognitive measures are Alzheimer's Disease Assessment Scale cognitive total score (ADAS), Mini Mental State Exam score (MMSE) and Rey Auditory Verbal Learning Test (RAVLT). Regression analyses were commonly used to predict cognitive scores from imaging measures. The relationship between commonly used cognitive measures and structural changes with MRI has been previously studied by regression models and the results demonstrated there exist a relationship between baseline MRI features and cognitive measures (Wan et al., 2014, Stonnington et al., 2010). For example, Wan et al. has proposed an elegant regression model called CORNLIN that employs a sparse Bayesian learning algorithm to predict multiple cognitive scores based on 98 structural MRI regions of interests (ROIs) for Alzheimer's disease patients. The polynomial model used in CORNLIN can detect either a nonlinear or linear relationship between brain structure and cognitive decline (Wan et al., 2014). Stonnington et al. adopted relevance vector regression, a sparse kernel method formulated in a Bayesian framework, to predict four sets of cognitive scores using MRI voxel based morphometry measures (Stonnington et al., 2010). One of the biggest challenges in the prediction of inferring cognitive outcomes with MRI is the high dimensionality, which affects the computational performance and leads to a wrong estimation and identification of the relevant predictors. Sparse methods have attracted a great amount of research efforts in the neuroimaging field to reduce the high dimensionality and identify the relevant biomarkers due to its sparsity-inducing property. Ye et al. applied sparse logistic regression with stability selection to ADNI (Alzheimer's Disease Neuroimaging Initiative) data for robust feature selection (Ye et al., 2012), successfully predicted the conversion from MCI to probable AD and identified a small subset of bio-signatures. Recently, the multi-task learning (MTL) based feature learning methods with sparsity-inducing norm have been widely studied to select the discriminative feature subset from MRI features by incorporating inherent correlations among multiple clinical cognitive measures (Zhou et al., 2013, Wang et al., 2011, Zhang and Shen, 2012). For example, the ℓ2,1-norm regularization penalizes each row of parameters matrix as a whole and enforce sparsity among the rows, ribosomal RNA is able to select the most discriminative features. Wang et al. (2011) and Zhang and Shen (2012) employed multi-task feature learning strategies for selecting biomarkers that could predict multiple clinical scores. Specially, Wang et al. (2011) considers some important features are only correlated to a subset of tasks, and adds an ℓ1-norm regularizer to impose the sparsity among all elements and propose to use the combined ℓ2,1-norm and ℓ1-norm regularizations to select features; Zhang proposed a multi-task learning with ℓ2,1-norm to select the common subset of relevant features for multiple variables from each modality by assuming that the related tasks share a common relevant feature subset. The most limitation of the popular learning models assume linear relationship between the MRI features and the cognitive outcomes. To model these more complicated but more flexible relationship between them, Zhang develop a multi-modal support vector regression (SVR) to fuse the above-selected features from all modalities with the selected feature subset (Zhang and Shen, 2012). Kernel methods have been studied to model the cognitive scores as nonlinear functions of neuroimaging measures. Recently, many kernel based classification or regression methods with faster optimization speed or stronger generalization performance have been proposed and investigated by theoretically analyzing and experimentally evaluating (Gu and Sheng, 2016, Gu et al., 2015). Suk et al. proposed a new sparse multi-task learning with an ℓ2,1-norm regularization (Suk et al., 2016). The multi-task learning is unlike the conventional multi-task learning methods, which treat all features equally. It utilizes the optimal regression coefficients learned in the lower hierarchy as context information to weight features adaptively. Most existing studies focus on only inferring the cognitive outcomes on single time-point of data (cross-sectional analysis), Ye et al. formulate the prediction problem as a multi-task regression problem by considering the prediction at each time point as a task, and propose a convex formulation with fused sparse group Lasso. The formulation allows to the simultaneous selection of a common set of biomarkers at all time points with ℓ1-norm as well as the selection of a specific set of biomarkers at different time points with ℓ1-norm, and in the meantime incorporates the temporal smoothness using the fused lasso penalty (Zhou et al., 2013).