Interactive and comprehensive tool for combining two diagnostic tests.

Diagnostic tests are widely used for purposes of distinguishing diseases from each other and making the correct diagnosis for the patients. Therefore, the role of the diagnostic tests in making decisions is crucial. In addition to having an essential role in medical diagnosis, these tests also contribute to planning the proper treatment while reducing treatment costs. The diagnostic accuracy performance and reliability of these diagnostic tests are taken into account when making these tests widely available. For some conditions, more than one diagnostic test may be available. Some of these diagnostic tests may even replace existing methods because they have better performance. In many studies, new diagnostic test performance measures with superior performance were obtained by using multiple diagnostic tests instead of one. The dtComb package includes diverse combination methods, data standardization, and resampling methods available for combining two diagnostic tests.

Here the web-tool application is presented to implement the combination approaches available in the dtComb package. This application allows users to upload their own data (Data Upload tab), build models (Analysis tab), present the results (Graphs tab) and make new predictions (Predict tab) from the model built. More detailed information about the combination methods and approaches through this web-tool and dtComb package can be found in the paper of the package. All source codes are in GitHub.

If you use this tool for your research please cite: ???

Usage of the web-tool:

Data upload

Load your data set in *.txt file format using this tab.

  • Rows must represent the observations and each column must represent the variables.

  • First row must be a header which indicates the variable names.


Combination Approach

Use this tab to combine diagnostic tests. dtComb supports 142 combination methods for combining diagnostic tests.

  • First, it is decided which one to choose in the 4 main combination approaches. There are 8 combination methods in linear combination approach, 7 in non-linear combination approach, 14 in mathematical operators and 113 combination methods in machine learning algorithms.

  • After selected the combination approach, one of the methods available in the approach is selected.

Linear Combination Methods
  • Scoring

    The binary logistic regression model is used. However, for a more straightforward interpretation, slope values are rounded to a given digit number, and the combination score is computed.
  • Su & Liu???s

    Su and Liu???s combination score is obtained by using Fisher???s discriminant function under the assumption of a multivariate normal distribution model and proportional covariance matrices.
  • Logistic regression

    A binary logistic regression model is fitted using the maximum-likelihood method.
  • Min-Max

    This method linearly combines the minimum and maximum values of the markers by finding a parameter ?? that maximizes the corresponding Mann-Whitney statistic.
  • Pepe & Thompson???s

    Uses the same binary logistic regression model. The combination score is obtained by proportioning the slope values to calculate the ?? parameter .
  • Pepe, Cai & Langton???s

    Pepe, Cai, and Langton combination score is obtained by using AUC as the parameter of a logistic regression model
  • Minimax

    Minimax method is an extension of Su & Liu???s method.
  • Todor & Saplacan???s

    Todor and Saplacan???s method uses trigonometric functions to calculate the combination score. The combination score is obtained by the ?? value that optimizes the corresponding AUC.
Nonlinear Combination Methods
  • Polynomial Regression

    The method builds a logistic regression model with the feature space created and returns the probability of a positive event for each observation. It is implemented with degrees of the fitted polynomials taken from the user.
  • Ridge Regression

    Ridge regression is a penalizing method used to estimate the coefficients of highly correlated variables and in this case the polynomial feature space created from two biomarkers. For the implementation of the method, glmnet library is used with two functions: cv.glmnet() to run a cross validation model to determine the tuning parameter ?? and glmnet() to fit the model with the selected tuning parameter.It is implemented with degrees of the fitted polynomials taken from the user.
  • Lasso Regression

    Lasso regression is also a penalizing method with one difference is that at the end this method returns the coefficients of some features as 0, makes this method useful for feature elimination as well. The implementation is similar to Ridge regression, cross validation for parameter selection and model fit are implemented with glmnet library.It is implemented with degrees of the fitted polynomials taken from the user.
  • Elastic-Net Regression

    Elastic Net regression is obtained by combining the penalties of Ridge and Lasso regression to get the best of both models. The model again includes a tuning parameter ?? as well as a mixing parameter ?? taken form the user which takes a value between 0 (ridge) and 1 (lasso) to determine the weights of the loss functions of Ridge and Lasso regressions. It is implemented with degrees of the fitted polynomials taken from the user.
In nonlinear approaches, polynomial, ridge and lasso regression methods, an interaction that may exist between two diagnostic tests can be included in the model. For this, the Include of interaction option must be selected as TRUE.
  • Splines

    With the applications of regression models in a polynomial feature space the second non-linear approach to combining biomarkers comes from applying several regression models to the dataset using a function derived from piecewise polynomials. Splines are implemented with degrees of freedom and degrees of the fitted polynomials taken from the user. For the implementation splines library is used to build piecewise logistic regression models with base splines.
  • Smoothing Splines and Natural Cubic Splines

    In addition to the basic spline structure, Generalized Additive Models are applied with natural cubic splines and smoothing splines using the gam library in R. It is implemented with degrees of freedom taken from the user.
Mathematical Operators
  • Arithmetic Operators:

    Add, Subtract, Multiply ve Divide methods represent basic arithmetic operators.
  • Distance Measures:

    The distance measures included in the package are Euclidean, Manhattan, Chebyshev, Kulczynski_d, Lorentzian, Avg, Taneja, and Kumar-Johnson.
  • Exponential functions

    These methods, in which one of the two diagnostic tests is taken as base and the other as an exponent, are indicated by the names baseinexp (markers1 markers2) and expinbase (markers2 markers1).
Machine-Learning Algorithms
  • 113 machine learning algorithms available in the caret, library which are used to train classification models using machine learning algorithms and make predictions using these models, are used within the scope of Machine-Learning Algorithms.IMPORTANT: See available-model for further information about the methods used in this methods. All resampling and Preprocessing methods included in the Caret package can also be used in dtComb while content is ML algorithms.


Resampling methods

  • Cross-validation performs with the number of folds

  • Repeated cross-validation performs with the number of repeats and number of folds

  • Bootstrap performs with the number of resampling iterations.


Standardization methods

  • Range: Standardization to a range between 0 and 1.

  • z-Score: Standardization using z scores with mean equals to 0 and standard deviation equals to 1.

  • t-Score: Standardization using T scores. The range varies between usually 20 and 80.

  • Mean: Standardization with sample mean equals to 1.

  • Deviance: Standardization with sample standard deviation equals to 1.


There are options under the Advanced checkbox, such as cuttoff method, direction, and confidence levels.

There are 34 methods available to determine the optimum cutoff. Cutoff methods can be found in the OptimalCutpoints package of R.


Outputs

When the analysis is complete, the Download Model button appears. With the help of this button, then the user can download the model he has trained to make predictions.

ROC Curve

An ROC Curve appears you when the analysis is complete. Here you can see ROC Curve of both the generated Combination Score, Marker 1 and Marker 2.



AUC Table

Under AUC Table subtab, you can get area under the curve (AUC) value and its standard error, confidence interval and statistical significance, instantly.



ROC Coordinates

Each false positive and true positive points can be found under ROC Coordinates subtab for each marker.



Multiple Comparisons Table

Multiple Comparisons Table alt tab can be used to make pairwise statistical comparisons for ROC curves of two markers and combination scores.



Cut points

  • The user can see the cut-off method selected in the advance section, the Criterion and the cut-off point determined by this method.



Performance Measures

  • The confusion matrix and Performance Measures created with the cutoff method chosen by the user can be seen.



Plots

  • Kernel density graphs, Individual-value graphs and Sensitivity&Specifity graphs belonging to Combination score, Marker 1 and Marker 2 can be viewed in the relevant sub-tabs.



Predict

  • The user can predict by loading the test data with the model that has been trained. The user can predict by loading the test data with the model that has been trained. Another option is to make a prediction on the same model by reloading the previously downloaded model.