Calculates the property MY_2D_QSAR. See Output for other properties that can be computed.
The model was built using the Least-Squares method.
The document contains these additional sections of information:
Regression Statistics
Model Coefficients and Variables
Training Data Information
Excluded Variable Information
Model Construction Parameters
This table contains statistics from the regression modeling procedure.
Statistic | Value |
---|---|
N | 70 |
r | 0.977 |
r2 | 0.954 |
r2 (adjusted) | 1.211 |
r2 (prediction) | 0.174 |
RMS residual error | 0.1843 |
q2 (cross-validation) | 0.202 |
RMS residual error (cross-validation) | 1.145 |
This table contains the coefficients and associated variables for the equation.
Coefficient | Variable |
---|---|
0.263585 | Constant |
0.139511 | ALogP |
-0.151553 | Count<ECFP_6:-591526139> |
0.254562 | Count<ECFP_6:-1910270391> |
0.255009 | Count<ECFP_6:-1100000244> |
0.255548 | Count<ECFP_6:-1074141656> |
0.257191 | Count<ECFP_6:1559650422> |
0.25646 | Count<ECFP_6:642810091> |
0.257106 | Count<ECFP_6:-182236392> |
0.257097 | Count<ECFP_6:-606302475> |
0.256129 | Count<ECFP_6:1572579716> |
1.11729 | Count<ECFP_6:-992506539> |
-0.493043 | Count<ECFP_6:734603939> |
3.02386 | Count<ECFP_6:-797085356> |
0.256944 | Count<ECFP_6:2099970318> |
0.258393 | Count<ECFP_6:770157610> |
0.257053 | Count<ECFP_6:-2024255407> |
0.257212 | Count<ECFP_6:1996767644> |
0.25823 | Count<ECFP_6:-786013480> |
0.258704 | Count<ECFP_6:1997021792> |
0.257812 | Count<ECFP_6:-175146122> |
0.259305 | Count<ECFP_6:1298504034> |
0.259698 | Count<ECFP_6:858184972> |
0.261082 | Count<ECFP_6:-932108170> |
-0.000900798 | Count<ECFP_6:-1332781180> |
0.0869928 | Count<ECFP_6:-757679000> |
0.331836 | Count<ECFP_6:-1102925512> |
1.15026 | Count<ECFP_6:-177264675> |
-1.71094 | Count<ECFP_6:859796174> |
1.1353 | Count<ECFP_6:-1686976258> |
0.260358 | Count<ECFP_6:-1302110264> |
0.261407 | Count<ECFP_6:1095683433> |
0.263643 | Count<ECFP_6:-768126022> |
0.261617 | Count<ECFP_6:2007300961> |
0.261366 | Count<ECFP_6:397284699> |
0.260751 | Count<ECFP_6:1451403962> |
0.262348 | Count<ECFP_6:2071685859> |
-0.620883 | Count<ECFP_6:-1952026932> |
0.464346 | Count<ECFP_6:-658363709> |
-1.78009 | Count<ECFP_6:-1506130950> |
1.28495 | Count<ECFP_6:1670941296> |
0.261055 | Count<ECFP_6:1155958977> |
0.260224 | Count<ECFP_6:-952707428> |
0.260105 | Count<ECFP_6:-1278685991> |
0.259641 | Count<ECFP_6:1079175434> |
0.259554 | Count<ECFP_6:-2135040425> |
0.162945 | Count<ECFP_6:-1897341097> |
-0.774082 | Count<ECFP_6:-167460056> |
0.772336 | Count<ECFP_6:-1059365320> |
0.943147 | Count<ECFP_6:-572965350> |
0.926212 | Count<ECFP_6:-1867561664> |
-0.962577 | Count<ECFP_6:-1683911134> |
-1.47467 | Count<ECFP_6:-178525456> |
0.101045 | Count<ECFP_6:-292555972> |
1.49489 | Count<ECFP_6:-666950485> |
0.366783 | Count<ECFP_6:1564392544> |
0.00835034 | Count<ECFP_6:1571214559> |
-0.0527057 | Count<ECFP_6:-2019199918> |
-0.807647 | Count<ECFP_6:-1487746661> |
-1.03346 | Count<ECFP_6:292958156> |
1.75669 | Count<ECFP_6:-756348342> |
0.479547 | Count<ECFP_6:-103562730> |
-1.4798 | Count<ECFP_6:-1950934120> |
0.181543 | Count<ECFP_6:-857146788> |
0.035885 | Count<ECFP_6:-408473190> |
-0.627152 | Count<ECFP_6:864909220> |
0.770691 | Count<ECFP_6:-740847217> |
0.910334 | Count<ECFP_6:408216150> |
0.988691 | Count<ECFP_6:1595399376> |
0.262223 | Count<ECFP_6:515773057> |
1.45669 | Count<ECFP_6:78036066> |
-2.06101 | Count<ECFP_6:-1884411803> |
0.504692 | Count<ECFP_6:1021725999> |
0.716968 | Count<ECFP_6:-665999307> |
0.215824 | Count<ECFP_6:661073749> |
-0.379654 | Count<ECFP_6:864518973> |
0.257071 | Count<ECFP_6:191790798> |
-0.694789 | Count<ECFP_6:1338334141> |
-0.0174989 | Molecular_Weight |
0.117988 | Num_AromaticRings |
0.811412 | Num_H_Acceptors |
0.653521 | Num_H_Donors |
-0.389538 | Num_Rings |
0.0348633 | Num_RotatableBonds |
-11.4967 | Molecular_FractionalPolarSurfaceArea |
The data used to train the model consisted of 70 samples. The following are the statistics for the dependent (Y) and independent (X) variables. (The first row shows statistics for the Y variable. All other rows are for X variables.)
Variable | Min | Max | Mean | Std. Dev. |
---|---|---|---|---|
pki-trypsin | 3.854 | 7.699 | 5.9975 | 0.86078 |
ALogP | -2.386 | 6.525 | 2.654 | 1.3647 |
ECFP_6 | N/A | N/A | N/A | N/A |
Molecular_Weight | 355.46 | 676.91 | 512.58 | 65.768 |
Num_AromaticRings | 1 | 4 | 2.8571 | 0.72281 |
Num_H_Acceptors | 4 | 7 | 5.1143 | 0.8871 |
Num_H_Donors | 3 | 4 | 3.3571 | 0.47916 |
Num_Rings | 2 | 6 | 3.8429 | 0.87236 |
Num_RotatableBonds | 5 | 14 | 8.4429 | 1.6443 |
Molecular_FractionalPolarSurfaceArea | 0.218 | 0.403 | 0.28823 | 0.036476 |
The following table shows statistics for the independent (X) training data variables that were excluded from the model for any of the following reasons: (1) The variable was constant or was a string when a number was expected. [Unexpected string variables appear as constants with values of 0.] (2) The variable contained too few nonzero values (fewer than 8, as specified by the MinSamplesPerVariable parameter; however, fingerprint features excluded due to too few nonzero values are not listed below). (3) The variable was correlated with another variable (correlation coefficient greater in magnitude than 0.9, as specified by the Max Correlation parameter). The Reason column indicates the reason that the variable was excluded.
Variable | Min | Max | Mean | Std. Dev. | Reason |
---|---|---|---|---|---|
Count<ECFP_6:670515721> | 0 | 1 | 0.84286 | 0.36656 | Correlated with other variables |
Count<ECFP_6:960161451> | 0 | 1 | 0.24286 | 0.43191 | Correlated with other variables |
Count<ECFP_6:20550775> | 0 | 1 | 0.24286 | 0.43191 | Correlated with other variables |
Count<ECFP_6:1658067901> | 0 | 1 | 0.84286 | 0.36656 | Correlated with other variables |
Count<ECFP_6:-1016680330> | 0 | 1 | 0.24286 | 0.43191 | Correlated with other variables |
Count<ECFP_6:2102150379> | 0 | 1 | 0.98571 | 0.11952 | Correlated with other variables |
Count<ECFP_6:-675671408> | 0 | 1 | 0.24286 | 0.43191 | Correlated with other variables |
Count<ECFP_6:571867147> | 0 | 1 | 0.18571 | 0.39168 | Correlated with other variables |
Count<ECFP_6:1574959513> | 0 | 1 | 0.24286 | 0.43191 | Correlated with other variables |
Count<ECFP_6:-978131182> | 0 | 1 | 0.71429 | 0.45502 | Correlated with other variables |
Count<ECFP_6:1454306807> | 0 | 1 | 0.18571 | 0.39168 | Correlated with other variables |
Count<ECFP_6:796830164> | 0 | 1 | 0.18571 | 0.39168 | Correlated with other variables |
Count<ECFP_6:117107367> | 0 | 1 | 0.24286 | 0.43191 | Correlated with other variables |
Count<ECFP_6:944467641> | 0 | 1 | 0.72857 | 0.44791 | Correlated with other variables |
Count<ECFP_6:1336540477> | 0 | 1 | 0.74286 | 0.44021 | Correlated with other variables |
Count<ECFP_6:-1331450522> | 0 | 1 | 0.52857 | 0.50279 | Correlated with other variables |
Count<ECFP_6:1449212896> | 0 | 1 | 0.71429 | 0.45502 | Correlated with other variables |
Count<ECFP_6:-102666057> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-1756464860> | 0 | 1 | 0.74286 | 0.44021 | Correlated with other variables |
Count<ECFP_6:-755462605> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:717474525> | 0 | 1 | 0.7 | 0.46157 | Correlated with other variables |
Count<ECFP_6:-1150899835> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:1146720904> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:710652510> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:-1289586824> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:1698998511> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-1811366813> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:1515192889> | 0 | 1 | 0.74286 | 0.44021 | Correlated with other variables |
Count<ECFP_6:-1650219925> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-281505363> | 0 | 1 | 0.12857 | 0.33714 | Correlated with other variables |
Count<ECFP_6:-1173882748> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:1637591468> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:2006518499> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:-81428579> | 0 | 1 | 0.68571 | 0.46758 | Correlated with other variables |
Count<ECFP_6:1233434266> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:-1364467941> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:-594723798> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:2146640915> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:1929265201> | 0 | 1 | 0.64286 | 0.48262 | Correlated with other variables |
Count<ECFP_6:325895898> | 0 | 1 | 0.62857 | 0.48668 | Correlated with other variables |
Count<ECFP_6:865482986> | 0 | 1 | 0.15714 | 0.36656 | Correlated with other variables |
Count<ECFP_6:-1505292865> | 0 | 1 | 0.15714 | 0.36656 | Correlated with other variables |
Count<ECFP_6:-376546800> | 0 | 1 | 0.45714 | 0.50176 | Correlated with other variables |
Count<ECFP_6:601995614> | 0 | 1 | 0.15714 | 0.36656 | Correlated with other variables |
Count<ECFP_6:-954757448> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-1625362884> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:833921154> | 0 | 1 | 0.12857 | 0.33714 | Correlated with other variables |
Count<ECFP_6:-174624245> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-1490910266> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-454715551> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:2025485523> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
Count<ECFP_6:-19155222> | 0 | 1 | 0.14286 | 0.35245 | Correlated with other variables |
Count<ECFP_6:-1078835860> | 0 | 1 | 0.14286 | 0.35245 | Correlated with other variables |
Count<ECFP_6:1386744051> | 0 | 1 | 0.15714 | 0.36656 | Correlated with other variables |
Count<ECFP_6:-1719301700> | 0 | 1 | 0.14286 | 0.35245 | Correlated with other variables |
Count<ECFP_6:864287155> | 0 | 1 | 0.11429 | 0.32046 | Correlated with other variables |
The following parameter values were specified by the learner component. Some items are internal parameters not exposed by the component. In the course of building the model, certain values may have been adjusted from the values shown below.
Parameter | Value |
---|---|
LearnedPropertyName | MY_2D_QSAR |
Name | pki-trypsin |
UseProperties | UserSet |
PredefinedSet | Estate_Keys |
UserSet | ALogP,ECFP_6,Molecular_Weight,Num_AromaticRings,Num_H_Acceptors,Num_H_Donors,Num_Rings,Num_RotatableBonds,Molecular_FractionalPolarSurfaceArea |
IgnoreProperties | |
InitialModelFrom | Least-Squares |
Weight Property | |
kNN Options | |
Number of Nearest Neighbors | 20 |
Dynamic Smoothing Factor | 0.5 |
Number of XV Groups | 11 |
Additional Options | |
NumberOfComponents | 20 |
MinSamplesPerVariable | SqrtEstimate |
Decorrelation Method | Pearson |
Max Correlation | 0.90 |
Learn Options | Perform OPS Analysis, Track Fingerprint Features |
Indicator Baseline | Most Common Value |
Numeric Distance Function | Euclidean |
Numeric Scaling | Mean-Center and Scale, Scale by Number of Dimensions |
Fingerprint Distance Function | Tanimoto |
Model Domain Fingerprint | FCFP_2 |
Additional Properties | |
TopLevelComment | Add Protocol Comment Here |
Destination Folder | 16606/LearnedProperties |
Max OPS Fingerprint Bits | 1000 |
Create Proxy Component | False |