You are working with the text-only light edition of "H.Lohninger: Teach/Me Data Analysis, Springer-Verlag, Berlin-New York-Tokyo, 1999. ISBN 3-540-14743-8". Click here for further information. |
Table of Contents Appendix Exercises Henry's Constant | |
See also: MLR, PCR, ANN, variable selection |
The data set HENRYSEM contains various descriptors and properties of 157 substances, among them the logarithm of Henry's coefficient, the boiling point, and the melting point. The following variables are available:
ln(H) logarithm of Henry's constant
melt.p. melting point (deg. Celsius)
boil.p. boiling point (deg. Celsius)
DENS20 density at 20 deg. Celsius
nD20 refractive index at 20 deg. Celsius
Hv(LB) enthalpy of evaporation
compact topological index indicating the compactness of a molecule
rad topological radius
dia topological diameter
nvz number of branches in the molecule
Randic Randic index
RdOz modified Randic index
NMethyl number of methyl groups in molecule
TJ topological index J (defined by Balaban)
C number of carbon atoms
H number of hydrogen atoms
O number of oxygen atoms
N number of nitrogen atoms
SumH number of hetero (non-H, non-C) atoms in molecule
MWgt molecular weight
LOIX topological index reflecting electronegativities
Use this data and go to the
to model Henry's constant from the molecular descriptors. Try to compare
several methods, ie. MLR (in combination with forward selection of variables),
PCR, and ANN (RBF networks). Which of the models is "best"?
What about modeling the boiling points and the melting points by using this data set?
Do you have an explanation for the difference between boiling points
and melting points?
Last Update: 2005-Jul-16