CCL: What do cheminformaticists do with inconsistently measured data?


I'm not sure if the hydrocarbons are your actual goal or just an example, but if you're really interested in their density and would like to correct them for the same reference temperature, maybe you could start with standard methods devoted to this issue. For instance, back in the day when I worked for some oil companies, we used ASTM D1298 standard method (along with standard tables and equations specified by the method) to measure the densities of petroleum products, which had to be corrected to 15 °C. In short, we had a different correction factor per unit °C which was different for different kinds of petroleum products (saturated, unsaturated or aromatic).

That's an engineering-related approach, a more science-based approach would take the thermal expansion for each individual substance (ASTM methods use average corrections for a whole class of substances) and would correct the individual densities to a reference temperature.

I hope it helps.



On Fri, Apr 21, 2017 at 7:15 PM, David Shobe <owner-chemistry[]> wrote:
Please excuse crossposting.

For example, if one is doing a QSPR (quantitative structure-properties relation) study of densities of alkanes, and encounters the problem that some densities are measured at 20°C and others at 25°C, how should one handle the inconsistency of measurement conditions?  Note that the difference in density for the same alkane between 20°C and 25°C might be significant in comparison to the difference in density between two isomeric alkanes at the same temperature.  Is is legitimate to try to correct/standardize the 20°C densities to 25°C densities by subtracting or dividing the 20°C densities by some constant?  And if so, how does one determine that constant?  Are there other approaches one can use?

--David Shobe


Prof. Dr. André Farias de Moura
Department of Chemistry
Federal University of São Carlos
São Carlos - Brazil
phone: +55-16-3351-8090