CCL: hydrogen bond



 Sent to CCL by: "Mezei, Mihaly" [mihaly.mezei^^mssm.edu]
 Greetings,
 I think that the most problematic aspect of using an AI bot to answer a
 scientific query is not necessarily the accuracy of the answer (although it IS
 important, of course) since AI is expected to improve in time and the accuracy
 could thus improve. What I find most problematic is the lack of reference to the
 source of the information. Even if the bot may be able to point to the web page
 a particular sentence is based on, it is less likely to be a peer-reviewed
 article (not that peer reviewing guarantees accuracy but still ...).
 I am also wondering about an other issue related to AI bots: as time goes on,
 the internet will contain (or, can we say, contaminated with?) a lot of pages
 generated by such bots, so training on internet pages runs into the danger of
 confirming inaccuracies.
 Mihaly Mezei
 Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai
 Voice:  (212) 659-5475   Fax: (212) 849-2456
 WWW (MSSM home): http://icahn.mssm.edu/profiles/mihaly-mezei
 WWW (Lab home - software, publications): https://mezeim01.dmz.hpc.mssm.edu
 WWW (Department): http://www.mssm.edu/departments-and-institutes/pharmacology-and-systems-therapeutics
 ________________________________________
 > From: owner-chemistry+mihaly.mezei==mssm.edu(_)ccl.net
 <owner-chemistry+mihaly.mezei==mssm.edu(_)ccl.net> on behalf of Laurence
 Cuffe cuffe[*]mac.com <owner-chemistry(_)ccl.net>
 Sent: Thursday, May 4, 2023 4:50 PM
 To: Mezei, Mihaly
 Subject: CCL: hydrogen bond
 USE CAUTION: External Message.
 There are two aspects to this. One of which is to look at the toll in its
 present form. Its been trained on a huge corpus of text from the internet, and
 the result is a little like grading a random high school chemistry text. The
 answers you get are unlikely to be either nuanced, deep, or particularly deep.
 This comes into the class of “Lies told to children”, an
 educational term for the simplifications which educators make to convey models
 at a level matching the students knowledge and abilities.
 The prospect of training such a system on a more specific corpus of texts such
 as scientific papers is interesting, though it also presents problems. A lot of
 such text is behind paywalls, and we have a second issue of how to incorporate
 the idea of scientific progress, at a trivial level in terms of changes in
 nomenclature, and at a less trivial level in terms of the advance of science and
 the development of theory.
 The second aspect is more of a housekeeping one. While ChatGPT will respond to a
 very large number of queries, getting an effective response which maximises the
 value of the model in answering your query is more complex, and has just opened
 up as a field called Prompt engineering. The open AI foundation have a good
 starting point on this here:
 https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api<https-:-//urldefense.proofpoint.com/v2/url?u=https-3A__help.openai.com_en_articles_6654000-2Dbest-2Dpractices-2Dfor-2Dprompt-2Dengineering-2Dwith-2Dopenai-2Dapi&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=6aRY5UV9ubFm2NPs0FuGTpiSKgHVvZ2_vskveEhdcQA&e=>;
 I offer this as a former computational chemist, who now teaches AI.
 Best
 Laurence Cuffe
 On 4 May 2023, at 01:28, Alan Shusterman alan[-]reed.edu<https://urldefense.proofpoint.com/v2/url?u=http-3A__reed.edu_&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=C3UfFhSLjdbCSv6wuFz0DdZdVAR_XI9wWw2YizbvA6I&e=>;
 <owner-chemistry,,ccl.net<mailto:owner-chemistry,,ccl.net>> wrote:
 ChatGPT's answer contains some correct information, but it is not
 "fine". It is incomplete and it can lead many readers to incorrect
 conclusions.
 First, ChatGPT provides an incomplete description of methylamine, "one
 nitrogen atom and one methyl group". That phrase adds up to N-CH3. 2 H
 atoms are missing.
 Second, most hydrogen bonds connect a hydrogen bond donor and a hydrogen bond
 acceptor, See Wikipedia, https://en.wikipedia.org/wiki/Hydrogen_bond<https-:-//urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Hydrogen-5Fbond&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=a0T41XX3VjlpwtvOGqQLfJyvQ9yL6ZCxQD0hdNBgyz0&e=>;
 ChatGPT fails to describe either of these participants properly. It only
 mentions one property of the nitrogen (lone pair), and it does not describe any
 required properties of hydrogen (see below).
 Nitrogen properties. ChatGPT mentions lone pairs. Unfortunately, lone pairs are
 not a sufficient basis for an answer. F in HF has 3 lone pairs but is a weak
 hydrogen bond acceptor, whereas F in F anion (4 lone pairs) is a much stronger
 hydrogen bond acceptor. But don't be misled because Ne (also 4 lone pairs) is
 not a hydrogen bond acceptor. We cannot simply rely on lone pairs.
 Hydrogen properties. ChatGPT mentions "hydrogen". That is correct.
 Hydrogens are always found in hydrogen bond donors, but not every hydrogen atom
 forms hydrogen bonds. Usually the hydrogen must carry a partial positive charge,
 and this is usually achieved by H being bonded to a more electronegative atom.
 ChatGPT does not even specify which hydrogen participates in the hydrogen bond.
 While it is tempting to assume that it's implied because the hydrogen bond
 occurs between methylamine and pyrazine, that doesn't do the job because
 ChatGPT's description of methylamine is incomplete. It states only that
 "the hydrogen atom in methylamine can form a hydrogen bond", but there
 are 5 H in methylamine. ChatGPT seems to think there are only 3H because it says
 methylamine contains "one nitrogen atom and one methyl group".
 Here's another ChatGPT example that reached me a few weeks ago that has no
 bearing on this thread's conversation around H bonding.  A chemistry teacher and
 friend (who will remain anonymous) asked ChatGPT to "explain the NMR
 spectrum of ethyl acetate" and sent me ChatGPT's answer:
 Ethyl acetate is a common organic compound with the chemical formula
 CH3COOCH2CH3. Its nuclear magnetic resonance (NMR) spectrum displays
 characteristic peaks corresponding to the different types of protons present in
 the molecule.
 In the proton NMR spectrum of ethyl acetate, there are four types of protons:
 the two methyl group protons (CH3), the methylene group proton adjacent to the
 carbonyl (C=O) group (CH2), and the carbonyl group proton (C=O). The chemical
 shifts and splitting patterns of these protons are as follows:
     • The two methyl group protons appear as a singlet peak at around
 1.25 ppm (parts per million), which is typical of alkyl methyl protons.
     • The methylene group proton adjacent to the carbonyl group appears
 as a triplet peak at around 4.1 ppm, with a coupling constant of about 7 Hz.
 This triplet splitting arises from the coupling of the methylene proton with the
 two equivalent methyl group protons.
     • The carbonyl group proton appears as a singlet peak at around 2.05
 ppm, which is typical of carbonyl protons.
 Overall, the NMR spectrum of ethyl acetate displays four peaks, each
 corresponding to a different type of proton in the molecule. By analyzing the
 chemical shifts and splitting patterns of these peaks, it is possible to
 identify the different types of protons in ethyl acetate and obtain information
 about the molecular structure and bonding.
 The so-called explanation (much like the previous one regarding methylamine and
 pyrazine) is a mixed bag. There is good and there is bad. And even when there is
 good, ChatGPT uses it incorrectly.
 Example of good info: a correct formula for ethyl acetate.
 Some examples of bad inferences:
 - "carbonyl group proton" does not exist in this molecule
 - there are six, not "two" methyl group protons in the formula
 - the formula shows two methylene group protonS, not a "methylene group
 proton"
 - the methyl groups are inequivalent and produce signals at different chemical
 shifts whereas ChatGPT says there is one signal that is due to "two methyl
 group protons" and describes them both as "alkyl methyl protons"
 (actually, one is an alkyl methyl, the other is an acyl methyl)
 - many errors in the coupling patterns and explanations of coupling
 And a final conclusion that is highly misleading:
 - "the NMR spectrum of ethyl acetate displays four peaks"
 No. There are actually 3 types of protons, and they produce these signal
 patterns: a singlet (1 peak), a triplet (3 peaks) and a quartet (4 peaks) for a
 total of 8 peaks.
 ChatGPT's conclusion (4 peaks) doesn't even agree with its own analysis. It
 lists 3 types of protons, and identifies them as producing two singlets + one
 triplet -> 3 signals or 5 peaks.
 Obviously, this is a much worse example of ChatGPT's abilities than the previous
 one, but I think they have much in common. Tread carefully.
 Alan
 On Wed, May 3, 2023 at 1:12 PM Kshatresh Dutta Dubey
 kshatresh_+_gmail.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__gmail.com_&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=EUZQ9w2diFDnkKwpPEr8jP-8xica5nEM6G-knLsWE7o&e=>;
 <owner-chemistry**ccl.net<mailto:owner-chemistry**ccl.net>> wrote:
 I am pasting answer > from ChatGPT which seems fine to me:
 "Yes, it is possible for a hydrogen bond to form between pyrazine and
 methylamine.
 Pyrazine is a six-membered aromatic heterocycle containing two nitrogen atoms in
 its ring structure, while methylamine is a simple amine molecule with one
 nitrogen atom and one methyl group.
 In the case of hydrogen bonding, the hydrogen atom in methylamine can form a
 hydrogen bond with one of the nitrogen atoms in pyrazine. This can occur because
 nitrogen has a lone pair of electrons, which can form a hydrogen bond with
 hydrogen."
 Hope it helps.
 KDD
 --
 Dr. Partha Sarathi Sengupta
 Associate Professor
 Vivekananda Mahavidyalaya, Burdwan
 --
 Alan Shusterman
 Professor Emeritus
 Chemistry Department
 Reed College
 3203 SE Woodstock Blvd
 Portland, OR 97202-8199
 http://blogs.reed.edu/alan/<https-:-//urldefense.proofpoint.com/v2/url?u=http-3A__blogs.reed.edu_alan_&d=DwMFaQ&c=shNJtf5dKgNcPZ6Yh64b-ALLUrcfR-4CCQkZVKC8w3o&r=_pOLssyMlKixy9t2NfGIeaFX83dKDBvdACoDPwc9A9s&m=KOKdHmNHyYRcMfuD3vQuwhbC-kzHoL-C5ddydY9v-5nNuMDWS7C7-58q7snIVlWD&s=YCEcZp2eUYeH6FMsT3wPiH8ERuaroH_aCtp_n2_aBcw&e=>;
 "Patience, persistence, and a sense of humor." Dave Barrett
 (1956-2017, Reed College '79)