From owner-chemistry {*at*} ccl.net Fri May 5 23:49:00 2023 From: "Mezei, Mihaly mihaly.mezei]=[mssm.edu" To: CCL Subject: CCL: hydrogen bond Message-Id: <-54914-230505142055-17821-gNptjfO0p7SLTJECOEeF/Q|a|server.ccl.net> X-Original-From: "Mezei, Mihaly" Content-Language: en-US Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" Date: Fri, 5 May 2023 18:20:40 +0000 MIME-Version: 1.0 Sent to CCL by: "Mezei, Mihaly" [mihaly.mezei^^mssm.edu] Greetings, I think that the most problematic aspect of using an AI bot to answer a scientific query is not necessarily the accuracy of the answer (although it IS important, of course) since AI is expected to improve in time and the accuracy could thus improve. What I find most problematic is the lack of reference to the source of the information. Even if the bot may be able to point to the web page a particular sentence is based on, it is less likely to be a peer-reviewed article (not that peer reviewing guarantees accuracy but still ...). I am also wondering about an other issue related to AI bots: as time goes on, the internet will contain (or, can we say, contaminated with?) a lot of pages generated by such bots, so training on internet pages runs into the danger of confirming inaccuracies. Mihaly Mezei Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai Voice: (212) 659-5475 Fax: (212) 849-2456 WWW (MSSM home): http://icahn.mssm.edu/profiles/mihaly-mezei WWW (Lab home - software, publications): https://mezeim01.dmz.hpc.mssm.edu WWW (Department): http://www.mssm.edu/departments-and-institutes/pharmacology-and-systems-therapeutics ________________________________________ > From: owner-chemistry+mihaly.mezei==mssm.edu(_)ccl.net on behalf of Laurence Cuffe cuffe[*]mac.com Sent: Thursday, May 4, 2023 4:50 PM To: Mezei, Mihaly Subject: CCL: hydrogen bond USE CAUTION: External Message. There are two aspects to this. One of which is to look at the toll in its present form. Its been trained on a huge corpus of text from the internet, and the result is a little like grading a random high school chemistry text. The answers you get are unlikely to be either nuanced, deep, or particularly deep. This comes into the class of “Lies told to children”, an educational term for the simplifications which educators make to convey models at a level matching the students knowledge and abilities. The prospect of training such a system on a more specific corpus of texts such as scientific papers is interesting, though it also presents problems. A lot of such text is behind paywalls, and we have a second issue of how to incorporate the idea of scientific progress, at a trivial level in terms of changes in nomenclature, and at a less trivial level in terms of the advance of science and the development of theory. The second aspect is more of a housekeeping one. While ChatGPT will respond to a very large number of queries, getting an effective response which maximises the value of the model in answering your query is more complex, and has just opened up as a field called Prompt engineering. The open AI foundation have a good starting point on this here: https://help.openai.com/en/articles/6654000-best-practices-for-prompt-engineering-with-openai-api I offer this as a former computational chemist, who now teaches AI. Best Laurence Cuffe On 4 May 2023, at 01:28, Alan Shusterman alan[-]reed.edu > wrote: ChatGPT's answer contains some correct information, but it is not "fine". It is incomplete and it can lead many readers to incorrect conclusions. First, ChatGPT provides an incomplete description of methylamine, "one nitrogen atom and one methyl group". That phrase adds up to N-CH3. 2 H atoms are missing. Second, most hydrogen bonds connect a hydrogen bond donor and a hydrogen bond acceptor, See Wikipedia, https://en.wikipedia.org/wiki/Hydrogen_bond ChatGPT fails to describe either of these participants properly. It only mentions one property of the nitrogen (lone pair), and it does not describe any required properties of hydrogen (see below). Nitrogen properties. ChatGPT mentions lone pairs. Unfortunately, lone pairs are not a sufficient basis for an answer. F in HF has 3 lone pairs but is a weak hydrogen bond acceptor, whereas F in F anion (4 lone pairs) is a much stronger hydrogen bond acceptor. But don't be misled because Ne (also 4 lone pairs) is not a hydrogen bond acceptor. We cannot simply rely on lone pairs. Hydrogen properties. ChatGPT mentions "hydrogen". That is correct. Hydrogens are always found in hydrogen bond donors, but not every hydrogen atom forms hydrogen bonds. Usually the hydrogen must carry a partial positive charge, and this is usually achieved by H being bonded to a more electronegative atom. ChatGPT does not even specify which hydrogen participates in the hydrogen bond. While it is tempting to assume that it's implied because the hydrogen bond occurs between methylamine and pyrazine, that doesn't do the job because ChatGPT's description of methylamine is incomplete. It states only that "the hydrogen atom in methylamine can form a hydrogen bond", but there are 5 H in methylamine. ChatGPT seems to think there are only 3H because it says methylamine contains "one nitrogen atom and one methyl group". Here's another ChatGPT example that reached me a few weeks ago that has no bearing on this thread's conversation around H bonding. A chemistry teacher and friend (who will remain anonymous) asked ChatGPT to "explain the NMR spectrum of ethyl acetate" and sent me ChatGPT's answer: Ethyl acetate is a common organic compound with the chemical formula CH3COOCH2CH3. Its nuclear magnetic resonance (NMR) spectrum displays characteristic peaks corresponding to the different types of protons present in the molecule. In the proton NMR spectrum of ethyl acetate, there are four types of protons: the two methyl group protons (CH3), the methylene group proton adjacent to the carbonyl (C=O) group (CH2), and the carbonyl group proton (C=O). The chemical shifts and splitting patterns of these protons are as follows: • The two methyl group protons appear as a singlet peak at around 1.25 ppm (parts per million), which is typical of alkyl methyl protons. • The methylene group proton adjacent to the carbonyl group appears as a triplet peak at around 4.1 ppm, with a coupling constant of about 7 Hz. This triplet splitting arises from the coupling of the methylene proton with the two equivalent methyl group protons. • The carbonyl group proton appears as a singlet peak at around 2.05 ppm, which is typical of carbonyl protons. Overall, the NMR spectrum of ethyl acetate displays four peaks, each corresponding to a different type of proton in the molecule. By analyzing the chemical shifts and splitting patterns of these peaks, it is possible to identify the different types of protons in ethyl acetate and obtain information about the molecular structure and bonding. The so-called explanation (much like the previous one regarding methylamine and pyrazine) is a mixed bag. There is good and there is bad. And even when there is good, ChatGPT uses it incorrectly. Example of good info: a correct formula for ethyl acetate. Some examples of bad inferences: - "carbonyl group proton" does not exist in this molecule - there are six, not "two" methyl group protons in the formula - the formula shows two methylene group protonS, not a "methylene group proton" - the methyl groups are inequivalent and produce signals at different chemical shifts whereas ChatGPT says there is one signal that is due to "two methyl group protons" and describes them both as "alkyl methyl protons" (actually, one is an alkyl methyl, the other is an acyl methyl) - many errors in the coupling patterns and explanations of coupling And a final conclusion that is highly misleading: - "the NMR spectrum of ethyl acetate displays four peaks" No. There are actually 3 types of protons, and they produce these signal patterns: a singlet (1 peak), a triplet (3 peaks) and a quartet (4 peaks) for a total of 8 peaks. ChatGPT's conclusion (4 peaks) doesn't even agree with its own analysis. It lists 3 types of protons, and identifies them as producing two singlets + one triplet -> 3 signals or 5 peaks. Obviously, this is a much worse example of ChatGPT's abilities than the previous one, but I think they have much in common. Tread carefully. Alan On Wed, May 3, 2023 at 1:12 PM Kshatresh Dutta Dubey kshatresh_+_gmail.com > wrote: I am pasting answer > from ChatGPT which seems fine to me: "Yes, it is possible for a hydrogen bond to form between pyrazine and methylamine. Pyrazine is a six-membered aromatic heterocycle containing two nitrogen atoms in its ring structure, while methylamine is a simple amine molecule with one nitrogen atom and one methyl group. In the case of hydrogen bonding, the hydrogen atom in methylamine can form a hydrogen bond with one of the nitrogen atoms in pyrazine. This can occur because nitrogen has a lone pair of electrons, which can form a hydrogen bond with hydrogen." Hope it helps. KDD -- Dr. Partha Sarathi Sengupta Associate Professor Vivekananda Mahavidyalaya, Burdwan -- Alan Shusterman Professor Emeritus Chemistry Department Reed College 3203 SE Woodstock Blvd Portland, OR 97202-8199 http://blogs.reed.edu/alan/ "Patience, persistence, and a sense of humor." Dave Barrett (1956-2017, Reed College '79)