From owner-chemistry %-% at %-% ccl.net Thu Aug 3 13:27:34 1995 Received: from iat.holonet.net for GUENTER "at@at" mdli.com by www.ccl.net (8.6.10/930601.1506) id NAA18488; Thu, 3 Aug 1995 13:15:20 -0400 Received: from jubal.mdli.com by iat.holonet.net (GUENTER %-% at %-% mdli.com) id KAA27755; Thu, 3 Aug 1995 10:15:04 -0700 Received: from gimli.mdli.com by jubal.mdli.com (8.6.12/9393.1) id KAA61220; Thu, 3 Aug 1995 10:06:52 -0700 Received: from mdli.com by mdli.com (PMDF V4.3-10 #6626) id <01HTMRCGN3S0IIUBXY-0at0-mdli.com>; Thu, 03 Aug 1995 10:15:05 -0800 (PST) Date: Thu, 03 Aug 1995 10:15:01 -0800 (PST) From: GUENTER GRETHE Subject: ACS Chicago - CINF Abstracts - 29 pages document - To: chminf-l;at;iuvbm.ucs.indiana.edu Cc: chemind-l <-at-> derwent.co.uk, chemistry <-at-> ccl.net Message-id: <01HTMRCGNDF6IIUBXY "-at-" mdli.com> X-VMS-To: IN%"chminf-l-: at :-iuvbm.ucs.indiana.edu" X-VMS-Cc: IN%"chemind-l { *at * } derwent.co.uk", IN%"chemistry { *at * } ccl.net",GUENTER MIME-version: 1.0 Content-type: TEXT/PLAIN; CHARSET=ISO-8859-1 Content-transfer-encoding: QUOTED-PRINTABLE These abstracts are posted on multiple lists. I apologize for any inc= onvenience. With an excellent program awaiting us in Chicago, can the next ACS meeting be far away? For the Spring meeting in New Orleans the CINF= =20 division has scheduled the following symposia: =09 Neural Networks in Chemistry =09 AI-Based ("Smart") Techniques for End-User Searching =09 Chemical Information Handling for Combinatorial Libraries =09 Managing Information in Databases of Three-Dimensional Structures =09 Information Needs of Regulated Chemical Research =09 Utilization of Information in Databass of Biologically Active Com= pounds In addition to scheduling these topic-oriented symposia the division = would also like to invite its members to showcase the diverse and interesting fi= eld of chemical information in the technology age by presenting papers in ge= neral oral or poster sessions. Abstracts for New Orleans are due November = 15th.=20 Any inquiries or request for ACS abstract forms should be sent to: =09=09 =09=09Guenter Grethe =09=09MDL Information Systems, Inc. =09=0914600 Catalina Street =09=09San Leandro CA 94577 =09=09voice: 510-895-1313 fax: 510-614-3652 e-mail:guenter -x- at -x- mdli.co= m ABSTRACTS, CINF PROGRAM - FALL ACS MEETING IN CHICAGO=20 SUNDAY MORNING, AUGUST 20, 1995 Collaborative Electronic Notebooks - Legal, Regulatory, Social and= =20 Technical Issues R. Lysakowski, Presiding =20 1. 9:00 AM ELECTRONIC LAB NOTEBOOKS: SHOULD WE, CAN WE, WILL WE? Raymond E.Dessy,Virginia Polytechnic Institute & State University, Chemistry Dept. 0212, Blacksburg, VA 24061. Should We?: Let's explore what Electronic Lab Notebooks can do to accelerate the product development cycle and to improve the corporate memory of facts and information. We'll also explore how it can improve the memory of corporate decision making processes, essential in a period when employee turnover is so rapid. Can We?: What technical infrastructures must be in place and well supported if ELNs are to succeed? What software growth areas need to mature? Will We?: For ELNs to succeed there must be a strong ground swell of support and use by the user community, as well as a simultaneous and highly visible commitment from a technically aware senior management. Without these synergistic forces, the work place changes will simply not occur. Conservative, strongly reactionary users will object to the sharing that is implicit in a fully implemented system. WE are the greatest weakness in this exciting new lab tool. The presentation will draw from anonymous examples of success and failure. 2. 9:30 AM BASELINE REQUIREMENTS FOR ACCEPTABILITY, USABILITY, & TECHNOLOGY FOR COLLABORATIVE ELECTRONIC NOTEBOOKS. R. Lysakowski, Optimize Technologies, Sudbury, MA 01776. Collaborative Electronic Laboratory Notebooks (CENBs) are at the intersection of law, technology, business, sociology and regulatory practices. Acceptability means different things to people in these different domains. Group software systems can and has been designed without paying close attention to all of these factors. However in today's business climate, those that fail to take them all into account can die stillborn, never getting a chance to correct inherent mistakes, or whither until they die on the vine because they are never being taken seriously by the market. For few tools have people longed so much, nor participated in so much controversy, much of which is overblown by ignorance of some key fundamentals. Yet, the promises for productivity enhancements and workplace transformation are now being fulfilled. As component technologies and interface standards arrive that permit the required degrees of integration, vendors are building their applications and systems with them, and people are learning to use these new combinations within newly designed work processes. This talk will address the fundamentals, issues of acceptability, details of usability and testing for it, and issues of fit by component technologies and full systems. It will be based on results of major studies of requirements and recent evaluations of technology fit to those requirements. 3. 10:00 AM=20 ADMISSIBILITY AND CREDIBILITY OF ELECTRONIC RECORDS. Richard D. Rochford, Jr. and Lisa Dolak, Nixon, Hargrave, Devans and Doyle, Clinton Square, Box 1051, Rochester, NY 14603. Scientists, quite naturally, often prefer to record research by computer instead of by pen and ink in laboratory notebooks. But are electronic records admissible to prove, for example, the date an invention was reduced to practice? Even if admissible, are electronic records subject to attack that they are less credible than bound paper notebooks, and thus entitled to little weight?=20 This presentation will explore the treatment of electronic records by the federal courts. Potential grounds for attacking the reliability and credibility of electronic records will be illustrated through a brief "mock trial" demonstration. 4. 10:30 AM=20 ESTABLISHING RELIABILITY FOR ELECTRONIC DOCUMENTS - REQUIREMENTS AND STATE OF THE ART IN NOTARY SERVICES. Scott Stornetta, Surety Technologies, Chatham, NJ 07928. The prospect of a world in which all text, audio, picture and video documents are in digital form raises a troublesome question. How can one certify a record unimpeachably, locking it in content and time? This problem was solved by a group of cryptographers at Bellcore in the late '80s. Remarkably, it can be done without keys, without disclosure, and without the need to trust anyone. The technology is now incorporated into Surety's Digital Notary System which solves significant problems relating to electronic records management. The particular needs of the research community and laboratory notebooks for reliable and certifiable electronic documents will be discussed, as will some background and various technical approaches for solving these problems. 5. 11:00 AM TOWARDS A REFERENCE MODEL FOR BUSINESS ACCEPTABLE COMMUNICATION.=20 David Bearman, Archives & Museum Informatics, 5501 Walnut St., Pittsburgh, PA 15232. In order for electronic communications to serve as the basis for laboratory notebooks or documentation of collaboration in socially significant business transactions, they must be trustworthy and available over time. Research on the attributes of electronic evidence at the university of Pittsburgh has led to formulation of "Functional Requirements for Recordkeeping".=20 Further specification of these functional requirements in production rules results in formulation of a metadata encapsulated object which is (by virtue of the presence of this metadata), a record and can serve as evidence. Such metadata encapsulated objects would be viable for the conduct of business and suited to archiving in an efficient fashion. This paper proposes a framework for deriving such a Reference Model and the layers that give it modularity. 6. 11:30 AM=20 UPDATE ON ELECTRONIC SIGNATURES AND RECORDS. Paul Mortise, U. S. Food and Drug Administration, 7520 Standish Place, Rockville, MD 20855. In the August 31, 1994 Federal Register (59 FR 45160), FDA published a Proposed Rule on Electronic Signatures; Electronic Records. The new rule, part 11, would establish the conditions under which FDA will consider electronic records, electronic signatures, and handwritten signatures executed to electronic records to be trustworthy, reliable, and generally equivalent to paper records and handwritten signatures executed on paper. The regulations would apply to all records in electronic form that are created, modified, or maintained pursuant to any records requirements in Chapter I of Title 21 of the Code of Federal Regulations. =20 Intended to promote and accept new technologies while maintaining FDA's enforcement integrity, the rule is part of the Administration's reinventing government effort and, as such, has been put on a fast track toward completion. This presentation will provide the most timely information possible on the project, and will review the provisions of the proposed rule as well as some of the more significant comments made by industry respondents. SUNDAY AFTERNOON, AUGUST 20, 1995, SECTION A Continuation - Electronic Notebooks R. Lysakowski, Presiding 7. 1:30 PM=20 GUNS, MONEY AND GROUPWARE - THE POWER BASE OF A LEARNING R&D ORGANIZATION Sean O'Brien, Advanced Communication Systems, South Hamilton, MA 01982. Organizations throughout time have based their power on competitive advantage. The basis for the competitive advantage has changed over the years, but the old adage, "may the best one win" has meaning to this day. In years past the winning team was often the one with the most money or the biggest guns. Knowledge has emerged as a basis for power today, especially for the R&D- based organization. Depending on your position relative to the Knowledge, you possess either a competitive advantage, or a barrier to entry that may bar your ability to succeed. Knowledge is the fundamental source of power in the R&D Organization. Yet organizations suffer from true amnesia: first they get hit on the head when people leave, then forget all but the most obvious lessons they've learned, and don't preserve the details required to repeat success with ease. Many tools and practices have developed to enable the development and capture of knowledge. A class of tool commonly called "Groupware" can capture both the process knowledge and information generated by that process. The use of "Groupware" is becoming a necessity for an organization whose power is based on Knowledge. Using groupware though requires more than just a purchase of software. A cultural change in the organization needs to take place. The role of knowledge in the organization and how it is shared or not shared takes on a sharp focus. A study of the motivations and requirements for using electronic notebooks, document management, and collaboration systems was undertaken. The goals of the study were to understand what motivational, value, or ethical shifts were required to apply groupware and related technologies to the problem of R&D collaboration across organizations, and the organizational obstacles that must be overcome to bring such systems into routine operation in laboratories. 8. 2:00 PM =20 THE NATIONAL ENVIRONMENTAL MOLECULAR SCIENCES COLLABORATORY Richard T. Kouzes, Pacific Northwest Laboratory, Richland, WA 99352. A Collaboratory is an open meta-laboratory that spans multiple geographical areas with collaborators interacting via electronic means - "working together apart." Collaboratories are designed to enable close ties among scientists in a given research area, to promote collaborations involving scientists in diverse areas, to accelerate the development and dissemination of basic knowledge, and to minimize the time-lag between discovery and application. Pacific Northwest Laboratory (PNL), one of the U.S. Department of Energy's national laboratories, is developing a Collaboratory that will enable more effective environmental molecular sciences research. The testbed for this activity is the instrumentation being developed for the Environmental Molecular Sciences Laboratory (EMSL) at PNL. Creating a Collaboratory entails integrating software and hardware computing tools to produce an environment where multiple geographically separated researches can collaborate on experimental and analysis tasks, data sharing, joint operation of computer and instrument resources, and exchange of personal expertise. The Collaboratory development status will be provided. 9. 2:30 PM USING COLLABORATIVE TECHNOLOGIES FOR SCIENTIFIC RESEARCH - SOME PILOT STUDY EXPERIENCES. A. Scour, Information and Engineering Sciences Department, Pacific Northwest Laboratory, P.O. Box 999, Richland, WA 99352. In the physical world scientists use a vast arsenal of methods and strategies to amplify the dynamic process of collaboration.=20 Often these techniques go unnoticed by the participants enabling them to focus on problem solving, gaining insight, synthesizing new information, and discovery. Computers and software that can support electronic collaboration between individuals are now emerging and beginning to be used.=20 While functionally capable, they often fall short with respect to the social give-and-take and cognitive factory that are necessary for scientific collaborations to be successful. Today's collaborative technologies often inconveniently divert scientists' attention from the scientific task. Using examples =66rom small pilot studies being performed at Pacific Northwest Laboratories, the talk will address some of the social and cognitive factors that should be enable within the fabric of scientific collaborative systems to provide a collaborative environment that does not interrupt collaborators from doing science. Attention will be given to issues providing concurrent spontaneity of communication and the evolved social practices between collaborators. 10. 3:00 PM=20 MANAGING AND PRESERVING KNOWLEDGE CONTAINED IN DOCUMENTS, E. P. Dion, Mobil Exploration and Producing Technical Center, P.O. Box 650232, Dallas, TX 7565. Knowledge has historically been preserved and managed through paper documents. While computers were initially used only to automate their creation, the recent phenomenon of networked computing demands the use of electronic rather than paper based documentation. This presents new challenges in properly preserving document based knowledge in the exact form intended by the author. The fundamental limitation of using computer based tool for knowledge management is their inability to identically display compound documents across diverse computing platforms.=20 To fully support network knowledge exchange, documents must be viewable as originally created including layout, fonts, and graphics of any size or complexity. The emergence of portable document (represented by such product as: Acrobat, Common Ground, Envoy, Replica) serves as the basis for universal workgroup communication. As such, they augment the search and delivery capabilities of other workgroup tools by presenting identical document views. 11. 3:30 PM =20 THE CHANGING SCIENTIFIC WORK PRODUCT. Howard M. Kanare, Chemical Services,Construction Technology Laboratories, Inc., 5420 Old Orchard Rd, Skokie, IL 60077-1030. Scientific records, especially laboratory notes, are used in different ways by people with different work functions such as research, product development, production, analysis and testing, management, teaching, and consulting. Work specialties, such as environmental, polymer, or inorganic chemistry have much in common in terms of information needs when viewed within work functions. The shift to computer-based laboratory notes should cause scientists and software developers to think of specific needs within work function levels. Is the scientific work product changing fundamentally? Or, are electronic, virtual notebooks and collaborative systems just the latest in a progression of record-keeping tools for scientists? 12. 4:00 PM =20 COLLABORATIVE VIRTUAL WORKSPACES, Peter Spellman, MITRE Corporation, Bedford, MA 01730. As organizations become more geographically and temporally disposed, computer supported collaborative environments (to be differentiated from vertical collaborative applications) become more important as vehicles to support intra-and inter- organizational coordination and collaboration. This talk presents a next generation collaborative Virtual Workspace (under-development at MITRE utilizing current Internet technologies) to support location-transparent, location- independent, multimedia collaborations within contexts provided by a flexible spatial metaphor. SUNDAY AFTERNOON, AUGUST 20, 1995, SECTION B Symposium in Honor of Gerald Vander Stouw G. Grethe, Presiding 13. 2:30 PM=20 THE CAS CHEMICAL REGISTRY SYSTEM: PAST, PRESENT, AND FUTURE. CAS REGISTRY FILE. William Fisanick, Wladyslaw V. Metanomski, Robert E. Stobaugh, Chemical Abstracts Service 2540 Olentangy River Road, P. O. Box 3012, Columbus, OH 43210. The CAS Chemical Registry System is a computer-based system that has been used for the past 30 years to identify uniquely chemical substances on the basis of their molecular structure. The Registry database currently contains representation and related information for over 13 million substances. CAS Registry Numbers are used as substance identifiers in CAS databases as well as in a variety of non-CAS databases, including those of many regulatory agencies. Begun originally to support the indexing of CHEMICAL ABSTRACTS (CA), the Registry System has become a worldwide inventory of chemical substances and has assisted chemists and other scientists around the world in their endeavors. The initial version of the system introduced in 1965, known as Registry I, was limited to fully defined organic substances. Registry II (1968) extended the machine registration to a variety of substance classes such as inorganic compounds, coordination compounds, alloys, and polymers. Registry III (1973) involved major adjustments in the Registry records to make the system more effective in support of CAS indexing operations.=20 In 1989 CAS began a long-term effort to further improve the Registry System. These efforts have been directed at improving the handling of certain substance classes, expanding the search, retrieval and display capabilities, and expanding the scope of Registry coverage. This paper will discuss and illustrate the evolution of the key features of the Registry System. 14. 3:00 PM=20 IDENTIFICATION OF SUBSTANCE NAMES IN CHEMICAL TEXTS Nick M. Kemp and Michael F. Lynch, University of Sheffield, United Kingdom. Much attention has been devoted to translating chemical names into other forms such as molecular formulae (Garfield) and connection tables (Vander Stouw), once they are isolated and recognized as such. Much less attention has been paid to identifying substance names in running text, which becomes an increasing priority as the complexity of substance increases and error-free processing becomes more important. The elaboration of a methodology for identifying substance names in the text of English-Language chemical texts is described, and the process, still at an early stage, is evaluated. 15. 3:30 PM=20 THE DIGITAL INFORMATION EXPLOSION AND ITS IMPACT ON CHEMICAL RESEARCH B. Lawlor, Advanced Research Technologies, 1062 Lancaster Avenue, Suite 18-C, Rosemont, PA 19010. Since the release of the IBM PC in 1981, the Information Industry has undergone major changes, with the single dominant force being the method of access to and delivery of information. As a result, the practice of chemical research has also begun to evolve. Computer literacy among chemical researchers has grown and chemical information has risen to a new level of importance.=20 Indeed, information in digital format can be quickly accessed and delivered worldwide to laboratories, offices, and homes - facilitating the speedy development of practical applications of theoretical concepts. Chemists can now communicate quickly and more efficiently via international networks, fostering collaborative efforts. These positive results of the digital information explosion, along with the current vulnerabilities that could inhibit its full utilization in scientific communication will be discussed. 16. 4:00 PM =20 AN OVERVIEW OF CHEMISTRY RESOURCES ON THE INTERNET. G.D. Wiggins, Chemistry Literacy, Indiana University, Bloomington, IN 47405. The shift on the Internet from access via client software that is primarily text-based (e.g., gopher) to clients that readily handle multimedia (e.g., Mosaic, Cello, Netscape, etc.) is well under way. For chemistry, the shift opens up the possibility to visualize chemical substances via the Internet. Thus, standards are evolving to enable the inclusion of chemical structures in many types of Internet files. Nevertheless, many resources with significant chemical information content, but no graphical components, are found in many places on the Internet. An overview of both types of resources will be presented, with examples. 17. 4:30 PM =20 USE OF MULTIMEDIA IN A HIGH SCHOOL SCIENCE INSTRUCTIONAL PROJECT.= =20 Kenneth M. Chapman and Richard A. Love, American Chemical Society, Washington, DC 20036. SciTeKS (Science Technology: Knowledge and Skills) is a hands-on, multidisciplinary instructional project for high school students enrolled in school-to-work programs. The course of study will consist of a series of up to 20 modules focusing on applications of chemistry, biology, and the geological sciences to solve technological problems, thus offering a contextual learning environment. This "in-context" or "need-to-know" method of instruction, providing themes for introducing the science needed to solve problems presented in the modules, offer an engaging and motivational approach for students to learn applied and conceptual science. An integral part of the project design is a multimedia application, which includes the multimedia components of text, graphics, sound, video, hypertext linking, animations, and interactivity. The content of the application incorporates an encyclopedia of core reference information; overview "scene- setting" Quicktime videos of the industrial sites described in the modules; and instructional units that present an interactive learning environment for the students. This presentation will provide an overview of the multimedia application design and the status of work-in-progress on this 3 year NSF-funded project. 18. 5:00 PM=20 AIDS INFORMATION IN THE 90'S: ELECTRONIC DATABASES, BULLETIN BOARDS, AND THE INTERNET. R. Bates, A. Wilson, And M. Vander Kolk, Aspen Systems Corporation, 1600 Research Blvd., Rockville, MD 20850. Acquired Immunodeficiency Syndrome (AIDS), a relatively new disease, has generated a vast amount of information and data about its etiology, epidemiology, prevention, treatment and research. One of the many unique aspects of this disease is the proactive role that Human Immunodeficiency Virus (HIV)-infected people and AIDS patients are playing in the management and control of their illness. One manifestation of this role, a reflection of the electronic environment we live in, is the proliferation of nonprint, nonstandard, nontraditional information products and services targeted at patients, families, activists, and other audiences that are appearing on electronic bulletin boards and on the Internet. This presentation will cover the major categories of HIV/AIDS information resources, their target audiences, and how to find and use these various information resources. MONDAY MORNING, AUGUST 21, 1995, SECTION A Use of Chemical Information in Generating New Compound Leads G. Grethe, Presiding 19. 9:05 AM INFORMATION NEEDS OF A MEDICINAL CHEMIST IN A CHANGING WORLD. A personal view of 25 years of information technology. Hans H. Hausberg, E. Merck, Scientific Information Systems, Frankfurter Strasse 250, 64261 Darmstadt, Germany. Up to the sixties the information technology to fulfill the needs of a medicinal chemist was rather limited. Besides some functionally very restricted batch oriented systems running on an IBM mainframe to store structures and alpha-numerical data the tools to maintain data were filing cards and paper. Starting with synthesis planning systems, programs running on mini computers became available during the seventies to store chemical information. The eighties saw modeling tools to perform high sophisticated QSAR-studies and a growing list of in house databases with various kinds of information. The nineties are dominated by integration technology which brings information directly to the chemists personal desktop computer under one graphical user interface. 20. 9:35 AM USE OF 3D PHARMACOPHORE SEARCHING IN DRUG LEAD DISCOVERY G. W. A. Milne, M. C. Nicklaus, Shaomeng Wang, Lab. of Medicinal Chemistry National Cancer Institute, NIH, Bethesda, MD 20892. The National Cancer Institute's three-dimensional (3D) structural database and sample repository contains over 450,000 compounds which have been tested for anticancer activity. Samples of most of the compounds are available and a 3D database has been created =66rom the 2D structures. This is routinely searched for pharmacophores, defined in terms of the x, y, z coordinates of the pharmacophore atoms. The retrievals from such searches can be tested immediately in the appropriate enzyme assay and the active compounds that are fond are treated as drug leads. In this way, we have identified numerous leads with activity against protein kinase C, HIV protease and HIV integrase. These leads can be refined by selection of low energy conformers and adjustment of physical properties such as solubility to optimize their activity. 21. 9:55 AM BINDING MODES OF NOVEL HIV PROTEASE INHIBITORS BY MOLECULAR MODELING. Shaomeng Wang, M. C. Nicklaus, G. W. A. Milne, Xinjian Yan, William G. Rice, NIH, National Cancer Institute, Lab. of Medicinal Chemistry, Bethesda, MD 20892. Molecular modeling studies of the interactions between HIV protease receptor and high binding affinity inhibitors has enabled us to define the pharmacophores in HIV protease.=20 Searches of the National Cancer Institute's three-dimensional (3D) structural database using the defined pharmacophores has lead to the discovery of several classes of novel, potent, non- peptide HIV protease inhibitors. One of these newly discovered inhibitors (IC50 =3D 1.7uM) was found to be capable of protecting CEM cells against HIV-1 infections at concentrations (EC50 =3D 12uM) well below cytotoxic concentrations (IC50 =3D 53 uM). This compound was also determined to be an HIV integrase inhibitor (IC50 =3D 0.7uM), suggesting that the compound could target several proteins important to viral replication and hence representing a promising lead in the development of anti-HIV agents. Molecular modeling was used to determine the binding modes of these HIV protease inhibitors on the active site, to gain insight into structure-activity relationships and to design new compounds based on these leads. The results of the molecular modeling studies, the 3D pharmacophore searching and the biological evaluations will be discussed. 22. 10:15 AM HIV-1 INTEGRASE INHIBITORS. 3D SEARCHING AND ACTIVE SITE DOCKING. M. C. Nicklaus, Yves Pommier, Abhijit Mazumder, G. W. A. Milne, NIH, National Cancer Institute, Lab. of Medicinal Chemistry, Bethesda, MD 20892. The National Cancer Institute's three-dimensional (3D) structural database was searched for HIV-1 integrase (IN) inhibitors with a putative pharmacophore that was derived from known IN inhibitors.= =20 Four of the 23 compounds that were so identified and obtained =66rom the NCI sample repository and tested in an IN assay showed inhibitory activity. They all belong to a class of compounds different from those that we have analyzed previously and that were used to derive the pharmacophore. Exploiting the knowledge gained from these studies and from other previously known IN inhibitors, the 3D structure of the catalytically active core domain of IN, which was recently solved by x-ray crystallography, was used in the next steps of IN inhibitor development. Docking of known inhibitors in the assumed active site provides insight into the binding mechanism and binding energies, and this in turn allows quantitative modeling of the binding of novel compounds to IN in order to develop potent inhibitors of this enzyme, which is crucial to the production of HIV-1. 23. 10:35 AM EXPLOITING DATA FROM COMBINATORIAL SYNTHESIS AND SCREENING.=20 Steven M. Muskal, MDL Information Systems, Inc., 14600 Catalina St., San Leandro, CA 94577. Many companies have used combinatorial chemistry and HTS to substantially reduce the time and cost associated with lead generation and optimization. However, as more compounds are produced/purchased and assayed, more robust methods of learning rules and relationships in this data will be necessary to completely realize return on the organizational investment.=20 Computer simulated neural networks, for example, are quite capable of learning even the most complex rules necessary to predict biological activity, provided a representative set of learning examples is available and an appropriate numerical representation of chemical structure is employed. We will discuss the utility of neural networks in structure-activity learning and describe their utility as "electronic assays'" capable of surveying large molecular populations. Here, once a neural network demonstrates adequate predictive performance for a given structure-activity series, it can be used to "electronically screen" compound databases, prospective libraries, and/or virtual libraries for probably active compounds. We will discuss these and other uses of this methodology from reagent selection to overall library assessment. 24. 11:00 AM 3D PROPERTY BASED PRECURSOR SELECTION FOR COMBINATORIAL LIBRARY CONSTRUCTION. Robert D. Brown, Mark G. Bures, Yvonne C. Martin, Patricia A. Pavlik, Abbott Laboratories, Pharmaceutical Products Division, Abbott Park, IL 60064-3500. A procedure has been developed to select diverse sets of precursors for the construction of combinatorial libraries. A set of precursors is selected to have a given functionality through which each member will react to form part of the target molecule. All compounds with the correct functionality are first selected from the Available Chemicals Directory. A variety of exclusion criteria are then applied to eliminate those with undesirable features. Each candidate precursor is characterized by the distribution of potential pharmacophore points in 3D space. Using clique detection, these compounds are then grouped and a representative set chosen from the cliques. This procedure is designed to ensure that the largest possible range of conformational space will be sampled by a library of structures built from these sets of precursors. 25. 11:25 AM RATIONAL SCREENING SETS: CREATION AND DIVERSITY PROFILING. Paul R. Menard, Richard A. Lewis, and Jonathan S. Mason, Rhone-Poulenc Rorer, Collegeville, PA 19426. Clustering of structural characterizations and partitioning of calculated molecular properties have been used to create chemically diverse subsets of the corporate research compound registry for use in new leads screening, and to aid in the identification of missing diversity. This presentation will concentrate on the structure-based approach, in which Jarvis- Patrick clustering of Daylight fingerprints was used. A strategy was developed that applies clustering in a cascaded manner, so as to keep the size of large clusters and number of singletons within a reasonable range. Clustering of all suitable registry structures (~160k structures) was used to create a representative structurally diverse subset. Clustering studies were also done on combinations of datasets taken from the other company registries (total ~390k structures) and external compound libraries (ACD, etc.). Recent results will be presented from several diversity analyses, together with results from a comparison with the diverse subsets obtained using a property- based approach (based on six non-correlated descriptors for molecular and physicochemical properties), highlighting the amount of complementarity between the two methods. 26. 11:50 AM A NOVEL METHOD FOR EVALUATING THE THREE-DIMENSIONAL DIVERSITY AND PRESENCE OF PUTATIVE COMMON PHARMACOPHORES IN COMPOUND LIBRARIES AND DATABASES OF ACTIVE STRUCTURES: PHARMACOPHORE-DERIVED QUERIES. Jonathan S. Mason, Stephen D. Pickett, Iain M. McLay, and Richard A. Lewis, Rhone-Poulenc Rorer, Collegeville, PA 19426. The current interest in combinatorial chemistry for lead generation has necessitated the development of methods for evaluating the diversity of the resultant compound libraries.=20 Such methods also have application in selecting diverse sets of compounds for general screening from corporate databases, and in the analysis of large sets of structures to identify common pharmacophore patterns. Existing methods for the analysis of large databases rely mainly on whole molecule properties such as cLogP or two-dimensional descriptors such as topological indices and structural characterisations. This paper presents a novel methodology for calculating diversity and identifying common features based on the three-point pharmacophores expressed by a compound. The method has been implemented within the environment of the Chem-X molecular modelling package (ChemDBS-3D), using a systematic analysis of 3D distance space with three point combinations of six pharmacophoric groups distinguished by an in- house developed atom type parameterisation. Uses of this method for lead generation will be discussed. MONDAY MORNING, AUGUST 21, 1995, SECTION B Polymer Information Management S. Young, Presiding 27. 9:05 AM POLYMER NOMENCLATURE AND STRUCTURE: A COMPARISON OF SYSTEMS USED BY CHEMICAL ABSTRACTS SERVICE, THE INTERNATIONAL UNION OF PURE AND APPLIED CHEMISTRY, MDL INFORMATION SYSTEMS, INC., AND DUPONT.= =20 Edward S. Wilks, E. I. Dupont de Nemours and Company, Corporate Information Science, Wilmington, DE 19880. Polymer nomenclature styles and structural representational systems described, recommended, or used by Chemical Abstracts Service, the International Union of Pure and Applied Chemistry, MDL Information Systems, Inc., and Dupont are compared and contrasted. Structure-based versus source-based nomenclature and structural representations are discussed. The paper covers regular single-strand organic polymers, irregular single-strand organic polymers (a large group that includes alternating and other periodic; block; comb and graft; crosslinked; dendritic, hyperbranched, hypercrosslinked, and star; and posttreated), stereochemistry in polymers, regular and quasi single strand inorganic and coordination polymers, regular double-strand (ladder and spiro) organic polymers, and siloxanes. Nomenclature styles and structural representations of end groups are included. 28. 9:35 AM DERWENT'S ENHANCED POLYMER INDEXING SYSTEM. J. A. Briggs, W. G. Town, D. Walter, Derwent Information Ltd., London, WC2B 5DF, England. In 1993 Derwent introduced the Enhanced Polymer Indexing system to index all polymer related patents included in the Derwent World Patents Index online database. This paper will discuss the main features of the system including the use of linking levels to represent the complexity of polymer information with greater specificity. In addition, we will describe the PILOT software which helps users create polymer search strategies offline. A new version of the PILOT software is being developed to make a complex system easier to use by creating online search strategies which will search all versions of the Plasdoc code and the new Polymer indexing. 29. 10:05 AM POLYMER INFORMATION ON THE INTERNET. Ann D. Bolek, University of Akron, Science-Technology library, Akron, OH 44325-3907. Until fairly recently, polymer information on the Internet has been limited mostly to listservs (POLYED-L. POLYMER,POLYMERP) and one Usenet newsgroup (sci.polymers). with the advent of the World Wide Web, many academic institutions, companies, laboratories, institutes, and other organizations are creating their own home pages, and the amount of polymer information has proliferated. Two organizations have created excellent home pages linking one to additional information. The "Poly-Links" web server is located at URL http://www.polymers.com and is sponsored by Phoenix Polymers, a commercial polymer compounder, Page Plumbers Co., a Web Design company, and others. The Polymer Resource Network web server is located at URL http://www.polysort.com; the Polymer Resource Network has created a comprehensive database of polymer companies, Polysort, and provides information retrieval and research services. The type of polymer information available on the Internet will be discussed. 30. 10:35 AM THE POLYMER INDUSTRY ADVISORY COUNCIL - WORKING TOWARD SOLUTIONS IN POLYMER INFORMATION MANAGEMENT. Anne Rogers, Dow Chemical Company, Chemical Registry, Library and Information Sciences, Midland, MI 48667. The Polymer Industry Advisory Council (PIAC) was officially formed in February, 1993. The function of this group is to provide a forum and mechanism for the exchange of information among users of MDL information Systems, Inc.'s software products and with MDL with the objective of improving the technology of polymer information systems. PIAC consists of representatives =66rom several major multinational companies in the polymer industry. This paper will report on the group's progress to data, including our identification of the major issues related to polymer information management, and will focus on the development of a relational data model for the storage and retrieval of polymer structures, syntheses and properties. 31. 11:05 AM ENHANCED ACCESS TO POLYMER INFORMATION FROM CAS. G. Kenneth Ostrum, Sylvia J. Teague, Chemical Abstracts Service, Columbus, OH 43210. Polymers include materials whose properties may be well known but whose structure is not so well defined or is very complex. For more than a quarter of a century, CAS has been providing access to a full range of polymer information. However, some polymers represent more of a challenge than others to users in accessing information. Over the past two and a half years, CAS has provided improved access to polymer information in three areas: polymer class terms were added for generic searching in the Registry File; siloxanes that previously be searched only as text terms in the bibliographic CA file are now registered and structure- searchable; and structure-based access points are being provided for polymer esters and ethers that previously were searchable only by names. 32. 11:35 AM INTEGRATION OF IN-HOUSE CHEMICAL INFORMATION MANAGEMENT SYSTEMS WITH EXTERNAL ON-LINE INFORMATION SOURCES. Dwight H. Lillie, Larry French, Steve Young, and Harold Cade, MDL Information Systems, Inc., San leandro, CA 94577. Access to individual on-line data sources is becoming easier (e.g. DialogLink, STN Express, MolKick, SCI Finder). However, technical limitations of software technology have prevented scientists from directly analyzing and reducing data from multiple disparate electronic sources into relevant information.=20 Two key developments are needed to allow this generation of relevant information; virtual integration of data independent of original source, and guided analysis tools to reduce the data to relevant information. This talk will present recently developed technology (MDL's ISIS/Host Open Gateway) that provides the foundation to automatically integrate data from on-line sources and in-house sources (e.g. relational, text, and molecular databases). These new integrated views of disparate data sources will be presented. Additionally, the development of standardized data formats and the direct generation of relevant information by non-information specialists will be discussed. MONDAY AFTERNOON, AUGUST 21, 1995 Careers in Chemical Information B. Slutsky, Presiding 33. 1:35 PM WE'VE COME A LONG WAY: FROM PRINT TO COMPUTERS AND NETWORKING IN 550 YEARS. Lucille M. Wert, Emeritus Professor of Library Administration, University of Illinois, Urbana, IL 61801. This paper describes the methods scientists and science librarians/information specialists developed to provide access to information over the last four hundred years. During this period the informational needs of scientists have not changed. What has changed are the methods developed to access the constantly increasing amounts of information. The emphasis of this paper is on the role of chemistry librarians/information specialists in adapting and applying new technologies to improve access to information. 34. 2:00 PM A CAREER AS AN ACADEMIC CHEMISTRY LIBRARIAN. M. E. Moulton, Binghamton University (SUNY), Science Library, Binghamton, NY 13902. A career as a chemical information professional in an academic library can be challenging and rewarding. Traditionally, these jobs require degrees in both library science and chemistry. The chemistry librarian works closely with the faculty to support the education and research mission of the department. This is achieved through collection development and management, reference and online services, and instruction in the use of printed resources and information technologies. The discussion will focus on skills and training required for the job. 35. 2:25 PM THE TECHNICAL INFORMATION CENTER IN A CHEMICAL OR PHARMACEUTICAL COMPANY. David S. Saari, Cyanamid Agricultural Products Research, Clarksville Road, Post Office Box 400, Princeton, New Jersey 08543-0400 The technical information center in a chemical or pharmaceutical company provides traditional library resources and other unique, specialized information services to meet the needs of research scientists and managers. Library resources and services include the acquisition and circulation of books, periodicals, government documents, patents, and other published information. Further, the technical information center usually provides comprehensive research services conducted by information scientists. The technical information center may have an extensive collection of patent information, and the information scientists may be responsible for conducting patent searches in partnership with inventors and patent law department staff. Additionally, the technical information center may be responsible for proprietary information, including laboratory notebooks, company reports, document indexing and/or document imaging systems. Personal information access tools, electronic publications, and other electronic information resources, including the Internet, are changing the traditional roles of the technical information center staff. In the future, technical information center employees may spend more time as tour guides and teachers and less time as providers and guardians of information. 36. 2:50 PM DATABASE PUBLISHING: A SOURCE OF NON-TRADITIONAL CAREER OPPORTUNITIES FOR CHEMISTS. B. Lawlor, Advanced Research Technologies, 1062 Lancaster Avenue, Suite 18-C, Rosemont, PA 19010. Database publishing has long been a source of non-traditional career opportunities for chemists. However, in the current environment of increased electronic publishing, the opportunities have grown. Positions include indexing and abstracting, marketing, sales, product development, public relations, R&D -and management, among others. The positions are not mutually exclusive and can be used as stepping stones for advancement, depending upon education and expertise. This presentation will present a summary of the current opportunities available, the background required for each, and a look at the future of electronic publishing, including opportunities for the entrepreneur. 37. 3:15 PM CAREER OPPORTUNITIES IN PATENTS. Sandra H. Smith, Warner-Lambert Company, Patent Information Services, Ann Arbor, MI 48105. A patent is a contract between the public, as represented by the Government, and an inventor. The inventor agrees to disclose the invention to the public in return for a government granted right to exclude other from making, using or selling the patented invention in the united States for the term of the patent.=20 Individual inventors or inventors at companies or universities file patent applications with the United States patent and Trademark Office (USPTO). Before an application is issued into a patent, the invention must be determined to be useful, new and unobvious. Patent professionals assist this process in the following ways: 1) The patent searcher, who has an appropriate technical degree, conducts a thorough patentability search to determine whether the invention is novel. 2) The patent agent evaluates the information and the search results, prepares and files the patent application and prosecutes the application through the examining and appeal procedures before the USPTO.=20 The agent must have an appropriate technical degree and must also pass the patent bar. 3) The patent attorney,in addition to the patent agent requirements, has a law degree and has passed a state bar, which enables him or her to litigate patent matters in court. 38. 3:40 PM SYSTEMS ANALYSIS, MANAGING INFORMATION SERVICES, CONSULTING: THE CHALLENGES AND REWARDS OF A CAREER IN CHEMICAL INFORMATION.=20 Wendy A. Warr, Wendy Warr and Associates, 6 Berwick Court, Holmes Chapel, Cheshire CW4 7HZ, England. I hope you do not believe that a career in chemical information is only for those who did not make the grade in research, or could not find a better job. There are, unfortunately,still a few research managers around who think that "there's always the library" when they want to redeploy their "walking wounded", or problem staff. Ignore these attitudes - nothing could be further =66rom the truth. For those who are versatile, communicative, and willing to keep up-to-date with many branches of science and technology, a career in chemical information can be both intellectually and financially rewarding and is full of interesting challenges. After more than 25 years of commercial and professional experience in diverse areas of chemical information and computational chemistry, I hope to inspire you to look upon information science as the career of choice. 39. 4:05 PM CHEMICAL INFORMATION CAREERS: QUALIFICATIONS AND COMPENSATION.=20 G. D. Wiggins, Indiana, University, Chemistry Library, Bloomington, IN 47405. The profession of chemical information offers many opportunities for employment, but there is a great range of salaries in the field. These depend on the relevant training and backgrounds of the job holders, the sector in which employed, and a number of other factors. This talk will summarize the most important factors found in a recent salary survey by the Division of Chemical Information and gleaned from other sources. MONDAY EVENING, AUGUST 21, 1995 SCI-MIX, D.S.Saari, Presiding 40. 8:00 - 10:30 PM THE PAPER OF THE CENTURY: THE FIRST RATIONAL DRUG DESIGN: P. EHRLICH, CHEM. BER. 1909, 42, 17-47. G. Lynn Carlson, Julie M. Streiff, Victoria McGruder, and Rebecca Milewski, University of Wisconsin-Parkside, Department of Chemistry, Kenosha, WI 53141. This paper reported the first instance of a compound synthesized in a rational attempt at drug design. Ehrlich had previously noted that some dyestuffs could be toxic to bacteria,and suggested that this was due to their -N=3DN- linkages. He hypothesized that compounds containing the -As=3DAs- linkage could by analogy also be active, and proceeded to synthesize several hundred such compounds. Testing led to the commercial introduction of 3,3'-diamino-4,4'-dihydroxyarsenobenzene (arsphenamine or Salvarsan) for the treatment of syphilis. The success of this approach to the treatment of disease led directly to its application in all phases of therapeutics. A direct result of this revolution is the alleviation throughout the world of morbidity and even mortality due to infectious disease. The change has not been without its drawbacks, which will also be examined in this poster. 41. 8:00 - 10:30 PM WHAT WE NEED TODAY IS A MODERN VERSION OF EDWIN SLOSSON'S "CREATIVE CHEMISTRY." John J. Fortman, Wright State university, Department of Chemistry,dayton, OH 45435. In 1919 the first edition of collected articles which appeared earlier in "The Independent" was published. In it Edwin Slosson clearly demonstrated to the general public what "for war and peace needs"..."this science of chemistry really means for mankind." In is fourteen chapters (a fifteenth was added in later years) it presented in a readable, interesting fashion the importance of chemistry in its application to modern life. The book was widely distributed to libraries using funds from the sale of German chemical patents received as war damages. Surely today we again need such books to awaken our nation to the value and benefits of science and the danger of a technically illiterate society. 42. 8:00 - 10:30 PM PUBLICATION OF THE CENTURY: THE DISCOVERY OF FERROCENE. KEALY, T. J. AND PAULSON, P. L., DEQUESNE UNIVERSITY, PITTSBURGH, PA, NATURE 1951, 168, 1039. Steven Milos, John D. Williams, John M. Carey, and Dale E. Wheeler, University of Wisconsin-Parkside, Department of Chemistry, Kenosha, WI 53141. Ferrocene was originally synthesized in 1951 by Kealy and Paulson when the reaction of cyclopentadienylmagnesium bromide with FeCl3 unexpectedly yielded dicyclopentadienyliron (ferrocene). this was the first observed complex which contained an organic molecule bonded to a transition metal through aromatic pi-bonds. this discovery led to the development of numerous transition metal complex catalysts. Today, soluble transition metal complexes are used extensively in industry to catalyze the synthesis of organic compounds. Catalysts are crucial in the preparation of pharmaceutical and polymer intermediates because of their selectivity and ability to produce pure products in high yield. Organotransition metal complexes have had a tremendous impact on society and industry, and research in the field of catalysis will continue into the next century. 43. 8:00 - 10:30 PM PUBLICATION OF THE CENTURY: THE NUCLEAR INDUCTION EXPERIMENT, BLOCH, F., HANSEN, W. W. AND PACKARD, M., STANFORD UNIVERSITY, CA, PHYSICAL REVIEW 1946, 70, 474. Dale E. Wheeler and Steven Milos, University of Wisconsin-Parkside, Department of Chemistry, Kenosha, WI 53141. Nuclear Magnetic Resonance (NMR) was first described in 1946 by the independent research groups of Bloch and Purcell who shared the 1952 Nobel Prize in Physics for this work. Since then, NMR has become one of the most important diagnostic tools for chemists, physicists, and most recently medical professionals.=20 During the 1970s, the development of 2-D NMR provided information necessary to elucidate the structure of complex molecules. In the 1980s, the 3-D NMR applications included noninvasive CAT scans which provide information about internal structures within the human body. Today, advances in NMR spectrometry continue with multi-dimensional techniques and higher field instruments which ultimately will make NMR the most important tool for molecular and cellular determination. This publication is important not only because of the progress made in NMR technology during the past 50 years, but also because of the great potential NMR has well into the next century. 44. 8:00 - 10:30 PM THE CAMILLE AND HENRY DREYFUS CHEMICAL INFORMATICS PROGRAM.=20 Robert L. Lichter, The Camille and Henry Dreyfus Foundation, Inc., 555 Madison Avenue, New York, NY 10022-3301. Rapidly escalating costs of maintaining and furnishing science library materials makes provision of library services increasingly challenging. Accordingly, in 1993 the Camille and Henry Dreyfus Foundation made planning and prototype grants of $15,000 each to ten colleges and universities. The awards were designed to explore alternative means to deliver the broadest array of chemical information most effectively to the largest numbers of chemical scientists, students, and the public, and to decrease technological, physical and administrative barriers to using chemical information. Projects were anticipated to benefit chemistry library users directly, be adaptable to other institutions and libraries, show familiarity with and take advantage of current trends and research in information science, demonstrate innovative use of technology, and be adaptable to future technological developments. Collaborations among institutions were welcome. Following completion of the planning efforts, proposals for implementation grants were to be evaluated. This presentation will give highlights of the accomplishments and experiences of the institutions receiving the planning grants, and the anticipations of those that received implementation grants. TUESDAY MORNING, AUGUST 22, 1995 Skolnik Award Symposium R. Luckenbach, Presiding 45. 9:05 AM PAST PERFECT, PRESENT PERFECT, FUTURE PERFECT...QUALITY ASSESSMENT AND QUALITY CONTROL MECHANISMS AT BEILSTEIN. R. Luckenbach, Beilstein Institute, D-60486 Frankfurt am Main, Germany. In the constantly expanding world of chemical information systems, the word Beilstein has always been regarded as synonymous with high quality, reliability and comprehensiveness.=20 In order to preserve these important criteria, a number of quality control mechanisms are applied at all production stages involved in the creation of the Beilstein data-pool from which all Beilstein products are derived. These mechanisms include the application of both manual (intellectual) data selection processes as well as a number of sophisticated automatic checking methods to each piece of data. Consequently, the quality and reliability of all Beilstein information tools is assured. This lecture surveys and details these quality control methods and includes some representative examples demonstrating the effects of the various assessment procedures. 46. 9:35 AM CHEMICAL INFORMATION IN 3-D SPACE. J. Gasteiger, J. Sadowski, J. Schuur, P. Selzer, and V. Steinhauer, University of Erlangen, Computer-Chemie-Centrum, D 91052 Erlangen, Germany. The availability of automatic 3D structure generators such as CORINA allows the study of relationships between 3D structure and physical, chemical and biological data. Many data analysis methods such as statistical analysis or neural networks require the representation of molecular structures by a fixed number of variables irrespective of the size of a molecule, Such a representation based on a molecular transform has been developed and successfully used for the classification of dopamine D1 and D2 agonists, for arranging steroids according to their binding activity to the CBG receptor, and for the simulation of infrared spectra. 47. 10:10 AM DISCOVERING REACTION PRINCIPLES USING REACTION DATA BASES AND QUANTUM CHEMISTRY. R. Herges, University of erlangen, Institut for Organische Chemie, D-91054 Erlangen, Germany. Reaction data bases contain a wealth of information in addition to the routinely used search and retrieval capabilities. General rules and reaction principles can be derived by a systematic analysis of the data. We used a hierarchical classification algorithm to scrutinize 80,000 reactions within reaction data bases. Most of the examples fall into well known reaction categories. However, with the remaining data set, we discovered a group of reactions that exhibit obvious relationships which have not been recognized so far. We were able to explain these descriptive relationships on a quantum chemical basis. The theoretical formalism can be reduced to simple rules that predict the stereochemistry of these reactions similar to the Woodward-Hoffmann rules. Moreover, a simple algorithm was derived to predict new reactions of this category. Finally, we were able to verify two of these reactions in the laboratory. 48. 10:40 AM =46ROM HANDBOOKS TO DATABASES ON THE NET: NEW SOLUTIONS AND OLD PROBLEMS IN INFORMATION RETRIEVAL FOR CHEMISTS. E. Zass, ETH Zuerich, Chemie-Bibliothek, CH-8092, Zuerich, Switzerland. Sources for chemical information are becoming even more powerful and varied: besides the still important printed sources, there are public databases, large in-house systems, and databases on PCs/CD-ROMS. The price to pay for this cornucopia, however,is increased complexity for users. Improved front-ends and the slow change from terminal-mainframe to client-server systems ease the burden of searching, but such means are not sufficient to make chemical information retrieval a reliable routine operation for every chemist. We need improved database quality, more goal- oriented marketing and training by producers or hosts, and problem-driven education for chemical information retrieval as an obligatory part of chemical curricula. 49. 11:10 AM ECONOMICAL ASPECTS FOR CHEMICAL INFORMATION. W. T. Donner, Bayer AG, Central Research Services, 51368 Leverkusen, Germany. Increasingly, economical aspects get a dominant importance in chemical industry, also for information. The market for information is still dominated by mechanisms developed at a time when information was considered as an infrastructure, necessary for any research. These mechanisms are no longer adequate if information becomes an additional factor of production. Not alone the production costs have to be considered but also its value to the user. The information provider primarily considers efforts and costs necessary for building up and maintaining an information system (f.i. a data-base). The user values the system first of all by the information he finds for his particular questions. - On the other hand, it is not industry alone that demands for information. Scientific institutions, universities, etc. are users (and providers) of information, as well. While the industrial user is prepared to pay for the information he gets, the same seems to be inadequate for scientific non-profit organisations. This results in a complex market situation for information systems. 50. 11:40 AM CAUGHT IN A CROSSFIRE: ACADEMIC LIBRARIES AND BEILSTEIN. G. D. Wiggins, Indiana University, Chemistry Library, Bloomington, IN 47405. The introduction of the Beilstein CrossFire product in 1994 caused many academic librarians to re-examine the Beilstein offerings. Although a number of academic libraries had cancelled the subscription to the printed Beilstein Handbook of Organic Chemistry, client-server architecture, coupled with new cooperative initiatives among academic libraries, made it possible to obtain both the printed volumes and the database at less cost than they were previously paying for the print alone.=20 Factors which have influenced academic libraries to subscribe to CrossFire will be examined, and the features of the product/subscription plan which academic chemists and librarians find attractive or less appealing will be described. Elements of the Beilstein CrossFire approach that have led to its adoption will be compared to the options available to academic libraries for other comparable database to see if there are lessons to be learned for the academic marketplace. TUESDAY AFTERNOON, AUGUST 22, 1995 Skolnik Award Symposium C. Jochum, Presiding 51. 2:00 PM THE BEILSTEIN INFORMATION SYSTEM IS NOT A REACTION DATABASE, OR IS IT? Clemens Jochum, Beilstein Informationsysteme GmbH, D- 60486, Frankfurt/Main, Germany. The Beilstein Information System is the world's largest collection of chemical properties of organic compounds. The Beilstein Database contains over 6.5 million compounds with associated properties, covering the literature period from 1779 to 1995. The properties covered most thoroughly are physical data and chemical behavior. Chemical behavior data contain preparations and reactions of almost all compounds in the database. For many compounds more than one preparation or reaction is contained in the database. Thus more than 10 million preparations and reactions are described. Online access to these preparations and reactions is still limited. This paper describes the current access and future plans for the development of a full Beilstein Reaction Database. 52. 2:30 PM BEILSTEIN: A COMMERCIAL APPROACH TO DISTRIBUTING CHEMICAL INFORMATION. Robert J. Massie, CAS, Administration, Columbus, OH 43210. Beilstein,formerly a private, not-for-profit foundation, has transformed itself over the last ten years into a vibrant private sector organization, Beilstein Information Systems, Inc. ("BIS").= =20 BIS is an online database supplier, developer of CD-ROM and software products, and most recently a developer of in-house retrieval software. In 1993, Beilstein divorced its long time partner of more than 70 years, Springer-Verlag, and remarried immediately to Information Handling Services Group, IHS. Two new companies have been formed: Beilstein Informationsysteme GmbH, and Beilstein Information Systems, Inc. This structure will bring a fresh commercialism to the sale of Beilstein data by combining the benefits of a not for profit foundation with the commercial for-profit arm. This evolution and new structure will be discussed as a business/economic model for dissemination of chemical information. 53. 3:00 PM THE SHEFFIELD GENERIC CHEMICAL STRUCTURES PROJECT - A RETROSPECTIVE REVIEW. John D. Holliday and Michael F. Lynch, University of Sheffield, Department of Information Studies, Western Bank, Sheffield, S10 2TN, United Kingdom. The problems posed by the requirements for storage and manipulation of generic chemical structure definitions in patents, which derive in part from their potentially unlimited numbers as well as from the vagaries of linguistic and structural complexities are reviewed. The theoretical foundations devised during the project for the successful solutions of the problems are reviewed, together with progress made towards implementation of a system based on these solutions. 54. 3:30 PM WILL ELECTRONIC INFORMATION CHANGE CHEMICAL RESEARCH? H. tom Dieck, The German Chemical Society, GDCh, D-60444 Frankfurt am Main, Germany. Chemistry depends more on information about earlier work than many other sciences. The relative smallness of typical chemical experiments creates an inverse proportionality to the importance of information retrieval. Originality has to be confirmed for every new substance or every new step. The larger the volume of knowledge becomes the more sophisticated search and help tools are needed. (Non-)value of time, (non-)availability of equipment or money, (un-)sufficient know-how or practice to master electronic data-bases and (un-)easy access to them or to complete printed literature have preponderant influence on the actual work of chemists. Starting in late 1994 all German chemistry graduates (approx. 8,000 Ph.D. students from 1995 to 97) will take part in a special education and application program for electronic retrieval techniques, organized by the German Chemical Society, GDCh, and supported by the German Minister of Science and Technology BMFT, now BMBF. The complete incorporation of the forthcoming young scientists' generation enables us to monitor the effect of the project on fundamental research. 55. 4:00 PM CHEMISTRY ON THE INTERNET - THE ROAD TO EVERYWHERE AND NOWHERE.=20 Stephen R. Heller, USDA, ARS, Building 005, Beltsville, MD 20705- 2350. The ability to connect the information stored in computers around the world is presenting a challenge to both chemists and information providers. With software as easy to use as it is to drive a car, what will the chemist do with this new technology?=20 Are the chemistry resources on the Internet just an electronic Potemkin village or is there real substance to the multitude of computer resources now available. This presentation will give one view of an Internet road map of the 21st century. 56. 4:30 PM VERY LARGE CHEMICAL STRUCTURE DATABASES - IMPLICATIONS IN MOLECULAR MODELING. K. Haraki and R. Venkataraghavan, American Cyanamid, Medical Research Division, Pearl River, NY 10965. Chemical lead discovery in pharmaceutical research is a key step in the identification of novel therapeutic entities. The structural variety of organic chemicals represented in the Beilstein Information System is a rich source for new ideas. A high speed chemical structure search engine combined with innovative computing tools makes it possible to generate hypotheses and test them in a laboratory. In our environment we have assembled a system that utilizes Beilstein database along with proprietary information to identify structurally different potential biologically active compounds. The vast amount of information available opens up a new gateway to generate novel ideas. Some examples of this application will be discussed along with limitations. WEDNESDAY MORNING, AUGUST 23, 1995 Challenges of Large Databases: Combinatorial Libraries W. A. Warr, Presiding 57. 9:05 AM REVOLUTIONARY APPROACHES TO MANAGING LARGE DATABASES OF CHEMICAL INFORMATION. David L. Grier, Brad D. Christie, Tim M. Maffett, James G. Nourse, and Dennis H. Smith. MDL Information Systems, Inc., 14600 Catalina St., San Leandro, CA 94577. For at least two decades, we have recognized through implementation of computer programs for isomer generation that the number of possible organic structures within reasonable constraints is, for all intents and purposes, infinite. Until very recently, these findings were largely of academic interest, and the profound implications of this result have not been recognized. Now, however, the revolution taking place in high volume chemical synthesis and biological screening establishes new methodologies that are poised take full advantage of the vast space of possible compounds. Revolutionary approaches to chemical information management are required to keep pace with these developments. We will discuss new methods for representing and searching large collections of chemical compounds in very compact and efficient ways. We will illustrate how the results can be combined efficiently with huge volumes of associated biological and other data to yield systems capable of managing the anticipated exponential growth of chemical information. 58. 9:35 AM REPRESENTATION OF COMBINATORIAL LIBRARY INFORMATION IN THE SLN STRUCTURE LANGUAGE, T. Hurst, Tripos Inc., 1699 S. Hanley Rd., St. Louis, Missouri 63144. The advent of combinatorial chemistry now presents a massive data management task. Researchers have begun to investigate methods for storing and searching the structural and screening data associated with combinatorial libraries. SLN (Sybyl Line Notation) was developed to represent chemical structures and 2D search queries, and is used as a mechanism for storing structures, as well as communication of the queries and structures between programs and between users. Without extension, SLN can be used to represent the information about the structural content in combinatorial libraries, and is a concise representation of these mixtures of thousands or millions of structures. This presentation highlights the SLN constructs which are used in representation of combinatorial libraries. 59. 10:05 AM USE OF AN EXPERIMENTAL SCREEN FRAGMENT THESAURUS IN SEARCHING THE CAS REGISTRY FILE. William Fisanick, Research, Chemical Abstracts Service, 2540 Olentangy River Road, P. O. Box 3012, Columbus, OH 43210. The CAS Chemical Registry File contains 2D structures for nearly 13 million substances. Substructure searching on this database consists of an initial screening step based on a set of structural fragment search screens followed by a time-consuming atom-by-atom search for the query structure on the candidates passing the screening step. However, with this large database the search screens that are automatically generated for queries with atom and/or bond variability can sometimes lead to candidates sets which exceed system limits for the subsequent atom-by-atom searching. This variability can result in the non- use of specific search screens. To help improve the screen efficiency in such cases, experimental software has been developed to function as a structural fragment thesaurus for the screens. This thesaurus can be used to discover hierarchical relationships among the screen fragments and to "synthesize", via OR logic, supplemental screens of intermediate specificity to improve the screen efficiency. In addition to the existing CAS substructure screen set, the thesaurus also supports an experimental set of "virtual" screens which provides for the comprehensive generation of synthetic screens. This paper will discuss the capabilities of the thesaurus and illustrate its use in improving screen efficiency in searching a large chemical structure database. 60. 10:35 AM ANALYZING LARGE CHEMICAL DATABASES TO INCREASE THE DIVERSITY OF A CORPORATE COMPOUND COLLECTION. M. G. Bures, R. Brown, Y. C. Martin, Abbott Laboratories, D47E AP10-2, 100 Abbott Park Road, Abbott Park, IL, 60064-3500. Recently, there has been a large increase in the use of high- throughput screening (HiTS) and combinatorial chemistry in the generation of new leads for drug design in the pharmaceutical industry. Recent advances in HiTS readily allow testing 100's of thousands of compounds in dozens of assays per year. Therefore, much effort has been expanded in increasing the chemical diversity and size of corporate compound collections for purposes of HiTS. One way corporate databases are being expanded is through purchase of compounds from outside sources such as commercial suppliers and academic research groups. We will present an overview of the techniques used to select diverse compounds from outside sources, including 2D/3D similarity analysis and clustering. 61. 11:05 AM CLUSTERING TECHNOLOGY TO SAMPLE THE RESULTS OF DATABASE SEARCHES.= =20 Keith Davies, Clive Briant, Roger Upton, Chemical Design, Cromwell Park, Chipping Norton, Oxon OX7 5SR, United Kingdom. Searching large databases with 3D pharmacophore queries can often lead to a large number of hits. It is not unusual for many of these hits to be quite similar and sometimes it may not be cost- effective to test all the molecules. In such circumstances it is valuable to review the hits to visualize the dissimilarity and test only a subset of the molecules. Traditional clustering methods, such as the Jarvis-Patrick algorithm, make approximations which are not always appropriate for the results of searches which may consist of a few thousand structures with a significant common pharmacophore. This paper describes algorithms for similarity clustering and ordering sets by dissimilarity appropriate for large data sets. 62. 11:35 AM C3: CLUSTERING BASED ON COMMON CORES. D. M. Bayada, W. Cho, C. Marshall, A. P. Johnson, Department of Chemistry, University of Leeds, Leeds LS2 9JT, United Kingdom. Advances in technology now permit the rapid search of very large structure of reaction databases. However, unless the query is very tightly constrained the answer sets are frequently too large for the content to be easily assimilated by humans. The C3 program uses a fragment-based clustering approach, combined with maximum common subgraph detection to split such answer sets into groups when the members of each group share a significant common core substructure. This type of analysis has potential applicant in analysis of the results of high throughput screening as well as the navigation of large answer sets. WEDNESDAY AFTERNOON, AUGUST 23, 1995 Challenges of Large Databases: Structure and Reaction Databases W. A. Warr, Presiding 63. 2:00 PM APPLICATIONS OF MARKUSH STRUCTURE TECHNIQUES TO HANDLING COMBINATORIAL LIBRARIES. John M. Barnard and Geoff M. Downs, Barnard Chemical Information Ltd., 46 Uppergate Road, Stannington, Sheffield S6 6BX, United Kingdom. Many of the information-handling problems posed by combinatorial chemistry are not new. Generic chemical structures, or "Markush" formulations, which can cover millions or even unlimited numbers of individual compounds, are common in the chemical patent literature, and over the past 15 years research and development work has led to the establishment of a number of operational systems. The somewhat inconsistently-used terminology in these systems is discussed, and the characteristics of Markush structures as they occur in patents and elsewhere is described. The techniques which are used to handle Markush structures in patents, and the commercial systems which have been developed, are reviewed. The extent to which these techniques are applicable to combinatorial libraries is discussed, in the context of the requirements for representation, search and diversity analysis of libraries. 64. 2;30 PM APPLICATIONS OF MARKUSH STRUCTURE TECHNIQUES TO=20 HANDLING COMBINATORIAL LIBRARIES - A USERS VIEWPOINT.=20 T. Mike Harvey, Phil McHale, John Myers, Derwent North America, 1420 Spring Hill Road, Suite 525, McLean, VA 22102. A Markush chemical database is one which indexes compounds from documents such as patents, in which not only a number of specific compounds, made and characterized are discussed, but many more which are `implied' within the scope of a general formula as well. In combinatorial chemistry, techniques are used in which many compounds may be simultaneously synthesized, identified or otherwise studied. The Markush databases and the combinatorial libraries both present the problem of having, in many cases, thousands of compounds to be indexed. Possible solutions for the user are discussed. 65. 3:00 PM SEARCHING OF LARGE DATABASES OF CHEMICAL REACTIONS G. W. A. Milne, Lab of Medicinal Chemistry, National Cancer Institute, NIH, Bethesda, MD 20892. Millions of different organic reactions are in the literature and to provide a useful facility, any system which claims to search reactions databases much do so exhaustively and quickly. This dual requirement is impossible to meet with standard, generally available software and several groups have sought to overcome this deficiency by developing novel solutions to the problem. Two particular search systems, ChemReact and Cognos, will be discussed. The solution adopted by ChemReact, designed by Loew and Saller, is to digest the large reactions database to find and extract a smaller set of reactions which, taken together, are representative of all the reactions in the large database. This algorithmic distillation of a reactions database yields some interesting statistics which will be described. The resulting smaller databases are used in both PCs and larger computers with practical search systems. Cognos, written by Hendrickson and Sander, takes a different approach. Every reaction in the database is keyed and the keys are searched in memory. The search is very fast and it is therefore possible to adjust the search criteria in real time so as to "tune" the search produce a desirable number of hits. Both these search systems can effectively search very large reactions databases and the performances they offer will be compared and contrasted. 66. 3:30 PM RETRIEVAL OF REACTION INFORMATION FROM LARGE DATABASES. Guenter Grethe, MDL Information Systems, Inc. 14600 Catalina St., San Leandro, CA 94577. Synthetic chemists are faced daily with the difficult task of finding relevant information from the existing literature for synthetic problems. Frequently, these searches must consider functional group compatibility, stereochemistry, availability of starting materials, and other requirements not easily expressed in simple structural queries. Over the last years, the rapid increase in reaction information has added another degree of difficulty to the task. Classification based on reaction types, navigating large hitlists more effectively and viewing full synthetic schemes are only some of the tools that have been recently developed to ease the task for the end-user. We will present applications developed by MDL Information Systems, Inc. in collaboration with InfoChem and FIZ Chemie Berlin, discuss ongoing work in other groups, and take a look at future requirements. 67. 4:00 PM BEHIND THE SCENES IN SCIFINDER: THE CHALLENGES OF MATCHING SIMPLE ENGLISH PHRASES TO REFERENCES IN A LARGE DATABASE. John L. Macko, Systems Development and Lorraine F. Normore, New Product Development, Chemical Abstracts Service, 2540 Olentangy River Road, Columbus, OH 43202-1505. The principal design requirement for the new SciFinder product =66rom Chemical Abstracts Service was to allow novice computer searchers--people who may have never previously looks for information using an online service--to retrieve useful answers to their questions. One of the features of SciFinder, called "Explore by Research Topic", allows a question to be input as a simple English phrase or statement. (This style of input frees the users from having to learn specialized command language conventions such as parentheses, field code indicators, Boolean operators, or truncation symbols.) SciFinder then does its best to retrieve references from he large CA text files that most closely match the input phrase. This presentation describes the following challenges: (1) mapping the input phrase to the large files, (2) retrieving a reasonably-sized, useful set of candidate answers, and (3) doing all of this as quickly and efficiently as possible. Explanations of the algorithms used are given, along with the interesting, and sometimes humorous, situations in which the algorithms broke down during development, and whet was done to fix them. 68. 4:30 PM SCIFINDER AND CHEMICAL STRUCTURE CONVENTIONS IN THE CAS REGISTRY FILE. Lisa M. Staggenborg, Kenneth S. Cada, Kevin P. Cross, David A. Deacon, Lucy A. Dixon, Roberta J. Fiete, Lester D. King, Cheryl S. Scotney, Harold L. Smith, Chemical Abstracts Service, 2540 Olentangy River Rd., Columbus, OH 43202-1505. Two significant challenges posed by large chemical structure databases are (1) achieving unique registration of substances and (2) permitting retrieval of specific substances. Chemical structure conventions have long been a key to CAS's maintaining the CAS Registry File (now over 13 million structures). CAS's new SciFinder software tackles the issue of retrieval, allowing users to draw in structures without having to worry about chemical structure conventions. The expertise of CAS input staff with an in-depth understanding of chemical structure conventions was instrumental in developing algorithms in SciFinder that provide appropriate recall for tautomers and aromatic bonds (including keto-enol tautomers and common dyes), open vs. closed ring forms (common dyes and hemiacetals), salts (organic, inorganic, and organometallic), coordination compounds and charged compounds (including delocalized charges). Specific examples will be presented. THURSDAY MORNING, AUGUST 24, 1995 Information Sources for Inorganic Chemistry G. Grethe, Presiding 69. 9:05 AM ASPECTS OF THE VALUE OF PATENT INFORMATION IN THE FIELD OF INORGANIC CHEMISTRY. Mike Harvey and John Meyers, Derwent North America, 1725 Duke St., Alexandria, VA 22314. A variety of searches of patents dealing with inorganic chemistry has been made on Derwent World Patents Index and Derwent Patents Citation Index. A number of different analyses of the results have been made, in order to try to establish areas of increasing or decreasing commercial importance. Some competitive intelligence data have been compiled from both patents databases, and these will also be discussed. 70. 9:35 AM COORDINATION COMPOUND INFORMATION RETRIEVAL IN CAS FILES.=20 Richard K. Lester, Kenneth S. Cada, Mary jane Janki, Chemical Abstracts Service, columbus, OH 43210-0012. Coordination compounds in substance-based files, such as the CAS files can be represented in various ways, such as connection tables, ring data, molecular formulas and names. This information can be manipulated to derive simple, but powerful text search terms which are useful as alternatives or complements to structural searching of coordination compounds. Other software tools for assisting in searches for coordination compounds will be discussed. 71. 10:05 AM THE GMELIN HANDBOOK OF INORGANIC CHEMISTRY: TRANSFORMATION OF AN INFORMATION SYSTEM. D. Schi=94berg, R. Deplanque, E. Fluck. Gmelin Institute, D-60486 Frankfurt, Germany. The Gmelin Handbook is the authoritative work in the fields of inorganic, organometallic, and physical chemistry. Today, the current 8th edition comprises more than 700 volumes. The substance-oriented Gmelin Handbook is classified on the basis of chemical elements and their compounds. The substance data are strictly arranged by the Gmelin system of subjects. The procedure of Handbook production assisted by modern software tools is reported. Actual fields as inorganic solid state chemistry or inorganic molecule chemistry are to a large extent descriptive subjects. External specialists cooperate intensively with Gmelin editors while producing manuscripts for the Handbook. The critical valuation of results and the systematical organization of the subject matter make the Handbook one of the most valuable research tools supplying scientific data. Recently produced volumes are discussed. 72. 10:35 AM THE GMELIN FILE - BASIS OF DIFFERENT INFORMATION PRODUCTS IN=20 INORGANIC CHEMISTRY. G. Olbrich, R. Deplanque, E. Fluck, Gmelin Institute, D-60486 Frankfurt, Germany. The Gmelin Institute for Inorganic Chemistry not only publishes the well-known Handbook of Inorganic and Organometallic Chemistry, but, since 1991, also offers the Gmelin factual Database on STN. After a description of the structure and the contents of the Gmelin File different forms for publication are discussed: Online version on STN, Inhouse systems for large organizations, and different CD-ROM products. In order to use the Gmelin File as a source for the handbook production, a publication systems has been built up within the Gmelin Institute. This production system is based on the SGML standard which offers the possibility to combine handbook manuscripts with cuts of the database to form new products in a flexible way.=20 First results from such a combination product are discussed. 73. 11:05 AM MAKING SENSE OF INORGANIC CHEMISTRY: A PRACTICAL APPROACH TO THE DOCUMENTATION OF INORGANIC COMPOUNDS. F. M. Macdonald,=20 Electronic Publishing Division, Chapman & Hall, 2-6 Boundary Row, London SE1 8HN, England. Organic chemistry is perhaps the most systematically documented of all the sciences owing to its basis of small molecules whose structures are readily depicted by "shorthand" formalisation.=20 Inorganic chemistry, however, is less amenable to description in this way. During the development of the new database, Dictionary of Inorganic Compounds, data has been organised to allow searching across the wide and diverse range of compounds. Some of the difficulties encountered and solutions employed in data handling will be discussed, the key aim being to make the data as convenient to access as possible to both specialist inorganic chemists and to non-specialists (other chemists, other scientists). Examples will cover the whole range of inorganic substances from discrete molecules and simple coordination compounds to grossly non-stoichiometric solid-state lattices. The different concerns in presenting the material in print format versus CD-ROM will be discussed. THURSDAY AFTERNOON, AUGUST 24, 1995 General Papers G. Grethe, Presiding 74. 2:00 PM CAS DATABASE CONTENT ENHANCEMENTS FOR IMPROVED ACCESS. Patricia S. Wilson, Chemical Abstracts Service, P.O. Box 3012, Columbus, OH 43210. The CAS database, distinguished by comprehensive literature coverage and in-depth, highly specific indexing of chemical substances, is challenging to search efficiently. To meet this challenge, CAS is enhancing database content to facilitate search and retrieval, guided by user suggestions and requests. One frequently requested enhancement is for intellectually assigned, concise, predictable indicators of the roles that substances play in the original reports. Through extensive discussions with users, we are developing a new indexing feature to address this need. The plan is to assign every indexed chemical substance at least one Role based on the novelty reported; multiple roles will be assigned when appropriate for the particular substance and study. This paper will discuss the considerations in defining a practical set of Roles with a hierarchical structure and will illustrate use of the new Roles. Other enhancements to database content aimed at improving access will also be outlined. 75. 2:25 PM CONSTRUCTING CONCEPTUAL HIERARCHY FOR SUBSTRUCTURE SEARCHING. J. An, H. Chen and Y. Fujiwara, University of Tsukuba, Institute of Information Science, Ibaraki, 305 Japan. A new approach for substructure searching is presented. In contrast to the screening approach which creates index of fragments, we construct conceptual hierarchy of chemical graph according to the substructure relationship in advance. The approach is highly efficient because subgraph isomorphism operation can be avoided by taking advantage of the conceptual hierarchy. The conceptual hierarchy constructing processing starts form the compounds in database instead of single atoms.=20 The border concepts (substructure)of a node can be generated by cutting off an atom or a superatom. In the cutting strategy, some semantics of chemical structure were used to make the system efficient and rational. The subroutines we used to derive rings and judge graph isomorphism are based on the SSSR and Sussengoth's algorithms respectively. 76. 2:50 PM GENERIC STRUCTURE REPRESENTATION OF DEA REGULATIONS OF CONTROLLED SUBSTANCES. M. Liu, G. M. Banik, S. P. Schmidt, R. D. Brown, J. M. DeLazzer, Abbott Laboratories, Pharmaceutical Products Division, Abbott Park, IL 60064. DEA controlled substance regulations (21 Code of Federal Regulation sections 1308.02-15) are complex to follow. Some regulations consider the salts, esters, ethers or isomers (geometric, positional or stereo) of a controlled compound as controlled substances, and other regulations do not. There are a few regulations which consider certain derivatives of a controlled compound as controlled substances. This variability leads to some degree of confusion in controlled substance classification. We have investigated the use of generic structures in both MDL and Daylight formats to represent most of the DEA regulations of controlled substances. These generic structures coupled with an application program can be used as a controlled drug classification expert system.