QSAR - Canadian Bioinformatics Help Desk Newsletter -- March 4, 2004

From: ian-$-redpoll.pharmacy.ualberta.ca
Date: Thu, 4 Mar 2004 17:32:21 -0700




  Canadian Bioinformatics Help Desk Newsletter -- March 4, 2004
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  


Help Desk News Banner Bioinformatics Platform: A GenomePrairie Project

CBHD Newsletter
Issue 9 - March 4, 2004



       CONTENTS:
Online version of this newsletter:
http://gchelpdesk.ualberta.ca/news/04mar04/cbhd_news_04mar04.php

Welcome to the ninth issue of the Canadian Bioinformatics Help Desk (CBHD) Newsletter. Back issues of our newsletter can be viewed at our newsletter archive site (http://gchelpdesk.ualberta.ca/news/news.php). Our circulation base has reached 1070 subscribers. In this issue's Bioinformatics Profile, we feature an article on Faster, Higher, Stronger Sequence Database Searches—homology detection with higher scores and stronger evidence. In our Commercial Software Spotlight we feature Phoretix 1D, a 1D gel image analysis tool, distributed in Canada by United Bioinformatica Inc. This biweekly newsletter is intended to keep Genome Canada researchers and other Help Desk users informed about new software, events, job postings, conferences, training opportunities, interviews, publications, awards, and other newsworthy items concerning bioinformatics, genomics, and proteomics. The CBHD newsletter is a mandated service of the Help Desk and we hope to provide enough useful content to keep you interested and informed. If you know of anyone who would be interested in receiving future issues of this newsletter or contributing content to the newsletter, please email us at ian---gchelpdesk.ualberta.ca. To subscribe to this newsletter, click here; to unsubscribe from this newsletter, send an email message to ian(a)gchelpdesk.ualberta.ca with the word "unsubscribe" in the subject line or body of the message.

Profile1) Bioinformatics Profile

Sequence Database Searches: Faster, Higher, Stronger!
Homology Detection: Higher Scores, Stronger Evidence

Feature article contributed by Paul Gordon

The key to a successful database analysis of DNA and protein sequences is to maximize two search result characteristics: sensitivity and selectivity. Improved sensitivity means that fewer true positive matches, i.e. identified functional domains, will be missed in the results. Improved selectivity means that fewer false positives, i.e. mistakenly identified functional domains, will be identified in the results. Software such as the BLAST suite of programs relies on assumptions about the nature of the sequence similarities to take computational shortcuts, and it does this fabulously well. The results from these searches can produce clear homology candidates. But what about those results that just give you weak hits and hypothetical proteins? To detect more distant homologies, or to find matches in error-prone sequence such as ESTs, these shortcuts cannot always be taken, and the search space grows exponentially. To identify the maximum number of functional domains correctly, one must use a whole range of sequence search tools. These tools include more sensitive pairwise methods such as Smith-Waterman searches and intron spanning GeneBLASTs, plus Hidden Markov Models such as those found in Pfam and other InterPro domain family databases. Figure 1 illustrates the relative sensitivity and selectivity of various search methods.




Figure 1. Comparison of sensitivity and selectivity of various sequence search methods. Blue denotes a software method, red denotes a hardware accelerated method.

These more sensitive and selective methods will generally yield higher e-values, and produce stronger evidence in non-obvious cases than the ubiquitous BLAST. In particular, the frameshift and Hidden Markov Model methods may find matches that elude the standard software methods because of these BLAST limitations:
Hardware Solutions: Faster Results

The University of Calgary, as part of the Genome Canada Bioinformatics Platform Project, hosts specialized systems for sequence database searches of all types described above. These systems are the
Paracel® GeneMatcher/BlastMachine and the TimeLogic™ Decypher
®. Full descriptions of these machines are available here. We use special systems to find these more difficult matches, requiring much more computation. By hardware accelerating these methods, it becomes practical to perform them for large scale datasets. In Figure 2, the plot of runtimes for comparable hardware and software methods illustrates the vast performance improvement achieved by the Paracel® (GeneMatcher & BlastMachine) and TimeLogic (Decypher®) systems for both the software and hardware methods.




Figure 2. Time-to-completion comparison of original methods and methods, in batch mode, available at the University of Calgary. For TBLASTX the improvement is 20-fold, for other methods it is at least 100-fold.


Each system has its particular strengths, as summarized in the table below. With this knowledge, we have the ability to maximize throughput, as well as sensitivity and selectivity, by selecting the proper methods on the proper machines for the given input data.


Sequence types Search Method Machine
DNA/protein BLAST/PSI-BLAST BlastMachine
TeraBLAST Decypher
ESTs vs. genomic GeneBLAST Decypher
Smith-Waterman, semi-global GeneMatcher
ESTs vs. protein S-W Frame GeneMatcher
DNA/protein vs. HMMs HMM Search Decypher
ESTs vs. HMMs HMM Frame Search Decypher
HMMs vs. genomic GeneWise Genematcher

Figure 3. Most appropriate sensitive search type for given data.


Availability

Unique Resources

In addition to being the only facility in Canada with this combination of resources, the University of Calgary facility hosts a unique in-house HMM database based on the popular NCBI Clusters of Orthologous Groups database. This database is the largest curated HMM set we know of at over 9000 prokaryotic and eukaryotic gene models. If you're still getting "hypothetical protein" from your PSI-BLAST responses, try searching COGSHMM on the Decypher system.

Casual Usage

Web interfaces for interactive submission by the general public exist for the Decypher and BlastMachine/GeneMatcher solutions.

Batch Jobs

More in depth coverage of how to use these resources is available at the University of Calgary web site (http://magpie.ucalgary.ca/search_resources.xhtml). Funded through Genome Canada and WestGrid grants, these machines are accessible for high throughput usage by Canadian academic researchers. For further details, please contact the Genome Canada Bioinformatics Platform Project Principal Investigator, Dr. Christoph Sensen, csensen . ucalgary.ca.


Software Spotlight Icon2) Software Spotlight

Featured Commercial Software: Phoretix 1D for 1D Gel Image Analysis

Feature article contributed by Russell Trischuk, UBI Application Scientist, r.trischuk[*]usask.ca

In the rapidly changing world of molecular biology high throughput analysis is a fact of life. As a result there is a need for a powerful, comprehensive, and user friendly gel analysis software. The answer to this requirement is Phoretix 1D Advanced, by Nonlinear Dynamics. Phoretix 1D is a complete 1D gel analysis package capable of analyzing any gel (DNA or protein) separated in a single dimension including PCR-based procedures (AFLP and RAPD), RFLP, SSCP, immunoblots, protein purification, and the analysis of post-translational modifications. Phoretix 1D is a highly automated, user friendly, accurate software package that provides the user with endless list of analysis applications. 

This software is capable of:

Multiple Lanes Image       Multi Tiered Image

Pixel Intensity Image

RF Lines Feature

Band ID-Matching Feature


Phoretix 1D is also available in a professional package (Phoretix 1D Pro) that couples the power of Phoretix 1D with a database that allows the user to create multiple libraries enabling the analysis of large cross-gel studies and projects.

If you are involved in any 1D gel-based projects and require an accurate, reliable and user friendly tool for the analysis and management of your data then Phoretix 1D Advanced or Professional should be your package of choice. For further information, trials, etc., please contact the Canadian distributor, UBI, at 866 202 2100, or by email at info^^ubi.ca



new icon3) What's New?



01 Mar 2004 Proteome Analyst Subcellular Localization Paper - Dr. David Wishart, Director of the CBHD, along with other researchers in the Department of Computing Science at the University of Alberta, recently co-authored a paper that appeared in the March 1 issue of Bioinformatics [ABSTRACT] [PDF]. If you missed our Bioinformatics Profile article on the Proteome Analyst Subcellular Localization Prediction Server, please visit http://gchelpdesk.ualberta.ca/news/22jan04/cbhd_news_22jan04.php#profile

01 Mar 2004 Chicken Genome Assembled - The first draft of the chicken genome sequence has been deposited into free public databases around the world. This assembly of genomic sequence data from the Red Jungle Fowl (Gallus gallus), ancestor of domestic chickens, represents the first avian genome to be sequenced. To read the NIH News Advisory, please visit http://www.nhgri.nih.gov/11510730

25 Feb 2004 2004 Benjamin Franklin Award in Bioinformatics - The Bioinformatics Organization, Inc. (Bioinformatics.Org) will present the 2004 Benjamin Franklin Award in Bioinformatics to Lincoln D. Stein of Cold Spring Harbor Laboratory for his "creation of a great number of open-source bioinformatics programs and for championing open-source principals in many venues..." For the Bioinformatics.Org press release, please visit http://bioinformatics.org/forums/forum.php?forum_id=2516

19 Feb 2004 Genome War - James Shreeve has written a new book, The Genome War: How Craig Venter tried to capture the code of life and save the world, chronicling the ins and outs of daily life at Celera Genomics during the race to sequence the human genome. For a good introduction to this book, check out Kevin Davies' recent Around the Bases feature article from BioIT World.

18 Feb 2004 Search for Complex Genes - This Bio-IT World article describes some of the "tricks of the trade", old and new, that are being used by researchers to track down "genes for complex diseases" (http://www.bio-itworld.com/archive/021804/genes.html).

18 Feb 2004 Continued Need for Bioinformatics - Analysts at Navigant Consulting believe that the need for bioinformatics analytical software in the Pharmaceutical and Biotechnology sectors will not go away over the next five years. Source: http://businesswire.com

16 Feb 2004 Bioinformatics Software in Academia - There is an abundance of bioinformatics software in academia. However, in some cases, some projects have never been completed or they contain outdated code. One dilemma that many academics face—resurrect the legacy code, start from scratch, or abandon it altogether. Here is an interesting article from The Scientist on this subject (http://www.the-scientist.com/yr2004/feb/prof2_040216.html).



Event Icon4) Upcoming Events

BIOINFORMATICS TRAINING

Applied Computational Genomics Course - The next ACGC will be held in Winnipeg, Manitoba, on June 12-20, 2004. Early bird registrations must be received before May 1, 2004. For more information, see last issue's Bioinformatics Profile article or visit the course web page.

Canadian Bioinformatics Workshops
:
In 2004, the CBW will be offering three remaining bioinformatics workshops on: 1. Developing the Tools (Deadline: March 6, 2004), 2. Proteomics (Deadline: May 22, 2004), and 3. Genomics (Deadline: June 19, 2004). These courses may count toward a Certificate in Bioinformatics. For further details, please visit http://www.bioinformatics.ca/workshops.php

BioneQ's Courses and Workshops - BioneQ offers a variety of courses and workshops in bioinformatics. Here are some of the courses and workshops that they offer: LIMS Workshop, EST Clustering Workshop, Workshop on Analysis of Expression Data, BASE Demo Installation, and Biojava Bootcamp. For further details, please visit their web site at http://www.bioneq.qc.ca

Training Program in Bioinformatics for Health Research: A bioinformatics training program, leading to a post-graduate diploma, M.Sc., or Ph.D., is "offered through a partnership between the BC Cancer Agency, Simon Fraser University and the University of British Columbia
."  For more information, visit http://bioinformatics.bcgsc.ca

Cold Spring Harbor Laboratory (CSHL) is offering a special 2 day course, summer, and fall courses. The deadlines for summer and fall courses are March 15, 2004 and July 15, 2004, respectively. For more information, please visit http://meetings.cshl.org/2004/2004courses.htm


BIOINFORMATICS MEETINGS

12-16 May 2004 2004 CSHL Meeting on The Biology of the Genomes: This meeting will take place in Cold Spring Harbor on May 12-16, 2004. For further details, please visit http://meetings.cshl.org/2004/2004genome.htm

14-16 May 2004
CPI 2004, MONTREAL, CANADA: The Fourth International Conference of the Canadian Proteomics Initiative (CPI) will be held in Montreal, Canada, on May 14-16, 2004. The CPI 2004 tutorials will take place on May 17-18, 2004. The deadline for abstracts is March 15, 2004. Registration closes April 13, 2004. For more information, visit http://www.pence.ualberta.ca/CPI/index.php?home

20-22 May 2004 Biotech China 2004: "Biotech China 2004 is an international, multidisciplinary conference designed to offer critical perspectives on the current status and future of cutting-edge genomic technologies such as RNAi, systems biology, functional genomics, proteomics and microarray." The deadline for acceptance of oral presentations has been extended to March 10, 2004. For more information, please visit http://www.biotechcn.com/

31 Jul-4 Aug 2004 ISMB/ECCB 2004: "In 2004—for the first time ever—Intelligent Systems for Molecular Biology (ISMB) will be held jointly with the European Conference on Computational Biology (ECCB), in conjunction with Genes, Proteins and Computers VIII" on July 31-August 4, 2004, in Glasgow, UK. Registration opens March 1, 2004. The poster submission deadline is April 19, 2004. For a list of key dates, please visit http://www.iscb.org/ismbeccb2004/keydates.html. For further details, please visit http://www.iscb.org/ismbeccb2004/

16-20 Aug 2004 CSB2004: "The 3rd annual Computational Systems Bioinformatics conference, CSB2004, is being organized once again by the IEEE Computer Society Technical Committee on Bioinformatics under the theme—Systems Bioinformatics." This conference will be held in Stanford, California, USA. The deadline for the submission of papers is March 22, 2004The poster submission deadline is May 17, 2004. Pre-conference tutorials will be held on August 16, 2004. Post-conference half-day workshops will be held on August 20, 2004. For more information, please visit the conference web page: http://conferences.computer.org/bioinformatics/

23 Aug 2004 ECAI 2004: The 16th European Conference on Artificial Intelligence (ECAI) will be held in Valencia, Spain. On August 23, 2004, there will be a workshop entitled, "Data Mining in Functional Genomics and Proteomics: Current Trends and Future Directions" (http://www.softwareresearch.ca/ecai-bio/index.html). The deadline for the submission of papers is March 31, 2004. For further details, please visit the conference web site at http://www.dsic.upv.es/ecai2004/



software CD5) Help Desk Software Repository
The Help Desk software repository is where researchers may upload or download bioinformatics programs of interest. Currently the repository has 50 programs. These are freeware packages that are available for anyone to download and install on their own computer. Many of the programs in the Help Desk repository have been thoroughly tested and a number have been published as research articles. Please take advantage of this resource. Downloads are encouraged and submissions are always welcome. The repository can be found at: http://gchelpdesk.ualberta.ca/repository/.

Attention all programmers—we encourage you to submit your favourite bioinformatics software to the Help Desk Software Repository

Please email Ian Forsythe (ian[A]gchelpdesk.ualberta.ca) if you would like to deposit software into the software repository. To deposit software now, please visit http://www.gchelpdesk.ualberta.ca/repository/SubmitRealSoftware.php

In case you missed it, last issue's Software Spotlight article highlighted parts one and two of the Programming in Perl tutorial from our Software Repository.


jobs image6) Bioinformatics Jobs

This is a resource for advertising positions in bioinformatics and computational biology. If you have a job you would like posted in this newsletter please email curators,+,bioinformatics.ca directly. Job postings will be carried for a maximum of 4 issues (8 weeks) unless the position is filled prior to that date.

Genome Canada Home Page
Genome Canada is advertising several positions. Check out their career brochure (http://www.genomecanada.ca/GCmedia/CareerOpportunities.pdf) and their latest job postings (http://www.genomecanada.ca/GCcarriere/index.asp?l=e).





Job Title Location Date Posted
Tenure Stream Assistant Professor
Toronto, ON March 3, 2004
SHARCNet Chair in Bioinformatics
London, ON
March 2, 2004
NSERC Industrial Research Chair in Biomedical Mass Spectrometry
Winnipeg, MB
March 2, 2004
Database Administrator; Chemistry Database Curators [details]
Toronto, ON February 26, 2004
Position in functional genomics
Quebec, PQ
February 26, 2004
Postdoctoral position
Montreal / Quebec City, PQ
February 23, 2004 
BIOINFORMATICS ANALYST
Montreal (St-Laurent), PQ
February 23, 2004 
BIOINFORMATICS SOFTWARE SPECIALIST
Montreal (Saint-Laurent), PQ
February 19, 2004
DNA Sequence Finisher, RAII
Vancouver, BC
February 16, 2004 
Molecular Database Curators
Toronto, ON
February 5, 2004 
Postdoctoral position in statistical and evolutionary bioinformatics/phylogenetics
Halifax, NS February 4, 2004 
Curators
Montreal, PQ
January 30, 2004
Application Scientists (Part Time)
Calgary, AB
January 26, 2004 
Gene Expression Research Associate (plus numerous additional positions at the Genome Sciences Centre)
Vancouver, BC January 26, 2004 
Computational Biology position (Assistant/Associate Professor tenure track)
Hamilton, ON January 24, 2004
Future positions: Bioinformatics,
molecular microbiology, and genomics
Burnaby, BC
Starting in 2004-2005

Source: http://www.bioinformatics.ca/jobs except for the Data Administrator, Chemistry Database Curators, Computational Biology tenure track, and future positions


registration7) CBHD Registration

WHY REGISTER?

Registering with the Canadian Bioinformatics Help Desk benefits both you and us.

Benefits include:



Free Subscription

To start your free subscription to this newsletter, send an email message to ian*_*gchelpdesk.ualberta.ca with the word "subscribe" in the subject line or body of the message. Please forward this newsletter to any interested colleagues or collaborators. We appreciate your comments; send your comments and feedback about this newsletter to ian~~gchelpdesk.ualberta.ca




Ian J. Forsythe, MSc
Bioinformatician
Canadian Bioinformatics Help Desk

University of Alberta
Department of Biological Sciences, CW 405

Edmonton, AB
Canada T6G 2E9
Phone: (780) 492-5969
Fax: (780) 492-9234

Email: ian#%#gchelpdesk.ualberta.ca
Website: http://gchelpdesk.ualberta.ca
The CBHD is sponsored by:
Genome Prairie Home Page Genome Canada Home Page

Received on 2004-03-04 - 21:32 GMT

This archive was generated by hypermail 2.2.0 : 2005-11-24 - 10:21 GMT