CBHD Newsletter Issue 8 - February 19, 2004 |
CONTENTS:
|
Welcome to the eighth issue of the Canadian Bioinformatics Help Desk (CBHD) Newsletter. Back issues of our newsletter can be viewed at our newsletter archive site (http://www.gchelpdesk.ualberta.ca/news/news.php). Our circulation base has reached 1050 subscribers. In this issue's Bioinformatics Profile, we feature an article on the Genome Canada Bioinformatics Workshop. In this issue's Software Spotlight, we highlight Programming in Perl, a collection of scripts for learning Perl. In our Commercial Software Spotlight we feature OptGene™, a novel Gene Optimizing tool, distributed in Canada by United Bioinformatica Inc. This biweekly newsletter is intended to keep Genome Canada researchers and other Help Desk users informed about new software, events, job postings, conferences, training opportunities, interviews, publications, awards, and other newsworthy items concerning bioinformatics, genomics and proteomics. The CBHD newsletter is a mandated service of the Help Desk and we hope to provide enough useful content to keep you interested and informed. If you know of anyone who would be interested in receiving future issues of this newsletter or contributing content to the newsletter, please email us at ian : gchelpdesk.ualberta.ca. To unsubscribe from this newsletter, send an email message to ian[a]gchelpdesk.ualberta.ca with the word "unsubscribe" in the subject line or body of the message. |
1)
Bioinformatics
Profile |
2) Software Spotlight |
Name of Script | Function |
Part I Scripts | |
1_typeseq.pl | -This script prompts the user for a name and a DNA sequence
and then prints the information to the screen. -Demonstrates the use strict and use warnings pragmas (these are compiler options), declaring variables, basic syntax, print statement, standard input and output, and the chomp statement. |
2_seqfromfile.pl | -This script reads a DNA sequence (in FASTA format) from a
file, parses its individual components, and prints them to the screen. -Demonstrates opening and reading of files, opening and closing file handles, and the die statement. |
3_readmany.pl | -This script reads multiple DNA sequences (in FASTA format)
from a file and prints the name, length, and %GC content of each
sequence. -Demonstrates arrays, the special variables $/ and $1, the while loop, the if statement, the else statement, the next statement, the matching operator, the substitution operator, the length function for strings, the string equality operator eq, the push statement for adding elements to an array, the sort function for sorting the elements of an array, the reverse function for reversing the elements of an array, the sprintf function for formatting strings, and the scalar function for determining the number of elements in an array. |
4_translate.pl | -This script reads multiple DNA sequences (in FASTA format)
from a file and translates the DNA into protein in reading frame 1. -Demonstrates hash tables (associative arrays), the lowercase function lc, the for loop, the substring function substr, the exists function for hash tables, and the join function for combining the elements of an array into a string. |
5_revcomp.pl | -This script reads a single DNA sequence (in FASTA format)
from a file and converts the DNA sequence into its
reverse-complement form. Sequence composition statistics are printed
for the original and reverse-complement sequence. -Demonstrates various regular expressions for use with the matching and substitution operators, the translation operator (tr), and the split function for converting a string into an array. |
6_orfs_a.pl |
-This script reads a single DNA sequence (in FASTA format)
from a file and finds the open reading frames (ORFs) on the direct
strand using pattern matching (regular expressions). The ORF ranges are
then printed. -Demonstrates the pos function for monitoring the regular expression search, the elsif statement, the modulus operator, the pop function for removing and returning the last element in an array, the last statement for exiting a loop, the foreach loop, and the special variable $_. |
6_orfs_b.pl | -This script is
similar to 6_orfs_a.pl, except that it uses for loops
to scan the DNA
sequence in each of the three reading frames on the direct strand for
start and stop codons.
It is approximately 30 times slower than 6_orfs_a.pl. The speed
difference is probably due to the optimization of the built in pattern
matching functions, which are used in 6_orfs_a.pl. -Demonstrates the foreach loop and the special variable $_. |
7_transorfs.pl | -This script reads a single DNA sequence (in FASTA format)
from a file and then generates the protein translations of ORFs found
in any of the six reading frames. The protein ORFs are displayed in
FASTA format with an informative title. A minimum ORF length prevents
short ORFs from being displayed. -Uses portions of the early programs, with slight modifications. |
Part II Scripts | |
1_orf_finder_sequence.pl | -This script submits a sequence to NCBI's ORF finder and
writes the unparsed results to a file. -Demonstrates Library for the WWW in Perl, the unless statement, and writing to a .html file. |
2_orf_finder_parser.pl | -This script parses the output produced by 1_orf_finder_sequence.pl, extracts the ORF information, and writes it to a text file. |
3_genscan.pl | -This script reads a single DNA sequence (in FASTA format)
from a file and sends the sequence to MIT's Genscan web server; the
predicted gene translations returned by Genscan are written to a file. -Further demonstrates and discusses the Library for the WWW in Perl. |
4_blast.pl | -This script reads multiple protein sequences (in FASTA
format) from a file and submits them to NCBI's BLAST server; the
results for each sequence are written to a file. -Demonstrates the application program interface (API) for accessing the NCBI BLAST server, how to use Perl to submit BLAST queries, and how to retrieve the results using the assigned request ID. |
5_MW.pl | -This script reads multiple protein sequences (in FASTA
format) from a file, determines the amino acid usage and molecular
weight of each sequence, and determines the combined amino acid usage
for all the
sequences. The results are written to a file. -Further demonstrates hash tables and writing to a file. |
Feature
article contributed
by Lindsay Moir
Production
of genetically modified organisms to achieve higher productivity,
disease resistance and other desirable properties is still based on
naturally occurring gene sequences. Naturally occurring sequences prove
futile in modern biotechnology with increased focus on safety
requirement for recombinant products and at the same time higher
flexibility in protein design. These gene sequences seldom meet the
ever growing demand for optimized yields in heterogeneous systems.
OptGene™ is a
novel Gene Optimizing tool that optimizes naturally occurring genes to
achieve higher productivity, at the same time giving higher flexibility
for protein design. The tool optimizes the genes using only the
sequence information and the choice of expression system. OptGene™
allows the researcher to adapt genes and their products precisely to
their specific requirements.
OptGene™
achieves optimization through
If you need a) to
optimize expression from a transgenic organism (e.g. alfalfa, canola,
mouse, goat, etc.), b) a management tool that can take you through the
process, c) to manage all of the information, d) to explore a large
number of possible alternatives, then OptGene™ is likely in your
future. For further information on OptGene™, please visit www.ubi.ca/optgene.htm.
Information on other bioinformatic products that UBI offers can be
found at www.ubi.ca/products.htm.
3)
What's New? |
19 Feb 2004 | CBHD Bioinformatics
Needs Survey Report - We
recently conducted a Canada-wide bioinformatics needs survey. We asked
Genome Canada researchers what their bioinformatics and computational
biology needs were. I wish to thank all who participated in this
survey. Click here to view our report [PDF]
[DOC]. |
16 Feb 2004 | S2K Chooses
GeneLinker Products for Advanced Data Analysis - "S2K
is a consortium of highly recognized researchers from across Canada
funded by a $15 million grant from Genome Canada, Genome Quebec and
Ontario Genomics Institute. The S2K program, which is hosted by the
Université de Montréal, aims to study the functional
genomics, pharmacogenomics and proteomics of the immune response
regarding HIV and HCV infections, SARS, transplant rejection and
rheumatoid arthritis diseases. The ultimate goal of this program is to
develop a bioinformatic model to predict susceptibility and progression
of the targeted diseases as well as the response to a given therapy." Source: http://www.prweb.com/releases/2004/2/prweb104338.htm |
16 Feb 2004 | Bioinformatics
Network Cheered -
"European Virtual Institute for Genome Annotation will have a global
impact. An initiative
to tackle the current fragmentation of bioinformatics research across
Europe has been welcomed by scientists in both Europe and the United
States." Source: http://www.biomedcentral.com/news/20040216/03/ |
14 Feb 2004 | BIOKNOPPIX
Distribution
- The
High Performance Computing Facility at the University of Puerto Rico
has released a Knoppix Linux Live CD distribution customized for the
molecular biologist. Here is some of the software included with BIOKNOPPIX: EMBOSS 2.8.0,
jemboss, artemis, clustal, Cn3D, ImageJ, BioPython, Rasmol, Bioperl,
Bioconductor. Source: http://bioinformatics.org/forums/forum.php?forum_id=2500 |
12 Feb 2004 | RNAsoft
Release - Researchers in the
Department of Computer Science at the University of British Columbia
recently released online versions of RNAsoft,
"software
for RNA/DNA secondary structure prediction and design." RNAsoft
consists of three main programs: PairFold, CombFold, and RNA Designer.
For further details, see their recent publication in Nucleic Acids
Research (http://nar.oupjournals.org/cgi/content/full/31/13/3416). Source: http://bioinformatics.ca/weblogs/ |
10 Feb 2004 | Venter submits
whole genome shotgun assemblies to GenBank - "The sequences of the
whole genome shotgun assemblies (WGSA) generated by Venter
et al.
at Celera Genomics and The Center for the Advancement of Genomics
(TCAG) have been deposited in the GenBank database (see accession nos.
AADD00000000, AADC00000000, and AADB00000000). This data release
accompanies a paper in
PNAS comparing the sequence from the International
Human Genome Sequencing Consortium (NCBI Build 34) with the
WGSA." Source: http://www.bioinformatics.ca/weblogs/log.php?wid=128 |
06 Feb 2004 | Science Issue on
Mathematics in Biology - The Feb. 6 issue of
Science is devoted to Mathematics in Biology. Several Genome
Canada researchers co-authored this issue's paper entitled "Global
Mapping of the Yeast Genetic Interaction Network". Source: http://www.bioinformatics.ca/weblogs/log.php?wid=127 |
12-16 May 2004 | 2004 CSH Meeting on The Biology of the Genomes: This meeting will take place in Cold
Spring Harbor on May 12-16, 2004. For further details,
please visit http://meetings.cshl.org/2004/2004genome.htm |
14-16 May 2004 |
CPI '04, MONTREAL,
CANADA: The
Fourth International Conference of the Canadian Proteomics Initiative
(CPI)
will be held in Montreal, Canada, on May 14-16, 2004. The deadline for
abstracts is March 15, 2004. For other key deadlines, please visit http://www.pence.ualberta.ca/CPI/index.php?keydates.
For more information, visit http://www.pence.ualberta.ca/CPI/index.php?home |
20-22 May 2004 | Biotech China 2004: "Biotech
China 2004 is an
international, multidisciplinary
conference designed to offer critical perspectives on the current
status and future of cutting-edge genomic technologies such as RNAi, systems
biology, functional genomics, proteomics
and microarray." The
deadline for acceptance of
oral presentations has been extended to March 10, 2004. For more
information, please visit http://www.biotechcn.com/ |
31 Jul-4 Aug 2004 | ISMB/ECCB 2004:
"In 2004—for the first time
ever—Intelligent Systems for Molecular Biology (ISMB) will be held
jointly with the European Conference on Computational Biology (ECCB),
in conjunction with Genes, Proteins and Computers VIII" on July 31-August 4, 2004, in Glasgow,
UK. Registration opens March
1, 2004. For further details, please visit http://www.iscb.org/ismbeccb2004/ |
16-20 Aug 2004 | CSB 2004: "The 3rd
annual Computational Systems Bioinformatics conference, CSB2004, is
being organized once again by the IEEE Computer Society Technical
Committee on Bioinformatics under the theme—Systems Bioinformatics." This conference will be held in
Stanford, California, USA. The
deadline for the submission of papers is March 22, 2004. For more details, please visit the
conference web page: http://conferences.computer.org/bioinformatics/ |
23 Aug 2004 | ECAI 2004: The 16th European Conference on
Artificial Intelligence (ECAI) will be held in Valencia, Spain. On
August 23, 2004, there will be a workshop entitled, "Data Mining in
Functional Genomics and Proteomics: Current Trends and Future
Directions". The deadline
for the submission of papers is March 31, 2004. For further
details, please visit http://www.softwareresearch.ca/ecai-bio/index.html
and http://www.dsic.upv.es/ecai2004/ |
5)
Help Desk Software Repository |
6) Bioinformatics Jobs |
Job Title | Location | Date Posted |
BIOINFORMATICS
SOFTWARE SPECIALIST |
Montreal
(Saint-Laurent), PQ |
February 19, 2004 |
DNA
Sequence Finisher, RAII |
Vancouver, BC |
February 16,
2004 |
Bioinformatics
position (Assistant Professor tenure track) |
Toronto, ON | February 9, 2004
|
Molecular
Database Curators |
Toronto, ON |
February 5,
2004 |
Postdoctoral
position in statistical and evolutionary bioinformatics/phylogenetics |
Halifax, NS | February
4, 2004 |
Curators |
Montreal,
PQ |
January 30, 2004 |
Application
Scientists (Part Time) |
Calgary,
AB |
January
26, 2004 |
Gene
Expression Research Associate (plus twelve additional positions
at Genome Sciences Centre) |
Vancouver, BC | January
26, 2004 |
Computational
Biology position (Assistant/Associate
Professor tenure track) |
Hamilton, ON | January 24, 2004 |
SHARCNet
Chair in Bioinformatics |
London, ON | January
7, 2004 |
Future
positions: Bioinformatics, molecular microbiology, and genomics |
Burnaby,
BC |
Starting
in 2004-2005 |
Source: http://www.bioinformatics.ca/jobs
except for the Bioinformatics tenure track, Computational Biology tenure track, and
future positions
7)
CBHD Registration
WHY
REGISTER?
Registering
with the Canadian Bioinformatics Help Desk benefits both you and us.
Benefits include:
Free Subscription
To start your free subscription to this newsletter, send an email
message to ian(!)gchelpdesk.ualberta.ca
with the word "subscribe" in the subject line or body of the
message. Please forward this newsletter to any interested colleagues or
collaborators. We do appreciate your comments. Send your comments and
feedback about this newsletter to ian-x-gchelpdesk.ualberta.ca
Bioinformatician
Canadian Bioinformatics Help Desk
University of Alberta
Department of Biological Sciences, CW 405
Edmonton, AB
Canada T6G 2E9
Fax: (780) 492-9234
Email:
ian++gchelpdesk.ualberta.ca
Website:
http://gchelpdesk.ualberta.caThe
CBHD is sponsored by:
This archive was generated by hypermail 2.2.0 : 2005-11-24 - 10:21 GMT