UIC Online Bioinformatics Program

Biological Databases in Bioinformatics (BioE 594)


Databases are used extensively in biological/medical data storage and analysis. Specific types of databases that are common and widely used include gene and protein sequence, protein structure, protein interaction, metabolic pathways, compounds and drugs, literature, medical records, and many others. Mastering the basic principles and design methods of databases is fundamental in bioinformatics training. In this course we are introduce how database design and application are used in the biomedical field. Students are not required to have prior database knowledge, but should be familiar with the basics of computer science. Fluency in at least one programming language (as well as cursory knowledge of at least one other), familiarity with common operating systems such as Windows XP and Unix/Linux, and experience using online biological and medical databases such as NCBI's PubMed and BLAST, PFAM, SWISS-PROT, etc. are basic requirements.

Topics covered:


  • Basic knowledge of databases
  • MySQL hands-on session: setup and implementation
  • Advanced databases
  • Popular biological and medical databases
  • Relational databases: principles and implementations
  • Building a biological data warehouse
  • Data retrieval

Syllabus:


  • Lecture 1: Popular bioinformatics databases
  • Lecture 2: Popular medical databases
  • Lecture 3: Introduction to database design principles
  • Lecture 4: Database design for biological data
  • Lecture 5: MySQL(Part I): Installation & Getting Started
  • Lecture 6: MySQL(Part II): Writing Queries In SQL
  • Lecture 7: My SQL (Part III): Advanced MySQL & Account Administration
  • Lecture 8: My SQL (Part IV): Embedded SQL & ODBC
  • Lecture 9: Relational Database Design (I): Entity-Relationship Model
  • Lecture 10: Relational Database Design (II): Relational Algebra & Normalization
  • Lecture 11: Relational Database Design (III): Normalization & Functional Dependency
  • Lecture 12: Relational Database Design (IV): other normal forms & some design issues
  • Lecture 13: Database Integrity
  • Lecture 14: Database Security

We will normally post 2 lectures per week (on Mondays).

Prerequisites:


  • Intermediate level programming ability in either C/C++ or Java or Perl
  • Familiarity with common operating systems such as Windows XP and Unix/Linux
  • Experience using online biological and medical databases such as NCBI's PubMed and BLAST, PFAM, SWISS-PROT, etc.

Textbook:


To be determined. We will be posting the materials according to the progress of the lectures beyond lecture slides including online resource, tutorials, papers, our own writing materials.

Grading:


  • Homework
    • Worth 100 points and will be assigned each week (8-12 total homework assignments)
    • Will be posted on Wednesday and will be due the following Wednesday.
    • For some weeks, we may post homework early. This will not affect the due date of the homework.
    • Late homework will be accepted until the first Friday following the due date with a penalty of 20 points per day late. Homework will not be accepted after Friday.
  • Comprehensive project (due on the same day as the final exam)
  • Midterm and final exams