Home

Class schedule

Useful links

Lecture 1

Lecture 2

Lecture 3

Lecture 4

Lecture 5

Lecture 6

Lecture 7

Lecture 8

REVIEW 1

Lecture 9

Lecture 10

Lecture 11

Lecture 12

Lecture 13

Lecture 14

Lecture 15

Lecture 16

Lecture 17

  



PP703 - Agricultural Genomics: Principles and Applications

Instructors: Guo-Liang Wang and Eric Stockinger

Click below to download handouts and reference papers as PDF files

One slides/page lecture notes

Four slides/page lecture notes

Reference paper 1

Reference paper 2

Course video

Printed copies of handouts will not be provided in class. Please print out your own handouts.

Study questions

  1. Define bioinformatics. How does bioinformatics differ from computational biology?
  2. How does sequencing an insert (1 – 5 kb) in a standard plasmid vector (e.g. pBluescript) by primer walking differ from sequencing a 100 kb insert in a BAC vector by shotgun sequencing? How does sequencing a whole genome using a BAC by BAC strategy differ from sequencing a whole genome using a Bottom-up shotgun strategy?
  3. What is Phred? How many errors are possible in a 1000 base-pair sequence that has a Phred value of Q20? What is the minimum acceptable Phred Q value for finished sequence deposited in GenBank? What is the typical Phred Q value of single pass raw sequence data deposited in EST and HTGS databases? If you were relying on sequence data from an EST or HTGS database to generate molecular markers for mapping purposes but discovered many of the markers that you developed were not working as you predicted, what might be one of the reasons?
  4. Why is it important to remove vector and adapter sequences when sequencing is done by a shotgun method? What might happen during assembly if those sequences were not removed?
  5. How many bases of sequence are typically possible from one fragment using Illumina? What is a paired-end read sequence? What is the difference between de novo sequencing a genome and resequencing a reference genome?
  6. How many different nucleic acid bases comprise DNA? What are their single letter codes? Why are other codes sometimes used for DNA? What does an “N” code for in DNA? How many different amino acids typically comprise proteins?
  7. What is the basic format of a sequence in FASTA? What are some of the information fields in a GenBank sequence format?
  8. What is Entrez? What types of information can be retrieved through ENTREZ? What is the difference between data in a curated, annotated database, and data in a non-curated, non-annotated database? Why might it be better to rely on information provided in an annotated database than a non-annotated database?