Bradford Condon PhD

Bioinformatics, Web & Mobile Development



This post is part 4 of a series on file formats, written for the 2017 UK-KBRIN Essentials of Next Generation Sequencing Workshop at the University of Kentucky.

Read the full post...

This post is part 3 of a series on file formats, written for the 2017 UK-KBRIN Essentials of Next Generation Sequencing Workshop at the University of Kentucky. The conference website is hosted here.

General transfer format (GTF), also known as General Feature Format (GFF) 2.0, is the format for transcripts in exercise 4, RNAseq. For more details, please see the ensembl guide to GFF.

Read the full post...

Expanding your horizons header NSF epscor

On April 29th 2017, the Expanding Your Horizons conference came to Kentucky for the first time. The conference was funded by an NSF EPSCoR (National Science Foundation Experimental Program to Stimulate Competitive Research for Education and Outreach) grant which I, along with Ellen Crocker and Susan Odom, was a PI on. I’m very grateful for all of the volunteers, corporate partners, parents, teachers, and students who made this conference a success.

Read the full post...

This post is part 2 of a series on file formats, written for the 2017 UK-KBRIN Essentials of Next Generation Sequencing Workshop at the University of Kentucky. The conference website is hosted here.

#FastQ sequence format

FASTQ was originally developed by the Wellcome Trust Sanger Institute to bind together FASTA sequences with their respective quality data. It is now the standard for high-throughput sequencing output.

The format

FASTQ is a four-line per sequence format. If it looks like the raw sequence of your read takes up more than four lines, you probably have word-wrapping enabled.

@SEQ_ID
GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

Read the full post...