ID: 7057759 (Ref.No. 310348MV)
Posted: June 14, 2019
Application Deadline: Open Until Filled
The HGSC was founded in 1996 under the leadership of Dr. Richard Gibbs and is a world leader in genomics. The fundamental interests of the HGSC are in advancing biology and genetics by improved genome technologies. As one of the three large-scale sequencing centers funded by the National Institutes of Health, the HGSC provides a unique opportunity to work on the cutting-edge of genomic science in a state of the art institution. Today, the HGSC employs ~ 200 staff, and it occupies more than 36,000 square feet on the 14th, 15th, and 16th floors of the Margaret M. and Albert B. Alkek Building. The HGSC is located on the southwest edge of downtown Houston, the fourth largest city in the U.S., in the Texas Medical Center, the world's largest medical complex. The major activity of the HGSC is high-throughput DNA sequence generation and the accompanying analysis. The HGSC is also involved in developing the next generation of DNA sequencing and bioinformatics technologies that will allow greater scientific advances in the future.
This position with the Next-Generation Sequencing Informatics (NGSI) group requires a Bioinformatics Programmer with Linux/Unix command line and coding experience. As the HGSC’s Bioinformatics Core, NGSI manages the production, maintenance, and primary analysis of all genome sequencing data at the HGSC, including Illumina HiSeq X and NovaSeq informatics. NGSI also contributes to multiple clinical, Mendelian, and large cohort sequencing studies, specifically in the areas of structural variation and at-scale genomic data science. Under the direction of a senior manager, a qualified candidate will assist with running research informatics pipelines, managing data storage and delivery, and troubleshooting routine production issues.
- Manage the generation, storage and delivery of large-sample genomic data sets
- Develop, test and deploy at-scale analysis protocols
- Deliver QC’ed data to public repositories and collaborators
- Maintain extensive project-specific documentation and best practices
- Support day-to-day NGSI production pipelines
- Participate in calls and meetings with collaborators Identify novel ways to improve data quality and analysis
- Provide excellent customer service to other HGSC groups and outside collaborators through ticketing system
- Work with NGSI production team to innovate improvements to NGS pipelines
- Support development and testing of software developed by NGSI used in running NGS pipeline
- Design, development, deployment and troubleshooting for novel software and code to support large-scale variant aggregation and quality control
- Provide excellent customer service to other HGSC groups and outside collaborators through ticketing systems
- Bachelor's degree in Computer Science, Biological Science, or a related field.
- Two years of relevant experience.
- Master's degree in a related field
- At least 1 year of hands-on experience working on Linux or Unix-based systems from the command line
- At least 1 year of programming experience with Python (preferred) or Java
- NGS pipeline development
- NGS sequence analysis tools (e.g., BWA, Samtools, bedtools, bamUtils, Picard, GATK,vcftools,bcftools)
- Common genomics data formats (e.g., FASTQ, BAM, VCF, BED)
- Database and big data software (e.g. NoSQL, Hadoop, HBase)
- Statistical and visualization software (e.g. R, SAS)
- Familiar with running analyses on a HPC clusters (Moab, PBS, and Torque preferred)
- Familiar with Cloud Computing (AWS, Google)
- Demonstrated experience in software development or testing
- Structural variation detection methods
- Expert proficiency in Unix environments