Data Engineer

Harvard Medical School

Boston, MA

ID: 7092811
Posted: January 25, 2022
Application Deadline: Open Until Filled

Job Description

Job-Specific Responsibilities

The Center for Computational Biomedicine (CCB) is a new center within the Blavatnik Institute at Harvard Medical School. Our mission is to provide cutting-edge computational capabilities, data analysis, and data integration technologies to support medical and biological research within the Medical School. Based at the Harvard Medical School Longwood Campus, we are part of a vibrant community of scientists, physicians, and engineers whose goal is to advance the boundaries of knowledge and improve patient care. The working environment combines the best features of a startup (fast pace, flexibility, flat hierarchies) with those of one of the leading medical schools (excellent benefits, outstanding opportunities for learning, great resources, name recognition).

CCB is looking for an individual to join the Data and Analytic Platforms Group, a group of engineers and scientists developing data warehousing and analytic solutions in support of epidemiology, healthcare economics, machine learning, and basic science research.

The Group works to reduce the burden on faculty by developing centrally managed and shareable data solutions to be used across research silos. We curate very large public and private healthcare utilization (insurance claims, electronic health record), multi-omics, environmental exposure, and social determinants data sets, provision access to those curated data sets, and develop analytic frameworks to accelerate reproducible academic research on top of them. Collectively these data sets contain information relating to hundreds of millions of patients.

This position reports to the Director of the CCB Data and Analytic Platforms Group. Primary responsibilities will include designing and implementing relational database architecture (schema, indexing, stored procedures, ETL processes, etc.) to warehouse multi-terabyte data sets in Microsoft SQL Server. This will include periodically evaluating various query performance metrics to ensure real-time availability to the research community and recommending modifications to the underlying database platform to resolve any identified issues. The bulk of this design work will be left up with the candidate, while a small portion will involve refactoring (or strategically deciding to abandon) existing ETL / indexing strategies. The data sets will be staged into a combination of proprietary schemas as well as the open-source i2b2 data model.

Additional opportunities will be available for the candidate to interact with individual scientific research teams to help improve their workflows.

**The below Typical Core Duties are a generalized list provided by Harvard's Job Frameworks, and may not actually reflect the job-specific responsibilities of this position.

Typical Core Duties

Oversee aspects of data management services which may include data modeling and database and analytics platform design, database performance and optimization, recovery/load strategy and implementation, and data modeling
Lead team in development and enhancements of the data user interface including data acquisition/access analysis
Monitor status of assignments; review code and document scripts and procedures
Design and implement data verification and testing methods
Identify and evaluate opportunities to improve existing subject areas and applications and determine viability for adoption
Provide technical expertise and direction in developing and supporting system level programs
Identify areas for efficiency or improvement; recommend improvements
Create new standards and procedures related to end user and interface development, including user requirements
Partner with others on technical issues and system architecture definition
May manage vendor relationships
Provide training to clients and staff
Function as subject matter expert or project lead; advise unit/school
Abide by and follow the Harvard University IT technical standards, policies, and Code of Conduct

Basic Qualifications

Minimum of seven years’ post-secondary education or relevant work experience

Additional Qualifications and Skills

Bachelor’s Degree in Computer Science or related degree preferred. At least 5 years experience as a software systems architect, including experience developing solutions with both relational database systems and at least one of the following languages: Java, Python, R.
Master’s Degree in a related field (Computer Science / Electrical Engineering, Bioinformatics, Statistics, Data Science, etc.) preferred.
Excellent communication skills, both written and oral
Experience with Microsoft SQL Server or cloud-based data warehousing technologies
Experience designing and maintaining multi-terabyte analytic relational databases, including index and query optimization
Experience orchestrating and optimizing Extract-Transform-Load (ETL) processes for multi- terabyte data warehouses
Comfort doing basic system administration in a Linux environment Comfort doing basic system administration in a Windows environment Experience with relational database index optimization
Experience with containerized (Docker or Singularity) workflows/paradigms
Experience with non-relational database systems (graph, key/value, document, array data stores) Experience with the R statistical computing platform
Experience with Java Experience with Python
Experience with high-performance computing
Comfort independently exploring distributed computing and database technologies and generating executive reports
Experience with public cloud platforms (AWS, Azure, Google Cloud)

Additional Information

This is a 12-month term appointment with the possibility of renewal contingent on funding.

This Staff role may start as a remote position due to the COVID-19 pandemic and while restrictions are still in place. The current remote nature of this role is considered temporary and may change as the University continues to evaluate options. While we continue to monitor the evolving COVID-19 guidelines, local on-campus work may be expected for some roles. Harvard Medical School does support flexible schedules, subject to individual departments’ business needs.

Harvard requires COVID vaccination for all Harvard community members. Individuals may claim exemption from the vaccine requirement for medical or religious reasons. More information regarding the University’s COVID vaccination requirement, exemptions, and verification of vaccination status may be found at the University’s “COVID-19 Vaccine Information” webpage:

Please note that we are currently conducting a majority of interviews and onboarding remotely and virtually. We appreciate your understanding.

Harvard University offers an outstanding benefits package including:
Time Off: 3 - 4 weeks paid vacation, paid holiday break, 12 paid sick days, 12.5 paid holidays, and 3 paid personal days per year.
Medical/Dental/Vision: We offer a variety of excellent medical plans, dental & vision plans, all coverage begins as of your start date.
Retirement: University-funded retirement plan with full vesting after 3 years of service.
Tuition Assistance Program: Competitive tuition assistance program, incredibly affordable classes directly at the Harvard Extension School, and discounted options through participating Harvard grad schools.
Transportation: Harvard offers a 50% discounted MBTA pass as well as additional options to assist employees in their daily commute.
Wellness options: Harvard offers programs and classes at little or no cost, including stress management, massages, nutrition, meditation, and complementary health services.
Harvard access to athletic facilities, libraries, campus events, and many discounts throughout metro Boston.
The Harvard Medical School is not able to provide visa sponsorship for this position.

Harvard Medical School strives to cultivate an environment that promotes inclusiveness and collaboration among students, faculty and staff and to create new avenues for discussion that will advance our shared mission to improve the health of people throughout the world.