Systems Administrator IV

College Station, Texas

Texas A&M University

Job Title

Systems Administrator IV

Agency

Texas A&M University

Department

Vice Pres For Research

Proposed Minimum Salary

Commensurate

Job Location

College Station, Texas

Job Type

Staff

Job Description

Our Commitment

Texas A&M University is committed to enriching the learning and working environment for all visitors, students, faculty, and staff by promoting a culture that embraces inclusion, diversity, equity, and accountability. Diverse perspectives, talents, and identities are vital to accomplishing our mission and living our core values.

Who We Are

The Division of Research supports one of the largest research universities in the United States and one of only seventeen universities to hold the triple designation as a land-grant, sea-grant, and space-grant university. We are committed to a truly comprehensive university where students, researchers, and inventors bring scholarship and innovation to bear for the benefit of the community, the state, and the nation. To learn more, visit https://vpr.tamu.edu/.

What We Want

The Division of Research is seeking a Systems Administrator IV, who will be responsible for: maintaining a large and complex systems software development project; providing advanced technical support in developing, maintaining, installing, and using operating systems or subsystems; and providing strategic planning and technical guidance on major development projects. We seek to hire a candidate who will support our commitment to Inclusion, Diversity, Equity and Accountability (IDEA) as stated above. The preferred candidate will have a Bachelor's degree in applicable field or any equivalent combination of education and experience as well as eight years of systems administration experience. We seek to hire a team player who will promote collaboration and cooperation while working within a team. If the description sounds interesting to you, we invite you to apply to be considered for this opportunity to join our team.

What You Need To Know

Salary: Commensurate with Experience

Cover Letter/Resume': A cover letter and resume' are strongly recommended. You may upload these documents on the application under CV/Resume.

COVID-19 information: Texas A&M University monitors local, state and federally mandated health guidelines to keep students, employees, prospective employees, and visitors as safe as possible. For the latest information regarding Texas A&M's COVID-19 response, please visit the University's COVID-19 website. For COVID-19 employment-related information, please visit the Division of Human Resources and Organizational Effectiveness' COVID-19 website.

Required Education and Experience:
  • Bachelor's degree in applicable field or any equivalent combination of education and experience
  • Eight years of systems administration experience
Required Knowledge, Skills, and Abilities:
  • Knowledge of word processing and spreadsheet applications
  • Knowledge of IT architecture, application of systems theory, advanced negotiation skills, enterprise-level operations, multi-team leadership and coordination, advanced project management, sourcing, advanced vendor relations, advanced business acumen, change management, and knowledge of the IT industry
  • Ability to multi-task and work cooperatively with others
  • Knowledge of HPC architectures of cache-coherent Symmetric Multi-Processors (SMPs) and clusters and their interconnects
  • Knowledge of HPC mass storage technologies, OS organization and design, host-based and network security, and parallelization and code optimization issues
Other Requirements or Other Factors:
  • Ability to lift or move light to moderately heavy servers and storage equipment, and/or operate a lift for heavier equipment
  • Must be U.S. Person as defined by U.S. export control regulations
Preferred Education and Experience:
  • Ph.D. in relevant science or engineering discipline requiring strong analytical and computational skills
  • Advanced degree in relevant science or engineering discipline requiring strong analytical and computational skills
  • Experience with HPC usage and system administration at universities and other national HPC centers
  • Two years' experience, which can be gained concurrently with the eight years of experience in a related area
  • Four or more years of experience in developing, debugging, and administering Red Hat/CentOS/Ubuntu systems including software package management, software kernel patching, and failure recovery procedures
  • Two or more years of experience with managing HPC high speed interconnects (InfiniBand, Omni-Path, or 40+ GbE)
  • Two or more years of experience with managing storage subsystems using conventional RAID, Declustered RAID, or Erasure Coding in HPC environments
  • Two or more years of experience with managing HPC workload managers such as SLURM, LSF, PBS Pro, etc.
  • One or more years' experience in utilizing cloud resources for HPC and research computing
  • Experience in evaluating and benchmarking cluster architectures and their key subsystems including mass storage, interconnect, processor technology
Preferred Licenses and Certifications:
  • Linux/UNIX certifications related to systems administration
  • Certifications related to managing Spectrum Scale or Lustre based storage systems
Preferred Knowledge, Skills, and Abilities:
  • Ability to cultivate and maintain professional working relationships with people of diverse backgrounds
  • Ability to evaluate and benchmark cluster architectures and their key subsystems (e.g., mass storage, interconnect, processor technology)
  • Knowledge of scripting languages such as Bash, Python, Perl to maintain HPC systems and also for use scientific computing
  • Excellent troubleshooting skills including the ability to quickly recognize diverse failure modes and corresponding symptoms
  • Knowledge of C/C++, Fortran, CUDA, OpenCL, OpenMP, MPI for scientific computing
  • Knowledge of configuration management tools such as Puppet, Chef, Ansible, Salt, etc.
  • Knowledge of container technologies such as Docker, Singularity, Kubernetes
Responsibilities:
  • System Administration - Installs, tunes, applies patches, and upgrades UNIX/Linux operating systems in a HPC environment. Installs and applies firmware patches and upgrades to all components. Coordinates with vendors to manage the installation, integration, and maintenance of both critical and non-critical software and hardware. Repairs hardware and subsystems or works with vendor field engineers to resolve hardware or subsystem problems.
  • Systems Analyst - Analyzes and monitors HPC system performance. Documents and/or coordinates the development and use of software support structures for the collection and tracking of HPC system utilization statistics. Documents and/or coordinates all changes to hardware and systems software. Participates in the development and implementation, at all levels, of security measures and disaster recovery procedures.
  • Technologist - Provides technical leadership and/or oversight for enterprise-wide projects or operation. Participates in the strategic planning toward enhancing the University's HPC infrastructure. Leads or provides technical oversight in the evaluation of new HPC technologies (e.g., processor, mass storage, interconnect, cloud resources); identifies and recommends cost-effective combinations of these as future acquisitions in order to meet the University's research missions. Assists in the procurement and purchase process of other hardware and software. Assists in the development of designs for specialized HPC computing clouds. Performs other duties as assigned.
Instructions to Applicants: Applications received by Texas A&M University must either have all job application data entered or a resume attached. Failure to provide all job application data or a complete resume could result in an invalid submission and a rejected application. We encourage all applicants to upload a resume or use a LinkedIn profile to pre-populate the online application.

In accordance with the federal contractor vaccination mandate, specific facilities at The Texas A&M System may be considered a covered contractor workplace with covered contractor employees. Therefore, successful applicants for this position may be subject to the federal mandate and will be required to be fully vaccinated against COVID-19 as a condition of employment unless an approved medical or religious accommodation is in place.

All positions are security-sensitive. Applicants are subject to a criminal history investigation, and employment is contingent upon the institution's verification of credentials and/or other information required by the institution's procedures, including the completion of the criminal history check.

Equal Opportunity/Affirmative Action/Veterans/Disability Employer committed to diversity.

Posted: 01/28/2022