Skip Navigation
Loading...

Core4ce Careers

An Unwavering Force for National Security

Site Reliability Engineer



Information Technology --> Information Technology

Remote
 • 
ID: 663-383
 • 
Full-Time/Regular

Core4ce is seeking a Mid-Level Site Reliability Engineer for a position within our R&D organization. This individual will play a key role in ensuring the reliability, scalability, and performance of mission-critical systems. They will be responsible for building and maintaining highly available infrastructure using Amazon Web Services and ensuring smooth operations for mission-driven applications. 

The ideal candidate will thrive in a fast-paced environment, working collaboratively with development and operations teams to design, implement, and maintain resilient systems. This role includes implementing automation, improving monitoring and alerting systems, and optimizing infrastructure to support critical mission objectives. 

 

Responsibilities

·       Collaborate with members of the R&D team to ensure reliable and scalable deployments. 

·       Design and implement monitoring, logging, and alerting solutions to identify and resolve issues. 

·       Automate system administration tasks to improve efficiency and reduce operational workload. 

·       Create CloudFormation templates for repeatably provisioning infrastructure. 

·       Manage and optimize infrastructure in both cloud and on-premises environments. 

·       Ensure infrastructure security and compliance with organizational and governmental standards. 

·       Develop and maintain CI/CD pipelines to enable fast and reliable code deployment. 

·       Troubleshoot and resolve incidents to minimize downtime and impact on mission objectives. 

·       Provide input on architectural decisions to enhance system reliability and performance. 

 

Requirements

·       Ability to obtain a U.S. Security Clearance. 

·       3+ years of experience in site reliability engineering or system administration. 

·       3+ years’ experience programming in Bash, Python, and/or Java. 

·       Java system administration / JVM tuning experience a plus. 

·       3+ years’ experience with containerization and orchestration tools such as Docker and Kubernetes (or managed equivalents like ECS and EKS). 

·       Strong experience with AWS infrastructure and services such as EC2, S3, Lambda, and CloudFormation. 

·       AWS Associate certification (Developer, SysAdmin or Solutions Architect) is preferred. 

·       Experience with monitoring and observability tools (e.g., ELK Stack, Prometheus, Grafana). 

·       Proficiency in setting up CI/CD pipelines using tools like Git, GitHub Actions. 

·       Understanding of networking, security, and system optimization principles. 

·       Experience troubleshooting and resolving incidents in a production environment. 

·       Ability to work remotely on a distributed team. 

·       Experience working in a team following an Agile delivery process like Scrum or Kanban. 

 

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy), national origin, disability, veteran status, age, genetic information, or other legally protected status.


close