Lead Site Reliability Engineer - Chase
JPMorgan Chase & Co..com
Office
Dublin, Ireland
Full Time
Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.
As a Lead Site Reliability Engineer at JPMorgan Chase within the International Consumer Bank, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers.
Job Responsibilities
- Demonstrates and champions site reliability culture and practices and exerts technical influence throughout your team
- Leads initiatives to improve the reliability and stability of your team’s applications and platforms using data-driven analytics to improve service levels
- Collaborates with team members to identify comprehensive service level indicators and stakeholders to establish reasonable service level objectives and error budgets with customers
- Demonstrates a high level of technical expertise within one or more technical domains and proactively identifies and solves technology-related bottlenecks in your areas of expertise
- Acts as the main point of contact during major incidents for your application and demonstrates the skills to identify and solve issues quickly to avoid financial losses
- Documents and shares knowledge within your organization via internal forums and communities of practice
Required qualifications, capabilities, and skills
- Demonstrate deep proficiency in reliability, scalability, performance, security, and enterprise system architecture
- Implement site reliability best practices and drive toil reduction within applications or platforms
- Program fluently in Python
- Apply advanced knowledge of software applications and technical processes, with emerging expertise in technical disciplines
- Utilize observability tools for monitoring, SLO alerting, and telemetry collection (e.g., Grafana, Dynatrace, Prometheus, Datadog, Splunk)
- Use continuous integration and delivery tools such as Jenkins, GitLab, and Terraform
- Work with Amazon DynamoDB or similar NoSQL database technologies
- Integrate DynamoDB with other AWS services and navigate the AWS ecosystem
- Manage containers and orchestration using ECS, Kubernetes, and Docker
- Troubleshoot and resolve common networking technologies and issues
- Identify and solve problems involving complex data structures and algorithms, while self-educating and collaborating across stakeholder groups
- Experience or interest in Site Reliability Engineering (SRE) practices and principles.
- Familiarity with FastAPI/GraphQL
- Familiarity with front end development frameworks, specifically React.js
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.