Site Reliability Engineer (SRE) Lead
JPMorgan Chase & Co
Chicago, IL, United States
2d ago

As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems.

Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation.

You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks. In this environment you’ll take the lead on relevant projects, supported by an organization that provides the support and mentorship you need to learn and grow.

As an SRE you’ll be focused on running better production applications and systems.

Responsibilities :

  • Design, code, test and deliver software to automate manual operational work
  • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
  • Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
  • Identify application patterns and analytics in support of better service level objectives
  • Design self-healing and resiliency patterns
  • Design automated software and product upgrades, change management, and release management solutions
  • Coach or manage teams as applicable
  • Participate in the 24x7 support coverage as needed
  • Qualifications :

  • Bachelor’s degree or equivalent experience in an software engineering discipline
  • Expertise in at least one technology stack designing, coding, testing, and delivering software
  • Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
  • Working knowledge of infrastructure components. (E.g. routers, load balancers , cloud products , container systems , compute, storage and networks)
  • Excellent debugging and trouble shooting skills
  • 5+ years of software engineer with hands-on experience and / or site reliability engineering in the following languages : Java, UNIX, and Oracle
  • Experience implementing and / or using Git / Stash, Jenkins, JIRA, and code quality & security scanning tools
  • Developing monitoring tools and log analysis tools to manage operations
  • Exposure to App Dynamics, Splunk / Kibana, Elasticsearch / Kibana would be a plus
  • Design and contribute to performance monitoring and capacity management tools
  • Leading a team of engineers or production management personnel
  • Knowledge of cloud-based technologies and tools especially in deployment, monitoring and operations, such as Kubernetes, AWS, PCF etc.
  • Proficient in service-level changes to a system and troubleshooting components.
  • Proficient in the development of automated tools, systems and services in multiple technology domains
  • Experience in Agile development techniques, including Scrum
  • Proficient knowledge of one or more infrastructure components such as networking, cloud services, orchestration tools, containerization, compute and storage systems
  • As a JPMorgan Chase & Co. Site Reliability Engineering (SRE) you will combine software and systems to help us build a world-class engineering function.

    Working with your team, you’ll focus on improving our production applications and systems to creatively solve operations problems.

    Much of our support and software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation.

    Report this job

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    My Email
    By clicking on "Continue", I give neuvoo consent to process my data and to send me email alerts, as detailed in neuvoo's Privacy Policy . I may withdraw my consent or unsubscribe at any time.
    Application form