Spanish bilingual and Hispanic jobs since 1997. Diversity job fairs since 2006. employers     login   |   register - post a job
Hispanic Diversity Recruitment - best jobs for hispanic, latino & bilingual (spanish & portuguese) jobseekers
    Log me in!   |   Site Map   |   Help   
 Manager, Site Reliability Engineering - San Diego, California, United States

Job information
Posted by: Apple 
Hiring entity type: Retail 
Work authorization: Not Specified for United States
Position type: Direct Hire, Full-Time 
Compensation: ******
Benefits: See below
Relocation: Not specified 
Position functions: Computers - Programming Languages
Computers - Platforms
Computers - Networks
Computers - Software Engineer
Travel: Unspecified 
Accept candidates: from anywhere 
Languages: English - Fluent
Minimum education: See below 
Minimum years experience: See below 
Resumes accepted in: English
Cover letter: No cover letter requested
Job code: 200296002 / Latpro-3831769 
Date posted: Oct-13-2021
State, Zip: California, 92101


Manager, Site Reliability Engineering

San Diego , California , United States

Software and Services


Posted: Oct 7, 2021

Weekly Hours: 40

Role Number: 200296002

Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. The Apple News team is looking for a manager with strong experience leading world class Production environments. You will manage a global team that uses technology to automate solutions and optimize outcomes focusing on application infrastructure in a fast-changing world of software delivery. We seek a passionate and high-energy manager of a global Site Reliability Engineering team to continue our focus in providing our customers the highest quality Apple Services experience. Our applications including News, Stocks, and Weather have to scale globally, remain highly available, and "just work." Here, you will lead a dynamic group, you'll have the rare and rewarding opportunity to help maintain world-class uptime of current and upcoming products that will inspire millions of Apple's customers every day. If you love engineering and running systems and infrastructure that will delight millions of customers, then this is the place for you!

Key Qualifications

  • Strong sense of ownership and integrity demonstrated through clear communication and collaboration
  • Experience managing and leading a Site Reliability Engineering team for a large-scale global 24/7 production environment
  • Proven leadership experience in an environment operating at scale and with distributed teams
  • Experience implementing and coordinating telemetry using monitoring and observability tools such as Splunk, Grafana and Prometheus
  • Drive to automate and improve manual operations through repeated iteration
  • Experience working with systems built with open source storage and search technologies including Cassandra, Kafka, Solr, Postgres and Redis is a plus
  • Experience working with cloud services such as AWS EKS, EC2, and S3 and container orchestration using Kubernetes
  • Experience with scale testing, disaster recovery, and capacity planning


==> Hire and develop a distributed team of extraordinary Site Reliability Engineers. ==> Design and maintain monitoring and alerting in production and qualification environments with 24x7 coverage and shifts for a myriad of applications in an agile and dynamic organization. ==> Develop and encourage strong troubleshooting capabilities that are used daily; hire and train successful Engineers who take steps on their own to isolate issues and resolve root cause through investigative analysis. ==> Develop and run a thorough and reliable process for incident resolution and communication for all technical production issues. Investigate new products/features and develop robust capacity plans and estimates of compute, storage, and service requirements. ==> Setup processes, write justifications, train users in sophisticated topics, write status reports and interact with other Apple staff and management. ==> Improve the stability, security, efficiency and scalability of all production systems that include multiple geographically dispersed data centers and servicing hundreds of millions of users presents unique challenges. As SRE Manager at Apple, you'll need to solve these problems using data, collaboration, and your own expertise. SREs at Apple own the full infrastructure stack; from device driver performance debugging to content delivery network traffic management. We run a mix of open source, vendor licensed, and internally developed tools to perform functions such as system configuration management, provisioning, software deployment, logging, and monitoring. You'll learn these tools and have opportunities to improve them. Our team is collaborative; we work closely with partner teams to deliver the best results for Apple. We strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.

Education & Experience

BS/MS in Computer Science or Equivalent


See job description


Apple requires you to fill in their on-line form which will open in a different window.

Enter your email address and click 'Apply':
  Prefer not to enter your email?