View all jobs
Pune, IN

Site Reliability Engineer

Department: Engineering

About the Role

The Ad Server and RTB Production Infrastructure is pivotal to ensuring our software applications reliability, availability, and overall excellence.  As an SRE Engineer, you will be responsible for the Ad Server and RTB Production Infrastructure. Your essential duties encompass ensuring the seamless operation and optimal performance of large-scale distributed software applications. Your role revolves around maintaining a robust and high-performing environment, contributing to the reliability of our services, and innovating solutions to guarantee 24/7 availability. By leveraging your technical expertise and dedication, you contribute to maintaining a seamless experience for our users while upholding the highest standards of operational excellence. Your specific responsibilities include: 

What You'll Do

  • Operational Support  
    • Be a primary point of contact for operational support of multiple large-scale distributed software applications in the Ad Server environment.  
    • Monitor availability of applications, promptly detect anomalies, analyze the impact, debug the problems in production, and follow up for the resolution by working closely with the engineering team.
    • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.  
    • Diligently work with the engineering team to expedite the resolution of incidents and ensure a swift return to normal operations.  
    • Be innovative in building dashboards, adding metrics, writing automation scripts to reduce operation toil, and streamlining processes to enhance system reliability and stability.  
    • Design and construct software and systems to effectively manage the Ad Serving platform, its underlying infrastructure, and applications. 
  • On Call Availability and Support  
    • Work in shifts to provide continuous on-call support for the production systems and resolve issues on your own by using predefined handbooks.  
    • Show a sense of urgency for high-priority issues and arrange war rooms to resolve the problems.  
    • Provide timely updates for high-priority issues and do handovers when a problem needs to be worked out 24*7.  
    • Conduct post-incident reviews to identify root causes, recommend preventive measures, and contribute to a culture of learning and improvement.

We'd Love for You to Have

  • Three plus years experience in software development.
  • Ability to program using programming languages like C or C++, Scripting languages like Shell or Python.
  • Good to have prior experience in technical engineering.
  • A proactive approach to identify the problems, performance bottlenecks, and areas of improvement.
  • Must know, Networking, Database (MySQL) and Linux System concepts, Debugging and analyzing the core dumps. 
  • Hands-on experience with monitoring and observability tools like Grafana, Nagios, Influx, ELK, etc.  
  • Familiarity with orchestration tools like Docker and Grafana and incident management systems like Zenduty.
  • Excellent communication and collaboration skills, with the ability to work effectively across teams.  
  • Self-motivated and positive mindset to examine any incidents.
  • Excellent interpersonal, written, and verbal communication skills.
  • B.E./ B.Tech. in Computers or equivalent.

Additional Information

Return to Office: PubMatic employees throughout the global have returned to our offices via a hybrid work schedule (3 days “in office” and 2 days “working remotely”) that is intended to maximize collaboration, innovation, and productivity among teams and across functions.

Benefits: Our benefits package includes the best of what leading organizations provide, such as stock options, paternity/maternity leave, healthcare insurance, broadband reimbursement. As well, when we’re back in the office, we all benefit from a kitchen loaded with healthy snacks and drinks and catered lunches and much more!

Diversity and Inclusion: PubMatic is proud to be an equal opportunity employer; we don’t just value diversity, we promote and celebrate it. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

About PubMatic

PubMatic is one of the world’s leading scaled digital advertising platforms, offering more transparent advertising solutions to publishers, media buyers, commerce companies and data owners, allowing them to harness the power and potential of the open internet to drive better business outcomes.

Founded in 2006 with the vision that data-driven decisioning would be the future of digital advertising, we enable content creators to run a more profitable advertising business, which in turn allows them to invest back into the multi-screen and multi-format content that consumers demand.

APPLY FOR THIS JOB