Principal Production Engineer Cloud Infrastructure Specialist Site Reliability Engineer
More About Me
Principal Production Engineer with 15+ years of experience optimizing cloud infrastructure, automation, and developer experience at scale. Proven expertise in Kubernetes, AWS/GCP, and infrastructure-as-code, consistently delivering 99.9%+ uptime for systems serving millions of users. Specialized in cost-effective architecture design, AI/ML infrastructure enablement, and security-first automation with a track record of reducing operational friction while maintaining robust security postures.
I possess extensive experience in cloud infrastructure and production engineering, having successfully led critical projects across financial services and e-learning sectors. My expertise spans from maintaining highly available systems at scale to implementing cutting-edge AI/ML infrastructure solutions. This diverse experience has strengthened my ability to balance technical excellence with business objectives, consistently delivering robust, cost-effective solutions that scale with organizational growth.
Comprehensive expertise in cloud infrastructure, container orchestration, and production engineering with deep knowledge of modern DevOps practices and security-first automation.
From my early experiences in academia to my tenure in various industries, each opportunity has shaped my skills and perspective, culminating in a well-rounded professional equipped to tackle diverse challenges.
Feb 2022 - Present
Design and maintain highly available infrastructure supporting millions of learners globally, achieving 99.9%+ uptime
Maintain clusters scaling to 1,000+ nodes across multiple AWS regions
Led Elasticsearch migration reducing operational overhead by 40% while enabling AI-powered search
Implemented SOC2-compliant security controls improving operational efficiency by 35%
June 2012 - February 2022
Abstracted infrastructure complexities reducing app deployment time from days to hours
Maintained stringent security standards while minimizing developer disruption through automated scanning
Led training initiatives to establish and grow a 15-person support team in Dublin operations hub
Authored runbooks and conducted workshops, improving mean time to resolution by 45%
May 2011 - May 2012
Managed all IT systems and infrastructure supporting company operations across Dublin and remote offices
Implemented operational improvements that enhanced customer experience and reduced ticket resolution time by 30%
Maintained on-site server infrastructure and customer-provisioned hardware with 99.5% uptime SLA
Jan 2006 - May 2011
Maintained and operated enterprise IT systems supporting 500+ employees across EMEA region
Conducted operational analysis identifying service improvements, resulting in 25% reduction in system downtime
Oversaw maintenance of on-site data centers and customer-provisioned infrastructure across multiple facilities
July 2006
Completed Bachelor's in Computer Applications with Honours classification
Active member of Computer Networking Society (Redbrick), collaborated on inter-college network infrastructure projects
Interested in discussing cloud infrastructure, DevOps transformation, or production engineering challenges? I'd love to hear from you.