CLOUD-OPS ENGINEER

CLOUD-OPS ENGINEER

Organisation:

Singapore Academy of Law


 

YOU WILL:

  • Be responsible for the cloud infrastructure in terms of: scalability, availability, performance, (cost & resource) efficiency, capacity planning;
  • Spend lesser than 50% of the time working through our vendors in carrying out the day-to-day IT operation such as: performance monitoring, attending to issues, manual intervention and service requests (infrastructure provisioning, deployment, data/system backups, patching and disaster recovery)
    • Be responsible for system administration tasks such as OS & application patching, software upgrades, backup, restore, etc. for our cloud infrastructure (AWS, Azure);
    • Drive troubleshooting, incident response/ resolution and blameless post-mortems;
    • Be responsible to maintain services once they go live by measuring and monitoring availability, performance, and overall system health;
    • Strive to streamline and secure cloud infrastructure management by proactively monitoring and protecting system boundaries, application deployment and release status, automating manual tasks, and keeping systems secured.
  • Spend most of the time on development tasks such as writing Infrastructure-as-Code (IaC), continuous improvement, and driving initiatives to improve automation, scalability and reliability:
    • Develop automation code for change control, configuration management, deployment and maintenance of infrastructure and applications through CI/CD pipeline;
    • Improve service resiliency through high levels of automation, to effectively detect/ predict/ prevent issues in the environment;
    • Develop and fine-tune change & incident management processes across teams;
    • Scale systems sustainably through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity;
    • Automate provisioning by IaC and Configuration-as-Code (CaC);
    • Review resource/ workload to optimise cost.

 

YOU HAVE:

  • To enjoy cloud, systems and security management, and relentlessly automating work.
  • Related cloud experience attested by certifications in SysOps, DevOps (AWS preferred).
  • Three or more years of hands-on experience operating and maintaining systems running on Cloud infrastructure (AWS preferred).
  • Proven experience in various IaC/ CaC tools. E.g., Ansible, Terraform.
  • Hands-on experience administering Unix & Windows operating systems as well as automating with shell scripts.
  • Deep appreciation of infrastructure and application monitoring, logging, alerting, release and configuration management.
  • Deep understanding in networking (e.g. HTTP protocol, TCP/IP, routing tables, network topology, load balancers, DNS, NTP).
  • Experience in standard IT security practices (e.g., encryption, certificates, key management).

YOU WILL CATCH OUR ATTENTION IF YOU:

  • Have proven experience in Site Reliability Engineering (SRE), DevSecOps or FinOps practices and methodologies.
  • Have experience in operating containerised workloads (using Docker, Kubernetes).
  • Have experience operating internet-facing 24/7 high-load applications (e.g. eCommerce).

 

How to apply:

Interested candidates are invited to apply here.

Only shortlisted candidates will be notified.