Disclaimer: highly inspired from PL infra job post
Scope for Medusa:
- Develop best practices on the Medusa node software AND on the contracts for monitoring
- Work with the web3 engineer to have alerts etc for monitoring attacks on the contracts or failure on the network
- In charge of the deployment, the kubernetes docker etc infra to deploy a node and responsible for helping partners and debugging deployment issues in the community
Potential for part time at the beginning
Role:
- Design, develop, and maintain infrastructure in a mix of cloud-based and traditional environments to power large-scale, massively distributed, fault-tolerant services while ensuring the highest security standards.
- Work with standard tools to monitor and inspect the different deployments -- choosing and creating tooling to quickly assess health, evolution, and any adjustments needed.
- Work alongside a cross-functional team including software design & development, product management, and ecosystem engineers. Provide technical leadership, support, and best-practices to stakeholders across the PL Network (inside/outside the org).
- Incorporate monitoring, alerting, and observability to support services that allow us to maintain the highest standards of security, reliability and uptime.
- Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and developer velocity.
Ideal Expectations:
- Have experience deploying production-grade infrastructure in an automated, reliable, and portable manner using Continuous Integration & Continuous Deployment tools such as GitHub Actions, CircleCI, TravisCI, or similar.
- Comfortable with software-defined infrastructure & configuration management tools such as Terraform, Ansible, or similar.
- Experience with contemporary monitoring & metrics tools such as Prometheus, Grafana, InfluxDB, etc.
- Experience with container orchestration technologies such as Kubernetes, etc.
- Deep understanding and experience with core Internet protocols (BGP, IP, TCP, DNS, TLS, HTTP), data caching in networks, and Linux system administration.
- Have experience designing, administering, and securing cloud environments.
- Place high value in documentation, and sharing of knowledge through effective written and verbal communication with internal and external stakeholders.