Site Reliability Engineer (Cloud Operations)

Location: Anywhere (US) – Eastern Time Zone Preferred

At Network to Code, our dedication to pioneering network automation technologies sets us apart from the rest. We don’t just keep up with trends; we define them. Our innovative solutions revolutionize the way organizations deploy, manage, and utilize their networks.

Through a combination of managed and professional services, we implement data-driven network automation strategies grounded in NetDevOps principles. This approach enhances reliability, boosts efficiency, fortifies security, and slashes costs for our clients.

As proud sponsors of Nautobot, the premier open source Network Source of Truth and Automation platform, we’re not just contributing to the industry; we’re leading it. Our efforts haven’t gone unnoticed. We’ve been honored as an Inc. Best Workplace and featured in the prestigious Inc. 5000 list. Additionally, our groundbreaking work has earned recognition in multiple Gartner reports, solidifying our position as trailblazers in the field of network automation.

As a Site Reliability Engineer (SRE) on the Nautobot Cloud Operations team, you will help deliver and maintain our managed Nautobot SaaS offering. Your primary focus will be operating, supporting, and evolving customer environments in AWS—especially EKS, EC2, and related services—while ensuring uptime, performance, and security. You will also handle occasional escalations for legacy customers running on AKS or on-premises deployments.

This role combines operational excellence with a mindset for continuous improvement. You will work across infrastructure, CI/CD pipelines, and observability tooling, applying DevOps best practices to deliver a reliable, scalable, and secure platform for our customers.

A day in the life

Operate and support Nautobot Cloud deployments in AWS, including EKS, EC2, RDS, and associated services.
Support resolution of escalated issues related to other Kubernetes-like, including AKS or on-prem, customers as needed.
Deploy and update Nautobot instances using Helm charts, Kubernetes manifests, and automation workflows.
Implement improvements to CI/CD pipelines (GitHub Actions, Terraform, Ansible) for provisioning, upgrades, and configuration management.
Maintain observability tools (Prometheus, Loki, Grafana) to ensure accurate monitoring, alerting, and logging.
Troubleshoot application and infrastructure issues across containerized environments.
Collaborate with engineers across Cloud Operations, Nautobot Core, and Nautobot Apps teams to deliver cross-functional solutions.
Contribute to documentation for operational runbooks, troubleshooting guides, and architecture diagrams.
Participate in Agile ceremonies, including standups and retrospectives.

What you bring

Passion for reliability, customer success, and operational excellence.
Ability to troubleshoot complex distributed systems and quickly identify root causes.
Strong communication skills—able to clearly convey technical concepts to both peers and customers.
A proactive mindset, looking for opportunities to improve processes and prevent issues before they occur.
Flexibility to adapt to changing priorities and technologies

What you have

3–5 years of experience applying DevOps or SRE practices to production systems.
2+ years experience operating workloads in AWS, with a focus on EKS, EC2, IAM, and networking.
2+ years working with Kubernetes (preferably in production) and Helm.
Experience with IaC tools such as Terraform and configuration management tools like Ansible.
Familiarity with CI/CD pipelines (GitHub Actions, Jenkins, CircleCI, etc.).
Proficiency in scripting languages such as Python or Bash.
Comfortable working in Linux-based environments.
Familiarity with monitoring, logging, and alerting solutions (Prometheus, Loki, Grafana, ELK).
Networking fundamentals (equivalent to CCNA-level understanding) is a plus.

SUBMIT RESUME

Our teams are led by some of the brightest minds in network automation, but we don’t just lead; we mentor, collaborate, and learn from each other every step of the way. With a global presence, we leverage technology to foster close-knit relationships, no matter where in the world we are. Virtual team events and happy hours keep us connected and inspired.

But at the heart of it all, it’s our people who make us who we are. Our culture is built on inclusivity, collaboration, and a shared vision for the future. We believe in the power of diversity and value every individual’s unique perspective. Whether you’re a seasoned pro or a rising star, if you’re passionate about shaping the future of network automation, there’s a place for you here.

Join us at Network to Code, where every voice is heard, every idea is valued, and every day brings new opportunities for growth and innovation. We’re not just building networks; we’re building a brighter future together.

Network to Code is an Equal Opportunity Employer. Network to Code does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, non-disqualifying physical or mental disability, national origin, veteran status or any other basis covered by appropriate law. All employment is decided on the basis of qualifications, merit, and business need.

SUBMIT RESUME

Author

Cookie	Duration	Description
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
li_gc	5 months 27 days	Linkedin set this cookie for storing visitor's consent regarding using cookies for non-essential purposes.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__hstc	5 months 27 days	Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
AnalyticsSyncHistory	1 month	Linkedin set this cookie to store information about the time a sync took place with the lms_analytics cookie.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
hubspotutk	5 months 27 days	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
ln_or	1 day	Linkedin sets this cookie to registers statistical data on users' behaviour on the website for internal analytics.

Cookie	Duration	Description
bcookie	1 year	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser IDs.
bscookie	1 year	LinkedIn sets this cookie to store performed actions on the website.
li_sugr	3 months	LinkedIn sets this cookie to collect user behaviour data to optimise the website and make advertisements on the website more relevant.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.

Site Reliability Engineer (Cloud Operations)