about 1 month ago
Our product is a cloud-based platform for real time collaboration,
making it easy for individuals and teams to create, interact with, and
share content. The client applications - large touch-screen monitors
or video walls, PC and Mac web browsers, iPad and Android mobile apps
- all sync in real time allowing multiple users to interact and see
each other's work on the same screen. Reporting to the Manager of
Cloud Operations, this position will be responsible for systems and
application service uptime in a high-availability customer facing
business critical 247 SaaS environment.
POSITION DUTIES & RESPONSIBILITIES:
* Application release management & configuration, upgrades/patches &
support of Unix/Linux systems - applications on ******* and Ruby on
Rails in a SaaS environment.
* Identify, diagnose, and resolve complex technology issues
efficiently in live production environment and drive to quick
resolutions; leverage those events to improve current technology &
processes towards prevention of such issues.
* Work closely with the Engineering team to escalate issues for
triage and resolution.
* Routinely review tickets and diagnostics to identify trends and
* Hands-on implementation & upgrade of tools for monitoring,
trending & diagnostics.
* Audit proactive monitoring of all systems to detect and resolve
problems to ensure uninterrupted operation of all infrastructure
* Update corresponding documentation on installation process &
SKILLS AND EXPECTATIONS:
* BS degree in an IT-related technical field or equivalent
experience defined as a minimum of 5 years working in a DevOps group
* Linux environment experience is a MUST; SaaS environment
experience is a strong plus.
* Strong technical systems & application operations/release
management experience with a passion for troubleshooting and triage of
incidents, bringing issues to rapid resolution.
* Extensive working knowledge of as many of following technologies
and areas as possible:
* Systems - Linux, Unix, Java & open source software
* Command over popular scripting languages to enable automation of
release processes, monitoring, trending, alerting techniques -
ideally a working knowledge of Python & Shell.
* Automation using Chef/ Puppet/ Ansible in a cloud environment
* Working knowledge of databases
* Good Networking fundamentals with Protocols, Load Balancers, VPN,
switches/routers/firewalls, LDAP, SNMP, SMTP
* Good understanding of file system Technologies -build and/or
troubleshoot file system issues
* Virtualization/Cloud technologies - Strong working knowledge of
AWS with a good understanding of other technologies like OpenStack,
OpenShift, Google Cloud
* Application Servers
* Web servers/reverse proxies such as apache, nginx and haproxy
* Web application frameworks such as *******, Ruby on Rails, etc.
* Monitoring, trending & diagnostics tools including Nagios, Cacti,
Zenoss, Graphite, etc.
* Logging tools such as Splunk, ELK stack, etc.
* Using source code control systems such as svn and git (or similar)
* Work/defect tracking systems such as Pivotal/JIRA
* Wiki tools such as Confluence
* Knowledge of the use and maintenance of continuous integration and
continuous deployment systems.
* Ability to prioritize & balance activity between projects for
longer-term impact -and- immediate production critical requirements
with a customer focus.
* Must be a self-starter and require minimal guidance.
* Excellent oral and written communication skills essential.
* Ability to work in a collaborative environment essential.
* Ability to take on-call rotation & co-ordinate work under
production critical situations is essential.
15 days ago
only 13 days until close
15 days ago
only 13 days until close