Progressive Staffing - Careers

Site Reliability Engineer / DeVops - Sunnyvale,CA

Date Posted:

06-10-20 (08:24 AM)


Sunnyvale, California, UnitedStates





Job Description:

Handle all infrastructure change needs through any non-functional threads.
Cloud Adoption
Capacity Adjustments based on Stress Test runs
ECV check enablement / disablement
DR Tests
Periodic Performance Evaluations through Stress tests, profiling application and system characteristics
Solve System level findings that needs solution through point 1 above
Consult with application teams on application level solutions based upon industry / walmart standards.
Splunk dashboards
Instrumentation usage - Consume info and provide visibility across the board
Monitoring and Alerting - Share SLI / SLA / SLO baselines and trend related metrics
Stress test support
Observe and report on system characteristics across all Site facing services. During stress test no functionality is typically validated, but only system scalability and other system characteristics. Observing, reporting and taking actions like node-additions can definitely be done thru SRE
Introducing new technology or platform through evaluations
Ex: Uber move to Selar DB
Evaluation of new tech in Azure platform
CD / CI Management
Process management
Tools that aide with release check list validations with clear gates
CD / CI pipeline monitoring and metrics
Should be comfortable/effective to look at/build metrics/graphs, dive through logs, ssh into nodes. Use postman etc to identify request/response (data), call flows. Ramp up quickly on the system domain knowledge (Item, S&B, IRO, Rollups/Panama/Uber ecosystem)"

Key Skills:


Experience :Senior candidate with 10 + year.