A day in the life of an SRE at LinkedIn:
http://engineering.linkedin.com/day-life/crash-course-linkedins-global-site-operations
Guide to Interviewing for Site Reliability Engineering at LinkedIn
The Role:
Site Reliability Engineers (SRE) at LinkedIn fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale.
Technical Interviews
You can expect questions that evaluate your skills in 5 primary areas.
1. Unix Internals
Know what happens under the hood.
Tip: Check out the book Stevens, Richard. Advanced Programming in the Unix Environment. 2013 Stevens
2. Network Infrastructure
Interviewers will be looking for an in depth knowledge and understanding of networking theory. Sample topics include the different protocols (TCP/IP, UDP, ICMP, etc…), MAC addresses, IP packets, DNS, OSI layers, and load balancing.
3. Systems Troubleshooting
Interviewers will be looking for a logical and structured approach to problem solving
4. Web Architecture
Candidates should understand what it means to operate services at scale in dynamic web environments. System design questions are meant to test your problem solving skills.
Tips: Make sure you explain your thought process. Explain and justify your assumptions. Ask qualifying questions.
5. Coding
This coding technical phone interview will test your ability to write programs in commonly used programming languages. You will be expected to write some programs in one of the following languages: Python, Perl, Ruby, PHP
Tips: While we're OK with other languages, the questions are time limited and overly verbose languages will handicap you. The tasks involved in these exercises will cover common operational needs with a particular focus on System Administration tasks. If you are out of practice you should brush up on file system processing, text parsing and basic language constructs like functions, loops and simple data structures.