CloudFront team is seeking a highly skilled and motivated Data Scientist to lead data analysis and help build a world class capacity planning, forecasting and data analytics platform.
We are responsible for planning server and network capacity ahead by analyzing resource usage trends.
The team is responsible for providing actionable capacity, availability and performance metrics, causal attribution, and future predictions in a format that is easy to digest at the highest levels in the organization.
These business insights highlight areas of capacity shortage or opportunity to optimize underutilized capacity, performance outliers, and assist in making well-
informed, data-based business decisions. Using this information, we will help business leaders develop a strategy on what company-
wide investments to make and the level of their importance.
As an Amazon Data Scientist, you will be working in one of the world's largest data warehouse environments. You will help build one of the largest cross-
functional databases in Amazon, translating businesses questions into quantitative mechanisms used by hundreds of users worldwide.
You'll research and reuse existing data sources, expose and measure the current performance of our systems, find and quantify opportunities for improvement, and dive deep into existing algorithms to explain unexpected performance or measure causal relationships.
You will collaborate with engineering, research, and business teams for future innovation. You need to be a sophisticated user of data querying tools and advanced quantitative and modeling techniques, and an expert at synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication to drive change.
Translate business questions and concerns into specific quantitative questions that can be answered with available data using sound methodologies.
In cases where questions cannot be answered with available data, work with engineers to produce the required data
Retrieve, synthesize, and present critical data in a format that is immediately useful to answering specific questions or improving system performance
Analyze historical data to identify trends and support decision making
Apply statistical or machine learning knowledge to specific business problems and data
Improve upon existing methodologies by developing new data sources, testing model enhancements, and fine-tuning model parameters
Provide requirements to develop analytic capabilities, platforms, and pipelines
Formalize assumptions about how systems are expected to work, create statistical definitions of outliers, and develop methods to systematically identify these outliers.
Work out why such examples are outliers and define if any actions are needed
Given anecdotes about anomalies, design strategies to quantify the overall impact of such anomalies, deep dive to explain why they happen, and identify fixes
Build decision-making models and propose solutions for business problems
Conduct written and verbal presentations to share insights and recommendations to audiences of varying levels of technical sophistication
Amazon is an Equal Opportunity-Affirmative Action Employer Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation
Master’s degree in Statistics, Applied Mathematics, Operations Research, Economics or a related quantitative field
3 years of experience with data querying languages (e.g. SQL), scripting languages (e.g. Python), or statistical / mathematical software (e.g. R, Weka, SAS, Matlab)
3 years of experience articulating business questions and using quantitative techniques to arrive at a solution using available data
Proven success in communicating with users, other technical teams, and senior management to collect requirements, describe data modeling decisions and data analysis strategy
Experience in working and delivering end-to-end projects independently
Experience providing technical leadership and mentoring other engineers on best practices in quantitative analysis
Depth and breadth in quantitative knowledge. Excellent quantitative modeling, statistical analysis skills and problem-solving skills.
Sophisticated user of statistical methods and tools
Demonstrable track record of dealing well with ambiguity, prioritizing needs, and delivering results in a dynamic environment
Ability to develop experimental and analytic plans for data modeling processes, use of strong baselines, ability to accurately determine cause and effect relations
Experience processing, filtering, and presenting large quantities (Millions to Billions of rows) of data
Depth of knowledge in machine learning algorithms
Excellent verbal and written communication skills with the ability to effectively advocate technical solutions to research scientists, engineering teams and business audiences
Experience in server, data center and network bandwidth capacity planning is a plus