Applied Data Science with Python Programming
Data science is now playing more and more important role in the era of big data. The posted jobs are more than the applicants for data scientists' job in the current job market. One of important reason for that is most of data analysts can only use SAS to do data analysis in non-Hadoop environment and don't know how to use any open source tools (such as R, Python, Scala, etc.) to do analysis.
However, in Canada, open source tools get more and more popular across all industries, and will be over SAS quickly for data analysis in industry. For instance, 5 big Banks, Telecom and consulting companies are using Python, R or Scala to do big data analysis and modeling instead of using SAS in Hadoop. Skills in SAS are no longer attractive to employers as before, instead the open source tools become required and more attractive to employers. For the popularity of programming language, you can get details at http://www.tiobe.com/tiobe_index. In terms of TIOBE index report 2016, you can see Python moved up three spots within the last year to claim the number 5 spot. Meanwhile, R was ranked to 16, while SAS just dropped to number 21. Data Scientist is a brand new role vs. previous data analyst, and more opportunity and more promising and more paid. Capture this opportunity with good preparation, and don't miss it! Please go to any job search website, and try to search for Data Scientist to feel how the job is so hot and so demanding.
In order to fit the needs of data scientist job market, this course intends to provide required knowledge and skills to help data analyst optimize Data Science learning path to successfully transit into data scientist. The topics in this course come from an analysis of real requirements in data scientist job listings from the biggest tech employers. The course will not only introduce you step-by-step to the process of installing the Python interpreter and data ingestion/wrangling, but also guide you from end-to-end to develop models with machine learning in Python.
The course is created around three themes designed to get you started and using Python for applied machine learning effectively and quickly. These three parts are as follows:
Lessons: Learn how data can be processed in Python (Python fundamental), and how machine learning project map onto Python and the best practice way of working through each task (Python advance – machine learning) through two sessions
Projects: Tie together all of the knowledge from the lessons by working through case study data processing and predictive modeling problems
Recipes: Apply machine learning with a catalog of standalone recipes in Python are provided as bonus, which you can copy-and-paste as a starting point for your new projects
Who is this course designed for:
· Anyone without prior coding or scripting experience but with science/engineering/finance background and aspiration to be data scientists
· New graduates with science/engineering/finance background, and would like to exploit Python to perform data science operations
· Developers and programmers who intend to expand their knowledge and learn about data manipulation and machine learning
· SAS programmer in the finance, telecom or other non-tech industries who want to transition from reporting or data cleaners into the data scientist role
You can seek and do data scientists job after mastering all you learnt from this course with confidence
Part 1: Introduction to Applied Data Science with Python Course
- Inclination
- Lessons
- Projects
- Recipes
- What You Learn From This Course
- FAQ
Part 2: Python Ecosystem for Machine Learning
- Python Ecosystem Installation
- Jupyter Installation
- SciPy – NumPy, Matplotlib, Pandas, Scikit-Learn, and statsmodels
Part 3: Python Programming Fundamentals
- Variables and Data Types
- Basic Operators
- Number Type Conversion
- Mathematical, Random Number, Trigonometric Functions and Mathematics Constants
- Working With Lists, Tuples, Strings, Sets and Dictionary
- Working With Sequences
- Working With Collections
- Conditionals
- Loops
- Mathematical Constants
- Functions
- The globals() and locals() Functions
- Modules
- Exercises
- File I/O
- Printing to the Screen
- Reading Keyboard Input
- Opening and Closing Files
· Reading and Writing Binary Files
· Creating a New File
· Working with Directories
· Exercises
· Overview of OOP Terminology
· Class and Objects
· Creating Classes
· Creating Instance Objects
· Accessing Attributes
· All Concepts of Objects Together
· Built-In Class Attributes
· Class Inheritance
· Drive a Class from Multiple Parent Classes
· Overriding Methods
- Exercises
Part 4: IPython and Raw Python, NumPy, Pandas
· Structure of IPython And Raw Python
· Jupyter Notebook
· Row Python
o Map
o Filter
o List Comprehensions
o Lambda Functions
· Numpy
o NumPy Array Basics
o Array Creation
o Resizing arrays
o Arrays derived from NumPy functions
o Getting an array directly from a file
o Multi-dimensional array
o Heterogeneous lists
o From lists to multidimensional arrays
o Extracting data from pandas
o Boolean Selection
o Helpful Methods and Shortcuts
o Vectorization
o Summary of NumPy
· pandas
o General pandas Concepts
o Object Creation
o Reading and Writing Data
o View Data
o Data Selection
o Missing Data
o Operations
o Concatenating objects
o Set logic on the other axes
o append
o Ignoring indexes on the concatenation axis
o Concatenating with mixed ndims
o More concatenating with group keys
o Appending rows to a DataFrame
o Database-style DataFrame joining/merging
o Merge
o Join
o Overlapping value columns
o Joining multiple DataFrame or Panel objects
o Merging Ordered Data
o Merging together values within Series or DataFrame columns
o Idioms
o Building Criteria
o Grouping
o Descriptive Statistics
o Convert DataFrame to an array
o Exercises
Part 5: Matplotlib
· Introducing the basics of matplotlib
· Curve plotting
· Using panels
· Scatterplots
· Histograms
· Bar graphs
· Image visualization
· Selected graphical examples with pandas
· Boxplots and histograms
· Scatterplots
· Parallel coordinates
[【授课名师】Mr. Chen ;