加国“数据分析第一名师”陈老师重拳出击：当今最热的五大银行都在用的Python数据分析软件！他的学生遍布各大银行，通讯公司及大型企业！（Nov 12. 2016）

你知道吗？五大银行都在用Python软件

你知道吗？Python即将成为最热门的数据分析软件

赫赫有名的陈老师，隆重推出“Python就业实战班”，他的学生遍布各大银行，通讯公司及大型企业

两个月时间掌握，令您从众多竞争对手中脱颖而出，2016高薪就业在此一举

您还在犹豫什么！订座热线：416-665-1888!

时间：11月12日、19日（周六）2:30pm

Python处理数据的优势：

1. 异常快捷的开发速度，代码量巨少

2. 丰富的数据处理包，不管正则也好，html解析，xml解析，用起来非常方便

3. 内部类型使用成本巨低，不需要额外怎么操作

4. 有处理大数据的框架，编码问题处理起来异常方便了

【主讲人】Mr. Chen

- 曾担任加拿大最大电信公司Director，现任加拿大著名银行Senior Manager

- 多次担任SAS公司技术讲座主讲嘉宾，获得大量好评

- 中国重点高校客座教授

- 多市最早从事数据分析工作的华人之一

- 多年北美政府，银行，通信，医药行业工作经验

- 对就业市场，数据分析工作的前景，简历准备，面试技巧，流行工具以及用人单位招聘流程都了如指掌

- 维多利亚多年从教经验，从2001年起帮助众多原本无相关经验和背景的学员顺利找到高薪稳定的数据分析工作，备受学员尊敬

为什么我们选择Python?

Python是一种面向对象、直译式的电脑程式语言，具有近二十年的发展历史。它包含了一组功能完备的标准库，能够轻松完成很多常见的任务。它的语法简单，与其它大多数程式设计语言使用大括号不一样，它使用缩进来定义语句块。

Python的设计哲学是“优雅”、“明确”、“简单”。 Python开发者的哲学是“用一种方法，最好是只有一种方法来做一件事”，也因此它和拥有明显个人风格的其他语言很不一样。在设计Python语言时，如果面临多种选择，Python开发者一般会拒绝花俏的语法，而选择明确没有或者很少有歧义的语法。这些准则被称为「Python格言」。

Python具备垃圾回收功能，能够自动管理内存使用。它经常被当作脚本语言用于处理系统管理任务和网路程式编写，然而它也非常适合完成各种高阶任务。 Python虚拟机本身几乎可以在所有的作业系统中运行。Python支援命令式程式设计、面向对象程序设计、函数式编程、面向侧面的程序设计、泛型编程多种编程范式。

加国“数据分析第一名师”陈老师重拳出击：当今最热的数据分析软件“Python就业实战班”

第一次课：11月26日（周六）6pm;固定在每周六6pm-10pm

[课程内容]

Session 1

Part 1: Introduction to Applied Data Science with Python Course

• Inclination

• Lessons

• Projects

• Recipes

• What You Learn From This Course

• FAQ

Part 2: Python Ecosystem for Machine Learning

• Python Ecosystem Installation

• Jupyter Installation

• SciPy – NumPy, Matplotlib, Pandas, Scikit-Learn, and statsmodels

Part 3: Python Programming Basics

• Variables and Data Types

• Basic Operators

• Number Type Conversion

• Mathematical, Random Number, Trigonometric Functions and Mathematics Constants

• Working With Lists, Tuples, Strings, Sets and Dictionary

• Working With Sequences

• Working With Collections

• Conditionals

• Loops

• Functions

• Class

• Exercises 1

Part 4: Data Ingestion and Munging

• Data Loading

a. Load CSV Files with the Python Standard Library

b. Load CSV Files with NumPy

c. Load CSV Files with Pandas

• Data Processing

a. Data Preprocessing with Pandas

a) Data Selection and Manipulation

b) Dealing with problematic data and Missing Value

c) Dealing with big datasets

d) Accessing other data formats

b. Data Preprocessing with NumPy

a) Creating NumPy Arrays

b) NumPy Fast Operation and Computations

• Working with Categorical and Textual Data

• Visualization

a) Introducing the basics of matplotlib

b) Selected graphical examples with pandas

c) Advanced data learning representation

• Exercises 2

Session 2

Part 5: Introducing EDA

• Understand Data With Descriptive Statistics

• Understand Data With Visualization

• The Detection and Treatment of Outliers

a. Univariate outlier detection

b. EllipticEnvelope

c. OneClassSVM

• Pre-Process Data

a. Data Transforms

b. Rescale Data

c. Standardize Data

d. Normalize Data

e. Binarize Data

• Dimensionality Reduction

a. The Covariance Matrix

b. Principal Component Analysis (PCA)

c. RandomizedPCA

d. Latent Factor Analysis (LFA)

e. Linear Discriminant Analysis (LDA)

f. Latent Semantical Analysis (LSA)

g. Independent Component Analysis (ICA)

h. Kernel PCA

• Exercise 3

Part 6: Feature Selection

• Univariate Selection

• Recursive Feature Elimination

• Stability and L1 Based Selection

• Feature Importance

• Exercise 4

Part 7: Resampling Methods

• Train and Test Sets.

• K-fold Cross Validation.

• Leave One Out Cross Validation.

• Repeated Random Test-Train Splits

Part 8: Algorithm Evaluation Metrics

• Classification Metrics

a. Classification Accuracy

b. Logarithmic Loss

c. Area under ROC Curve

d. Confusion Matrix

e. Classification Report

• Regression Metrics

a. Mean Absolute Error

b. Mean Squared Error

c. R-square

Part 9: Model Techniques Selection for Classification

• Linear Machine Learning Algorithms

a. Logistic Regression

b. Linear Discriminant Analysis

• Nonlinear Machine Learning Algorithms:

a. K-Nearest Neighbors.

b. Naive Bayes.

c. Classification and Regression Trees.

d. Support Vector Machines

• Exercise 5

Part 10: Model Techniques Exploration for Regression

• Linear Machine Learning Algorithms

a. Linear Regression.

b. Ridge Regression.

c. LASSO Linear Regression.

d. Elastic Net Regression

• Nonlinear Machine Learning Algorithms:

a. K-Nearest Neighbors.

b. Classification and Regression Trees.

c. Support Vector Machines

• Exercise 6

Part 11: Champion Model Technique Selection

• How to formulate an experiment to directly compare machine learning algorithms

• A reusable template for evaluating the performance of multiple algorithms

• How to report and visualize the results when comparing algorithm performance

• Exercise 7

Part 12: Pipelines Machine Learning Work Flows Automation

• How to use pipelines to minimize data leakage.

• How to construct a data preparation and modeling pipeline.

• How to construct a feature extraction and modeling pipeline

• Exercise 8

Part 13: Ensemble Methods

• Bagging. Building multiple models from different subsamples of the training dataset

a. Bagged Decision Trees

b. Random Forest

c. Extra Trees

• Boosting. Building multiple models (typically of the same type) each of which learns to fix the prediction errors of a prior model in the sequence of models

a. adaBoost

b. Stochastic Gradient Boosting

• Voting. Building multiple models (typically of differing types) and simple statistics (like calculating the mean) are used to combine predictions

• Exercise 9

Part 14: Algorithm Parameter Tuning

• The importance of algorithm parameter tuning to improve algorithm performance.

• How to use a grid search algorithm tuning strategy.

• How to use a random search algorithm tuning strategy

• Exercise 10

Part 15: Save and Load Machine Learning Models

• Finalize Your Model with pickle

• Finalize Your Model with joblib

Part 16: Projects

• Predictive Modeling Project Template

a. Use A Structured Step-By-Step Process

b. Machine Learning Project Template in Python

c. Machine Learning Project Template Steps

d. Tips For Using The Template Well

• Project 1: The Hello World of Classification Machine Learning (multinomial target model)

• Project 2: Regression Machine Learning Case Study Project (continuous target model)

• Project 3: Binary Classification Machine Learning Case Study Project (binary target model)

【主讲人】Mr. Chen

维多利亚培训中心(Victoria Training Center (Toronto)
订座电话：416-665-1888，Website：www.victoronto.com
地址：200 Consumers Road，Suite 118，M2J 4R4 (位于Consumers夹Sheppard东南角第三座楼，近地铁站，免费停车

本周讲座

维多利亚教育中心 - 热线电话：416-665-1888
Toronto: 250 Consumers Road, Suite 901, Toronto, Ontario, Canada M2J 4V6
Mississauga: Unit 129, 1140 Burnhamthorpe Road West, Mississauga, Ontario L5C 4E6
Copyright © 2009-2017 Victoria Toronto Training Center. All rights reserved.

本页最后更新: | -- | 网站设计和虚拟主机服务 WECAN