What is the course about?
Originally developed as a general purpose programming language, Python has become an essential part of the Data Analyst’s toolkit. Across a series of four courses, we will examine Pythons’ role in the Analyst’s workflow.
There are an almost infinite set of permutations as to what comprises a data analytic project as it has a broad range of use cases ranging from the finance to the healthcare sector and it also plays a role in a complex set of outcomes including management reporting, fraud and risk detection, digital marketing and product development. Despite this, the stages of data analysis broadly falls into four main categories, these are:
Importing data - The means by which structured and unstructured data from a range of sources is accessible from within a project.
Cleaning and Organising data - The preparation of data for analysis by correcting or removing anomalies and managing missing values.
Analysing and Interpreting data - The techniques needed to explore data to gain insights that lead to improved decision making.
Visualising and Presenting data - The art of presenting data graphically to both communicate observations and as a vital part of the analysis process.
According to a recent survey in the United States, data scientists spend nearly 80% of their time collecting and organising data with 60% of this time spent preparing these datasets for analysis and interpretation. This hands-on course will teach you the techniques required to both diagnose problems in a dataset and manage issues such as anomalous, missing and duplicate data.
This course is designed to be studied either independently as a standalone course, teaching the techniques required by an Analyst to prepare data for analysis in Python or as the second in a series of four courses that examines the analysis lifecycle.
What will we cover?
Diagnose problems in a dataset
Deal with missing values
What will I achieve?
By the end of this course you should be able to...
Use Python to:
Identify missing and anomalous data
Manage null and missing values
Manage duplicate values
Work with numeric types
Work with string types.
What level is the course and do I need any particular skills?
You will need to have a working knowledge of Python as covered in our Introduction to Python course or equivalent experience in Python or another language.
Though it is not an entry requirement you will get more out of this course if you have a basic overview of the data analytics process as covered in our Data Analytics with Python: introduction course.
You should also be able to follow spoken instructions, read written instructions and information, and discuss work with your tutor in English.
How will I be taught, and will there be any work outside the class?
There will be some theoretical underpinning to the course, but it is nearly all practical, through Teacher demonstration and practical programming and problem solving activities. There is no official work set outside the class but it is a good idea to practice the skills you have learnt to reinforce classroom learning.
Are there any other costs? Is there anything I need to bring?
Computers are provided for each student with all the necessary software installed. All the software used on the course is free to download and use. Your tutor will recommend where to find this software for home use. Unfortunately due to the range of hardware and software used by students at home, the College is unable to provide advice on installation issues.
If you wish to copy the programs you produce on the course please bring a USB key or have access to a cloud service such as Google Drive or Dropbox. A pen and notepad for note taking is also advised.
When I've finished, what course can I do next?
Data analytics with python intermediate: importing data, data analytics with python intermediate: analysing data, data analytics with python intermediate: visualising data. You might want to explore the Excel courses in data analysis such as: Data analysis with Power BI, Excel analysing data (stage 1 & 2) Introduction to DAX: data analysis expression for Power BI.