Probability and statistics for Data Analysis

Course Dates: 07/06/21 - 28/06/21
Time: 14:00 - 17:15
Location: Online
Get a deep understanding of the underpinnings of data analysis with a focus on the most fundamental concepts and techniques.
This course will be delivered online. See the ‘What is the course about?’ section in course details for more information.
Course Code: CDA03

Mon, day, 07 Jun - 28 Jun '21

Duration: 4 sessions (over 4 weeks)

What is the course about?

Probability is the art of reasoning in the presence of uncertainty (its counterpart, when no uncertainty is present, is called logic). It is a neat and elegant set of ideas that are especially applicable to games or game-like situations that include an element of uncertainty. For example, poker players need to know how likely it is that a hand of five cards, dealt at random, will contain a pair (two cards whose values match but whose suits are different).

Statistical measures usually aim to summarise a large amount of data – for example, the mean average salary for a person in the UK is a single number that “measures” a data set of millions of numbers. In this process, the features of large data sets can be made clearer but information is also usually lost.

This course offers a fast-paced introduction to statistics. We begin with classical probability, which we discuss in relation to the topics that provoked it: games and gambling. We will then turn to common problems with the application of probability, especially when we have statements of the form “If X then Y”. For example, as a juror in a murder trial how impressed should you be by a DNA test that confirms the defendant was at the crime scene (a place with which they otherwise have no connection) with 99.9% accuracy? In the second half of the course we look at both descriptive and inferential statistics.

Throughout the course our focus will be on conceptual understanding rather than detailed methods of calculation; most of the latter are, these days, delegated to computers in any case. Knowing what to ask the computer to do remains a key human skill!

This is a live online course. You will need:
- Internet connection. The classes work best with Chrome.
- A computer with microphone and camera.
- Earphones/headphones/speakers.
What will we cover?

• Basic elements of the language of set theory
• Basic probability calculations and problem-solving
• Conditional probability and Bayes’ Theorem
• Discrete and continuous random variables
• Discrete and continuous distributions
• Descriptive statistical measures including averages, standard deviation, quartiles and perhaps a discussion of kurtosis and skewness
• Some discussion of inferential statistics, with discussion of p-values, sampling and hypothesis testing.

What will I achieve?
By the end of this course you should be able to...

• Solve probability problems and calculate probabilities under properly-specified conditions
• Solve problems involving conditional probability, use Bayes’ Theorem and avoid mistakes such as the Prosecutor’s Fallacy
• Work with discrete random variables and various common distributions
• Work with continuous random variables and various common distributions
• Calculate descriptive statistical measures on a set of data and explain their meanings, uses and limitations
• Recognise some of the dangers of statistical inferences and critically evaluate inferences made by others.

What level is the course and do I need any particular skills?

This is an introductory course. Some facility with adding and multiplying fractions is helpful but all the maths we use will be introduced from scratch. You do not need to know anything at all about probability, statistics, programming or data analysis.

How will I be taught, and will there be any work outside the class?

We will use a mixture of presentation, discussion and problem-solving in class.

Are there any other costs? Is there anything I need to bring?

There are no other costs. A pen and notepad for note taking.

When I've finished, what course can I do next?

Data analytics with Python: introduction, Data analysis with Power BI, Excel analysing data: stage 1 or 2.

Rich Cochrane

Rich is a programmer, writer and educator with a particular interest in creative practice. In his previous career he worked as a software developer in the CIty, first at a dot-com startup and later at a top-tier investment bank where he worked mostly on trading floor systems and got to play with a wide range of languages and technologies. He now teaches coding and maths-related courses full time. Besides his work at City Lit he also teaches at Central Saint Martins, the Architecture Association and the Photographer's Gallery and is the author of two books about mathematics. His technical collaborations with artists have been shown at, among others, the Hayward gallery, the V&A, the ICA and Camden Arts Centre. He has a BSc in Mathematics from the Open University. He also has a BA in English Literature and a PhD in philosophy (both from Cardiff). He continues to teach a little philosophy and literature, especially as they intersect with his other interests, and as a partner in Minimum Labyrinth he has brought these ideas to wider audiences in collaboration with the Museum of London, the Barbican and various private sponsors.

