# Probability Theory, Statistics and Exploratory Data Analysis

• Conditional probability and Independence
• During this week we discuss conditional probability and independence of events. Sometimes we can use this definition to find probabilities. Sometimes we check that this definition fulfills to assure whether events are independent. We discuss important law of total probability, which allows us to find probability of some event when we know its conditional probabilities provided some hypotheses and probabilities of the hypotheses. We also discuss Bayes's rule which allows us to find probability of hypothesis provided that some event occurred. We demonstrate how Python can be used for calculating conditional probabilities and checking independence of events.
• Random variables
• Random variable denotes a value that depends on the result of some random experiment. Some natural examples of random variables come from gambling and lotteries. There are two main classes of random variables that we will consider in this course. This week we'll learn discrete random variables that take finite or countable number of values. Discrete random variables can be described by their distribution. We'll consider various discrete distributions, introduce notions of expected value and variance and learn to generate and visualize discrete random variables with Python.
• Systems of random variables; properties of expectation and variance, covariance and correlation.
• Several random variables associated with the same random experiment constitute a system of random variables. To describe system of discrete random variables one can use joint distribution, which takes into account all possible combinations of values that random variables may take. We'll find some joint distributions, research their properties and introduce independence of random variables. Then we'll discuss properties of expected value and variance with respect to arithmetic operations and introduce measures of independence between random variables.
• Continuous random variables
• This week we'll study continuous random variables that constitute important data type in statistics and data analysis. For continuous random variables we'll define probability density function (PDF) and cumulative distribution function (CDF), see how they are linked and how sampling from random variable may be used to approximate its PDF. We'll introduce expected value, variance, covariance and correlation for continuous random variables and discuss their properties. Finally, we'll use Python to generate independent and correlated continuous random variables.
• From random variables to statistical data. Data summarization and descriptive statistics.
• This week we'll introduce types of statistical data and discuss models that are used to pass from statistical data to random variables. We'll introduce descriptive statistics of sample data, such as various measures of central tendency and statistical dispersion, and find correspondences between properties of random variables (population) and the sample descriptive statistics, which are essential for statistical predictions. We’ll talk about visualization of statistical data and learn to work with them in Python.
• Correlations and visualizations
• This week we’ll consider correlation in statistical data and find out how its' related to the level of dependance within the data and what it means for scatter plots. We’ll consider several types of correlation suitable for different types of data and discuss difference between correlation and causation. Finally, we’ll learn to visualize dependence between numeric variables and calculate correlation with Python.