This course focuses on developing Python skills for assembling business data. It will cover some of the same material from Introduction to Accounting Data Analytics and Visualization, but in a more general purpose programming environment (Jupyter Notebook for Python), rather than in Excel and the Visual Basic Editor. These concepts are taught within the context of one or more accounting data domains (e.g., financial statement data from EDGAR, stock data, loan data, point-of-sale data).
The first half of the course picks up where Introduction to Accounting Data Analytics and Visualization left off: using in an integrated development environment to automate data analytic tasks. We discuss how to manage code and share results within Jupyter Notebook, a popular development environment for data analytic software like Python and R. We then review some fundamental programming skills, such as mathematical operators, functions, conditional statements and loops using Python software.
The second half of the course focuses on assembling data for machine learning purposes. We introduce students to Pandas dataframes and Numpy for structuring and manipulating data. We then analyze the data using visualizations and linear regression. Finally, we explain how to use Python for interacting with SQL data.
INTRODUCTION TO THE COURSE
-In this module, you will become familiar with the course, your instructor and your classmates, and our learning environment. This orientation module will also help you obtain the technical skills required to navigate and be successful in this course.
MODULE 1: FOUNDATIONS
-This module serves as the introduction to the course content and the course Jupyter server, where you will run your analytics scripts. First, you will read about specific examples of how analytics is being employed by Accounting firms. Next, you will learn about the capabilities of the course Jupyter server, and how to create, edit, and run notebooks on the course server. After this, you will learn how to write Markdown formatted documents, which is an easy way to quickly write formatted text, including descriptive text inside a course notebook.
MODULE 2: INTRODUCTION TO PYTHON
-This module focuses on the basic features in the Python programming language that underlie most data analytics programs (or scripts). First, you will read about why accounting students should learn to write computer programs. In the first lesson, you will also learn the basic concepts of the Python programming language, including how to create variables, basic data types and mathematical operators, and how to document your programs with comments. Next, you will learn about Boolean and logical operators in Python and how they can be used to control the flow of a Python program by using conditional statements. Finally, you will learn about functions and how they can simplify developing and maintaining programs. You will also learn how to create and call functions in Python.
MODULE 3: INTRODUCTION TO PYTHON PROGRAMMING
-In this module you will learn about working with fundamental data structures in Python: strings, tuples, lists, and dictionaries. You will also learn about how to write loops for performing repetitive tasks.
MODULE 4: PYTHON PROGRAMMING
-In this module you will learn about creating and using modules, which is a group of functions. You will then learn about two of the most important modules for data analytics: NumPy and Pandas. NumPy performs numerical calculations on large data arrays. Pandas simplifies procedures for working with panel data, also known as dataframes.
MODULE 5: DATA ANALYSIS WITH PYTHON
-This module focuses on using the Pandas dataframe to do some fundamental dataframe tasks including saving and reading dataframes, pivot table functions, filtering functions, and calculating descriptive statistics.
MODULE 6: INTRODUCTION TO VISUALIZATION IN PYTHON
-In this module you will learn some basic elements of creating data visualizations in Python. You will then learn how to use the Matplotlib and Seaborn modules to help create some of the most commonly used one- and two-dimensional data visualizations.
MODULE 7: PRODUCTION DATA ANALYTICS
-In this module you'll learn about the CRISP decision making framework to approach real-world problems. You'll also learn how to use linear regression to find and quantify relationships.
MODULE 8: INTRODUCTION TO DATABASES IN PYTHON
-This module focuses on relational database management systems (RDBMS) and how to interact with those using Python.