Introduction to Data Analytics

Introduction to Data Analytics Week 01 Quiz Answers

Quiz 1: Graded Quiz Answers

Q1. A modern data ecosystem includes a network of continually evolving entities. It includes: 

  • Data sources, enterprise data repository, business stakeholders, and tools, applications, and infrastructure to manage data
  • Social media sources, data repositories, and APIs
  • Data sources, databases, and programming languages
  • Data providers, databases, and programming languages

Q2. Data Analysts work within the data ecosystem to:

  • Build Machine Learning or Deep Learning models
  • Develop and maintain data architectures
  • Provide business intelligence solutions by monitoring data on different business functions
  • Gather, clean, mine, and analyze data for deriving insights

Q3. When we analyze data in order to understand why an event took place, which of the four types of data analytics are we performing?

  • Prescriptive Analysis
  • Descriptive Analysis
  • Predictive Analysis
  • Diagnostic Analysis

  • Q4. The first step in the data analysis process is to gain an in-depth understanding of the problem and the desired outcome. What are you seeking answers to at this stage of the data analysis process?
  • When you are and where you need to be
  • The best tools for sourcing data
  • What will be measured and how it will be measured
  • The data you need

Q5. From the provided list, select the three emerging technologies that are shaping today’s data ecosystem.

  • Cloud Computing, Internet of Things, and Dashboarding
  • Machine Language, Cloud Computing, and Internet of Things
  • Big Data, Internet of Things, and Dashboarding
  • Cloud Computing, Machine Learning, and Big Data

Quiz 2: Graded Quiz Answers


Q1. Why is proficiency in Statistics an important skill for a Data Analyst?
  • For creating project documentation 
  • For identifying patterns and correlations in data 
  • For acquiring data from multiple sources
  • For creating queries to extract required data 

Q2. Which of these is one of the soft skills required to be a successful Data Analyst?
  • Prepare reports and dashboards
  • Integrate data coming from multiple sources 
  • Filter, clean, and standardize data 
  • Work collaboratively with cross-functional teams 

Q3. Which of the data analyst functional skills helps research and interpret data, theorize, and make forecasts?
  • Probing skills 
  • Problem-solving skills 
  • Analytical skills 
  • Proficiency in Statistics

Q4. In “A day in the life of a Data Analyst”, what according to Sivaram Jaladi forms a large part of a Data Analyst’s job?
  • Generating hypotheses
  • Interacting with stakeholders 
  • Cleaning and preparing data 
  • Creating a report

Q5. In “A day in the life of a Data Analyst”, what are some of the data points that were useful in analyzing the use case. (Select all that apply)
  • Average billing amount of complainants
  • Age and education details of complainants
  • Employment history of the complainants
  • Serial number of the meters


Introduction to Data Analytics Week 02 Quiz Answers

Quiz 1: Graded Quiz Answers


Q1. In the data analyst’s ecosystem, languages are classified by type. What are shell and scripting languages most commonly used for? 

  • Automating repetitive operational tasks
  • Manipulating data 
  • Querying data 
  • Building apps 

Q2. Which of the following is an example of unstructured data? 

  • XML
  • Spreadsheets
  • Video and audio files
  • Zipped files 

Q3. Which one of these file formats is independent of software, hardware, and operating systems, and can be viewed the same way on any device?

  • Delimited text file
  • XLSX
  • PDF
  • XML

Q4. Which data source can return data in plain text, XML, HTML, or JSON among others? 

  • XML 
  • Delimited text file 
  • API
  • PDF 

Q5. According to the video “Languages for Data Professionals,” which of the programming languages supports multiple programming paradigms, such as object-oriented, imperative, functional, and procedural, making it suitable for a wide variety of use cases? 


  • PowerShell 
  • Python
  • Java
  • Unix/Linux Shell 

Quiz 2: Graded Quiz Answers


Q1. Data Marts and Data Warehouses have typically been relational, but the emergence of what technology has helped to let these be used for non-relational data?
  • SQL
  • ETL
  • Data Lake
  • NoSQL
Q2. What is one of the most significant advantages of an RDBMS? 
  • Requires source and destination tables to be identical for migrating data 
  • Enforces a limit on the length of data fields
  • Can store only structured data  
  • Is ACID-Compliant
Q3. Which one of the NoSQL database types uses a graphical model to represent and store data, and is particularly useful for visualizing, analyzing, and finding connections between different pieces of data? 
  • Document-based
  • Graph-based
  • Column-based 
  • Key value store  
Q4. Which of the data repositories serves as a pool of raw data and stores large amounts of structured, semi-structured, and unstructured data in their native formats?   
  • Data Lakes
  • Data Warehouses
  • Relational Databases
  • Data Marts
Q5. What does the attribute “Veracity” imply in the context of Big Data?
  • Scale of data
  • Diversity of the type and sources of data 
  • Accuracy and confrmity of data to facts
  • The speed at which data accumulates
Q6. Apache Spark is a general-purpose data processing engine designed to extract and process Big Data for a wide range of applications. What is one of its key use cases?
  • Scalable and reliable Big Data storage
  • Fast recovery from hardware failures   
  • Consolidate data across the organization
  • Perform complex analytics in real-time 


Introduction to Data Analytics Week 03 Quiz Answers

Quiz 1: Graded Quiz Answers

Q1. What are some of the steps in the process of “Identifying Data”? (Select all that apply)

  •  Define the checkpoints 
  •  Define a plan for collecting data 
  • Determine the visualization tools that you will use
  •  Determine the information you want to collect 

Q2. What type of data refers to information obtained directly from the source?

  •  Secondary data 
  •  Primary data 
  •  Sensor data 
  •  Third-party data 

Q3. Web scraping is used to extract what type of data? 

  • Data from news sites and NoSQL databases
  • Text, videos, and images
  • Text, videos, and data from relational databases
  • Images, videos, and data from NoSQL databases

Q4. Data obtained from an organization’s internal CRM, HR, and workflow applications is classified as:

  • Secondary data
  • Third-party data
  • Primary data

Q5. Which of the provided options offers simple commands to specify what is to be retrieved from a relational database?

  • RSS Feed
  • API
  • SQL
  • Web Scraping


Quiz 2: Graded Quiz Answers


Q1. What does a typical data wrangling workflow include? 
  • Using mathematical techniques to identify correlations in data 
  • Recognizing patterns 
  • Validating the quality of the transformed data 
  • Predicting probabilities 
Question 2
OpenRefine is an open-source tool that allows you to:  
  • Transform data into a variety of formats such as TSV, CSV, XLS, XML, and JSON 
  • Enforces applicable data governance policies automatically 
  • Use add-ins such as Microsoft Power Query to identify issues and clean data 
  • Automatically detect schemas, data types, and anomalies 
Question 3
What is one of the steps in a typical data cleaning workflow? 
  • Building classification models  
  • Inspecting data to detect issues and errors 
  • Establishing relationships between data events  
  • Clustering data 
Question 4
When you’re combining rows of data from multiple source tables into a single table, what kind of data transformation are you performing? 
  • Unions 
  • Joins 
  • Normalization 
  • Denormalization 

Question 5
When you detect a value in your data set that is vastly different from other observations in the same data set, what would you report that as? 
  • Outlier 
  • Missing value 
  • Irrelevant data 
  • Syntax error 

Introduction to Data Analytics Week 04 Quiz Answers

Quiz 1: Graded Quiz Answers

Q1. What is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of numerical or quantitative data?

  • Algebra
  • Calculus
  • Statistics
  • Pie

Q2. Data Mining is defined as the process of:

  • Preparing raw data for analysis
  • Filtering data based on pre-defined criteria
  • Identifying errors in data
  • Extracting knowledge from data

Q3. What type of data mining operations was R specifically built to handle?

  • Classification of data 
  • Filtering
  • Calculating mean, median, and mode
  • Sorting

Q4. When you’re calculating the middle value of a data field in a data set, what are you really calculating?

  • Mean
  • Average
  • Mode
  • Median

Q5. What is the general tendency of a set of data to change over time called?

  • Anomaly
  • Trend
  • Variation
  • Pattern

Quiz 2: Graded Quiz Answers


Q1. "A presentation is not a data dump”. What is the one thing you would do to ensure your presentation is not a data dump?
  • Deliver the findings in a single slide
  • Not include facts and figures in the presentation
  • Not use visuals in the presentation
  • Include only that information as is needed to address the business problem
Q2. What is the discipline of communicating information through the use of visual elements?  
  •  Data regression 
  •  Data profiling   
  •  Data visualization 
  •  Data type conversion 
Q3. Matplotlib is a widely used Python data visualization library.
  • True
  • False
Q4. What is the goal of Data Visualization? 
  • Make collaboration easy 
  • Make information easy to comprehend, interpret, and retain 
  • Establish trust in the audience 
  • Make the presentation look attractive 
Q5. What can you do to help your audience trust you?
  • Hand them copies of the data sets you have used for analysis 
  • Make your presentation look good 
  • Share your data sources, hypotheses, and validations 
  • Share the detailed documentation of every aspect of your project so they can verify all details 

Introduction to Data Analytics Week 05 Quiz Answers

Quiz: Graded Quiz Answers


Question 1
Which of the following statement describes Data Analyst Specialist Roles? 
  • Analysts who advance technical, statistical, and analytical skills, over time, to expert levels 
  • Analysts who can work with Machine and Deep Learning models 
  • Analysts who specialize in specific fields like HR, Sales, and Finance
  • Analysts who specialize in data lakes and data repositories 
Q2. A Principal Data Analyst is responsible for: 
  • Being a domain specialist 
  • Being well-versed in Big Data processing tools 
  • Having expertise in all tools and technologies used in data analytics   
  • Establishing processes in the team  
Q3. Job roles such as Project Managers, Marketing Managers, and HR Managers, can achieve greater efficiency and effectiveness in their current roles by acquiring data analysis skills, and are therefore known as analytics-enabled job roles.
  • True
  • False
Q4. Which of these is essential for getting started and growing as a Data Analyst?
  • Domain specialization
  • A degree in Computer Science 
  • Love for numbers, a curious mind, and openness to learn 
  • A degree in Statistics 
Q5. What Data Analysis roles may be best suited for people with little or no technical training? 
  • Functional Analyst  
  • Data Scientist   
  • Big Data Engineer 
  • Data Analyst   

Final Assignment


Project Title : Detecting credit card fraud using data analysis


List at least 5 (five) data points that are required for the analysis and detection of a credit card fraud. (3 marks) 

IP Address; User ID; Shipping Address; Transaction Value: Unites Purchased;

Refer to the data table below and identify 3 (three) errors/issues that could impact the accuracy of your findings. (3 marks)

1. The IP Address of Johnp in row 5(not including the first row) is missing. 

2. The IP Address of davidg in row 9(not including the first row) is missing. 

3. The Age data of ellend is missing. 

4. The Transaction Value of johnp in row 3 (not including the first row) is missing. 

5. The Transaction Date should be standardized.


Refer to the data table below and identify 2 (two) anomalies or unexpected behaviors, that would lead you to believe the transaction may be suspect. (2 marks)


1. John seems to using the use the credit card for large transaction value and big units since 3-6-20 and the shipping address changed, too. 

2. The IP Address and Shipping Address of ellend changed on 02 July 2020 for an expensive laptop.


Briefly explain your key take-away from the provided data visualization chart. (1 mark)


According to the trend of transaction value per transaction, john and ellend are suspective for their values changed intensely after Transaction 2.


Identify the type of analysis that you are performing when you are analyzing historical credit card data to understand what a fraudulent transaction looks like.

Hint: The four types of Analytics include: Descriptive, Diagnostic, Predictive, Prescriptive. (1 mark) 


Descriptive Analytics: By looking at what happened to the past transaction, including the mean, mode of data points, we find anomalies.

Post a Comment

Previous Post Next Post