Introduction to Data Analytics Week 01 Quiz Answers
Quiz 1: Graded Quiz Answers
Q1. A modern data ecosystem includes a network of continually evolving entities. It includes:
- Data sources, enterprise data repository, business stakeholders, and tools, applications, and infrastructure to manage data
- Social media sources, data repositories, and APIs
- Data sources, databases, and programming languages
- Data providers, databases, and programming languages
Q2. Data Analysts work within the data ecosystem to:
- Build Machine Learning or Deep Learning models
- Develop and maintain data architectures
- Provide business intelligence solutions by monitoring data on different business functions
- Gather, clean, mine, and analyze data for deriving insights
Q3. When we analyze data in order to understand why an event took place, which of the four types of data analytics are we performing?
- Prescriptive Analysis
- Descriptive Analysis
- Predictive Analysis
- Diagnostic Analysis
- Q4. The first step in the data analysis process is to gain an in-depth understanding of the problem and the desired outcome. What are you seeking answers to at this stage of the data analysis process?
- When you are and where you need to be
- The best tools for sourcing data
- What will be measured and how it will be measured
- The data you need
- Cloud Computing, Internet of Things, and Dashboarding
- Machine Language, Cloud Computing, and Internet of Things
- Big Data, Internet of Things, and Dashboarding
- Cloud Computing, Machine Learning, and Big Data
Quiz 2: Graded Quiz Answers
- For creating project documentation
- For identifying patterns and correlations in data
- For acquiring data from multiple sources
- For creating queries to extract required data
- Prepare reports and dashboards
- Integrate data coming from multiple sources
- Filter, clean, and standardize data
- Work collaboratively with cross-functional teams
- Probing skills
- Problem-solving skills
- Analytical skills
- Proficiency in Statistics
- Generating hypotheses
- Interacting with stakeholders
- Cleaning and preparing data
- Creating a report
- Average billing amount of complainants
- Age and education details of complainants
- Employment history of the complainants
- Serial number of the meters
Introduction to Data Analytics Week 02 Quiz Answers
Quiz 1: Graded Quiz Answers
Q1. In the data analyst’s ecosystem, languages are classified by type. What are shell and scripting languages most commonly used for?
- Automating repetitive operational tasks
- Manipulating data
- Querying data
- Building apps
Q2. Which of the following is an example of unstructured data?
- XML
- Spreadsheets
- Video and audio files
- Zipped files
Q3. Which one of these file formats is independent of software, hardware, and operating systems, and can be viewed the same way on any device?
- Delimited text file
- XLSX
- XML
Q4. Which data source can return data in plain text, XML, HTML, or JSON among others?
- XML
- Delimited text file
- API
Q5. According to the video “Languages for Data Professionals,” which of the programming languages supports multiple programming paradigms, such as object-oriented, imperative, functional, and procedural, making it suitable for a wide variety of use cases?
- PowerShell
- Python
- Java
- Unix/Linux Shell
Quiz 2: Graded Quiz Answers
- SQL
- ETL
- Data Lake
- NoSQL
- Requires source and destination tables to be identical for migrating data
- Enforces a limit on the length of data fields
- Can store only structured data
- Is ACID-Compliant
- Document-based
- Graph-based
- Column-based
- Key value store
- Data Lakes
- Data Warehouses
- Relational Databases
- Data Marts
- Scale of data
- Diversity of the type and sources of data
- Accuracy and confrmity of data to facts
- The speed at which data accumulates
- Scalable and reliable Big Data storage
- Fast recovery from hardware failures
- Consolidate data across the organization
- Perform complex analytics in real-time
Introduction to Data Analytics Week 03 Quiz Answers
Quiz 1: Graded Quiz Answers
Q1. What are some of the steps in the process of “Identifying Data”? (Select all that apply)
- Define the checkpoints
- Define a plan for collecting data
- Determine the visualization tools that you will use
- Determine the information you want to collect
Q2. What type of data refers to information obtained directly from the source?
- Secondary data
- Primary data
- Sensor data
- Third-party data
Q3. Web scraping is used to extract what type of data?
- Data from news sites and NoSQL databases
- Text, videos, and images
- Text, videos, and data from relational databases
- Images, videos, and data from NoSQL databases
Q4. Data obtained from an organization’s internal CRM, HR, and workflow applications is classified as:
- Secondary data
- Third-party data
- Primary data
Q5. Which of the provided options offers simple commands to specify what is to be retrieved from a relational database?
- RSS Feed
- API
- SQL
- Web Scraping
Quiz 2: Graded Quiz Answers
- Using mathematical techniques to identify correlations in data
- Recognizing patterns
- Validating the quality of the transformed data
- Predicting probabilities
- Transform data into a variety of formats such as TSV, CSV, XLS, XML, and JSON
- Enforces applicable data governance policies automatically
- Use add-ins such as Microsoft Power Query to identify issues and clean data
- Automatically detect schemas, data types, and anomalies
- Building classification models
- Inspecting data to detect issues and errors
- Establishing relationships between data events
- Clustering data
- Unions
- Joins
- Normalization
- Denormalization
- Outlier
- Missing value
- Irrelevant data
- Syntax error
Introduction to Data Analytics Week 04 Quiz Answers
Quiz 1: Graded Quiz Answers
Q1. What is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of numerical or quantitative data?
- Algebra
- Calculus
- Statistics
- Pie
Q2. Data Mining is defined as the process of:
- Preparing raw data for analysis
- Filtering data based on pre-defined criteria
- Identifying errors in data
- Extracting knowledge from data
Q3. What type of data mining operations was R specifically built to handle?
- Classification of data
- Filtering
- Calculating mean, median, and mode
- Sorting
Q4. When you’re calculating the middle value of a data field in a data set, what are you really calculating?
- Mean
- Average
- Mode
- Median
Q5. What is the general tendency of a set of data to change over time called?
- Anomaly
- Trend
- Variation
- Pattern
Quiz 2: Graded Quiz Answers
- Deliver the findings in a single slide
- Not include facts and figures in the presentation
- Not use visuals in the presentation
- Include only that information as is needed to address the business problem
- Data regression
- Data profiling
- Data visualization
- Data type conversion
- True
- False
- Make collaboration easy
- Make information easy to comprehend, interpret, and retain
- Establish trust in the audience
- Make the presentation look attractive
- Hand them copies of the data sets you have used for analysis
- Make your presentation look good
- Share your data sources, hypotheses, and validations
- Share the detailed documentation of every aspect of your project so they can verify all details
Introduction to Data Analytics Week 05 Quiz Answers
Quiz: Graded Quiz Answers
- Analysts who advance technical, statistical, and analytical skills, over time, to expert levels
- Analysts who can work with Machine and Deep Learning models
- Analysts who specialize in specific fields like HR, Sales, and Finance
- Analysts who specialize in data lakes and data repositories
- Being a domain specialist
- Being well-versed in Big Data processing tools
- Having expertise in all tools and technologies used in data analytics
- Establishing processes in the team
- True
- False
- Domain specialization
- A degree in Computer Science
- Love for numbers, a curious mind, and openness to learn
- A degree in Statistics
- Functional Analyst
- Data Scientist
- Big Data Engineer
- Data Analyst
Final Assignment
IP Address; User ID; Shipping Address; Transaction Value: Unites Purchased;
Refer to the data table below and identify 3 (three) errors/issues that could impact the accuracy of your findings. (3 marks)
1. The IP Address of Johnp in row 5(not including the first row) is missing.
2. The IP Address of davidg in row 9(not including the first row) is missing.
3. The Age data of ellend is missing.
4. The Transaction Value of johnp in row 3 (not including the first row) is missing.
5. The Transaction Date should be standardized.
Refer to the data table below and identify 2 (two) anomalies or unexpected behaviors, that would lead you to believe the transaction may be suspect. (2 marks)
1. John seems to using the use the credit card for large transaction value and big units since 3-6-20 and the shipping address changed, too.
2. The IP Address and Shipping Address of ellend changed on 02 July 2020 for an expensive laptop.
Briefly explain your key take-away from the provided data visualization chart. (1 mark)
According to the trend of transaction value per transaction, john and ellend are suspective for their values changed intensely after Transaction 2.
Identify the type of analysis that you are performing when you are analyzing historical credit card data to understand what a fraudulent transaction looks like.
Hint: The four types of Analytics include: Descriptive, Diagnostic, Predictive, Prescriptive. (1 mark)
Descriptive Analytics: By looking at what happened to the past transaction, including the mean, mode of data points, we find anomalies.