Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / DS 804: Final Exam In this exam you will scrape data using R, clean it, export it to Excel, analyze it then Visualize it using Tableau

DS 804: Final Exam In this exam you will scrape data using R, clean it, export it to Excel, analyze it then Visualize it using Tableau

Computer Science

DS 804: Final Exam

In this exam you will scrape data using R, clean it, export it to Excel, analyze it then Visualize it using Tableau. Please read through the entire exam before you begin so that you can plan your time wisely. This is an open book exam so feel free to use any class notes, slides and online recourses. The only thing you are not permitted to solicit help from, or give help to, other individuals.

 

  1. Using R for data extraction, cleansing and visualization
    1. Use R to collect/scrape all the data from the following website:

http://myisba.org/ds804/index.html .   

 

*Note: If you are unable to scrape the data you may download the csv file by clicking here and import it into R. Please note that if you opt to import rather than scrape the data you will forfeit 5 points.

 

    1. Using the rename() function from dplyr change the name of the column titled "Year of Breach" to Year. 
    2. Using the rename() function from dplyr change the name of the column titled "Description of incident" to Description
    3. Use ggplot2 to create a chart that shows the number of breaches that occurred each year. You Plot should look like this including the heading, and axis labels.

 

    1. In R generate a wordcloud from the "Description of incident" column - Make sure that you remove all stop words, the word "information" and the following character â. Your wordcloud should look similar to this one:

 

  1. | P a g e

 

  1. Using MS Excel for data cleansing and analysis
    1. Import the file titled breaches. that you saved in 1.6 into Excel
    2. Save the file as breaches.xlsx (Please note that you will submit this file to canvas once you complete all the steps below)
    3. Remove all the NA’s from the Total_Records column
    4. Delete all non-US data i.e. data from Beijing, Berlin, British Columbia, Buckinghamshire, Cheshire, Dublin, Grand Bahama, Guangdong, London, Noord Holland, Tokyo, Quebec, Ontario
    5. For each record add a column that shows the state population Hint: See column AF in the following worksheet: https://myisba.org/ds804/breaches.xlsx Note that I have also included the population data for each state under the tab population_data. Feel free to use it.
    6. Create pivot tables to answer the following question

2.6.1.  What is the proportion of breaches for different organization types over the years? Format any values above 2% in red. Your solution should resemble the following:

 

 

 

 

 

 

 

  1. Building a Dashboard using Tableau

    Use the data from your Excel file to create a dashboard. The primary audience of your dashboard is c-level business leaders in New England who are primarily interested in knowing about breaches in the following states: Maine, New Hampshire, Vermont, Massachusetts, Rhode Island, and Connecticut.  The dashboard should have at least 4 charts and allow the user to filter data across the dashboard using one of the charts. At least one chart  must be a donut chart and one other chart must show the weekly breach trend over time (including a trend line). Finally, your dashboard must adhere to the Best Practices for Building Effective Dashboards that were discussed in the following Lecture 7 Video assigned on week 7.  

    1. What interesting insights can the gleaned from the charts that you have generated?
    2. Publish your dashboard on Tableau Public

 

Deliverables

1. A MS Word document that includes the following:

        1. The R code that you generated to address questions 1.1-1.6
        2. Your answer to question 3.1
        3. A link to your Tableau dashboard

2. The cleansed MS Excel file with your Pivot table

 

  1. | P a g e

 

pur-new-sol

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE