Trusted by Students Everywhere
Why Choose Us?
0% AI Guarantee

Human-written only.

24/7 Support

Anytime, anywhere.

Plagiarism Free

100% Original.

Expert Tutors

Masters & PhDs.

100% Confidential

Your privacy matters.

On-Time Delivery

Never miss a deadline.

DS 804: Final Exam In this exam you will scrape data using R, clean it, export it to Excel, analyze it then Visualize it using Tableau

Computer Science Dec 07, 2021

DS 804: Final Exam

In this exam you will scrape data using R, clean it, export it to Excel, analyze it then Visualize it using Tableau. Please read through the entire exam before you begin so that you can plan your time wisely. This is an open book exam so feel free to use any class notes, slides and online recourses. The only thing you are not permitted to solicit help from, or give help to, other individuals.

 

  1. Using R for data extraction, cleansing and visualization
    1. Use R to collect/scrape all the data from the following website:

http://myisba.org/ds804/index.html .   

 

*Note: If you are unable to scrape the data you may download the csv file by clicking here and import it into R. Please note that if you opt to import rather than scrape the data you will forfeit 5 points.

 

    1. Using the rename() function from dplyr change the name of the column titled "Year of Breach" to Year. 
    2. Using the rename() function from dplyr change the name of the column titled "Description of incident" to Description
    3. Use ggplot2 to create a chart that shows the number of breaches that occurred each year. You Plot should look like this including the heading, and axis labels.

 

    1. In R generate a wordcloud from the "Description of incident" column - Make sure that you remove all stop words, the word "information" and the following character â. Your wordcloud should look similar to this one:

 

  1. | P a g e

 

  1. Using MS Excel for data cleansing and analysis
    1. Import the file titled breaches. that you saved in 1.6 into Excel
    2. Save the file as breaches.xlsx (Please note that you will submit this file to canvas once you complete all the steps below)
    3. Remove all the NA’s from the Total_Records column
    4. Delete all non-US data i.e. data from Beijing, Berlin, British Columbia, Buckinghamshire, Cheshire, Dublin, Grand Bahama, Guangdong, London, Noord Holland, Tokyo, Quebec, Ontario
    5. For each record add a column that shows the state population Hint: See column AF in the following worksheet: https://myisba.org/ds804/breaches.xlsx Note that I have also included the population data for each state under the tab population_data. Feel free to use it.
    6. Create pivot tables to answer the following question

2.6.1.  What is the proportion of breaches for different organization types over the years? Format any values above 2% in red. Your solution should resemble the following:

 

 

 

 

 

 

 

  1. Building a Dashboard using Tableau

    Use the data from your Excel file to create a dashboard. The primary audience of your dashboard is c-level business leaders in New England who are primarily interested in knowing about breaches in the following states: Maine, New Hampshire, Vermont, Massachusetts, Rhode Island, and Connecticut.  The dashboard should have at least 4 charts and allow the user to filter data across the dashboard using one of the charts. At least one chart  must be a donut chart and one other chart must show the weekly breach trend over time (including a trend line). Finally, your dashboard must adhere to the Best Practices for Building Effective Dashboards that were discussed in the following Lecture 7 Video assigned on week 7.  

    1. What interesting insights can the gleaned from the charts that you have generated?
    2. Publish your dashboard on Tableau Public

 

Deliverables

1. A MS Word document that includes the following:

        1. The R code that you generated to address questions 1.1-1.6
        2. Your answer to question 3.1
        3. A link to your Tableau dashboard

2. The cleansed MS Excel file with your Pivot table

 

  1. | P a g e

 

Expert Solution

Please download the final files in zip using this link

https://drive.google.com/file/d/1nAK29mO8Wnr4yLVjhPERFIrYPzD4WxiF/view?usp=sharing

Archived Solution
Unlocked Solution

You have full access to this solution. To save a copy with all formatting and attachments, use the button below.

Already a member? Sign In
Important Note: This solution is from our archive and has been purchased by others. Submitting it as-is may trigger plagiarism detection. Use it for reference only.

For ready-to-submit work, please order a fresh solution below.

Or get 100% fresh solution
Get Custom Quote
Secure Payment