Why Choose Us?
0% AI Guarantee
Human-written only.
24/7 Support
Anytime, anywhere.
Plagiarism Free
100% Original.
Expert Tutors
Masters & PhDs.
100% Confidential
Your privacy matters.
On-Time Delivery
Never miss a deadline.
CS 418 Project 2: (only jupyter notebook needed) This project will have you doing exploratory data analysis in iPython on a real-world dataset
CS 418 Project 2: (only jupyter notebook needed)
This project will have you doing exploratory data analysis in iPython on a real-world dataset. The goal is to get fluent in working with the standard tools and techniques of exploratory data analysis, by working with a dataset where you have some basic sense of familiarity.
This project is based on Chicago Crash Data available to the public. You should explore the data and uncover interesting observations. You will need to submit all your results in three different formats (.ipynb, .pdf and .py). Make sure to have your code documented with proper comments and the exact sequence of operations you needed to produce the resulting tables and figures.
Data set
You are to download the Chicago Crash Data and perform various EDA tasks on it. You can download the data by accessing the following link and download the CSV file from the Box.
The original data set is available on Chicago Open Data but the original data includes more attributes and fields that were removed in the version available on the Box. Here’s the data set description from the original website:
Crash data shows information about each traffic crash on city streets within the City of Chicago limits and under the jurisdiction of Chicago Police Department (CPD). Data are shown as is from the electronic crash reporting system (E-Crash) at CPD, excluding any personally identifiable information. Records are added to the data portal when a crash report is finalized or when amendments are made to an existing report in E-Crash. Data from E-Crash are available for some police districts in 2015, but citywide data are not available until September 2017. About half of all crash reports, mostly minor crashes, are selfreported at the police district by the driver(s) involved and the other half are recorded at the scene by the police officer responding to the crash. Many of the crash parameters, including street condition data, weather condition, and posted speed limits, are recorded by the reporting officer based on best available information at the time, but many of these may disagree with posted information or other assessments on road conditions. If any new or updated information on a crash is received, the reporting officer may amend the crash report at a later time. A traffic crash within the city limits for which CPD is not the responding police agency, typically crashes on interstate highways, freeway ramps, and on local roads along the City boundary, are excluded from this dataset. As per Illinois statute, only crashes with a property damage value of $1,500 or more or involving bodily injury to any person(s) and that happen on a public roadway and that involve at least one moving vehicle, except bike dooring, are considered reportable crashes. However, CPD records every reported traffic crash event, regardless of the statute of limitations, and hence any formal Chicago crash dataset released by Illinois Department of Transportation may not include all the crashes listed here.
This is a large dataset with many fields. Here is a list of all attributes included:
|
Column Name |
Description |
Type |
|
CRASH-RECORD-ID |
Unique identifier for the record |
Number |
|
CRASH-DATE |
Date and time of crash as entered by the officer |
Date & Time |
|
POSTED-SPEED-LIMIT |
Posted speed limit |
Number |
|
TRAFFIC-CONTROL-DEVICE |
Traffic control device present at crash location |
Plain Text |
|
WEATHER-CONDITION |
Weather condition at time of crash |
Plain Text |
|
LIGHTING-CONDITION |
Light condition at time of crash |
Plain Text |
|
FIRST-CRASH-TYPE |
Type of first collision in crash |
Plain Text |
|
TRAFFIC WAY-TYPE |
Traffic way type |
Plain Text |
|
ROADWAY-SURFACE-COND |
Road surface condition |
Plain Text |
|
ROAD-DEFECT |
Road defects |
Plain Text |
|
CRASH-TYPE |
A general severity classification for the crash. Can be either Injury and/or Tow Due to Crash or No Injury / Drive Away |
Plain Text |
|
INTERSECTION-RELATED |
A field observation by the police officer whether an intersection played a role in the crash. Does not represent whether or not the crash occurred within the intersection. |
Plain Text |
|
NOT-RIGHT-OF-WAY |
Whether the crash begun or first contact was made outside of the public right-of-way |
Plain Text |
|
HIT-AND-RUN |
Crash did/did not involve a driver who caused the crash and fled the scene without exchanging information and/or rendering aid |
Number |
|
DAMAGE |
A field observation of estimated damage. |
Plain Text |
|
DATE-POLICE-NOTIFIED |
Calendar date on which police were notified of the crash |
Date & Time |
|
PRIM-CONTRIBUTORY-CAUSE |
|
Number |
|
NUM-UNITS |
|
Number |
|
INJURIES-TOTAL |
Total persons sustaining fatal, incapacitating, nonincapacitating, and possible injuries |
Number |
|
INJURIES-FATAL |
Total persons sustaining fatal injuries in the crash |
Number |
|
INJURIES-INCAPACITATING |
Total persons sustaining incapacitating/serious injuries in the crash 1 |
Number |
|
INJURIES-NON-INCAPACITATING |
Total persons sustaining non-incapacitating injuries in the crash 2 |
Number |
|
INJURIES-NO-INDICATION Total persons sustaining no injuries in the crash Number |
||
This is an EDA practice, so you need to delve into the data set and identify some useful insights and visualize them. But here are some of the key points every submission must include:
- The data set need cleaning. Decide what to do with missing values and extra attributes.
- Some attributes are more useful if you break them into several attributes. An example of this is already included in the data set where the time, day, and month of the crash are given as separate attributes. These attributes allow you to compare crashes based on the day of the week, time, or month (season). Are there other attributes that you can break down into smaller attributes to gain more information from?
- What are some insights about the crashes and date/time? You can look into season, day of the week, day/night, lightning, weather, etc.
- Has number of deadly crashes increased recently? Look at the data over the years. Can you identify any significant increase/decrease?
- Investigate number and type of injuries based on the speed limit.
- Is there a relationship between hit and run crashes and number of fatal injuries?
- Do intersection-related crashes result in more fatal injuries?
- Come up with at least two more interesting insights and visualize them. (Suggestions: Season/weather/road condition and fatalities, or hit and run, having right of the way ... } ) You must have at least one visualization for any questions/insight you are investigating.
Rules
- This is an individual assignment. It is not a group activity.
- If you are new to Python, this project will be a lot of work, I strongly suggest you start early!
- Please include some proper explanations for your results. Do not submit a notebook with code cells only. You need to properly describe your methods and discuss/analyze your observations.
- We will look at the quality of your work for grading. You submission should be coherent and well documented.
- You might find some online discussions and demos on this data set. It is okay if you look them up, but you must write your own code and analyze the data by yourself.
- We will run your code though MOSS software to detect copying and plagiarism.
Submission
Submit everything through Gradescope and Blackboard. As mentioned above, you will need to upload:
- The Jupyter notebook all your work is in (.ipynb file) on Blackboard
- Python code (.py file) on Blackboard
- You can zip .py and .ipynb files for Blackboard submission
- PDF version of your Jupyter notebook on Gradescope
In order to make grading easier for your TA, please use the following format for naming your files:
netid-hw2-418 {.py, .pynb, .pdf }
Expert Solution
Please download the answer file using this link
https://drive.google.com/file/d/1Php1vV9nRcW7YIiHxTe_auatC_XDB6_a/view?usp=sharing
Please rename the database file as "Traffic_Crashes.csv".
Archived Solution
You have full access to this solution. To save a copy with all formatting and attachments, use the button below.
For ready-to-submit work, please order a fresh solution below.





