Fill This Form To Receive Instant Help

Help in Homework
trustpilot ratings
google ratings


Homework answers / question archive / Project: A Tutorial Motivation Instead students are asked to submit a tutorial that walks the reader through the Data Science pipeline

Project: A Tutorial Motivation Instead students are asked to submit a tutorial that walks the reader through the Data Science pipeline

Computer Science

Project: A Tutorial

Motivation

Instead students are asked to submit a tutorial that walks the reader through the Data Science pipeline. The subject matter of this tutorial is far less important that the ability to communicate the approach throughout and a meaningful discussion of the implications/interpretations of the final results. For the purposes of this tutorial, we will assume that ‘The Data Science

Pipeline’ has the following phases:

1. Data collection/curation + parsing (if necessary)

2. Data management /representation

3. Exploratory data analysis

4. Hypothesis testing and machine learning

5. Communication of insights attained

It is required that each tutorial is a self-contained artifact, using a combination of Mardown and Python code within a Jupyter Notebook. This artifact should be publically available on the web.

1) Expectations

In general we would expect a good submission to provide the following at a minimum:

  • 1500+ words of prose in English, describing the process throughout and a discussion of the insights attained
  • Approximately 150 lines of non-contrived Python
  • Well-labelled figures showing important aspects of the analysis
  • Links to external documentation and resources that would be useful in understanding the approach.

2 Examples

The following are links to final projects from past semesters. ‘They should be seen as a rough guide to what is expected and to the variety of topics that can be pursued and not as examples of the highest-scoring submissions.

  • https://amulyavelamakanni. github.io/data-science-pipeline-tutorial/:

Analysis of Freddie Mac’s Single Family Loan-Level data

  • https://andrewstehman.github.io/Joe-Flacco-Is-Elite/: Investigation into whether Joe Flacco is an ‘elite’ quarterback in the NFL
  • https://summerzzzy.github.io/: Analysis of global suicide rates
  • https://amygracecruz.github.io/: Attempt to predict dementia and

Alzheimer’s

3.1 Format of your deliverable

The formatting for the majority of the deliverable is left to your discretion.

However, each submission must begin with the title of the tutorial, providing a rough idea of the topic, followed by your name (and all members of the group).

4 Assessment

The following dimensions of each submission will be given a rating between 1-10:

1. Motivation

2. Understanding

3. Resources

4. Prose

5. Code

6. Communication of Approach

7. Subjective Evaluation

Motivation: each tutorial should be sufficiently motivated. If there is not motivation for the analysis, why would we ’do data science’ on this topic?

Understanding: the reader of the tutorial should walk away with some new understanding of the topic at hand. If it’s not possible for a reader to state ‘what they learned’ from reading your tutorial, then why do the analysis?

Resources: tutorials should help the reader learn a skill, but they should also provide a launching pad for the reader to further develop that skill. The tutorial should link to additional resources wherever appropriate, so that a well-motivated reader can read further on techniques that have been used in the tutorial.

Prose: it’s very easy to write the literal English for what the Python code is doing, but that’s not very useful. The prose should enhance, the tutorial, adding additional context and insight.

Code: code should be clear and commented. Function definitions should be described and given context/motivation. If the prose helps the reader under- stand why the code should be sufficient for the reader to learn how.

Communication of Approach: every technical choice has alternatives, why did you choose the approach taken in the tutorial? A reader should walk away with some idea of what the trade-offs may be.

Subjective Evaluation: does the tutorial seem polished and ‘publishable’, or haphazard and quickly thrown together? The tutorials should read as well put together and having undergone a few iterations of editing and refinement.

This should be the easiest of the dimensions.

4.1 Grades

Once each tutorials has been rated along each dimension, the score for each dimension will be scaled according to the following rubric:

Category          |                                      Points Available

Motivation                                               10

Understanding                                        10

Resources                                                10

Prose                                                        20

Code                                                         20

Communication of Approach              20

Subjective Evaluation                           10

Total Points:                                        100

 

Purchase A New Answer

Custom new solution created by our subject matter experts

GET A QUOTE

Related Questions