Part 1: EDA
In this stage of the project, you should decide with your team what aspect of the dataset you’d like to explore and display.
You should choose one of the datasets and conduct EDA on it. Some of the datasets will require different methods of preprocessing. Keep in mind that you may need to:
- Join tables
- Remove or fill in missing values
- Regroup data in order to focus on particular elements
- Create new columns for analyzing the data
Part 2: Visualisation
After exploring the data, you need to decide on different ways to visualize the data. Each of your visualizations should be unique.
Part 3: Reporting Results - Presentation
While visualizations are very useful for pointing out important features of a dataset, they are not sufficient for a full data analysis. The last stage of your work must be creating a presentation with the results.
You must include:
- An introduction to the dataset
- What data do you have?
- What did you do?
- What are your major findings, briefly?
- Explanations of visualization, explaining, giving broader context, and drawing conclusions
- A conclusion, indicating what further explorations might be made into the dataset
Reporting results for grading
Your team should submit a link to a Github public repo with all of your working materials to Arina Sitnikova. For assessing your work, there should be:
- Tableau Workbook (or link to the published workbook)
- Presentation file/link
- Supporting documents (if any)
Grading criteria for data work (20 points max)
- Visualization use (10 points)
- The team should use various ways to visualize the data
- Visualizations add to the story and are an integral part of it
- All the diagrams are described with captions and summaries
- Presentation with results (10 points)
- The introduction sufficiently describes the data
- Each of the visualization descriptions fully explains the image
- Conclusions been drawn clearly
- Next steps that might be taken are indicated