I believe every project offers an opportunity to reinforce technical skills, improve analytical thinking, and refine processes. The Cyclistic case study highlighted several lessons that I’ll carry into future data analysis projects.
1. Stay Focused on the Business Objective
One of the biggest lessons I learned was keeping the business objectives visible throughout the project. It was tempting to want to explore interesting patterns that didn’t directly answer the stakeholder’s questions. However, regularly checking progress against the original objectives helped me prevent unnecessary work and kept the analysis aligned with business needs.
2. Continue Expanding Technical Skills
This project helped me learn new tools and features, and reinforced the importance of learning more efficient ways to work with large datasets.
Key takeaways included:
- Learning new Excel functions.
- Using filters within Pivot Tables more effectively.
- Exploring more efficient tools for summarizing and processing large datasets outside of Excel.
- Working with smaller subsets in R.
- Learning the capabilities of ggplot2, matplotlib, folium, and pandas for data processing and visualization.
3. Use Visualizations To Answer Questions and Tell the Story
Since this project utilized a large dataset, it was challenging to find trends and patterns with raw number tables alone. I learned well-designed charts can often communicate trends more effectively during the initial analysis. Utilizing Pivot Tables made it easier to identify patterns, support recommendations, and communicate findings to stakeholders when I started preparing the presentation.
4. Focus on the Largest Opportunities
An effective analysis prioritizes where the greatest business impact exists. Rather than concentrating only on the highest or lowest values, it was important to understand where most observations occur within the dataset. Utilizing Lean Six Sigma principles, I found distribution charts, standard deviations, and the Pareto Principle helped me identify areas of opportunity with the greatest impact. This was done by identifying the data that represented the top 80-90% of riders in different categories.
5. Choose Summary Statistics Carefully
This project reinforced that statistical measures should always match the question being asked. It was good that this project had outliers and various measurements which gave me broader insights into:
- Determining whether the analysis requires counts or actual values.
- Being cautious when interpreting averages in datasets with significant outliers.
- Considering using the median when outliers would distort the average.
- Avoiding relying solely on the top or bottom few observations if they don’t represent the broader population.
6. Validate the Data
One unexpected lesson I learned was that filtered subsets and full datasets don’t always produce identical counts. This emphasized the importance of validating Pivot Tables, filters, and aggregation methods before drawing conclusions.
7. Work Efficiently
I learned very quickly that large datasets require more time and planning than some of the smaller datasets I worked with before. To help address this going forward, I plan to:
- Account for processing time when setting project deadlines.
- Avoid repeating analyses when I already answered similar questions in my review.
- Organize my calculations so that shared metrics can be reused instead of recreated multiple times.
Reflection
The Cyclistic project strengthened both my technical and analytical skills. Beyond learning new Excel shortcuts and visualization techniques, I developed a greater appreciation for staying business-focused, validating data carefully, selecting appropriate statistical methods, and communicating insights through clear visual storytelling.
These lessons will help me produce more efficient, reliable, and impactful analyses in future projects.
Leave a Reply