1.0 | Introduction
During my data science internship at Qulturum, an innovative healthcare improvement research institute within Jönköping County, Sweden, I had the opportunity to apply theoretical knowledge to practical projects. Working on real-world challenges allowed me to strengthen my analytical capabilities and gain valuable hands-on experience. I also had the chance to collaborate closely with experienced mentors, whose feedback helped refine my skills and approach. This blog post reflects on the core projects I undertook, the skills I developed, and the lessons I learned along the way.
2.0 | Key Projects
2.1 | Diagnostic Data Visualization and Analysis
One of my first major tasks was the development of interactive network diagrams to visualize diagnosis data. Using Python and tools like NetworkX and Bokeh, I created a network diagram displaying diagnoses correlations. Each node’s size was adjusted according to frequency, while connections represented relationships between diagnoses. I iterated on this project multiple times, incorporating feedback to improve the visual intuitiveness and interactivity. This project taught me how to simplify complex data into actionable visuals, a crucial skill for making data-driven healthcare insights accessible to a broad audience.
2.2 | Clustering and Dimensionality Reduction
With guidance, I explored and implemented various clustering algorithms, including K-Means and Spectral Clustering, alongside dimensionality reduction techniques like PCA and UMAP. These methods were applied to segregate and analyze clusters within patient data, helping to identify patterns and relationships. This task not only deepened my understanding of clustering techniques but also showed me how to apply them to discover high-impact patterns in healthcare data.
2.3 | National Malnourishment Data Analysis
One of the more impactful projects I worked on involved analyzing malnutrition rates among the elderly across Sweden. This project required data preparation, including data cleaning and applying key database operations to merge multiple datasets, creating a comprehensive view of malnutrition prevalence at both national and regional levels. Using Python and GeoPandas, I visualized the data through bar plots and geographical maps of Sweden, highlighting malnutrition rates by county.
This analysis, conducted on a national scale, yielded results closely mirroring those observed in a county-level analysis. For further validation, a senior statistician independently reviewed and verified the findings, enhancing their reliability. These findings were used to support future revisions of the international GLIM (Global Leadership Initiative on Malnutrition) criteria. This project underscored the critical role that rigorous data preparation and analysis play in shaping essential healthcare decisions on a global scale.
2.4 | Presenting My Work
My final task was a formal presentation summarizing my internship experience and findings. Presented to both technical and non-technical stakeholders, it offered a chance to showcase my work and insights gained. I learned the art of tailoring technical content to a diverse audience, balancing in-depth analysis with user-friendly explanations.
3.0 | Skills Gained
3.1 | Technical Skills
Through regular practice and feedback, I expanded my skill set in Python, mastering data visualization libraries (e.g., NetworkX, Bokeh, Geopandas), statistical techniques, and clustering algorithms. Debugging became a key part of my workflow, as I often encountered challenges requiring patience and a methodical approach to problem-solving.
3.2 | Data Analysis and Visualization
Working on projects like diagnostic network diagrams and malnutrition mapping, I honed my skills in visualizing complex data in accessible formats. I learned to use data visualization not just as a presentation tool but as a means to identify trends and communicate findings effectively.
3.3 | Stakeholder Communication
Presenting my findings taught me to translate complex analyses into actionable insights for both technical and general audiences. Feedback from mentors helped refine this skill, as I adapted presentations to balance detail with clarity, ensuring each audience could understand the implications of the work.
4.0 | Lessons Learned
4.1 | Iterative Improvement and Feedback Integration
A key takeaway was the value of iterative work. I learned that early feedback and incremental improvements are vital in producing high-quality, user-centered results. Breaking down projects into manageable pieces helped me refine my work continuously, making each iteration more valuable.
4.2 | Adapting Techniques to Data Complexity
The internship highlighted the need to adapt analytical techniques to data complexity, especially in healthcare. By exploring and comparing different clustering and dimensionality reduction methods, I learned to choose the appropriate approach based on data characteristics, intended outcomes, and project requirements.
4.3 | Data Quality is Essential
One of the biggest lessons was the importance of data quality. The quality and granularity of data profoundly impact analysis, and thorough data preprocessing is crucial to ensure meaningful, reliable insights. This was especially clear during the malnutrition analysis project, where I tackled issues of missing and inconsistent data.
5.0 | Final Thoughts
This internship provided a transformative experience in data science within the healthcare sector. The blend of practical projects, mentorship, and exposure to real-world challenges enriched my technical skills and professional confidence. Moving forward, I am excited to leverage these insights and skills in future roles, contributing to impactful data-driven projects in healthcare and beyond.
6.0 | Figures
CLICK TO OPEN/CLOSE FIGURES