Skip to content

Collection of dataset and links that can be used to create a cohesive visualization for the module CS5346

Notifications You must be signed in to change notification settings

carola173/information_visualization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 

Repository files navigation

Information visualization

Collection of dataset and links that can be used to create a cohesive visualization for the module CS5346

Dataset

  1. It is a dataset bank that contains multiple datasets along with a few sample visualization examples:https://data.fivethirtyeight.com/
  2. Domain based dataset related to earth science, atmosphere, cloud surface area related domains: https://data.nasa.gov/data_visualizations.html
  3. Spotify Play Count for Billboard's 1990 Top 100 https://www.kaggle.com/zacharykauz/spotify-play-count-for-billboards-1990-top-100, https://www.kaggle.com/yasserh/song-popularity-dataset, https://www.kaggle.com/nenamalikah/nft-collections-by-sales-volume, https://www.kaggle.com/ahemateja19bec1025/musicgenerationdataset, https://www.kaggle.com/ektanegi/spotifydata-19212020, One of the ideas is the collect all the historical dataset and generate a most viewed, depict the evolution of music over decades, etc. to generate a web publishable dashboard for how Billboard’s Top Hits Changed over decades [for all the music lovers!]
  4. Google Trends: https://trends.google.com/trends/explore
  5. Global fundamental dataset for the banking & financial domain https://data.nasdaq.com/search?query=KAUFFMAN
  6. FBI Fireman background check dataset:https://github.com/BuzzFeedNews/nics-firearm-background-checks
  7. Iris Data Set http://archive.ics.uci.edu/ml/datasets/Iris
  8. HR Data Set- Visuals & Predictions https://www.kaggle.com/joshuaswords/awesome-hr-data-visualization-prediction
  9. Mall Customer Segmentation Data https://www.kaggle.com/joshuaswords/data-visualization-clustering-mall-data
  10. COVID-19 Vaccination Data Visualization:https://www.kaggle.com/joshuaswords/uk-covid-19-vaccination-progress-data-vis (UK),https://data.gov.sg/dataset/covid-19-vaccination (Singapore)
  11. Student Performance Visualization https://www.kaggle.com/joshuaswords/awesome-data-visualisation-student-results?scriptVersionId=57181038
  12. Netflix Data Visualization: https://www.kaggle.com/joshuaswords/netflix-data-visualization , https://www.kaggle.com/meetnagadia/netflix-stock-price-data-set-20022022
  13. FIFA Dataset: https://www.kaggle.com/stefanoleone992/fifa-21-complete-player-dataset This one is specific to FIFA21, maybe one can combine historical data to generate interesting visualization outcomes
  14. Airline Safety Dataset:https://github.com/fivethirtyeight/data/tree/master/airline-safety
  15. USA weather history Dataset:https://github.com/fivethirtyeight/data/tree/master/us-weather-history One can try doing web scraping of weather data
  16. USA Government Surveillance Planes Dataset https://github.com/BuzzFeedNews/2016-04-federal-surveillance-planes
  17. Political advertisement dataset on meta platform https://www.propublica.org/datastore/dataset/political-advertisements-from-facebook
  18. Wine and its quality dataset:http://archive.ics.uci.edu/ml/datasets/Wine+Quality
  19. Car Evaluation dataset: http://archive.ics.uci.edu/ml/datasets/Car+Evaluation,https://www.kaggle.com/ebrahimhaquebhatti/75000-used-cars-dataset-with-specifications
  20. Video Game industry dataset: https://www.statista.com/topics/868/video-games/ One can generate various interesting visualization like shown here https://studentwork.prattsi.org/infovis/visualization/visualizing-video-game-data-2007-2016-with-tableau/ [ one may need to create an account with statista to download the dataset If anyone one you find better resources please do let me know!]
  21. Social Impact Dataset: One can use web scraping and extract the data from https://www.buzzfeed.com/, https://www.data.gov/,https://www.reddit.com/r/datasets/
  22. USA Air quality dataset https://www.epa.gov/environmental-topics/air-topics
  23. open-source platform for the crowdsourced reporting and triaging of infrastructure-related issues https://github.com/taarifa/TaarifaWaterpoints
  24. Global climate dataset per continent https://en.tutiempo.net/climate
  25. UN dataset for various domains like Greenhouse Gas Inventory Data, World Development Indicators, etc. http://data.un.org/Explorer.aspx?d=CLINO
  26. Electricity Dataset: https://www.eia.gov/electricity/data/eia923/ (USA),https://www.singstat.gov.sg/find-data/search-by-theme/industry/energy-and-utilities/latest-data (Singapore); Sample visualization example https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=5379&context=sis_research
  27. Electricity Consumption & Occupancy dataset http://www.vs.inf.ethz.ch/res/show.html?what=eco-data (USA)
  28. USA FNS dataset https://www.fns.usda.gov/snap-retailer-data
  29. Some open USA state datasets: https://data.seattle.gov/, https://data.austintexas.gov/,https://datasf.org/opendata/, https://opendata.cityofnewyork.us/
  30. Road Traffic monitoring dataset: https://www.kaggle.com/shawon10/road-traffic-video-monitoring
  31. Insect Egg Evolution Dataset: https://github.com/shchurch/Insect_Egg_Evolution
  32. Dataset on tv shows, movies, documentary series, and all other forms of content available on HBO as of 2020 -https://www.kaggle.com/rishidamarla/hbo-tv-shows-documentaries-movies-as-of-2020
  33. Hollywood Theatrical Market Synopsis dataset for 1995 to 2021 https://www.kaggle.com/johnharshith/hollywood-theatrical-market-synopsis-1995-to-2021
  34. Web scrape data from anime-planet with 18000+ animes dataset https://www.kaggle.com/vishalmane10/anime-dataset-2022
  35. GDP of all countries dated from 1960- 2020: https://www.kaggle.com/holoong9291/gdp-of-all-countries19602020
  36. Uber fare dataset: https://www.kaggle.com/yasserh/uber-fares-dataset
  37. Forest Fire dataset: https://www.kaggle.com/balavashan/forest-fire-dataset
  38. Singapore Transport dataset: https://datamall.lta.gov.sg/content/datamall/en.html - It has both static eg Annual Bus Population By Passenger Capacity, etc and dynamic dataset eg Bus Routes, etc
  39. E-Commerce related dataset: https://imerit.net/blog/25-best-retail-sales-and-ecommerce-datasets-for-machine-learning-all-pbm/
  40. Singapore Government who aim to deliver insightful statistics and trusted statistical services that empower decision making: https://www.tablebuilder.singstat.gov.sg/publicfacing/selectVariables.action [cleaned dataset for use https://docs.google.com/spreadsheets/d/1a2uZKydzbP-vTdrXrdcGmfxTFnZSXdDS65XOpjWY0SE/edit#gid=1367742117]
  41. Police Department Incident Reports from 2018 to Present:https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-2018-to-Present/wg3w-h783
  42. COVID-19 World vaccination status progress: https://www.kaggle.com/gpreda/covid-world-vaccination-progress
  43. UCS Satellite Dataset: https://www.ucsusa.org/resources/satellite-database [Independent science to solve our planet's most pressing problems checkout the sample visualization created by Union of Concerned Scientists at https://www.ucsusa.org/]
  44. E-commerce dataset https://data.world/promptcloud/fashion-products-on-amazon-com [amazon fashion product dataset], https://www.kaggle.com/c/shopee-product-detection-open [Shopee product dataset] etc where one can try creating a visualization dashboard which can show/predict/generate/suggest from which e-commerce site to buy a relevant product from [Domain: Search Relevance E-commerce application]

Resouces

  1. Basic web scraping: https://www.thepythoncode.com/article/extract-weather-data-python
  2. Data Visualization Tools that are worth looking into: Tableau[https://www.tableau.com/],Looker[https://looker.com/],Zoho Analytics[https://www.zoho.com/analytics/], Sisense[https://www.sisense.com/], IBM Cognos Analytics [https://www.ibm.com/au-en/products/cognos-analytics], Qlik Sense[https://www.qlik.com/us/products/qlik-sense], Domo[https://www.domo.com/business-intelligence/visualization] , Microsoft Power BI[https://powerbi.microsoft.com/en-us/] , Klipfolio[https://www.klipfolio.com/],SAP Analytics Cloud[https://www.sap.com/products/cloud-analytics.html] . Remember these software tools may not be free to use[some can try out the trial version too]. Feel free to contact Prof/TA's for possiblity of providing the licenses. Guide that might help in selecting the visulization tool https://callminer.com/blog/data-visualization-tools-buying-guide
  3. Go-to links to understand how to generate web-deployable dashboard: https://towardsdatascience.com/deploying-data-dashboards-automatically-reliably-and-securely-372ef802ca3c,https://topflightapps.com/ideas/how-to-create-a-dashboard-web-application/,https://topflightapps.com/ideas/how-to-create-a-dashboard-web-application/, https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/, https://help.tableau.com/current/server/en-us/web_author.htm [ Specifically if someone is using tableau], https://powerbi.microsoft.com/en-us/publishtoweb/ [PowerBI]
  4. Learning path: https://www.analyticsvidhya.com/learning-paths-data-science-business-analytics-business-intelligence-big-data/tableau-learning-path/[for Tableau],https://analyticsindiamag.com/8-best-free-resources-to-learn-tableau/[Free resouces to learn tableau] ,https://medium.com/javarevisited/7-best-courses-to-learn-microsoft-power-bi-for-beginners-and-experienced-developers-83695c9428dc [PowerBI], https://www.youtube.com/watch?v=3u7MQz1EyPY [PowerBI],
  5. Visualization tips: https://blog.hubspot.com/marketing/great-data-visualization-examples, https://resagratia.com/2020/07/the-differences-between-good-data-visualization-and-bad-data-visualization-part-1/ ,https://towardsdatascience.com/data-visualization-101-7-steps-for-effective-visualizations-491a17d974de,
  6. Learning tableau and getting a coursera certificate : https://www.coursera.org/learn/analytics-tableau?specialization=excel-mysql&utm_source=gg&utm_medium=sem&utm_campaign=05-ExceltomySQL-ROW&utm_content=B2C&campaignid=6558079885&adgroupid=118127838703&device=c&keyword=&matchtype=&network=g&devicemodel=&adpostion=&creativeid=507138498627&hide_mobile_promo&gclid=CjwKCAiA866PBhAYEiwANkIneJS_RfAv4PpwU4J0_QAKLLjh_OUnNsm4n5Ggm1B2b4Jxw6U2bhjHtBoCKd0QAvD_BwE [just a 18hrs lecture videos!]

About

Collection of dataset and links that can be used to create a cohesive visualization for the module CS5346

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published