Supporting Large-scale Social Media Data Access and Analytics for the Broader Research Community
presentationposted on 2022-03-04, 01:06 authored by eRNZ AdmineRNZ Admin, RICHARD SINNOTTRICHARD SINNOTT
The Australian Digital Observatory has been funded by the Australian Research Data Commons to support large scale access to and use of social media data by the broader research community. Examples of the social media data of interest include data from platforms such as Twitter, Instagram, Flickr, Foursquare, Reddit, YouTube and e-Gaming platforms such as Steam and Valve. This talk will cover the user communities on what they would require from such a platform as well as the technical aspects of development and delivery of a big data platform tasked with dealing with voluminous amounts of tweets and posts.
Example case studies will be presented reflecting the initial user/community demands including:
- finding and tracing historic information related to social media use by suicide completers (who committed suicide in 2014/2015);
- the challenges of dealing with fake news and technical solutions to attempt to identify news as fake based on machine learning algorithms and the diffusion patterns of data;
- geolocating social media data where the explicit location (lat/long) of posts, i.e. based on the location based service of their mobile device, is not present and the privacy concerns that this gives rise to;
- topic modelling and sentiment analysis at scale.
The talk will also cover the technical aspects of the development of the platform and use of the National Research Cloud and technologies such as Docker and Kubernetes and how it is planned that they will serve the user communities in their big social media data research requirements, i.e. where their local resources/laptops are insufficient to manage large quantities of data.
ABOUT THE AUTHOR
Professor Richard O. Sinnott is Professor of Applied Computing Systems and Director of the Melbourne eResearch Group at the University of Melbourne. He has been lead software engineer/architect on an extensive portfolio of national and international projects, with specific focus on those research domains requiring finer-grained access control (security) and those dealing with big data challenges. He has over 450 peer reviewed publications across a range of applied computing research areas.