Class details 2018
Databases Stream (Databases in theory and practice)
This class runs for half a day (1:30-5pm) for four days (Monday to Thursday inclusive). Each day builds on the previous one. They are not repeats.
This course combines database theory with hands-on development. It will take you from complete novice (who perhaps has heard the word ‘database’ but knows nothing else!) to confident in designing and running a database. Covers data modelling and database design, normalisation, data validation, and referential integrity, on the theory side of things; and on the practical side, you will install and config the software stack XAMPP for development and testing of Web-enabled databases locally (i.e. on your laptop), then proceed to learning how to add, update, delete data from your database, and how to write queries in SQL against the database. We end with a look at noSQL / non-relational databases and compare them with the relational model. Theory and practice will proceed in parallel, and you will be hands-on from day 1.
Participants who have a sample of their own work or source data in any format (spreadsheet, plain text, etc.) should bring it for conversion to a database, otherwise we will work from a minimal set of data supplied by tutor.
If possible, intending participants should:
- download and install XAMPP before the course from https://www.apachefriends.org/index.html (the relevant one for your operating system).
- familiarise yourself with a diagramming tool e.g. Visio or Dia
- Visio is expensive, so only if you already have access to a copy
- Dia is free, open source at http://dia-installer.de/ - very good tool but a bit counterintuitive so you will need a few hours to get used to it
- Course materials are on digclassics.info
Digital Media Research Stream
This stream consists of a number of stand-alone half-day classes as described below. Participants may attend one or more of these as they wish.
Youtube Analysis: An introduction to Data Ethnography
Monday afternoon 1:30pm-5pm with break – 3 hr class time
This session presents evidence from the visual cultures that surround YouTube, supporting the development of a new methodology to research the so-called ‘black boxes’. The methodology is defined as data ethnography and is the combination of stakeholder consultation, top level data analysis, persona construction, fine data analysis and finally topic or genre analysis. Data ethnography enables us to not only research what we cannot see, but provides a useful way to understand platform interoperability.
Humanities and social science digital research methods have been interrupted due to the prominence of privacy and surveillance concerns of platform interoperability that produces large quantities of personification data. The Facebook Cambridge Analytica scandal, especially the revelation of its ability to construct predictive models of its user’s behaviors, brought to the public interest concerns over how platform user data is harvested, shared and manipulated by third party providers. The global pushback against the platform provider’s use of these data resulted in platforms closing down some access to application programming interfaces (APIs) to inhibit data manipulation. However, these restrictions also impact on how public benefit research is conducted, providing a useful prompt to rethink how humanities, social scientists and human computer interaction scholars research the digital.
While the datafication of our digital lives has provided us with new insights, the digital methods that enable us to research our digital selves have always been mixed to understand the field of enquiry, along with the surrounding political, cultural and economic constructs. Increased digital practices built on sophisticated calculations, for example the use of algorithmic recommendations, connected devices, internet of things, and the like, have impacted on our research environments, prompting the question, how do we research what we can’t see?
Introduction to Digital Ethnography
Tuesday morning 9am-12:30pm with break – 3 hr class time
Digital media are increasingly part of both the contexts in which we do research and the techniques and processes through which we practice ethnography. This creates a new context for practicing ethnography, whereby the digital, methodological, practical and theoretical dimensions of ethnographic research are increasingly intertwined. Doing research with, through and in an environment partially constituted by digital media has led to the development of new and innovative methods and to challenge existing conceptual and analytical categories. It has invited us not only to theorize the digital world in new ways, but also to re-think how we have understood pre-digital practices, media and environments. This session serves as an introduction to the principles of digital ethnography. Using examples from a variety of research projects carried out in different national and regional contexts, the session highlights a variety of ethnographic strategies with an eye towards the use and application of digital ethnography in different contexts.
Participants are asked to send a brief title, bio and description of their research projects to the convenor (firstname.lastname@example.org) on or before Monday 3 December and should be prepared to discuss their research during the session.
Multiplatform Issue Mapping
Tuesday afternoon 1:30pm-5pm with break – 3 hr class time
Ariadna Matamoros Fernández
This workshop introduces issue mapping as an advanced method to account for the cultural dynamics of digital media within and across platforms. The first part of this session will contextualize issue mapping within the field of Internet studies and illustrate how the method can be applied to social media analysis. The second part of the session will focus on YouTube as an object of research. We will use the YouTube Data Tools(Rieder, 2015) to gather data and explore different analytical possibilities (for example, video and channel networks, YouTube search results, and the examination of YouTube vernaculars and its impact on the platform’s algorithms). The workshop will demonstrate how the use of digital tools in combination with network and qualitative analyses can help illuminate the platform-specificity of social media controversies both from the perspective of users’ practices and of the platforms’ mediation of them. Overall, we will discuss the importance of treating platforms as active actors in the shaping of controversies rather than as transparent sources of behavioural data on social issues.
Complete the readings listed below.
Watch the YouTube tutorials for a first introduction to the YouTube Data Tools in order to get familiar with the tool.
Think about why do you want to know how to use the YouTube Data Tools. What are you interested in? Do you think this tool can help you answer your research questions?
- Burgess, J., & Matamoros-Fernández, A. (2016). Mapping sociocultural controversies across digital media platforms: one week of #gamergate on Twitter, YouTube, and Tumblr.Communication Research and Practice,2(1), 79–96.
- Gephi & Visual Network Analysis at DMI 2017 Summer School,Slides https://wiki.digitalmethods.net/pub/Dmi/SummerSchool2017/Gephi%20Tutorial%20DMI.pdf
Please install the open source software Gephi on your computers and make sure to have any spreadsheet program in your laptops (Excel, Google spreadsheets, Numbers, etc.).
YouTube Data Tools tutorials
Topic Models in R
Wednesday morning 9am-12:30pm with break – 3 hr class time
Topic models allow you to find topics and relationship among documents in large collections of texts. There is a (growing) number of approaches to explore and make statistical inferences about texts. This session will provide a hands-on introduction to text analysis and topic modelling using "stm: An R package for Structural Topic Models" (https://cran.r-project.org/web/packages/stm/index.html). stm is a comprehensive and highly regarded package to prepare, model and visualise textual data. The session will be mainly practical but I will also provide a short theoretical introduction to topic models and present some applications we can currently find in the literature. In this session, we will walk the length of a standard pipeline for textual analysis. We will explore different techniques and methods to import texts and associated metadata into R from a variety of sources such as PDFs, webpages, spreadsheets and APIs. We will prepare the data and clean the text using Hadley Wickham’s “tidy” approach. We will compute document term matrices and discuss the benefits of different weighting techniques. Finally, we will estimate, evaluate and visualise topic models to facilitate their interpretation and to communicate the result of the analysis.
You are encouraged to visit this webpage (https://digital-methods-sydney.github.io/ws-201812/) for additional resources on the session.
TRISMA and Twitter
Wednesday afternoon 1:30pm-5pm with break – 3 hr class time
This workshop will involve scraping data from Twitter’s API and carrying out analysis by creating an influencer network and network visualisation using Gephi and Tableau. Participants will be using Python to perform some analytic tasks. It is not expected that attendees will have python environment and necessary packages installed on their laptops. Online Jupyter notebook will be set up for each person to access, and all functions will be executed remotely from the server side.
- Participants will need a laptop or tablet, you will have guest access to internet
- Local installation of Gephi and Chrome internet browser
Digital ethnography inside Instagram
Thursday, 6 December, 9am-12:30pm with break – 3 hr class time
This session will explore how digital ethnographers are investigating visually oriented digital content through data generated by the public on the popular social photography platform, Instagram. Launched in 2010, Instagram is now among the fastest growing social media platforms, with more than one billion active users worldwide. Six-in-ten users check the app daily or more frequently. The rise of Instagram is built on a number of factors including ease of use and a tactile, uncluttered interface that prioritises visual content over text. With the rise of screen-culture, the visual has become powerful. Rich, nuanced, understandings about contemporary life can be gained from scholarly explorations in this realm. In this workshop, I will use examples from recent research to highlight ways that understandings about particular groups of people and broader aspects of contemporary culture can be interpreted from this kind of data. I will share a series of methods I have been developing over the past four years in a number of research contexts. Then I will ask you to consider these methods in light of your research interests. You will then design a small study and with my guidance, begin exploring a data set focusing on your topic and research question.
- Participants will need a laptop or tablet, you will have guest access to internet
- Instagram on your phone or a tablet. Follow at least about 10 accounts that might be of interest to you as a researcher.
- Access to Excel or Word to create a code sheet
- Research question or an area of interest from which you can create a question
Researching Automated Journalism
Thursday, 6 December, 1:30pm-5pm with break – 3 hr class time
The workshop will start with an introduction to research on the data, analytical and platformed-based skills and job roles expected in 21st century newsrooms, and examples in which AI is being integrated into journalism (e-readings sent out prior to workshop). We will then check-in with participants about learning goals for this session, and pre-existing knowledge, projects or workplace experiences of automated journalism. The main learning activity in this workshop is data scraping for journalists, and we will be using a software called OutWit Hub v7.0, a free version of which can be downloaded from here: https://www.outwit.com. Participants should pre-install and preview it before the workshop. In the workshop, step-by-step instructions will be given on how to use the software, along with examples of the types of data that can be accessed, and the news stories that emerge as a result. Then, with a story idea in mind, and working with a buddy, participants will experiment with data-scraping and collecting/collating data for a news story. Assistance will be available. Insights and results will be shared through whole group discussion. We will take a short break after that. In the second half of the workshop, participants will work on a similar story idea, using an open access data site that has already scraped and processed similar types of data, and made it freely available to the public. After exploring the data site, the challenge will be to identify ways journalists can make ‘expert’ use of this data. The workshop will consider excellence in datajournalism, with reference to recent prizewinning examples, before a final check-in on the achievement of learning goals, workshop feedback, and networking around projects, research interests, and recommended further readings.
- Participants need to bring a laptop, you will have guest access to internet
- Pre-install and preview the software OutWit Hub v7.0 (available for free from https://www.outwit.com.)
- Involves interactive exercises
Reading Digital Images
Friday, 7 December, 9am-12:30pm with break – 3 hr class time
Chris Chesher and Fiona Andreallo
This workshop introduces some approaches to analysing everyday visual images shared through online media. Chesher and Andreallo will briefly present their recent work including analysing YouTube videos of Furbies, real estate lifestyle images, selfies and memes. Andreallo will look at images as forms of visual conversation (Andreallo,2017). Chesher will examine how images can be used as evidence of cultural practices. In this workshop participants will explore their own sets of images through these approaches and then share findings and thoughts at the conclusion of the workshop. It is useful, however not essential to have a set of images in hand, or in mind, to work with. Please also bring your own laptop or electronic device.
- Participants will need a laptop or tablet, they will need guest access to internet
- No special software is needed.
- The workshop will involve 2 self-contained interactive exercises. You can bring your own images, or we will guide you in finding images to analyse during the workshop.
Distant Reading Stream
These are a series of stand-alone classes. The participants can attend one or more as they prefer. The "Introduction to Distant Reading" class is spread over two days (the two sessions are not repeats). The other classes are half days.
Introduction to Distant Reading
Tuesday and Wednesday mornings 9-12:30. These build on each other. They are not repeats.
Dr Michael Falk (Western Sydney University)
Over two half-days, you will get a practical introduction to distant reading using Python and the Natural Language Toolkit. Distant reading is the art of extracting meaning from text without physically reading the words. With advances in software, computing power, and the availability of digital texts, it is now possible for even a modest programmer with a modest computer to perform powerful kinds of distant analysis. At the end of these two days of workshops, you will know how to import text into Python, tokenise it, search through it intelligently using regular expressions, and use a wide variety of different tools and algorithms to find and visualise different meanings in your corpus.
Python is rapidly becoming one of the most popular programming languages, because of its intuitive syntax, open-source philosophy and massive community of developers who are constantly adding new functionality. At the end of these workshops, you will have learnt how to write your own program in Python, and will know how to find Python packages that can provide the functionality you need for your project.
Please visit the workshop's repository for instructions on how to install the necessary software.
Thursday morning 9-12:30.
Dr Rachel Hendery and Dr Michael Falk (Western Sydney University)
You will need to bring a laptop with an up-to-date installation of the Chrome web browser, capable of connecting to the guest wifi that we will provide you with. No other preparation is required.
Introduction to Corpus Linguistics
Friday morning 9-12:30
Monika Bednarek (University of Sydney)
This hands-on workshop will introduce participants to corpus linguistics - the computer-based analysis of text. The workshop will involve text analysis using free and easy-to-use corpus linguistic software. Participants need to bring their own laptop. The workshop does not require prior expertise in computer software or programming.
- You will need to bring a laptop, capable of connecting to the guest wifi that we will provide you with.
- Pre-install and preview the software AntConc (available for free from http://www.laurenceanthony.net/software/antconc/.)
- You can bring your own dataset(s) if you want. This should consist of plain text files (.txt), ideally in UTF-8 encoding or ascii
Linked Open Data Stream
This workshop runs for three full days (Tuesday, Wednesday and Thursday 9-5). Each session builds on the previous session; they are not repeats. An optional session will be held on Monday afternoon 1:30-5 and Friday morning 9-12:30. The Monday session will provide assistance in installing software, if needed. The Friday session will provide one on one assistance with individual questions or projects.
Terhi Nurmikko-Fuller (ANU)
This workshop introduces participants the key concepts and some of the main technologies behind Linked Data and the Semantic Web. Learners will explore data captured in RDF, design ontological structures, and learn how to write SPARQL queries. No prior knowledge or experience of working with Linked Data is necessary.
There is a file with a list of readings and resources you can download and take a look at before the workshop. An optional session on the Monday afternoon is available for people who need help installing software.
Projects and Writing Stream
This stream contains a number of stand-alone classes. Participants may attend one or more of the sessions as they prefer. Textual Scholarship and Markup, and Project Management are each full days. The other classes are half days.
Beginner's guide to LaTeX
Half day (Monday afternoon 1:30-5)
Dr Rose Faunce (ANU)
This course provides a hands-on introduction to making a well-structured document in LaTex, with the focus on formats for academic writing in the humanities. Participants will learn how to adapt a LaTex template to suit the individual requirements of their thesis or essay. You will also find out how to source templates for other contexts (articles, reports, translations etc) and to reach out to LaTeX
support networks and communities for future assistance. In the final part of the course we will consider the pros and cons of using LaTex to format and typeset your thesis instead of standard wordprocessing software.
- Participants will require a laptop.
- Before the workshop please install the current distribution of TexShop suitable for your computer's operating system. It is free. If you have difficulties, Rose will be available during the lunch break to help.
For MAC OS 10.10, 10. 11, 10,12 or higher the current distribution is TexShop 4.15. Download with Safari recommended. To download click http://www.tug.org/mactex/mactex-download.html
For MAC OS 10.4 to 10.9, choose the applicable version to download from https://pages.uoregon.edu/koch/texshop/obtaining.html For Windows follow the instructions at https://miktex.org/howto/install-miktex
- Bring a backed up copy of an essay or thesis (in any format) to experiment with.
Textual scholarship and markup technology
Full day (Tuesday 9-5)
Dr Rose Faunce (ANU)
This course offers a hands-on introduction to the scholarly skill of textual transcription using web- based tools.
Initiatives involving crowdsourcing to annotate, transcribe and translate manuscripts, rare books, archival documents and images offer interactive and engaging ways to connect with and share all kinds of material. We will explore the different approaches taken in two of these projects. Participants will then carry out short transcription exercises, mark-up a text using T-Pen (Transcription for Paleographical and Editorial Notation) and learn how to annotate IIIF content in Mirador Viewer.
- Participants require their own laptop.
- Access to T-Pen requires registration - get in touch with instructor (email@example.com) before the workshop to add your name to the registration list.
- A selection of texts will be available to transcribe or participants have the option of working on a text of their own—to do this an image of the text scanned at high resolution will be required.
Understanding and Engaging Knowledge in a ‘Social’ Context
Half day (Wednesday morning 9-12:30)
Prof. Ray Siemens (University of Victoria, Canada)
This half-day session explores the underpinnings of online, social knowledge formation and engagement from perspectives that include intellectual history, knowledge domain area expertise and academic authority, facilitation of qualitative assurance process, and beyond. Examples will be drawn from scholarly editing and areas identified by participants, largely with focus on Wikimedia’s Wikipedia and Wikibooks.
Take a look at some of the following resources, tools, projects:
- Wikibooks: Open Books for an Open World. Wikimedia Foundation, Inc.
- Wikipedia: The Free Encyclopedia. Wikimedia Foundation, Inc.
- “Annotated Bibliography of Social Knowledge Creation” [Alyssa Arbuckle, Nina Belojevic, Tracey El Hajj, Randa El Khatib, Lindsey Seatter, and Ray Siemens, with Alex Christie, Matthew Hiebert, Jon Saklofske, Jentery Sayers, Derek Siemens, Shaun Wong, and the INKE and ETCL Research Groups. 29-264 in Alyssa Arbuckle, Aaron Mauro, and Daniel Powell, eds. Social Knowledge Creation in the Humanities: Volume 1 NMRTS 7. Toronto / Tempe: ITER and the Arizona Center for Medieval and Renaissance Studies, 2017.]
- British Library Additional Manuscript 17,492 (The Devonshire MS)
- A Social Edition of the Devonshire MS (BL Add 17,492). Eds. Ray Siemens, Karin Armstrong, Barbara Bond, Constance Crompton, Terra Dickson, Johanne Paquette, Jonathan Podracky, Ingrid Weber, Cara Leitch, Melanie Chernyk, Bret D. Hirtsch, Daniel Powell, Alyssa Arbuckle, Chris Gaudet, Eric Haswell, Arianna Ciula, Daniel Starza-Smith, James Cummings, with Martin Holmes, Greg Newton, Jonathan Gibson, Paul Remley, Erik Kwakkel, and Aimie Shirkie. Wikibooks and the Devonshire MS Advisory Group, 2012. 1394 pp. <http://en.wikibooks.org/wiki/The_Devonshire_Manuscript>.
- The Devonshire MS (BL Add 17,492) of Early Tudor Poetry. Eds. Ray Siemens, Karin Armstrong, Barbara Bond, Constance Crompton, Alyssa Arbuckle, Terra Dickson, Johanne Paquette, Jonathan Podracky, Ingrid Weber, Cara Leitch, Melanie Chernyk, Bret D. Hirtsch, Daniel Powell, Chris Gaudet, Eric Haswell, Arianna Ciula, Daniel Starza-Smith, James Cummings, with Martin Holmes, Greg Newton, Jonathan Gibson, Paul Remley, Erik Kwakkel, and Aimie Shirkie. Toronto / Tempe: Iter and Medieval and Renaissance Texts and Studies, 2015. xiv+519 pp.
- The Devonshire Manuscript: A Women’s Book of Courtly Poetry. Ed. Elizabeth Heale. The Other Voice in Early Modern Europe: The Toronto Series 19. Toronto: Iter Inc. and Centre for Reformation and Renaissance Studies, 2012. xiii + 278 pp.
Open Access and Open Social Scholarship Foundations
Half day (Wednesday afternoon 1:30-5)
Dr Alyssa Arbuckle (University of Victoria, Canada)
Full day (Thursday 9-5)
Prof. Lynne Siemens (University of Victoria, Canada)
This offering will cover the basics of project management from project definition to project review upon completion, including risk assessment and mitigation, work effort modeling, software tools and related internet resources and other topics. The course will be a combination of lecture and hands on activities with participants' own projects.
R and Data Wrangling Stream
Introduction to R for data wrangling and visualisation
This is a series of half day workshops that run Monday to Thursday inclusive, 1:30-5, each building on the previous one. They are not repeats. The first two are introductory and the second two are intermediate. People with some R background can join halfway through for the intermediate sessions only. People with no R background are advised to attend all sesssions.
Siva Kalyan and Hedvig Skirgård
In this course you will learn how to manage your data efficiently and visualise the results of your research using the programming language R and the “tidyverse” packages. The course is intended to be accessible to beginners: we will cover the basics of R scripting and the RStudio environment thoroughly and with the aim that all students become independently capable of using R for their own needs. We will not be teaching statistical modeling, but will focus purely on data wrangling and visualisation. We will mainly use example datasets from linguistics; however, no prior knowledge of linguistics is required, and you are encouraged to bring in your own data.