Learning Analytics 101

Emerging from research into the visualisation of argument construction, the analysis of learner interactions within networks has become widely recognised in recent years as a rich and effective means of providing feedback on learner progress (Najjar, Duval and Wolpers, 2006) – facilitating personalised learning (Beck and Woolf, 2000), developing collective intelligence (De Liddo, et al., 2012) automating metadata annotation (Downes, 2004), and offering opportunities for enhanced discoverability (Siemens, 2012).

Learning Analytics (LA) is a relatively new area of research that is comparable with other fields, such as Big Data, e-science, Web analytics, linguistic analysis and Educational Data Mining (EDM). All of these fields use large collections of in-depth data to identify patterns. While EDM and LA have many similarities, EDM tends to focus on analysing metrics with the aim of building prediction models (e.g. Kizilcec, Piech and Schneider, 2013; Wen, Yang and Rosé, 2014), while LA inclines to data analysis for developing learning processes. Both applications of technology have the potential to disrupt and have critical implications for future teaching and learning practice, with far reaching, but little understood outcomes.

The underlying assumptions of LA are based on the belief that Web-based proxies for behaviour can be used as evidence of knowledge, competence and learning. Through the collection and analysis of “trace data‟ (e.g. learners‟ search profiles, their website selections, and how they construct, use and move information on the Web – Stadtler and Bromme, 2007; Greene, Muis and Pieschl, et al., 2010) learning analysts explore “how students interact with information, make sense of it in their context and co-construct meaning in shared contexts” (Knight, Buckingham Shum and Littleton, 2014:10). LA methods that focus on discussion forums include processes that identify learners’ attention, sentiment analysis (agreement or disagreement), learner activity, and relationships between learners within forums (De Liddo, et al., 2011).

Design of LA instruments is not neutral, but inevitably reflects the ideology, epistemology and pedagogical assumptions of the designers. Data are not value free; they require interpretation and are subject to “interpretative flexibility” as much as any other technological development (Collins, 1983; Hamilton and Feenberg, 2005). Historically, information and communication technology (ICT) interventions in education have been based on objectivist assumptions that learners’ ability to represent or mirror reality are key to judging evidence of knowing and learning. While still maintaining a strong position in summative assessment, over the past thirty years the assumptions underlying objectivism have been challenged by a growing body of constructivist thought which holds that the key to understanding how knowledge is built is through examining the interpretive process of learning (Jonassen, 1991). The practice of Learning Analytics broadly adheres to either an objectivist perspective, which prioritises the use of trace data to make evaluations of knowledge acquisition (assessment of learning), or a constructivist position which values the provision of feedback to facilitate improved learner self-awareness (assessment for learning).


Learning Analytics provides some evidence that awareness of peer feedback improves collaboration (Phielix, et al., 2011) and a key method for providing feedback is through drawing attention to useful interaction metrics through visualisation techniques. Duval (2011) asserts that data visualisation “dashboards‟ can provide useful feedback mechanisms for learners and educators which can aid their evaluation of learning resources, and which may lead to improved discovery of content that is better suited to their needs.

For example Murray et al. (2013) describe a prototype dashboard which aims to support learners’ online deliberations through the use of textual analysis to identify and monitor: reflection, questioning, conceptualising, peer interaction as well as other social awareness metrics. Equipped with such a dashboard, facilitators may monitor common online forum problems like off-topic conversation, conversation dominated by specific contributors and high emotional content.

Duval (2011) asserts that “one of the big problems around learning analytics is the lack of clarity about what exactly should be measured” (2011:15) and suggests that “typical measurements…of time spent, number of logins, number of mouse clicks, number of accessed resources…” (2011:15) are not adequate metrics for finding out how learning being accomplished. Visualising other data sources including “emotion and stress analytics” (Verbert, et al., 2014:1512) may be relevant to enhance reflection and monitoring.

Ethical Issues

Learning analytics involves common data mining techniques and as such may have potential problems with ethical values like privacy and individuality. Data mining makes it difficult for an individual to control how their information is presented or distributed. Van Wel and Royakkers (2004) have identified two main forms of data mining: “content and structure mining‟ and “usage mining‟:

“Content and structure mining is a cause for concern when data published on the Web in a certain context is mined and combined with other data for use in a totally different context. Web usage mining raises privacy concerns when Web users are traced, and their actions are analysed without their knowledge.” (van Wel and Royakkers, 2004:129).


Critics have focused on a number of problems with the outcomes of analysing learning. The reliable validation of human and automatic annotation is problematic and unresolved (Rourke, et al., 2003; de Wever, et al., 2006); crude feedback mechanisms can lead to efforts to “game the system‟, so that educators design learning objects to elicit positive responses regardless of the overall benefit to learners; analytics can lead to learner-dependence on feedback rather than their own understanding, and the ethical implications of combining and representing data are not fully comprehended (Shum and Ferguson, 2012).


  • Beck, J. E., and Woolf, B. P. (2000). High-level Student Modeling with Machine Learning. In G. Gauthier, C. Frasson and K. VanLehn (eds.), Proceedings of 5th International Intelligent Tutoring Systems Conference, ITS 2000, 584-593. June 19-23, 2000, Montréal, Canada.
  • Collins, H. M. (1983). An Empirical Relativist Programme in the Sociology of Scientific Knowledge. In K. Knorr-Cetina and M. Mulkay (eds.), Science Observed. Perspectives on the Social Study of Science, 85-113. London: Sage Publications.
  • De Liddo, A., Buckingham-Shum, S., Quinto, I., Bachler, M., and Cannavacciuolo, L. (2011). Discourse-centric Learning Analytics. In Proceedings of the 1st International Conference on Learning Analytics and Knowledge, 23–33. February 27 – March 1, 2011, Banff, Alberta.
  • De Liddo, A., Buckingham Shum, S., Convertino, G., Sándor, Á. and Klein, M. (2012). Collective intelligence as community discourse and action. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work Companion (CSCW ’12). ACM, New York, NY, USA, 5-6.
  • de Wever, B., Schellens, T., Vallcke, M. and van Keer, H. (2006). Content Analysis Schemes to Analyze Transcripts of Online Asynchronous Discussion Groups: A Review. In Computers and Education, 46(1), 6-28.
  • Downes, S. (2004). Resource Profiles. In Journal of Interactive Media in Education, 5, 1–32. Special Issue on the Educational Semantic Web. [Online] Available at: http://www-jime.open.ac.uk/2004/5 [Accessed on 5 September 2014].
  • Duval, E. (2011). Attention please! Learning Analytics for Visualization and Recommendation. In Proceedings of LAK11: 1st International Conference on Learning Analytics and Knowledge, 9–17. February 27-March 1, 2011, Banff, Alberta.
  • Greene, J. A., Muis, K. R., and Pieschl, S. (2010). The Role of Epistemic Beliefs in Students‟ Self-Regulated Learning with Computer-Based Learning Environments: Conceptual and Methodological Issues. In Educational Psychologist, 45(4), 245–257.
  • Hamilton, E., and Feenberg, A. (2005). The Technical Codes of Online Education. In Techné: Research in Philosophy and Technology, 9(1).
  • Jonassen, D. H. (1991). Objectivism Versus Constructivism: Do We Need a New Philosophical Paradigm? In Educational Technology Research and Development, 39(3), 5-14.
  • Kizilcec, R. F., Piech, C., and Schneider, E. (2013). Deconstructing Disengagement: Analyzing Learner Subpopulations in Massive Open Online Courses. In Proceedings of the Third International Conference on Learning Analytics and Knowledge, 170-179. ACM. April 08 – 12, 2013, Leuven, Belgium.
  • Knight, S., Buckingham Shum, S. and Littleton, K. (2014). Epistemology, Assessment, Pedagogy: Where Learning Meets Analytics in the Middle Space. In Journal of Learning Analytics, 1, 2, 23 – 47.
  • Murray, T., Wing, L., Woolf, B., Wise, A., Wu, S., Clark, L. and Xu, X. (2013). A Prototype Facilitators Dashboard: Assessing and Visualizing Dialogue Quality in Online Deliberations for Education and Work. In Proceedings of International Conference on e-Learning, e-Business, Enterprise Information Systems, and e-Government (EEE’13), 34-40. July 22-25, 2013 Las Vegas Nevada.
  • Najjar, J., Duval, E., and Wolpers, M. (2006). Attention Metadata: Collection and Management. In Proceedings of WWW2006 Workshop on Logging Traces of Web Activity: The Mechanics of Data Collection, 1-4. 23-26 May, 2006, Edinburgh.
  • Phielix, C., Prins, F. J., Kirschner, P. A., Erkens, G., and Jaspers, J. (2011). Group awareness of social and cognitive performance in a CSCL environment: Effects of a peer feedback and reflection tool. In Computers in Human Behavior, 27(3), 1087-1102.
  • Shum, S. B., and Ferguson, R. (2012). Social Learning Analytics. In Educational Technology and Society, 15(3), 3-26.
  • Siemens, G. (2012). Learning analytics: envisioning a research discipline and a domain of practice. In Proceedings of the 2nd International Conference on Learning Analytics and Knowledge, 4-8. ACM. April 29 – May 02, 2012, Vancouver, BC, Canada.
  • Stadtler, M., and Bromme, R. (2007). Dealing with Multiple Documents on the WWW: The Role of Metacognition in the Formation of Documents Models. In International Journal of Computer-Supported Collaborative Learning, 2(2), 191–210.
  • van Wel, L., and Royakkers, L. (2004). Ethical Issues in Web Data Mining. In Ethics and Information Technology, 6(2), 129-140.
  • Verbert, K., Govaerts, S., Duval, E., Santos, J. L., Van Assche, F., Parra, G., and Klerkx, J. (2014). Learning dashboards: an overview and future research opportunities. In Personal and Ubiquitous Computing, 18, 6 1499-1544.
  • Wen, M., Yang, D., and Rosé, C. P. (2014). Sentiment Analysis in MOOC Discussion Forums: What does it tell us? In Proceedings of the 7th International Conference on Educational Data Mining (EDM 2014), 130-137. July 4 – 7, 2014, London, UK.

DAL MOOC – Week 1 Reflection

I have just completed the first week of the Data, Analytics and Learning MOOC on the edX platform, and as an end of week activity I’ve been asked to research learning analytics tools, add them to a table, and upload the table here. I’ve also been asked to provide a definition of Learning Analytics, share my reflections on week one in terms of a) content presented, and b) course design.

LA Tools Research

Apparently people are very interested in cleaning, modeling, analysing, and visualising data, because there are many, many tools available to do some or all of this (some of them free, and too many to list here. So here’s my Learning Analytics Tools Matrix for you to download.

My definition

Learning Analytics use Web-based activity as proxies for behaviour that provide evidence of knowledge, competence and learning, and facilitate the building of predictive models, and the analysis of networked interactions.

For a more in-depth explanation of my understanding of LA – see my Learning Analytics 101 post.

Reflections on Week 1

The content and interface looks good, the Google Hangouts are informative, and there appear to be a lot of useful data wrangling tools which I’m going to find out how to use.

But, I find the ProSolo “social competency” tool difficult to navigate, and I’ve yet to figure out how to use the Learning Progress and Credentials functions. I joined a conversation this morning which I can’t find (which is frustrating because it contained some of my reflections on the course). I like to get stuck in straight away and would have liked a simple, practical bit of analytics as a taster of what’s to come.

Finally, although the Google Hangouts are useful, a lot of time is spent housekeeping, and managing technology, which is fairly OK when live, but could easily be cut out for later distribution (which I have done myself – see my previous post).

All in all, it looks good,and I can’t wait to stuck into week 2.

DAL MOOC – Week 1 Beginning

I’ve just started the edX Data, Analytics, and Learning MOOC (rather late – but I’m catching up) which has involved getting the hang of ‘Hangouts’. These are informal chats between experts which can take a while to get going, but inside these video conversations there are nuggets of extremely useful stuff. So in the spirit of the ‘revise/remix’ ethos of the course I’ve started to edit them into ‘bitesize’ chunks (using the free version of Lightworks). The first two are from week 1 where George Siemens (Athabasca University), Carolyn Rosé (Carnegie Mellon University), Dragan Gašević (Athabasca University) and Ryan Baker (Columbia University) give their definitions of Learning Analytics and answer the question “what do you do when you do Learning Analytics?”.

Personally, I can’t wait to get to Dr Rose’s section as her work on Discourse Analysis sounds right up my street, but I’m also very interested in Social Network Analysis  which will be covered by Dr Gasevic, and Prediction Modeling which will be led by Dr Baker.

There are a range of open source tools used on this course:
Lightside for Discourse Analysis
Gephi for Social Network Analysis, and
Rapid Miner for Prediction Modeling

The course also encourages learners to use the social networking aggregation and learning support tool, ProSolo, provides an introduction Tableaux – “a good tool to get started with” data analysis and visualisation – and shares a load of cleaned and anonymised datasets for us to play with.

Early days

Beetham031114Poster Session at ILiAD Conference 2014/Sue White © 2014/CC BY 2.0

A recent highlight was the ILIaD Conference, held at the University of Southampton, where I had the opportunity to discuss my summer research project with Digital Literacies expert, author of Rethinking Learning for the Digital Age, and keynote speaker, Helen Beetham during the poster session.

These early days in my life as a PhD research student seem to have been mainly taken up with training. I undertook an excellent Demonstrator training session at the end of September, which has led to some paid work evaluating first year student presentations throughout November. I have also undertaken the very useful Public Engagement and Presenting Your Research training as well as the Successful Business Engagement programme, Finding Information to Support Your Research and the Lifelong Learning PGR Training Package.

Public Engagement/Tim O'Riordan ©2014/CC BY 2.0

Public Engagement/Tim O’Riordan ©2014/CC-BY 2.0

Induction into the Web and Internet Science research group provided an opportunity to find out what my lab colleagues are mainly up to and to do some lightweight research into Provenance (see: WAIS Provenance Project).

I attended a seminar on the Portus MOOC (the focus of my summer dissertation project) and met with MOOC team leader and dedicated online learning exponent, Graeme Earl. It was great to hear Graeme talking about how he approached designing the course and using some of the key terms from the model I used for coding attention to learning in my project. He also raised issues about managing high numbers of learners comments. How do we scale MOOCs, engage larger numbers of learners and still pay attention to scaffolding? My hope is that my research will provide some support in this area.

I joined the PEGaSUs e-learning research group at the end of October. This is a interdisciplinary group of research students from diverse backgrounds who have a shared interest in online learning. Topics for discussion included teaching in emerging economies, blended learning, measuring teacher performance, communities of practice, e-portfolios, and (of course) Massive Open Online Courses. Post-meeting I set up a (closed) Facebook group to help keep the discussion going between meetings. I think PEGaSUs stands for Postgraduate E-learning Group at Southampton University (looks about right).

Finally, The British Computing Society Introduction to Badges for Learning event run by Sam Taylor and Julian Prior at Southampton Solent University at the end of October provided an excellent overview of this emerging method for enhancing learner engagement.

My growing to-do list is mainly taken up with the tasks involved in reviewing, editing and generally preparing my summer project for publication. This involves:

  • Finding suitable journals/conferences that may be interested in publishing my work.
  • Looking for what’s missing in my dissertation.
  • Interrogating my analysis (Is it watertight? Does it hold up under scrutiny?).
  • Reading more papers on learning analytics and content analysis.
  • Apply my analysis to other data sets (possibly the Web Science MOOC comment data).
  • Getting a paper ready for submission by the end of February 2015.