MSc Web Science – Week 21

Open Data Institute Annual Summit/ODI © 2013/CC BY-NC-SA 2.0

Next week is Web Science Research Week. I’m working on a team project: Using Linked Data to Record and Expose Linked Resources. This is the brief:

Critical to the research is the collection, curation and organisation of related work, examples and business cases that build the key evidence base upon which further work can be carried out scientifically – standing on the shoulders of giants. While an individual may be able to collate and organise references to other works, it is a much harder job to organise and expose multiple collections of resources at an organisational (and global) scale. This project aims to ask if linked data can help? The ODI uses several mechanism to collect links, references and citations, each suited to the demand of its project and individual preferences of those carrying out each project. These links and references represent a valuable resource not only to the organisation itself but also to the wider population, however organising and exposing these links in context with the reports, projects and work they relate to is a problem as yet unsolved. With several projects now complete, links and references have accumulated in Google Docs, Pinboard and Zotero and we want to find out if these references can be organised, given back context (where they have been used), interlinked (related links) and exposed. We believe linked data could hold the answer to these problems.

This week’s readings:

COMP6050 – Semantic Web for Web Science

  • Daniele Nardi and Ronald J. Brachman, An Introduction to Description Logics, in Franz Baader, Diego Calvanese, Deborah L. McGuinness, Daniele Nardi and Peter F. Patel-Schneider (eds) The Description Logic Handbook: Theory, implementation and applications, Cambridge University Press, 2003, pp1-40
  • F. Baader and W. Nutt, Basic Description Logics, in Franz Baader, Diego Calvanese, Deborah L. McGuinness, Daniele Nardi and Peter F. Patel-Schneider (eds) The Description Logic Handbook: Theory, implementation and applications, Cambridge University Press, 2003, pp47-100
  • FOAF Worksheet

COMP6052 Social Networking Technology

RESM6003 – Qualitative Methods

  • Blenkinsop, L. (2009) ‘The internet: virtual space’, in S. Barber and C. Peniston-Bird (eds) History Beyond the Text: A student’s guide to approaching alternative sources, pp. 122-135
  • Dobson, M. (2009) ‘Letters’, in M. Dobson and B. Ziemann, Reading Primary Sources: The interpretation of texts from nineteenth- and twentieth-century history, pp. 57-73 [e-book]
  • Roper, M. (2009) The Secret Battle: Emotional survival in the Great War, Manchester: Manchester University Press.
  • Steven, S. (2002) ‘Making sense of letters and diaries’, History Matters: The U.S. Survey Course on the Web.
  • Background reading on archives and source interpretation:
  • Abrams, L. (2010) Oral History Theory, London: Routledge [D 16.11 ABR] (Review).
  • Jordanova, L. (2012) The Look of the Past: Visual and Material Evidence in Historical Practice, Cambridge University Press.
  • Tosh, J. (2006, 4th edn) The Pursuit of History: Aims, methods and new directions in the study of modern history, Pearson. [D 16.2 TOS] Ch. 3, ‘The Raw materials’ and Ch. 4, ‘Using the sources’ [e-book]
  • Trouillot, M-R. (2004) Silencing the Past: Power and the production of history, Boston: Beacon Press

MSc Web Science – Week 20

Girl inspector confers with worker/Library of Congress c. 1939/No known copyright restrictions 

This week’s readings (some resources require institutional login):

COMP6047 – Further Web Science

Social Movements/Global Justice Activism
Analysing Protests

COMP6048 – Interdisciplinary Thinking

  • Repko A. F. (2008) Interdisciplinary Research: Process and Theory. Sage Publications. Chapters 10, 11 and 12.

COMP6050 – Semantic Web for Web Science

COMP6052 Social Networking Technology

  • Shuen, A. (2008) Web 2.0: A Strategy Guide. O’Reilly Media, Inc. [HD 30.213 SHU] Example chapter.
  • Gladwell, M. (2001) The Tipping Point. Abacus. [HM 251 GLA]
  • Watts, D. J. (2004) Six Degrees: The Science of a Connected Age. W. W. Norton & Co. [QA 166 WAT]

RESM6003 – Qualitative Methods


Open Hypermedia and the Web

Tim Berners-Lee

Tim Berners-Lee/Silvio Tanaka ©2009/CC BY 2.0


Tim Berners-Lee, the main architect of the World Wide Web (W3), developed the system while working for CERN, the European Organisation for Nuclear Research in the late 1980s. W3 was developed to overcome difficulties with managing information exchange via the Internet. At the time, finding data on the Internet required pre-existing knowledge gained through various time-consuming methods: the use of specialised clients, mailing lists, newsgroups,hard copies of link lists, and word of mouth.

At CERN, a large number of physicists and other staff needed to share large amounts of data and had begun to employ the Internet to do this. Although the Internet was acknowledged as a valuable means of sharing data, towards the end of the 1980s the need to develop simpler, more reliable methods encouraged the creation of new protocols using distributed hypermedia as a model.

Developments in Open Hypermedia Systems (OHSS) had gained pace throughout the 80s; a number of stand-alone systems had been prototyped and early attempts at a standardised vocabulary had been made [1]. OHSS facilitate key features: a separation of link databases (‘linkbases’) from documents, and hypermedia functions enabled for third party applications with potential accessibility within heterogeneous environments.

Two key systems; Hyper-G, developed by a team at the Technical University of Graz, Austria [1], and Microcosm, originating at the University of Southampton [5] were at the heart of pioneering approaches to hypermedia. Like W3, they were launched in 1990, but within 10 years both were outpaced by the formers overwhelming popularity. Ease of use, the management of link integrity and content reference, and the ‘openness’ of the underlying technology were contributing factors to W3’s success. However, both Hyper-G’s and Microcosm’s approach to linking media continue to have relevance for the future development of the Web.

The Dexter Hypertext Reference Model

In 1988 a group of hypertext developers met at the Dexter Inn, New Hampshire to create a terminology for interchangeable and interoperable hypertext standards. About 10 different contemporary hypertext systems were analysed and commonalities between them were described. Essentially each of the systems provided “the ability to create, manipulate, and/or examine a network of information-containing nodes interconnected by relational links.”[6]

The Dexter Model did not attempt to specify implementation protocols, but provided a vital reference model for future developments of hypertext and hypermedia. The Model identified a ‘component’ as a single presentation field which contained the basic content of a hypertext network: text, graphics, images, and/or animation. Each component was assigned a ‘Unique Identifier’ (UID), and ‘links’ that interconnected components were resolved to one or many UIDs to provide ‘link integrity’.

The World-Wide Web

By the mid-80s Berners-Lee saw the potential for extending the principle of computer-based information management across the CERN network in order to provide access to project documentation and make explicit the ‘hidden’ skills of personnel as well as the ‘true’ organisational structure. He proposed that this system should meet a number of requirements: remote access across networks, heterogeneity, and the ability to add ‘private links’ and annotations to documents. Berners-Lee’s key insights were that ”Information systems start small and grow”, and that the system must be sufficiently flexible to “allow existing systems to be linked together without requiring any central control or coordination”.

His proposal also stressed the different interests of “academic hypertext research” and the practical requirements of his employer. He recognised that many CERN employees were using “primitive terminals” and were not concerned with the niceties of “advanced window styles” and interface design [2].

Towards the end of 1990, work was completed on the first iteration of W3, which included a new Hypertext Markup Language (HTML), an ‘httpd’ server, and the Webs first browser, which included an editor function as well as a viewer. The underlying protocols were made freely available and within a few years the technology had been used and adapted by a wide variety of Internet enthusiasts who helped to spread W3 technology to wider audiences.


Aimed at providing solutions to perceived problems in contemporary hypermedia systems, Microcosm was launched as an “open model for hypermedia with dynamic linking” [5] in January 1990. The Microcosm team identified that existing hypermedia systems, although useful in closed settings, did not communicate with other applications, used proprietary document formats, were not easily authored, and as they were distributed on read-only media, did not allow users to add links and annotations.

While Microcosm used read-only media (CD-ROMs and laser-discs) to host components within an authored environment, it separated these ‘data objects’ from linkbases housed on remote servers. This local area network-based system allowed all users, authors and readers, to add advanced, n-ary (multi-directional) links to multiple generic objects. Microcosm was also able to process a range of documents and had some potential for interoperability due its modular structure, which enabled it to offer a degree of interoperability with W3 browsers [7].

While recognising the significance of W3, the Microcosm team identified some weakness, especially in the manner HTML managed links. Rather than storing links separately, W3 embedded links in documents which resulted in the inability to annotate or edit web documents, and suffered from ‘dangling’ or missing links when documents were deleted or URLs changed. In addition, HTML was limited in how links could be made, there were a small number of allowable tags and only single-ended, unidirectional links could be authored. To counter these link integrity issues the Microcosm team developed the Distributed Link Service (DLS) which enabled the integration of linkbase technology into a W3 environment [3].

Using the DLS, W3 servers could access linkbases and enabled user authored generic as well as specific links. Generic link authoring allows users to create links that connect any mention of phrases within sets of documents, and allows bi-directional links within documents.


Hyper-G offered a number of solutions to the linking issues identified by others working in hypermedia systems development. In a similar manner to Microcosm, Hyper-G stored links in link databases. This allowed users to attach their own links to read-only documents, multiple links to documents or anchors within text or any other media object could be made, users could readily see what objects were linked to, and links could be followed backwards so users could see “what links to what”. Unlike Microcosm, the system use an advanced probabilistic flood (‘P-Flood’) algorithm which managed updates to remote documents and linkbases ensuring link integrity and consistency essentially informing links when documents have been deleted and changed.

Like W3, Hyper-G was a client-server system with its own protocol (HG-CSP) and markup language (HTF). Hyper-G browsers integrated with Internet services W3, WAIS and Gopher, supported a range of objects (text, images, audio, video and 3D environments) and integrated authoring functionality with support for collaboration.

Hyper-G was a highly advanced system that successfully applied key hypermedia principles to managing data on the Internet. As web usability expert, Jakob Nielsen asserted, it offered “some sorely needed structure for the Wild Web” [8].

Why W3 Won

Despite acknowledged limitations, W3 retained its position as the defacto means of traversing the Internet, and continued to grow and spread its influence. The reasons for this are relatively straightforward.

W3 was free and relatively easy to use; anyone with a computer, a modem and a phone line could set up their own servers, build web sites and start publishing on the Internet without having to pay fees or enter into contractual relationships.

Although limited in terms of hypermedia capability, these shortcomings were not serious enough to prevent users taking advantage of its data sharing and simple linking functions. Dangling links could be ignored, as search engines allowed users to find other resources, and improved browsers allowed users to keep track of their browsing history, and backtrack through visited pages.

In contrast, Microcosm and Hyper-G were developed, in their early stages at least, as local systems. This enabled them to employ superior technology to manage complex linking operations much more effectively than W3. However, this focus led to systems that were significantly more complex to manage than W3, and presented difficulties for scaling up to the wider Internet. In addition it was not clear which parts, if any, were free for use. Both systems promoted commercial versions early in their development which had the unintended effect of stifling adoption beyond an initial core group of users.

Future directions

W3 has developed into a sophisticated system that provides many of the functions of an open hypermedia system that were lacking in its early stages of development. Attempts to integrate hypermedia systems with W3 [3],[4],[9] and find solutions to linking and data storage issues influenced the development of the open standard Extensible Markup language (XML) and XPath, XPointer and XLink syntaxes. While HTML describes documents and the links between them, XML contains descriptive data that add to or replace the content of web documents. XPath, XPointer and XLink describe addressable elements, arbitrary ranges, and connections between anchors within XML documents respectively.

XML may be combined with Resource Description Framework (RDF) and Web Ontology Language (OWL) protocols to store descriptive data that produce web content in more useful ways than with simple HTML. These protocols allow web content to be machine-readable, allowing applications to interrogate data and automate many web activities that have previously only been executable by human readers. These protocols are seen as precursors for the ‘Semantic Web’, a new development of W3 that links data points with multi-directional relationships rather than uni-directional links to documents [10].


[1] Keith Andrews, Frank Kappe, and Hermann Maurer. The Hyper-G Network Information System. In J. UCS The Journal of Universal Computer Science, pages 206–220. Springer, 1996.

[2] Tim Berners-Lee. Information Management: A Proposal. CERN, 1989.

[3] Les A Carr, David C DeRoure, Wendy Hall, and Gary J Hill. The Distributed Link Service: A Tool for Publishers, Authors and Readers. 1995.

[4] Hugh Davis, Andy Lewis, and Antoine Rizk. Ohp: A Draft Proposal for a Standard Open Hypermedia Protocol (Levels 0 and 1: Revision 1.2-13th March. 1996). In 2nd Workshop on Open Hypermedia Systems, Washington, 1996.

[5] Andrew M Fountain, Wendy Hall, Ian Heath, and Hugh C Davis. Microcosm: An Open Model for Hypermedia with Dynamic Linking. In ECHT, pages 298–311, 1990.

[6] Frank Halasz, Mayer Schwartz, Kaj Grønbæk, and Randall H Trigg. The Dexter Hypertext Reference Model. Communications of the ACM, 37(2):30–39, 1994.

[7] Wendy Hall, Hugh Davis, and Gerard Hutchings. Rethinking Hypermedia: the Microcosm Approach, Volume 67. Kluwer Academic Publishers Dordrecht, 1996.

[8] Hermann Maurer. Hyperwave – The Next Generation Web Solution, Institute for Information Processing and Computer Supported Media, Graz University of Technology, [Online: Accessed 5 December 2013].

[9] Dave E Millard, Luc Moreau, Hugh C Davis, and Siegfried Reich. Fohm: A Fundamental Open Hypertext Model for Investigating Interoperability Between Hypertext Domains. In Proceedings of the Eleventh ACM on Hypertext and Hypermedia, pages 93–102. ACM, 2000.

[10] Nigel Shadbolt, Wendy Hall, and Tim Berners-Lee. The Semantic Web Revisited. Intelligent Systems, IEEE, 21(3):96–101, 2006.

MSc Web Science – Week 19

Reading/Sam Howzit © 2012/CC BY 2.0

This week’s readings:

COMP6047 – Further Web Science

COMP6048 – Interdisciplinary Thinking

  • Repko A. F. (2008) Interdisciplinary Research: Process and Theory. Sage Publications. Chapters 3 and Chapter 4
  • Plus 6 readings for first group work.

COMP6050 – Semantic Web for Web Science

COMP6052 Social Networking Technology

RESM6003 – Qualitative Methods

5 interactions between the Web and Education that are changing the way we learn

Using MACs in the Computer Laboratory/University of Exeter ©2008/CC BY 2.0

The way we learn and the tools we use to extend our capacity for learning have always been closely interrelated. Over 2000 years ago wax tablets enabled learners to show their working, 500 years ago the introduction of movable type made books more accessible, 150 years ago the postal system provided the infrastructure for distance education, the introduction of radio and television services established the means for widespread educational initiatives, and personal computers and portable video making equipment were widely adopted by educators in the 1970s and 80s. Since the emergence of the Web 25 years ago, both learners and educators have exploited the potential of the underlying technologies and the services developed with them to support and change the way we think about learning in many fundamental ways.

1. Technology

The educational value of the Internet was recognised at its inception and computing science academics working in universities and colleges keenly adopted the technology to share data among themselves and with their students. However, as the number of resources hosted on networked computers increased they tended to become ‘siloed’ and difficult to find. The invention of the Web fundamentally changed this environment and the way people interacted with the Internet. The underlying protocols that govern the way the Web works are based on linking electronic documents over disparate networks using web browser applications. By making the protocols open to everyone at no cost the Web’s founders allowed people to build upon the technology, for example one of the earliest adaptations introduced the search function that enables users to discover resources significantly easier than with earlier technologies.

In the mid-1960s Gordon Moore identified an interesting fact about the processing power of computers – that it appeared to double every two years. Once this filtered through to computing hardware manufacturers and as the demand for personal computers increased, this became something of a self-fulfilling prophesy – one that has led to the development of ever sophisticated, ever smaller, less expensive computing devices. From laptop computers to smartphones to tablets to Google Glass and Radio-frequency identification (RFID) devices, this phenomenon has placed powerful, mobile computing into the hands of more than 1.5 billion people worldwide allowing learners and educators to access significantly more information than has been available to any previous generation.

2. The Evolving Web

The early Web gave learners and educators a taste of what could be achieved in this new environment. Learners could access information that had previously been ‘hidden’ in libraries and archives and educators were able to either convert existing instructional progammes, quizzes and exams into Web-enabled resources or develop new assets that guided learners through a set of learning objectives. But this essentially static, ‘read-only’ Web allowed little opportunity for learner interaction, collaboration and sharing, all vital components of the learning process. This began to change with the introduction of Wiki’s in the mid-90s.

These web applications enable users to comment on or change the text on a web page that had been written by others, and provide a platform for group collaboration and sharing. In addition to inspiring the creation of the global knowledge bank that is Wikipedia, wiki’s encapsulated many of the features of a ‘read, write and execute’ web – what is commonly referred to as Web 2.0.

The ability to readily create a presence on the Web via blogs, social networking, and video sharing sites has created a dynamic resource that continues to make radical changes to our learning and teaching experience. Web 2.0 applications have been embraced by learners and educators at all levels. YouTube and other video sharing sites provide a platform for user-generated how-to videos, software advice, and exemplars of arts and science disciplines (e.g. The LXD: TED Talk, Periodic Videos and Khan Academy), that inform and inspire millions of informal learners as well as students in formal education. The social networking site, Facebook is used by teachers to facilitate collaborative group work (e.g. in Music Technology at Bridgend College), and a large number of user-generated resource sharing sites (e.g. Flickr, SlideShare and Storify) and cloud computing services (e.g. Google Drive, WeVideo, and Pixlr) enable learners and educators to extend their tools and resources beyond the traditional classroom.

3. Theory

The network of collaborative and productive spaces enabled by Web 2.0 has inspired an invigoration of constructivist educational theory and its application to a range of online learning spaces. Learners and educators are able to communicate, provide feedback and collaborate in order to co-create the learning process using a variety of free-to-access synchronous and asynchronous technologies.

In constructivist theory learning takes place primarily through interaction between learners and between learners and teachers. Teachers assess the suitability of technologies in various settings and judge what are called their affordances for learning, that is, the essential features of a technology and what the interface allows learners to do. For example the affordances of Facebook may be the opportunities to support collaboration, a shared group identity and understanding of knowledge. Once the teacher is familiar with the environments they can orchestrate learning in a manner that supports learners through the process (i.e. ‘scaffolding’).

The Web has also revived interest ‘autonomous education’, highlighted by interest in the ‘Hole in the Wall’ experiments undertaken by Professor Sugata Mitra in the late 90s. These experiments involved observing children’s use of Web-connected computers placed in open spaces in rural settings in India and demonstrated that children were able to learn how to use the devices, to find information and teach others how to use the computers without any instruction or guidance.

While supporting opportunities for self-learning, the Web also provides a platform for delivering timely instruction and feedback that can shape learning outcomes using operant conditioning methods. This approach to teaching is based on behaviourist theory which claims that learning can be reinforced through the use of rewards and punishments. In Web-based learning environments this is normally applied through the use of ‘gamification’ techniques such as the awarding of virtual badges for achievement or through the provision of a visual indication of learner progress (e.g. a ‘progress bar’).

4. Pedagogy

New technologies inspire new approaches to teaching, and the Web has made a huge impact in this area. Formal education has adopted new approaches including the use of Virtual Learning Environments (VLEs), e-Porfolios, and Massive Open Online Courses (MOOCs) which support new blended learning methods. Course materials, formative assessments, lecture recordings (including video, audio and synchronised slides), and assignment information and submission form the backbone of VLEs used in most educational institutions. In addition, many institutions encourage their students to develop their own ePortfolios – a self-edited collection of coursework, blog posts and other educational activity that reflects the students’ progress, experience and knowledge gained during their time at a university of college. These are often integrated with (although kept separate from) the more formal VLE, and the institutions’ Careers Service and used as an addition to a students’ Higher Education Achievement Record.

VLEs are primarily used to support ‘bricks and mortar’ educational, they are not viewed as a replacement for class-based learning, but are ‘blended’ with traditional methods. MOOCs on the other hand appear to be heralding a paradigm shift in the delivery of formal learning. This relatively new web-based form of distance learning emerged in 2008 and has its antecedents in Open Educational Resources initiatives. MOOCs typically provide opportunities for an unlimited number of learners to experience a short college or university level module (normally around 6 weeks in length), delivered using synchronous and asynchronous tutorials, web-based video, readings and quizzes. At the end of the course learners are required to produce some form of relevant feedback that demonstrates their achievement, which is then assessed by their course peers or course tutors.

5. Openness

The early decision to open Web technologies for all was inspired by research sharing practices in academia, and as the Web has developed it has been used as a platform for sharing ideas, research and teaching. Open Access to research papers that have traditionally published by academic journals and available at a high premium, has the potential to transform learning and research. Making academic research available to everyone via the Web provides opportunities for wider access to learning for the poor and those living in rural areas, and improves the uptake of research outputs.

Similarly Open Educational Resource initiatives are providing opportunities for teachers to share teaching materials, allowing others to reuse and repurpose content. Issues regarding ownership of content have been overcome in many instances through the use of Creative Commons licenses – a scheme that allows content owners to clearly show how they would like others to use their material.

The increasing ubiquity of Web technologies combined with the culture of openness promoted by its founders of the Web, and increasing availability of low cost Web-enabled devices are transforming opportunities for learning and teaching, and are changing the way education is perceived. Despite inequality of access, the ‘digital divide’ and ‘web literacies’, the opportunities for accessing education are greater today largely due to the Web.