Resource description may be explained as the way in which the characteristic properties of content items and the relationships between them and other resources and entities are described.
In discussing resource description it is useful to make a distinction between a human readable, textual description of a resource, which may be presented in a structured or semi-structured format, and metadata, the formal, standardised, machine readable representation of the characteristics of a resource. R. John Robertson in discussing resource description1 suggested that it was "just good practice" that scholarly resources should include within the text information such as the title, a short description, enough information to identify the author and their affiliation, and the date. No one would talk to academics about this type of resource description and call it metadata.
Metadata is commonly defined etymologically as "data about data". A more formal definition is provided by NISO2:
"structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource."
While NISO explain that this can refer to information held in card catalogues etc, in modern usage the expectation is that this metadata will be machine processable, and in standard formats such as MARC, Dublin Core or IEEE LOM, which are frequently expressed in XML or RDF.
Some newer approaches, such as microformats, RDFa, and microdata, bridge the gap between human-oriented resource description and machine readable metadata in HTML pages by inserting machine readable tags into semi-structured descriptive text.
The NISO definition of metadata quoted above includes the common reasons for wanting to describe a resource, that is to assist in resource retrieval, use and management. However, use of the term "retrieval" hides what for many is metadata's single most important function: to facilitate resource discovery. In practical OER contexts, resource discovery means facilitating search on Google, as discussed in the SEO and Discoverability chapter. Resource description may facilitate the use of OERs by providing the information necessary to select an appropriate resource. This may be information such as the suitability of the resource for the topic to be studied but may also include information about the timeliness of the resource, the technical format, and licence under which it may be used.
The role of metadata for resource management is more difficult to conceptualise, but includes the ability to repurpose and re-present collections of OERs. For example, by passing information about all or part of a collection held at an institution to a central service, it can be included in a national or subject-oriented collection. Resource management may also involve the analysis of activity and outcomes at a collection or programme level, such as the OER Data Analysis and Visualisation project3 undertaken by Martin Hawksey. Both these examples of resource management require a level of metadata standardisation and machine readability so that one computer system can understand the information provided by many others.
The Heart and Pulse of OER4
At the outset of the first phase of the OER Programme, CETIS was asked to provide projects with guidance on resource description. As discussed earlier, for previous JISC programmes CETIS had provided projects with a strong steer on specific standards and application profiles to use in a number of technical domains including resource description. An important component of this work was an effort to produce an application profile of the IEEE LOM, the UK LOM Core5, tailored for use in UK Further and Higher Education. However, partly as a result of an increasing realisation that the UK LOM Core was not achieving the intended results, and partly in response to new approaches to resource description, such as folksonomies, informal tagging, and the use of platforms for resource sharing such as flickr, YouTube and SlideShare, which did not support formal metadata schema, the decision was taken not to mandate the use of formal metadata schema for the UK OER Programme.
Instead CETIS used the innovative pilot nature of the UK OER Programme as an opportunity to suggest that JISC explore a new approach to resource description6. Rather than mandating a formal application profile based on a single open standard, CETIS instead identified the type of information that projects were required to record for the resources they created, without mandating how this should be done. The hope was that this would give projects considerably greater flexibility as to how they described their resources and that this would ultimately result in richer descriptions of greater value to end users. The expectation was that projects would identify what they wanted to achieve and think through the resulting resource description issues that arose. It was hoped that by encouraging this methodology, collaborative approaches to interoperable resource description would be surfaced.
The metadata guidelines for phase one of the Programme mandated that the description of all resources should include information about the title of the resource; the author, owner or contributor; an indication of the date that the resource was created or published, whichever was significant; the URL at which the resource could be found; and technical information such as format, file size etc. There was also a mandatory programme tag, ukoer, which was to be used to identify resources produced through the programme. Other information such as a description, subject classification, keywords, tags, comments and the language of the resource were recommended as desirable but were not mandated. It was the intention that this minimal set of mandatory metadata would form the basis on which projects and related services could build resource descriptions that were adequate to meet the needs of their stakeholders. Some parts of the mandatory metadata set addressed programme-level requirements; for example, the ukoer tag was designed to facilitate the identification of all the resources produced by projects funded through the programme.
The Phase One guidelines were modified only slightly for subsequent phases of the Programme, the main variation being that licensing information was added to the mandatory metadata and technical information removed from the recommended list.
The programmes surfaced a number of issues relating to resource description.
One of the factors in deciding not to mandate a specific metadata profile across the programme was an acknowledgement that it was necessary to rethink what metadata is really required for educational resource description.
It is not difficult to find recommendations for resource characteristics or relationships that it might be useful to record for certain, often speculative, use cases (see for example the Learning Materials Application Profile Scoping Study7), however it is less easy to find evidence about how often these use cases are actually important and whether the proposed metadata approaches actually work.
The hope that projects would invest time in analysing their resource description requirements was somewhat disappointed. The metadata requirements listed above, which were intended as a starting point, seemed to be interpreted by some projects as the final word on what was required. Some projects took what we had intended as the minimal base of metadata to the sum total of what was required. Indicative of this failure to provide a suitable level of resource description was that in the Phase One Pilot Programme it was frequently the case that information about the licence under which a resource was released was not provided. Jorum, the national JISC funded repository followed the CETIS lead in specifying a minimal metadata set (which included some additional information such as licence and subject classification not in the basic set), but the repository did not reveal any additional metadata provided by depositors through the interface. As a result, potentially useful information was not visible to users. In another case, metadata from an OER source contained a description that was only seven letters long. It is difficult to imagine for what purpose such a description could be adequate. The best that can be said about such approaches is that at least they did not result in anyone wasting time that could have been better spent on the release of OERs.
In order to stimulate discussion about what metadata was necessary, CETIS organised a meeting8 at which various approaches to gathering data that might inform an answer were discussed. Three approaches were identified as promising: questionnaires to ascertain from users what they were interested in and how they approached resource discovery; analysis of search logs to find what characteristics were being searched for; and semantic analysis of free-text descriptions to find out what was being described. Some preliminary results were reported at the meeting, however this remains an under-explored area.
The use of the UKOER tag to identify OERs created through the funded programme did not work quite as anticipated as the tag proved to be very popular among projects and was widely used for tagging anything related to the programme, including tweets, discussions, blogposts, images, and other project outputs. As a result, a Google search for UKOER finds more information about the programme than resources created by the projects. It does, however, have some utility in identifying resources from the programme that were shared through social web sites such as YouTube, flickr and Slideshare. Several projects became aware of this issue and used a project-specific tag to identify their OERs. This also makes it possible to identify which resources came from which projects once they have been aggregated into larger collections.
It was acknowledged from the outset that the freer approach to resource description was likely to have some impact on interoperability. One aspect of this was not mandating classification schemes or controlled vocabularies for the specific resource characteristics being described. Thus there was no single subject classification that was imposed centrally by the programme. This allowed individual projects to choose classification schemes that met their own needs. For example institutional projects could classify according to their departments or the programmes they delivered, subject-based projects could use a classification scheme that was specific to their discipline, and projects that disseminated through specific channels such as iTunesU or Jorum could use the classification scheme used by that platform. Jorum uses JACS9 to provide a top-level classification scheme that mirrors the subjects taught in UK Higher Education, a practice that was followed by many projects.
However some projects wanted a more restrictive approach. This may have been because they foresaw interoperability problems without mandated controlled vocabularies, in more than one case this opinion was expressed by a project team member from a library background. Alternatively it may have been that projects preferred to use recommended vocabularies rather than having to select classification schema themselves.
One area where standardisation of approaches has proved to be particularly difficult is in describing the educationally significant properties of a resource. As noted in the Learning Material Application Profile Scoping Study10
"metadata for education was one of the domains where the issues were least well articulated and where solutions were least well developed."
In other words, while it is often stated that it would be useful to describe features such as the "educational level" or the "interactivity" of a resource, there remains a gap between this desire and defining exactly what is meant and how these characteristics should be described. If it is the case that these concepts are useful in assisting users in resource selection, but are too nebulous to be used by machines for filtering, then a realistic approach is to include them in the human readable resource description, without attempting to encode them as machine readable metadata. In other words, to include them in free text descriptions or free text keyword fields.
One of the effects of the previous focus on formal metadata standards seems to be that resource description is seen as a technical issue to be dealt with by experts in information and interoperability standards, distinct from the resource creation process, and hidden away from those creating the resources. A recurring symptom of this was that resource descriptions tended to be created in the platform that was used to disseminate the resource, e.g. the repository, rather than being contained within the resource itself. The fact that programmes funded the release of existing material rather than the creation of new material may have accentuated this focus on dissemination systems, rather than on the resources themselves. The problem, of course, is that once the resource is downloaded by a user, or if the user is sent directly to the resource from an external link bypassing the description page, they will miss important information such as the licence and provenance of the resource. This issue is also explored in the Licensing and Attribution chapter, but in short, this is a recipe for creating orphan resources and uncertainty among users, that may inhibit the reuse of the resource. In addition, divorcing resources from their metadata, may deny the content creator or publisher the potential reputational benefit arising from having their resource reused and their authorship acknowledged.
Sensible approaches to ensure that the description of the resource stays associated with the resource include creating a template of resource description elements as a header or footer running throughout the resource, cover pages providing a short summary of the resource, credits at the end of a resource such as an audio or video recording or the final slide in a stack. Images present a particular problem as they are typically non-textual and displayed as a single frame. One solution is to add the necessary information, in the form of text, as inconspicuously as possible at the edge of the image. A particularly interesting and useful application of this approach is the Xpert Media Search attribution tool11 which will search flickr for Creative Commons licenced images and automatically create a copy of the image to which licence and attribution information has been added. This tool being developed further with funding from the JISC OER Rapid Innovation Programme.
Another approach is to embed machine readable metadata into the resource, for example as the properties of a Word or Powerpoint document, exif12 metadata in images or id313 tags in audio files. The extent to which this metadata is made visible to human users varies between applications, as does the reliability of the metadata found in the wild; arguably the two are correlated. At one end of the spectrum the metadata found in office documents, whether proprietary or open source, is rarely displayed when viewing the documents, and has been found to be unreliable. For example, the author of the template a document is based on often appears in the "document properties" as the author of the document, and is often left uncorrected. Some metadata in images and recordings which is created automatically (e.g. time and geolocation information from cameras) or imported from trusted sources (e.g. metadata in music recordings) is usefully displayed by systems that disseminate or display/play those resources.
There are limitations however, for example while some social media sites will import embedded metadata, such as geolocation tags, and display this information on the page along with the resource, frequently it is not possible to amend this embedded metadata. For example, a user may change the location of the image on the display page, but this will not change the geolocation information embedded in the image.
Some future directions relevant to resource description have emerged from the UK OER Programmes that are worth highlighting.
The description of audio visual recordings at a highly granular level has always been problematic. "Shot lists", which provide a shot-by-shot description of the content of a recording and where it can be found, can be extremely useful but are very time consuming and expensive to produce. An interesting approach to providing this information is being explored by OER Rapid Innovation projects such as Spindle14, which aims to increase OER discoverability by improved keyword metadata via automatic speech to text transcription and Synote15 which supports the crowdsourcing of notes, bookmarks, tags, images and text captions that are synchronised with audio visual recordings. Synote also has the ability to publish linked data.
The use of microdata within HTML documents provides a means of combining human oriented resource description with machine readable metadata. Of particular significance is schema.org16, an initiative involving Google, Yahoo, Yandex and MS Bing that aims to
"… improve the web by creating a structured data markup schema supported by major search engines. On-page markup helps search engines understand the information on web pages and provide richer search results."17
There are two aspects to schema.org; a syntax for encoding the markup that is a subset of microdata or RDFa lite, and a shared ontology of item types and their properties. The Learning Resource Metadata Initiative is working to extend the schema ontology so that selected educationally significant characteristics may be marked up. These developments came late on in the Programme and are only at an early stage of implementation.
It has long been acknowledged that publisher-created resource descriptions and formal metadata records are not the only useful sources of information about a resource. Often more useful, contextually sensitive and extensive information can be created by users, both incidentally as they use the resource, and through the conscious actions of reviewing, tagging, discussing and recommending resources. The new approaches to gathering and using this information encapsulated in the paradata18 approach may offer solutions to some of the more intractable issues around the description of the educational characteristics of a resource. For example rather than trying to identify the educational level of a resource, the paradata approach would be to record the courses a resource has been used in, so that it can be recommended to teachers and learners engaged with similar courses. This is approach is discussed in more detail in the chapter on Paradata.
Joint Information Coding Scheme (JACS), http://www.hesa.ac.uk/index.php?option=com_content&task=view&id=158&Itemid=233
Barker, P., (2008), Learning Material Application Profile Scoping Study – final report, http://www.icbl.hw.ac.uk/lmap/lmapscopingreport.pdf
There has been error in communication with booki server. Not sure right now where is the problem.
You should refresh this page.