CWRC Ontology Preamble


Abstract

The Ontology of the Canadian Writing Research Collaboratory (cwrc.ca) brings together various linked data materials produced within the Collaboratory related to the writers, writing, and culture.

1. Introduction

Although it contains quite general components for activities such as annotation and citation, the focus of the CWRC ontology is on describing and relating aspects of literary studies and literary history, with a strong emphasis on gender and intersectional analysis indebted to its roots in The Orlando Project, a history of women’s writing in the British Isles. It links to a number of standards while attempting to indicate the complexity of the relationship between representation and provenance in the production of linked data, and to convey the situatedness (Haraway, 1988) of the knowledge that it represents.

Some of the materials associated with this ontology are produced natively by activities conducted within the Collaboratory. Others are produced through a process of translation from embedded XML markup. In other words, some are the product of human creation or curation, and others are generated by machine.

2. About this Document

This document is a human-readable version of the ontology that cannot document all of its data structures. The ontology itself should be the primary source for understanding how the ontology works.

The intended audience of this document is the scholar that wishes to understand how the ontology tackles concrete data recording problems and the linked open data practitioner that intends to make use of this ontology.

3. Status of this dynamic ontology

This document and the associated ontology will grow iteratively with modifications made over time as data is progressively translated and further ontological concerns are identified. Continuity is ensured using the OWL ontology annotations for ontological compatibility and for deprecated classes and properties. Deprecated ontology terms remain present but are marked as such.

The ontology is understood to be a living document that makes no claims to completeness. Instances in particular have been derived from particular datasets and will be expanded progressively over time.

We welcome suggestions for new classes, properties, and predicates from those wishing to use the ontology for their own datasets, as well as suggestions related to the complexity of vocabulary associated with existing terms. Please submit suggestions via an issue or a pull request to the CWRC Ontology code repository.

4. Background on the Orlando source data

The Orlando Project embarked in 1995 on a history of women’s writing in the British Isles from the beginnings to the present (Brown, Clements and Grundy, 2007a;Brown, Clements and Grundy, 2007b).

This born-digital collaboration devised a knowledge representation (Brown, Clements et al., 2006) in the form of a bespoke SGML tagset to encode the project’s intellectual priorities and concepts in the text as it was being written. This tagset structures the biocritical, chronological, and bibliographical content of the resulting history of more than 8 million words and 2 million tags. The schema provides the basis of the Canadian Writing Research Collaboratory’s schema for similar content, and provides the foundation of the ontology provided here. Some of the source data is produced via extraction from XML tags embedded in Orlando Project materials and the content of similarly structured content within the Collaboratory (Simpson and Brown, 2013).

Orlando: Women’s Writing in the British Isles from the Beginnings to the Present (Brown and Clements and et al., 2006) is published by Cambridge University Press.

The scholarly introduction and introduction to the Orlando tagset are available here: Introduction to Orlando Tagset.

Contributors to Orlando are listed here: Orlando Project Contributors.

The Orlando Project’s XML schemas and the CWRC Project’s XML schema are available on Github.

5. Basic ontological goals

a. Principles

The schema covers entities, classes, and relationships associated with the domains of literature and literary and cultural history as understood from an intersectional feminist perspective. The ontology design responds to the challenges of shifting from semi-structured to structured data (Smith, 2013). Although linked data triples stand on their own formally, many are derived from discursive prose and are best read in an environment that links back to their original context. The CWRC ontology design avoids representing RDF extractions from Orlando data as positivist assertions, and yet produces machine-readable OWL/RDF-compliant graph structures. It allows references to, without endorsing, external ontological vocabularies that are nevertheless part of documenting cultural processes and identities.

b. Competency Questions

Competency questions are meant to provide a sense of scope to an ontology, that can serve a number of purposes including giving users a sense of what kind of information they might find in datasets that employ the ontology, and giving the ontology developers criteria against which to measure the success of the ontology. The CWRC ontology represents a wide range of information about writers’ lives, literary careers, and literary works. Moreover, as with other humanities data, this information may be put to a wide range of uses, many of which will not be foreseeable. For instance, the nineteenth-century novels of Susanna Moodie have been searched for evidence of specific weather events by researchers into climate change. This list will therefore not be exhaustive, but it should give some sense of the range of questions we would expect the ontology to be able to address. The fact that the datasets represented by this ontology cannot be comprehensive also needs to be stressed: the unevenness of the archival and published record plus the necessarily selective and variously prioritized ways in which the information has been collected and recorded mean that any sense of statistical significance or representativeness related to the kind of data for which this ontology is designed must be highly qualified and contextualized.

Biography-based questions

  1. What people are known to have attended school in a certain city over a particular period of time?
  2. What British authors attended the same schools as each other?
  3. What writers were taught by or schooled alongside another woman writer?
  4. Who is recorded to have died from a particular cause of death in a particular time period?
  5. What family members is a particular person recorded to have had, and how they were related?
  6. Which queer/lesbian identified authors are recorded as having attended single-sex institution?

Cultural Formation

  1. What people were identified with a particular race, colour, or nationality?
  2. What women during the Victorian period were associated with multiple nationalities?
  3. What writers had some form of Jewishness in common?
  4. What British writers were associated with both Protestantism and Catholicism in the nineteenth century?
  5. What literary texts engage with a particular religion or denomination?
  6. What is the breakdown by the different genders represented in this dataset of novels published during a particular period?
  7. Which writers are associated with a particular political affiliation?
  8. Are writers more likely to be associated with gender-related causes at particular points in history?

Human Relationships

  1. Does a connection exist between two particular people?
  2. How close is the connection? Is it asserted frequently in the data, as opposed to occurring only once or twice?
  3. What types of connections to other people does a particular person have?
  4. What family ties does this person have?
  5. If two people are not connected, what is the shortest path between them via relationships with other people, or with other entities such as organizations or texts?
  6. What connections exist between a set of people during a specified period in time?
  7. How many people cite a particular author as influential in their own work?
  8. How many writers are related to a particular organization? More specifically, which feminist organizations were supported by two or more generations of writers from a single family?
  9. Who are all the people noted here with whom this author collaborated professionally (editor to writer relationships; author to author; editor to editor, etc.)?
  10. Who had a relative involved in professional publishing spheres?

Clustering/Networking People

  1. What authors are most interconnected with other authors in terms of their influence?
  2. Can we identify clusters of writers who seem to be operating as a community in terms of having a tight network of friendships, literary relationships, use of the same publishers, reviewing each other’s works, etc.?
  3. Can we identify individuals who were key connections between different groups?
  4. Whose work was influenced by British and/or international writers of colour?
  5. Who was involved in both feminist groups and animal welfare activism?
  6. Who was in touch with non-literary artistic groups?

Texts/Works based questions

  1. What books were important to this author’s education?
  2. What reviews exist for a particular book?
  3. In what languages has a particular work been published?
  4. Is there any acknowledged intertextual relation between X and Y?
  5. In which journals does a particular author’s work appear?
  6. How many intertextual relationships does an author have to female-authored literary works?
  7. Find all the responses to this book that are deemed to be gendered.
  8. Which works are represented as the most translated?
  9. Find particular themes and topics in texts, such as which works of the imagination contain depictions of women’s colleges? Which depict political organizations?
  10. Which authors wrote for the same journal and in the same time period?
  11. Which fictional works allude to a particular type of activism?
  12. Are there references to fictional works in this author’s non-fictional work?
  13. Which European fictional texts are set outside of Europe?
  14. Who destroyed her own works? Whose works were destroyed by others?
  15. Which works seem to have been influenced by certain theorists or philosophers?

Geography based questions

  1. Which texts were or were not published in a particular country?
  2. Which texts were or were not reviewed in a particular country?
  3. In which cities or nations did a particular author reside?
  4. Which cities or nations are depicted or discussed in an author’s work?
  5. In what locations has a particular play been performed over time?
  6. Which works were written during travels?
  7. Which texts were published or otherwise shared in countries outside of Europe? Which texts were reviewed in countries outside of Europe?

Time- (and Event-) Related Questions

  1. What are the most discussed texts of a particular temporal period within this dataset?
  2. Trace the impact of a particular text through time and space.
  3. What is the relative rise or fall of a writer’s reputation over time, in relation to other writers in the period?
  4. What events in this person’s life were related to aspects of social identity such as religion, social class, or political affiliation?
  5. What changes over time are recorded in the frequency of the kinds of relationships that the data describes, across numerous writers? E.g. Does this dataset record greater degrees of intertextuality with male writers or female writers, relatively speaking, at different points in time?
  6. What major social or historical events and developments are reflected in the literary record?
  7. Can we target exploration of the data at particular temporal periods, such as the Victorian period?
  8. Which authors are likely to have known each other, due to overlapping chronologies, locations, and other connections in common?

Complex questions

In many cases the ontology will play a part in investigating a more complex question or as a component of a larger hermeneutical process. For instance:

  1. Let me compare the publication patterns of writers, distinguishing by gender and by the number of children that they had. Looking at it over time, does their rate of literary productivity increase or decrease in relation to the number of children they have?
  2. Show all the elements of both her self-taught and her formal education (books, subjects, instructors) that are also alluded to in a writer’s works.
  3. Trace the impact of developments in writing, such as the emergence of a particular theme or formal feature, to a larger social development.
  4. Test claims about the rise of genres or literary movements and see how they look when inflected by a dataset focused on women’s writing.

c. Anticipated tools and functionalities

Also relevant to the structure of the ontology are the kinds of tools and functionalities that it aims to support. These are:

  1. Searches through SPARQL queries;
  2. Browsing, including faceting according to various criteria based on the ontology, including temporal periods, geographical locations, or the properties of writers;
  3. Linking to our instance data by way of their URIs;
  4. Discovery of significant information about instances through dereferenceable web pages;
  5. Discovery of materials across the web that reference instances or other components of the ontology;
  6. Graph Visualization of the structure of the ontology, including the properties and relationships it contains;
  7. Network Visualization of the relationships between people and other people, and influence and relationship graphs showing connections between people and other entities such as books, indicating the directionality of relationships where appropriate.
  8. Mapping of components of the data associated with geospatial information;
  9. Timelines of components of the data associated with temporal information;
  10. Use of SHACL rules and other logical inferencing tools to check for data errors, omissions, and consistencies;
  11. Use of SHACL rules and other inferencing tools to derive new information from the combination of existing data and the ontologies;
  12. Expose the unevenness of datasets by enabling the tracking of sources, provenance, and degrees of certainty in order to provide insight into gaps in the knowledge base;
  13. Expose conflicts, contradictions, and outliers within datasets as a basis for inquiry.

d. Linkages to external ontologies

We employ a number of strategies for linking to other ontologies. Our architecture does not typically import other ontologies wholesale, but relates to large vocabularies in defined ways. We try not to abuse sameAs predicates (Halpin, Hayes et al., 2010).

We adopt external namespaces and associated classes and terms wherever possible when they are in widespread use and their vocabularies are broadly compatible with ours, as in the case of the FOAF and BIBO vocabularies. For some terms, such as those for religious denominations or genres, we are happy to draw on other vocabularies’ terms and definitions in part or in whole, as in the case of terms from the Getty Art and Architecture Thesaurus (Getty Research Institute). Other terms are referenced, but usually at a distance rather than through wholesale import. This is particularly common in relation to cultural forms, which, as explained more fully below, are understood primarily as representational and linked, where multiple related terms exist within the ontology, to terms typed as textual labels. By means of this structure, our vocabulary positions all terms associated with processes of Cultural Form as discursive labels, retaining the ambiguity of terms implicated in the complex social construction of identities within a narrative. Cultural forms may in turn be related to external ontologies in a number of ways. If an external ontology term aligns semantically with ours, then we use OWL- or SKOS-based relationships such as <owl:equivalentClass>, <skos:narrower> or <skos:broader>. If an external term's definition or use is not commensurate with a term in the CWRC ontology but its application in external datasets is such that it will be useful nevertheless to link those terms to ours (for instance for broadening searches using the problematic ISO5218 Codes for the representation of human sexes), then the has functional relation predicate is employed to indicate that the relationship is specified semantically but may be leveraged for processing.

At the top level, the CWRC ontology makes use of the following well known ontologies:

  1. The FOAF ontology for the representation of people and organizations.
  2. The BIBO ontology for the representation of bibliographic data.
  3. The TIME ontology for the representation of events and points in time where ISO8601/XML Schema times are not appropriate.
  4. The NIF-CORE ontology is used to contain and manipulate the text of the original Orlando entries.
  5. The Web Open Annotation data model is used to link the original Orlando text to specific Contexts.
  6. The SKOS vocabulary is used to represent taxonomical relationships within certain Cultural Forms and to fully document ontology terms.
  7. Some Dublin Core vocabulary terms are used for well known documentation tags such as <dc:title>.
  8. The W3C Provenance ontology is used to indicate indebtedness, derivation or provenance of term descriptions as well as Cultural Context source annotations.
  9. Linkages are made to the CIDOC-CRM ontology to cultural instances that are in common with CWRC.

Established ontologies and vocabularies are used in the definition of numerous classes and instances. For instance, the religious terms of the Getty Art and Architecture Thesaurus provide suitable definitions for many religions, as does DBPedia for many terms throughout the ontology. Sometimes definitions draw on scholarly print and online sources. Quotation marks around the text of the description indicate wholesale adoption of the source definition. Where the description is surrounded by quotation marks, the term has been defined by the CWRC team, but links to may be provided to external resources such as a scholarly article or closely related DBpedia entry.

In other cases, terms from external ontologies are adopted in CWRC datasets without having been imported into our ontology. What follows is a non-exhaustive list of such vocabularies and the classes for which they are most frequently used:

  1. Geonames terms are often used for locations and for many instances of geographic heritage.
  2. Library of Congress Languages codes are typically used for instances of language.

e. Provenance and contexts

As noted above, some data associated with this ontology has been generated from XML structures (Simpson and Brown, 2013). Provenance is thus particularly important, given that such data was not produced natively in RDF but rather in the form of tags embedded in a discursive context. In such cases, the relevant portions of the text are provided in the form of snippets, which within the dataset become instances of contextual notes or human-readable annotations to which the data set nodes are directly tied.

The wholesale import of entire vocabularies within the CWRC ontology was likely to cause logical and ontological problems. To this end, we opted not to use the <owl:import> construct and instead either to link to vocabularies externally or to clone specific sets of terms from selected vocabularies. Similarly, not all vocabularies are well-defined from an ontological standpoint, but drawing from their narrative or some of their properties proved useful. To this end, we avoided the use of <owl:sameAs> so as not to bring unintended properties or ontological structures into the CWRC ontology. In other cases, the Provenance ontology property <prov:wasDerivedFrom> is used to indicate that the term was constructed using information from other terms without necessarily being equivalent. Direct linkages to other ontologies are usually made through the use of subClasses or <owl:equivalentClass>.

f. Labels

For labelling, CWRC utilizes two schemes to promote searchability. rdfs:label represents the humen-readable nomenclature for a concept, instance, or predicate. This is the terminology used when representing components of the ontology in documentation and diagrams, except where a URI is provided.

As noted above in relation to cultural form, however, when textual label is used to type a class, this is an indication of the representationality or discursivity of that class. cwrc:TextLabels are frequently used for ambivalent, overlapping, and culturally contested terms.

In addition, to support those with knowledge of prior datasets whose strings or terms have been linked for extraction purposes to CWRC terms, the ontology provides additional linguistic context for CWRC ontology terms. Alternative labels, signified by skos:altLabel, indicate terms from source datasets that have been employed to create relationships to this concept. Alt labels typically cannot serve as replacements for rdfs:label. Within the ontology, such alternative labels primarily exist for search and retrieval by allowing ontology terms to be located under a larger number of labels. Although some reflect the idiosyncrasies of the source data, they may be useful for broadening searches.

g. Cultural diversity

Cultural diversity has been an increasing source of debate beyond and within the digital humanities community. The concentration within the Debates in Digital Humanities series (Gold, 2012; Gold and Klein, 2016) of pieces reflecting the increasing prominence of matters related to race, gender, cultural diversity, and difference is but one marker of the extent to which diversity matters. This ontology seeks to convey an intersectional understanding of identity categories, as instantiated in The Orlando Project’s XML Biography schema.

The Cultural Form portion of the ontology recognizes categorization as endemic to social experience, while incorporating variation in terminology and the contextualization of identity categories. It understands social classification as culturally produced, intersecting, and discursively embedded. We invoke categories as the grounds for cultural investigation rather than fixed classifications, since such categories have never been stable or mutually exclusive (Algee-Hewitt, Porter, and Walser, 2016). For a more detailed explication of cultural formation, see Brown et al 2017.

6. CWRC ontological structures

Source data from CWRC spans multiple types of data including annotations on source texts, meta-data, granular material such as bibliography, and discursive and analytical content about specific life events and literary phenomena. The CWRC linked open data set represents such information as series of assertions, frequently associated with particular contexts.

While full, integrated traceability has always been a core need of repeatable experiments, this comes at a complexity cost within a linked open data set in that the queries required to retrieve basic information become unwieldy. To this end, the CWRC ontology records information in two different ways: through a series of Contexts that link the information to its associated source text in Orlando or other materials, and through a series of granular properties that simply link individuals to their personal attributes. In this way, both rapid retrieval and deep provenance tracking are enabled.

The major structure used within the ontology to achieve this is Contexts combined with the typing of contexts according to a number of high-level classes such as Cultural Form. Contexts are used to link a fragment of Orlando prose to the individual whom it references, the properties or assertions situated within that context, and the class of experience or activity to which it belongs.

a. Contexts

The Context class provides the discursive context for assertions in the ontology. Where the assertions have been generated from a web-accessible source text, Context provides the text, or the relevant snippet of a longer text, from which they have been extracted. Contexts help to ground the data in its source materials, which can provide users with a sense of the nuance and complexity of assertions related to human subjects and cultural phenomena.

Contexts are typed by major semantic categories including, for biographical material, Cultural Forma, Birth, Death, Education, Occupation, and Politics, and, for literary content, Production, Reception, and Textual Features. DC subject links connect specific instances of contexts to associated assertions of properties for that individual, and are also linked where possible using the provenance ontology to the full text from which they were extracted. The triples are linked to their associated contexts using The Web Annotation Data Model and Dublin Core subject relationships.

The major contexts classes are as follows:

A cultural form represents an aspect of lived social subjectivities and/or classification of a person through categories such as race or colour, ethnicity, gender, language, sexuality, politics, or religion. Most of the properties associated with specific Cultural Forms may also have the additional modifiers of reported and self-reported, allowing for the qualification of individual statements.

b. Persons, Personas and Roles

The distinction between persons, personas, and roles is an important component of the complexity of human experiences and relationships.

This ontology adopts the broad FOAF definition of a foaf:person, which can be applied to any entity considered to be a person, including non-humans. We define two subclasses of Person: a NaturalPerson or human being, and a FictionalPerson, since fictional characters are important to literary studies. If a historical person who is a NaturalPerson is fictionalized as a Character [not yet created] in a text, they also become a FictionalPerson. If a text simply alludes or refers to a NaturalPerson, however, they are not also a FictionalPerson.

In some cases, a Person will be associated with a Persona. A person can occupy a Role [not yet created] in relation to a specific event or situation.

The author Michael Field offers an example of the extent to which "personhood is both a complex and a crucial characteristic that ontologies must be designed to capture appropriately" (Brown and Simpson 2013). The persona of Michael Field was produced by the artistic and lived collaboration between Katherine Harris Bradley and Edith Emma Cooper at the turn of the twentieth century. Even though he was not a biological person, Michael Field had an important role in the two women’s careers, their social lives, and their personal relationship. "Michael Field" can neither be assigned to one of the authors over the other, nor can it be considered only a shared pseudonym. Michael Field is associated with two natural persons at the same time. We seek in the CWRC ontology to capture such manifestations of the originality and the plurality of personhood. The ontology thus includes the "persona" class of person to describe entities such as Michael Field.

It might be argued that such personae are simply pen-names or stage names, such as "Currer Bell" for Charlotte Brontë. However, personae are more than alternative signatures. Personae inflect the ways in which artists socially, symbolically, intimately or artistically embody authorship. While a pen-name can be described as a publication strategy related to a specific context, a persona has its own performed personality that goes beyond a signature. A contemporary example is the FASTWÜRMS art collective. The collective operates as more than a creative identity, to the point of holding a single academic position at the University of Guelph.

A particular persona is an original creation, often bearing meaning related to the biographical, historical and sociological context of its creator. A persona in this sense is also not generally associated with mental illness or multiple personality disorders that result from distorted or uncontrolled perceptions of reality. At the heart of a persona is an identity with which others interact and that can be confused with a Natural Person. It is incarnated and developed by a natural person, may have specific properties such as gender or sexuality that differ from those of the natural person with whom it is associated, and may engage in social, literary, artistic or political activities. Although Personae are FOAF Persons, they are distinct from the CWRC Natural Persons who embody them and from Fictional Persons, unless they become fictionalized by themselves or others.

As documented for the recent Persona tag incorporated in the Text Encoding Initiative Guidelines, personae are not Roles either: "A role may be assumed by different people on different occasions, whereas a persona is unique to a particular person, even though it may resemble others. Similarly, when an actor takes on or enacts the role of a historical person, they do not thereby acquire a new persona." (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html#NDPERSE).

A Role can be adopted by either personae or natural persons, but a persona cannot be adopted by people generally: it is specific to one natural person, or more rarely several natural persons (as in the case of the collaborative Field and the artistic collective FASTWÜRMS art collective).

Roles are characters or functions performed in specific occasions and situations, which is to say events. Dramatic roles, that is to say #Character in a creative work, are adopted by actors for particular performances. By analogy, social roles are adopted by particular individuals in relation to particular events or occurrences, which may be of brief or long duration. Key roles in relation to events are those of agents, spectators, and commentators. Occupations, jobs, or significant activities are not the same as roles, although they may be related to them, as may familial or social relationships. Roles will be further fleshed out in relation to the event component of the ontology, which is currently under development.

c. Cultural Form

The Cultural Form classes recognize categorization as endemic to social experience, while incorporating variation in terminology and contextualization of identity categories by employing instances at different discursive levels.

Cultural Form sub-classes and instances describe the subject positions of individuals through both Contexts and granular properties. This has its roots in the Orlando arrangement of Cultural Form encodings that points users towards a framework for raising and debating complex matters for cultural investigation rather than invoking reified categories.

The shift from embedded semantic markup to a linked open data approach presented the challenge of making this approach compatible with linkages to other ontologies and data sets outside of the Orlando frame of reference. The move from "strings to links" or "strings to things" was in some sense at odds with the former embrace of the ambiguity of strings such as white, black, English, etc.: white and black can represent race or ethnicity, while English can also be invoked as an ethnicity, nationality, or a national heritage. Orlando marks these strings using its Cultural Forms tagset as specific to, for example, the context of race or ethnicity, mandating a similar association, within the linked data representation, with a specific instance of Cultural Form. Thus, there exist Cultural Form instances that point to the discursive construction of white as a race and white as an ethnicity. Lastly, there also exists a white label that can be instantiated as either race or ethnicity, but not both within the same assertion (although multiple assertions are possible).

This is a departure from previous (non-linked open data) controlled vocabularies, in that the appearance of the term or label (in this case "white") does not indicate the specific cultural formation being invoked, the specific instance does. This also means that linkages to other data sets or vocabularies can be made appropriately, since multiple representations of the same label are present within the CWRC ontology. As a last resort, or for data mining purposes, the term is also available as a concept whose actual Cultural Form is undecided amongst the CWRC-defined options. This allows for linkages to an external ontology, such as can be required by text mining, without endorsing the corresponding definition or interpretation of the term.

i. Granular Properties

Granular properties provide a simple means of indicating cultural categories as as presumed, perceived, or otherwise assigned to a person according to cultural conventions, or as self-reported by the person themselves. Some of the properties are associations inherited from forebears.

Most properties take noun forms in keeping with conventions for ontologies, but in some cases idiom makes adjectival forms preferable, even though these terms function as nouns, as in the case of the sexual identity celibate.

d. Built-in Taxonomies

i. Religion

The original Orlando data makes religious reporting a challenge in that the original contexts had no differentiation between religious belief, membership in a religious organization, and absence of any religious belief combined with adherence to values or practices.

We use a taxonomy for enumerating the categories associated with this spectrum. The taxonomy in itself is SKOS-based and represents a loose mixture of the shared beliefs and historical offshoots.

The taxonomy attempts to trace in a subjective way the theological and/or historical lineage of the belief system. Like applying the labels to an individual, this is an interpretive process.

The specific taxonomy is:

ii. Political Affiliation

Political affiliation categories cover a broad range of political parties, more and less organized movements, and various causes. The instances here involve a strong emphasis on British political matters historically of interest to women and a recognition that movements such as feminism are contested and change with time and location. Some affiliations are linked via SKOS relations, but there are other cross-currents among different groups that cannot be captured here as well as in the contextualized data itself. As for the other components of this ontology, this vocabulary makes no claims to comprehensiveness, having been derived from the Orlando dataset, and is open to expansion as needed.

The specific taxonomy is:

iii. Genre

The Literary Genre ontology contains a taxonomy of literary genres based on a SKOS approach. This provides limited transitive narrower/broader definitions that can be used to search for relevant creative works using the taxonomy. The taxonomical tree is built on an ad-hoc topic relevance standard meant for document retrieval and that may not be suited to all applications. One distinctive feature of the ontology is the use of adjectival terms such as 'philosophical' or 'detective' to denote a particularly type of text. Such terms can be employed in conjunction with genres that relate more to form, such as 'poem' or 'novel', so as to denote, for instance, 'feminist novel'.

The specific taxonomy is:

e. Notes on ChangeSets

Change Sets exist to track changes to instances, terms, and classes within the ontology, they are therefore used by both the authors of the ontology as well as users who are making additions or modifications. Change sets are instances that are linked either to or from a structure in the ontology. This is done through the object properties affected entity or through the skos:changeNote. A single change set may apply to several entities therefore the cwrc:affectedEntity relation may be applied 0 or more times. Change sets also track the user through the cwrc:alteredBy relation and may link to any cwrc:NaturalPerson. In addition dates and times are kept track of through time:inXSDDateTimeStamp using xsd:dateStamps to track the instant the change was applied to the ontology. This will allow change sets to be used as release by selecting them between a particular date to show major changes. Through provenance Change Sets may link to external resources allowing extended discussion to be kept out of the ontology. Shorter descriptions are included through the skos:changeNote relation as well as a title using a standard rdfs:label. Change Sets are to be used anytime an issue is completed by ontology developers or an instance is changed by a user. Automated mechanisms will be used to offset some of the work required.

Figure 3 - Change Set Example using both external descriptions and internal descriptions

f. Has Functional Relation

The Functional Relation predicate indicates that the two terms may be treated as related for functions such as querying and retrieval, but it also denies a semantic relationship between the two terms. This predicate is designed to bring together incommensurate terms for processing purposes but also to exclude them from semantic operations. This differentiates from, for instance, the skos:semanticRelation property and the skos:closeMatch predicate which serves a similar purpose but asserts a semantic proximity.

One of the purposes for this relation is to facilitate comparisons and relationships to other ontologies and vocabularies with which users are more familiar. Use of this relationship does not assert that the two terms are not related semantically, but rather that the current semantic relationships available within OWL, SKOS, and other ontologies used by this ontology are not sufficiently nuanced to allow for a semantic relationship to be specified in a way that can be processed appropriately by other tools (such as inference engines).

7. CWRC Ontology Design Rules

Beyond the formalism of The OWL 2 Web Ontology Language, the CWRC ontology follows the following design rules and styles:

  • The contents of rdfs:labels tags are always in lowercase, with the following exceptions:
    • Labels for religions, political affiliations and groups of people derived from a proper name will begin with an uppercase letter.
  • Whenever possible, the original Orlando XML tag equivalent is contained within the rdf:value tag of any term within the ontology.
  • Whenever referencing a geographical location, use the most precise item within the database.
  • Definitions in French, English (and other serendipitously available languages) are never word for word translations, and are definitions in their own right.

8. Notes on SKOS and OWL

SKOS (Simple Knowledge Organization System) enjoys widespread popularity in the semantic web community as it provides simple terms for taxonomies without requiring reasoner support. Whenever appropriate, SKOS terms are inserted within this ontology to link terms to each other. However, since these terms are not ontologically powered, their scalability is limited since each additional layer of terms within a taxonomy requires another database query.

Some of the constructs within the CWRC ontology are deep and require reasoning support. OWL is the preferred means of using this ontology, though the usage of the terms, SKOS-style, is possible.

9. Conclusion and Future Work

This is a draft ontology that is very much in progress. It will continue to be developed, expanded, and revised as we discover the implications of how we have structured the ontology through using it to extract and explore our data, as fresh data and use cases necessitate expansion or refinement, and as new needs, understandings, and debates arise.

10. Version History

  • 0.99 - Initial public release.

  • 0.99.2 - Periodic release with updated logos, genres, documentation, and proper masthead data.

  • 0.99.6 - Periodic release with updated styling, competency questions and documentation regarding events and changesets

  • 0.99.75 - Periodic release

  • 0.99.80 - Periodic release with addition of occupations, educational award types, education credentials

11. Bibliography

Adrienne Rich,. Blood, Bread And Poetry: The Location Of The Poet. no date. The Massachusetts Review, 1983.[link]
Angel R. Oquendo,. “Re-Imagining The Latino/a Race”. no date. The Latino/a Condition, edited by Richard Delgado and Jean Stefancic, 1998.[link]
Ashcroft, B., et al. Key Concepts In Post-Colonial Studies. no date. Literary Criticism, 1998.[link]
Atkin, A. “Peirce's Theory Of Signs”. no date. The Stanford Encyclopedia Of Philosophy, 2013.[link]
Bailey, M. Z. “All The Digital Humanists Are White, All The Nerds Are Men, But Some Of Us Are Brave”. no date. Journal Of Digital Humanities, 2012Mar. .[link]
Beauvoir, S. de, et al. The Second Sex. no date. Vintage, no date.[link]
Bergman, S. B. Butch Is A Noun. no date. Suspect Thoughts Press, no date.[link]
Bornstein, K. My Gender Workbook: How To Become A Real Man, A Real Woman, The Real You, Or Something Else Entirely. no date. Routledge, no date.[link]
Bornstein, K., and S. B. Bergman. Gender Outlaws: The Next Generation. no date. Seal P, no date.[link]
Brown, R. M. Rubyfruit Jungle. no date. Daughters Inc, no date.[link]
Brown, S., and J. Simpson. “The Curious Identity Of Michael Field And Its Implications For Humanities Research With The Semantic Web.”. no date. 2013 Ieee International Conference On Big Data, 2013.[link]
Brown, S., et al. Orlando: Women's Writing In The British Isles From The Beginnings To The Present. no date. Cambridge University Press, no date.[link]
Brown, S., et al. “An Introduction To The Orlando Project”. no date. Tulsa Studies In Women’S Literature, 2007.[link]
Brown, S., et al. “Cultural (Re-)Formations: Structuring A Linked Data Ontology For Intersectional Identities”. no date. The Proceedings Of The Digital Humanities Conference, 2017.[link]
Brown, S., et al. “The Story Of The Orlando Project: Personal Reflections”. no date. Tulsa Studies In Women’S Literature, 2007.[link]
Butler, J. Gender Trouble. no date. Routledge, no date.[link]
Carpenter, E. The Intermediate Sex: A Study Of Some Transitional Types Of Men And Women. no date. Swan Sonnenschein and Company, no date.[link]
Coyote, I. E., and Z. Sharman. Persistence: All Ways Butch And Femme. no date. Arsenal Pulp Press, no date.[link]
Crenshaw, K. “Demarginalizing The Intersection Of Race And Sex: A Black Feminist Critique Of Antidiscrimination Doctrine,feminist Theory And Antiracist Politics”. no date. University Of Chicago Legal Forum, 1989.[link]
Crook, G. T. The Complete Newgate Calendar. no date. Edited by Mary Hamilton, vol. 2, Navarre Society, no date.[link]
Dbpedia. Dbpedia. no date. 2018.[link]
Dbpedia. Dbpedia. no date. DBpedia, 2017.[link]
Dbpedia. Dbpedia. no date. no date.[link]
De Lauretis, T. “Differences: A Journal Of Feminist Cultural Studies”. no date. Duke University Press, edited by Elizabeth Weed and Ellen Rooney, 1991.[link]
Dean-Hall, A., and R. H. Warren. “Sex, Privary And Ontologies”. no date. Workshop On Search And Exploration Of Xrated Information (Sexi 2013), 2013Feb. .[link]
Ellis, H., and J. A. Symonds. Studies In The Psychology Of Sex. no date. Wilson & Macmillan, no date.[link]
Encyclopædia Britannica. Encyclopædia Britannica. no date. 1911.[link]
Faderman, L. Surpassing The Love Of Men: Romantic Friendship And Love Between Women From The Renaissance To The Present. no date. Morrow, no date.[link]
Fuss, D. Essentially Speaking: Feminism, Nature And Difference. no date. Routledge, no date.[link]
Getty Art And Architecture Thesaurus. Getty Art And Architecture Thesaurus. no date. The J. Paul Getty Trust, 2017.[link]
Gillian Einstein,. “Situated Neuroscience: Exploring Biologies Of Diversity”. no date. Neurofeminism, edited by Robyn Bluhm et al., no date.[link]
Grant D. Campbell,, and Scott Cowan. The Paradox Of Privacy: Revisiting A Core Library Value In An Age Of Big Data And Linked Data. no date. FIMS Publications, 2016.[link]
Grier, B., and C. Reid. Lesbian Lives: Biographies Of Women From The Ladder. no date. Diana Press, no date.[link]
Grimm, S., et al. “Knowledge Representation And Ontologies”. no date. Semantic Web Services: Concepts, Technologies, And Applications, 2007Jan. .[link]
Grosz, E. A. Volatile Bodies: Toward A Corporeal Feminism. no date. Indiana University Press, no date.[link]
Halberstam, J. Female Masculinity. no date. Duke University Press, no date.[link]
Halpin, H., et al. “When Owl:sameas Isn’T The Same: An Analysis Of Identity In Linked Data”. no date. International Semantic Web Conference, 2010.[link]
Haraway, D. “Situated Knowledges: The Science Question In Feminism And The Privilege Of Partial Perspective”. no date. Feminist Studies, 1988.[link]
Hawley, J. C. Encyclopedia Of Postcolonial Studies. no date. Greenwood Publishing Group, no date.[link]
James Smith,. “Working With The Semantic Web”. no date. Doing Digital Humanities: Practice, Training, Research, edited by Constance Crompton et al., no date.[link]
James, H. The Bostonians. no date. 1921st ed., Macmillan, no date.[link]
Jennifer Drew,. “Frigidity”. no date. Sex And Society, Volume 1, 2010.[link]
Jewett, S. O. Sarah Orne Jewett Manuscript Collection. no date. Houghton Library, Harvard University, no date.[link]
Johnston, J. Lesbian Nation: The Feminist Solution. no date. Simon and Schuster, no date.[link]
Krafft-Ebing, R. Psychopathia Sexualis, With Especial Reference To Contrary Sexual Instinct: A Medico-Legal Study. no date. Translated by Gillbert Charles Chaddock, 7th ed., F. A. Davis, no date.[link]
Ladies Of Llangollen: Letters And Journals Of Lady Eleanor Butler (1739-1829) And Sarah Ponsonby (1755-1831) From The National Library Of Wales. Ladies Of Llangollen: Letters And Journals Of Lady Eleanor Butler (1739-1829) And Sarah Ponsonby (1755-1831) From The National Library Of Wales. no date. Adam Matthew Publications, no date.[link]
Lauren F. Klein. Debates In The Digital Humanities 2016. no date. no date.[link]
Lerman, J. “Big Data And Its Exclusions.”. no date. Stanford Law Review, 2013Sept. .[link]
Martin, D., and P. Lyon. Lesbian/woman. no date. Glide P, no date.[link]
Matthew K. Gold,. Debates In The Digital Humanities 2012. no date. Univ Of Minnesota Press, no date.[link]
Mayor, E. The Ladies Of Llangollen: A Study In Romantic Friendship. no date. Joseph, no date.[link]
McPherson, T. “Why Are The Digital Humanities So White? Or Thinking The Histories Of Race And Computation.”. no date. Debates In The Digital Humanities, 2012.[link]
Monique Wittig,. “The Straight Mind”. no date. Feminist Issues, 1980.[link]
Munt, S. “Sisters In Exile: The Lesbian Nation”. no date. New Frontiers Of Space, Bodies And Gender, 1998.[link]
N. Katherine HaylesHayles, N. K. How We Think: Digital Media And Contemporary Technogenesis. no date. The University of Chicago Press, 2012.[link]
Nakamura, L. Cybertypes: Race, Ethnicity, And Identity On The Internet. no date. Routledge, no date.[link]
Nestle, J. The Persistent Desire: A Femme-Butch Reader. no date. Alyson Publications, no date.[link]
Noel IgnatievIgnatiev, N. How The Irish Became White. no date. Routledge, 2008.[link]
Petersen, T. Art & Architecture Thesaurus. no date. Oxford University Press, no date.[link]
Radclyffe, H. The Well Of Loneliness. no date. Jonathan Cape, no date.[link]
Representing Race And Ethnicity In American Fiction, 1789- 1964. no date. 2016Oct .[link]
Richards, D. Lesbian Lists: A Look At Lesbian Culture, History, And Personalities. no date. Alyson Publications, no date.[link]
Ross, B. The House That Jill Built: A Lesbian Nation In Formation. no date. University of Toronto Press, no date.[link]
Sajnani, D. “Rachel/racial Theory: Reverse Passing In The Curious Case Of Rachel Dolezal”. no date. Transition Magazine, Hutchins Center, 2015.[link]
Scott, S. A Description Of Millenium Hall. no date. Edited by Gary Kelly, 1995th ed., Broadview Press, no date.[link]
Simone de Beauvoir,. The Second Sex. no date. Vintage Books, 1973.[link]
Simpson, J., and S. Brown. From Xml To Rdf In The Orlando Project. no date. 2013Sept .[link]
St. Pierre, E. “The Posts Continue: Becoming”. no date. International Journal Of Qualitative Studies In Education, 2013Mar. .[link]
Stuart Hall,, and Paul du Gay. Questions Of Cultural Identity. no date. SAGE Publications Ltd., 1996.[link]
Sycamore, M. B. Nobody Passes: Rejecting The Rules Of Gender And Conformity. no date. Seal P, no date.[link]
The Woman-Identified Woman. The Woman-Identified Woman. no date. Radicalesbians, no date.[link]
Treviranus, J. “The Value Of The Statistically Insignificant”. no date. Educause, 2014Jan. .[link]
Woolf, V. A Room Of One's Own. no date. The Hogarth Press, no date.[link]
Woolf, V. Three Guineas. no date. Hogarth Press, no date.[link]
Wright, E. Feminism And Psychoanalysis: A Critical Dictionary. no date. Blackwell, no date.[link]
“Sorting Things In: Feminist Knowledge Representation And Changing Modes Of Scholarly Production”. no date. Women's Studies International Forum, 2006May .[link]
“Trans Student Educational Resources”. “Trans Student Educational Resources”. no date. Tser, 2017.[link]