LCPO Data Proofing and Entry

Note: this site is in active development. Site appearance will change significantly before true go-live, but the core data structure for storing Poet data and other content is relatively set. Because Drupal and other CMS’es abstract content (data), Views (pages and visualizations), and themes (appearance), changing the appearance or adding / deleting Views won’t affect the underlying data. It works the other way (changing underlying data automatically updates Views), so if you add a new poet with a birthplace, he or she will immediately appear on the Poet Birthplaces Map.

Authenticated users have an admin toolbar:

The only links you’ll use from here are My Workbench and Content (which is also accessible under My Workbench).

My Workbench is provided by the Workbench module, which helps govern workflows and moderate content publication. It shows content which you’ve edited under “My Drafts”, and has links to create content as well. These are the same links as under “Content.”

Taxonomies (Drupal for controlled vocabularies) are under Structure. These help manage input for fields. Examples include Locations (because we’re using historical locations and want them standardized to some degree), Religions, Occupations, Industries, Affiliations, etc. Data entry users / contributors can add or delete terms and rearrange term hierarchies, but not add or delete entire vocabularies. You will rarely need to add to a taxonomy, and shouldn’t ever need to delete a term, but it can be useful to see a list of all available terms here, especially if using one of the autocomplete widgets which don’t always give a list of terms on the page itself.

Most of the time you won’t be adding content – you’ll be editing / modifying it. I think that the easiest way to do this is from the “All Poets Table” View (main menu).

This shows all the poets in the database in a spreadsheet-like view. There is a pager on the bottom (see below) as the page is very large; I’m still experimenting with the best number of poets to display at once – right now it’s 100.

There are facets available to filter poets by different field values; these are located in the "Filters" dropdown above the table. Not all the facets have labels yet, as the Exposed Filters Fieldgroup module is a bit buggy, but you can apply facets to filter and search for poets. You can also sort poets in ascending or descending order by name, ID, priority, etc.

You can’t edit Poets directly from “All Poets Table." However, you can at least see all Poets here to find the Poet you’d like to edit. Clicking on the Poet’s name lets you View them; you can edit from that View, or you can click directly on the edit link as well.

The Poet “Edit Draft” view is where you’ll likely spend most of your time. Fields for the Poet entity type take up the rest of this document. The general process for editing a poet is just to pull up the Superlist and the Drupal Poet entity side by side and skim through both at the same time. Most poet entries are very brief – just four or five lines. You’re just checking that any quantifiable or describable information from the Superlist also appears in a field in the Drupal Poet entity. Eventually, there will be a “Narrative Bio” section where a bio close to the Superlist entry will appear, but the style of the Superlist will need to be changed to something closer to ODNB and this will take significant rewriting, so “Narrative Bios” have been backlogged for now.

Once you're 

If you have any questions, comments, or suggestions, contact Cole. There are absolutely no stupid questions when it comes to this – I’ve been developing the backend more than anything, so suggestions on how to make the data modification side more user-friendly, or suggestions on how to display information better, etc, are all welcome.

Why do all of this work? Besides providing a better way to search for poets, quantifying and fielding all of this information makes it possible to make data visualizations like the “Poet Birthplaces” map. This is relatively easy to do once the data is in Drupal, and the possibilities are only limited by the data captured, development skills, and imagination. Capturing the right data in the right way is the hardest part.

POET FIELDS

  • Name [text]. Alias for Drupal node title. Full name of the poet; should be unique. Uses bio dates or other distinguishing information if there are multiple poets with the same full given name. Shouldn’t need to modify.
  • PoetId [int]. GUID, mostly used for importing. Do not modify
  • Contributor [entity reference to User]. References users who have helped with this poet entry. In Superlist, usually preceeded by “inf.”, sometimes other places

NAMES Generally filled out well, except for images.

  • Image [content link]. We don’t have any of these yet. You can upload images here – make sure to give them a description in alt text and a title. Also include a link to the source, the name of the source, creation year (if known), and rights (if known).
  • Prefix or Title [text]. Dr., Mrs., etc
  • First Name [text]. Single first name
  • Middle Name(s) [text] all other names
  • Last Name(s) [text]
  • Suffix [text]. Jr, III, etc
  • Maiden Name [text] often “nee” in Superlist
  • Married Name [text]
  • Pseudonyms [text] Any name used or referenced besides bardic names (Welsh), or alternate spellings / initials
  • Bardic Name [text] Welsh phenomenon – usually in Welsh, sometimes with English translation (if so, include)
  • Other Name [text] Abbreviations or alternate name spellings
  • Gender [list] all male female, Other / Unknown just in case
  • Sexual Orientation [list] almost always leave as none so it won’t show up at all (don’t change to heterosexual). I think Anne Bannerman is the only potentially gay poet
  • Sexual Orientation Description [text]

DATES Baptism, birth, death are all generally filled out well. Active Decades – don't bother filling this in. We're probably not going to use this as it's really just a proxy for publication data (eg if a poet published works in 1756, 1759, and 1768, "Active Decades" would be 1750s; 1760s), but we can get this information directly from the publications instead. Cause of death: spotty. Filled in when I found it, didn’t go over specifically.

  • Baptism Year [YYYY date]. Takes a year in YYYY format
  • Birth Year [YYYY date]. Takes a year only, in YYYY format
  • Birth Year Approximated [Boolean]. Yes/no checkbox for whether the birth year is approximated (circa). Default no, check if yes.
  • Death Year [YYYY date]. Takes a year only, YYYY format
  • Death Year Approximated [Boolean]. Yes/no checkbox for whether the death year is approximated (circa). Default no, check if yes.
  • Flourished Years [YYYY date range]. If there is just a single date, fill in; if there is a range, check the “Show End Date” box and fill in both dates
  • Flourished Years Approximated [Boolean]. Yes/no checkbox for whether flourished year range (either side) is approximated (circa). Default no, check if yes.
  • Flourished Dates Description [text]. Used to capture original description of flourished years (eg 1850s-1860s, which has to become 1850-1869 for the Date Range datatype)
  • Active Years [YYYY Date Range, multiple]. Backup field for Active Decades if Partial Date didn’t work out. No need to put anything here.
  • Active Decades [Partial Date]. Use the decade picker; “add another” if there are multiple active decades. If the poet published a work during that decade, he or she is considered “active”
  • Cause of death [taxonomy reference Cause of Death]. Use the autocomplete to fill in cause of death, if known. If there is a reason not in the Cause of Death taxonomy, add it to the taxonomy, then return to the poet page and reference it.
  • Death Description [text]

LOCATIONS Birthplace and Other Locations: generally filled out well. Nationality: spotty on newest poets (High priority). Emigration, desc and Transatlantic: should be pretty complete, spent time on this.

  • Birthplace [taxonomy reference Locations]. Poet’s birthplace (eg “born at” of “of”). Multiple-select autocomplete for a Location, which is already geocoded. If there is a location which does not exist, add it with its coordinates to the Location taxonomy, then return and reference it – shouldn’t happen often, if at all.
  • Other Locations [taxonomy reference Locations]. Any other geographic places mentioned in a poet entry. Some overlap with Emigration field.
  • Nationality [taxonomy reference Nationality]. Can have multiple. Decide by looking at poet birthplace and other permanent(ish) locations.
  • Emigration [taxonomy reference Locations]. Locations the poet emigrated to.
  • Emigration Description [long text]. Look for occupations like sailor / seaman / soldier, when or why they moved, if they returned back
  • Transatlantic [Boolean]. If the poet emigrated across the Atlantic, check yes.

EDUCATION Not added at all yet – new fields and category

  • Formal Education [tax ref Formal Education Levels]. Unfortunately pretty subjective. Looking just at quality / length of formal schooling, not self-teaching. Examples?
  • Education Types [taxonomy reference Education]. Taxonomy to describe education. Can select multiples. If there is a type of education which isn’t well described here, feel free to add to Education
  • Formal Education Start / End Ages [int]. Ages rather than years. Often just an end age rather than a range.
  • Education Description [text]. Description. Put things like “only 3 days of formal schooling” or “Sunday school educated but otherwise self-taught” here.

CLASS AND LABOR Occupation, Industry: pretty complete except for High priority poets. LC Status: complete for all poets. Began working at age: not filled in for any poets. Disability and Illness, Orphaned or Widowed, Social Relief, Imprisonment: potentially spotty – searched for all major terms but never made individual full run-throughs.

  • Occupation [tax ref Occupations]. Use the existing occupations if at all possible, as these are the result of merging other very similar terms. If necessary add to taxonomy.
  • Industry [tax ref Industry]. Which Industry/ies the poet worked in. Based on their occupation.
  • Laboring-Class Status is Certain [Boolean]. Uncheck if the poet is not certainly working-class. In Superlist, question mark precedes name (eg “(?) Bradley, Dudley”).
  • Began working at age [int]. Trying to see if there was a pattern of child workers – numerous 6 or 7 year old factory laborers. Include if mentioned in Superlist.
  • Disability and Illness [tax ref Disability and Illness]. If the poet suffered from any illnesses, disabilities, or accidents. Some overlap with Cause of Death.
  • Disability and Illness Description [long text]. Describe if there is additional information beyond what is captured by the taxonomy term ref
  • Orphaned or Widowed [list]. Orphan = both parents lost; half-orphan = 1 parent lost. Can choose 2 (eg orphaned and widowed).
  • Orphaned or Widowed Description [long text]. If there is additional information about the parents / husband / wife, or how that person’s death directly impacted the poet’s class status
  • Social Relief (tax ref Social Relief). Types of social relief. No specific location in Superlist. Often Civil List pension or workhouse inmates. Social Relief Worker is a poet who worked at jails or helped run workhouses but was not an inmate. Asylums are generally mental health institutes; there were a few poets who lived in blind asylums, which should be “Other Housing.”
  • Imprisonment (long text). A description of any imprisonment, including reasons for, deportation to Australia, punishment type and length, etc.

CONNECTIONS Religion: spotty in both the database and the Superlist. Affiliations: very organic group; coverage and taxonomy both incomplete. Use description field liberally. Relatives: not yet imported, but pretty complete. Relationships: not yet added; spotty and experimental. Do not enter for now as I’m still tinkering with how I want to capture these.

  • Religion [tax ref Religions]. Usually only mentioned if the poet is a religious figure, writes religious verse, or is a non-mainstream denomination. If unsure, leave blank. If adding terms to taxonomy, make sure they sit at the right place in the hierarchy (eg under Christian -> Protestant)
  • Affiliations [tax ref Affiliations]. Describes political groups / movements poets belonged to or supported, literary groups / figures they belonged to or emulated, and geographic clusterings. Can be very specific or very general (Blue Stockings Group vs generic “radical”). Other examples include Jacobins / Chartists / temperance movement; School of Duck; Manchester “Sun Inn” Group. Geographic subsection has obvious overlap with Birthplace and Other Locations fields, but still useful. Try to use existing taxonomy, but add to it if necessary – make sure it’s in the right area of the hierarchy. • Affiliations Description [long text]. Describe level of commitment to movements, specifics, etc.
  • Relatives [Field Collection]. Used to describe a blood or marriage relationship. Each “Relative” reference is composed of multiple parts:
    • Relative Type [Relatives tax ref]. Describes the tie – eg father, brother-in-law, wife. From the perspective of the poet whose page you are editing – so there are “is child of” AND “is parent of” Relative Types, and they need to applied in the correct direction. Needs to be added to both poet pages, unfortunately. Example: if you’re editing Andrew Barnard, the relative tie looks like this to tie Andrew to his father.
    • Relative Figure [Poet / Non-Poet figure entity reference]. References the other person in the relative relationship.
    • Relationship Description [long text]. Optional; if they worked together, knew each other, had a close or estranged relationship.
  • Relationships [Field Collection]. Used to describe networks of influence, patronage, collaboration, letter writing, friendship, assistance, etc. Each Relationship is composed of multiple parts:
    • Relationship Type [Relationship tax ref]. Describes the relationship type. Like Relative Type, these are directional. Unlike Relative, there can be multiple Relationship Types between a pair of figures (you can only be related to your mother one way, unless you’re Oedipus, but you can both be taught by someone and collaborate with them). So this is a multivalue field. Example, if you’re editing Thomas Pringle, and want to show that STC read his work and that Pringle was influenced by STC: Relationships are hard to quantify or pin down, but showing that they existed is the most important thing.
    • Related Figure [Poet or Non-Poet Figure entity ref]. References the other half of the relationship. All relationships are directional. They don’t have to be mirrored.
    • Relationship Description [long text]. Describe how the two figures were related.
    • Relationship Strength [int, 0-5]. Numerical relationship of relationship strength; highly subjective. Experimental. Would be used for weighting edges or nodes on a network graph. Leave for now.
    • Document Count [int]. Count of documents exchanged between two figures. Would be used to supplement the Relationship Strength field or perhaps on its own in a graph depicting letter writing.
    • Document Count Approximated [Boolean]. If a count is approximate or not.

PUBLICATIONS Not yet added; I’m finishing up these soon. Will be pretty complete for low or medium priority poets, spotty or not at all for high / critical priority poets. Dialect Usage and Language are not captured at all yet.

  • Poetry Collections [Publication entity ref]. References a Publication. Can reference a periodical or non-poetic publication because of how the Publication content type is structured; don’t do this. Reference those in their respective field. All data about the Publication (dates, publisher, locations, editions, volumes, digital editions, etc) is stored in the Publication entity.
  • Periodical Publications, Non-Poetic Publications [Publication entity ref]. References a Publication. As above – make sure you reference the right publication in the right field.
  • Referenced Poem Titles [text]. If the Superlist mentions a singular poem that isn’t a standalone publication, add the poem’s title here; add format, year, etc if relevant.
  • Manuscript Information [long text, multivalue]. If the poet has known holograph manuscripts or letters, enter that information here; provide links, physical locations, etc if known. If there are multiple collections, add as multiple values in the field.
  • Dialect [tax ref Dialects]. If the poet wrote in dialect in general. This has overlap with the Publications dialect field, which captures whether or not a specific work used dialect. Taxonomy in flux; mostly taken from the Wikipedia page of British dialects. Scots is included as both a language and a dialect due to the ongoing debate over its status. Try to pick a specific, terminal term (eg Cumbrian instead of Northern). Find by looking at birthplace, publication titles, or in random places throughout Superlist entries; if unable, “Yes [undefined]” is a fallback for a poet that is known to have written in dialect – use sparingly.
  • Dialect Usage Description [long text].
  • Language [tax ref Languages]. If the poet wrote in multiple languages.
  • Publication Notes [long text]. Anything about publications, interactions with publishers / publication process not captured above or in individual Publication content types.

REFERENCES AND SOURCES Noted somewhat well in spreadsheet, but probably will not be able to import into Drupal, so they’ll have to be manually added. Not started on this yet.

  • References and Sources [Field Collection]. Contains multiple different types of reference.
    • Biblio Field Collection [Field Collection]. A little clunky, but to associate pages with a Biblio reference it has to be nested in its own Field Collection as the pages aren’t associated with the Biblio entity for all references.
      • Major References [Biblio entity ref]. Major references (see Superlist, “Major Sources and Abbreviations”) have been added as Biblio content types. I haven’t added the abbreviations yet. Reference a major source here.
      • Biblio Page Numbers [text]. Can take multiple values, if there are discrete ranges in a Biblio source (eg p 8-21, 45, 92-94).
    • Other Digital References [Link]. For linking to a digital reference such as a website. Takes a Title (eg Oxford Dictionary of National Biography) and a URL. Strip out the ezproxy in the middle of your access so that users from other institutions can use the link – for the Robert Tannahill article, http://www.oxforddnb.com.ezproxy.proxy.library.oregonstate.edu/view/arti... becomes http://www.oxforddnb.com/view/article/26960. ODNB, ESTC, etc shouldn’t be added as Biblio references as these wouldn’t point to individual entries, and each poet poets to a different link. If the source may be reused and linked to as a reference in another Poet, consider adding this as a major source in Biblio instead.
    • Other Analog References [text]. For analog sources only. MLA format. If the source may be reused and linked to as a reference in another Poet, consider adding this as a major source in Biblio instead.
    • Manuscript Information [long text]. If a manuscript was used as a source for information about the poet (not just listed as a publication or work), cite it here.
    • Reference Description [long text]. Catchall to describe references; don’t add references themselves here.

OTHER

  • Internal Comments [long text]. Use to capture anything about the data entry process, describe further research needed, etc. Don’t add info about the poet here, as this field won’t be shown to anyone but admins, contributors, and data entry users.
  • Priority [list]. Add priority status here – “Low” through “Critical.” If a poet is completely done (as is, all fields completely checked and everything mirrors the Superlist), but could probably benefit from additional research: “Mirrors Superlist – Additional Research Necessary.” If digital editions, images, ODNB links, etc have all been searched, “No Research Necessary.” This field won’t be shown to end users.
  • Revision Log Message [Workbench Moderation field]. Helps track changes made to poets by tracking revisions made by users. Be succinct – not much needed here. This is a sort of versioning control system; each time changes are made to an entity, a snapshot of that entity's state 
  • Moderation State [Workbench Moderation field – list]. Newly Added -> Work in Progress -> Ready for Review -> Published is the workflow. Has to move through that workflow. If you’ve added any info at all, change to WIP; if the poet is either of the “Mirrors” status, it can probably be moved to Ready for Review and then Published. Admin has to approve nodes before they’re Published

ADDITIONAL INFORMATION

Entity relationship diagram (open in new tab for larger view)