Skip to content

Latest commit

 

History

History
92 lines (61 loc) · 7 KB

assignment-04.md

File metadata and controls

92 lines (61 loc) · 7 KB

Assignment 4: Building a Web Archive

Context:

The purpose of the Building a Web Archive assignment is to function as a final project for this course. It has been designed to provide you with an opportunity to draw from the readings, discussion, and assignments you have worked on in previous modules of this course, and build a sample web archive based on your Assignment 3: Web Archive Collection Plan.

Assignment:

For this assignment you will make use of the Web Archive Collection Plan that you submitted in Assignment 3: and begin to build that web archive.

We will use the free Conifer (previously called Webrecorder.io) tool located at https://conifer.rhizome.org/

In Module 13 you signed up for a free account, created a public collection, and uploaded a link for that collection to the class discussion board. You will be adding seed URLs to that collection and making use of the tools and functionality in Conifer to conduct your crawls of those seeds.

If you have questions about using the Conifer tool, I suggest you begin with the Conifer Help guides https://guide.conifer.rhizome.org/ or even look at some of the YouTube videos that discuss the operation of the service. Overall it should be something you are familiar with from class Web Archive Exercises.

For your final project you will need to expand your initial seed list to a total of at least 15 seeds.

For each of the seeds you will document at least the following pieces of information.

Metadata Fields Description of Field
Seed URL Seed URL you will be collecting.
Pre-Crawl Review: Problems that might exist for a crawler.
Title of Seed: The title of the seed/document/website
Description of Seed URL: Textual description of the seed URL
Creator/Author/Publisher: Who is responsible for the creation of this seed URL
Reason for Inclusion in Collection: Why did you include this seed in your collection?
Post-Crawl Review: After crawling, what are any limitations you notice in your crawled content.
Crawled Seed URL Link directly to the seed URL in your collection in Conifer.

You are welcome to include other metadata fields that are appropriate for the type of collection you are creating. These could include country, language, branch of government, Olympian name, political party, or anything else that would be helpful for a user trying to use your collection. The pre- and post-crawl review are opportunities to communicate some of the crawling challenges you see based on your experience in this course. It is also a way of describing any quality issues that you identify in your collection after you have completed your crawls.

You are free to present the metadata and fields in any format that works for you, just make sure it is clear what the fields are, and which seed URL they belong to.

When you are crawling your seed, you should keep two things in mind. First, you have a limited amount of space (5GB) for this work. So don’t go crazy trying to capture a ton of video for example. Second, you want to make sure you capture your seed at an appropriate level that fits within the scope of your collection. You may not need to capture the site in its entirety, just make sure you discuss what you did and didn’t crawl in your Pre-Crawl and Post-Crawl Review sections.

Organization and Content:

The assignment should start with the title “Assignment 4: Building a Web Archive” centered at the top of the first page.

The following sections should be present as headings.

Collection Overview

This section outlines the collection you are building including links to the public collection page in the Conifer service. You should take into account the additional seeds you have added beyond those in your Collection Plan when writing this overview to make sure it incorporates the additional information you will include. In this section you should speak to the crawl modality of the web archive you are creating (domain, website, topical, event, document).

Seed List

This section will contain all 15 (or if you want to include more) seed URLs and associated metadata fields (as listed in the Assignment section above). You may format these in whatever way you feel best conveys the information clearly.

References

Any references cited within the document. Use APA standards for citations. Use an online source (Purdue OWL) for specifics about APA.

Layout Specifics:

Font-size 11 pt, double spaced with 1 inch margins throughout the document. Include page numbers in the bottom right side of the footer.

The title for the assignment should be centered horizontally at the top of the first page (just like this document). Sections should be bolded and slightly larger than the 11pt font used in the rest of the document. I suggest using the Headings available in most word processing tools.

Feel free to include screenshots as needed to provide examples or highlight points.

Use APA standards for citations. Use an online source (Purdue OWL) for specifics about APA.

Put your last name in the upper right margin. Include pagination in the bottom margin.

Name the document Assignment4_lastname.docx, Assignment4_lastname.doc, or Assignment4_lastname.odf depending on which tool you use. You will submit this to the Major Assignment: Building a Web Archive in Canvas.

Grading Rubric

Design (15 points)

  • Does the document follow the specific instructions for the assignment?
  • Does the document contain a title and section headings?
  • Does the document contain the correct information in the header and footer?
  • Does the document use appropriate margins, line spacing, and font size?
  • Is the document’s length appropriate based on the instructions?

Content (65 points)

  • Does the document provide an overview of the collection?
  • Does the document identify the collection crawl modality?
  • Does the document have at least fifteen seed URLs?
  • Does the document include the required metadata fields with each seed URL?
  • Does the archive content represent the seed list, pre- and post- crawl reviews?

Linking and Citations (10 points)

  • Does the document have at least fifteen seed URLs and appropriate metadata?
  • Does the document include necessary citations?
  • Does the document include properly formatted citations?

Delivery (10 points)

  • Was the document submitted to the correct assignment module on Canvas?
  • Was the document submitted on time?
  • Was the document submitted in the correct file format (.doc, .docx, .odf)?
  • Was the document submitted with the correct file name?