Instructions for NHSC Editors

From GrassrootWiki
Revision as of 23:38, 13 February 2006 by Jere Krischel (talk | contribs) (Copyedit text to standard format)
Jump to: navigation, search

In order to complete copy editing of the NHSC Report, Volume I, please follow this procedure:

Download the raw OCR text

It exists as two PDF files:

    • Cover to xvi (cover of the book, inside flap, and pages i-xvi)
    • p1-497 (only pages 1-497 have any real recognizable text - after 497 each page has 4 pages reduced in size, and often unreadable)

Go to the NHSC Volume I Full Page List

Pick a given numbered page and click on the link

For example, page 1 is here.

If completed, the link will show you a page with navigation links at the top and bottom (Previous Page and Next Page), and a link that says "Text Only".

If not completed, the link will show you a page with navigation links at the top and bottom (Previous Page and Next Page), and two red links, one that says "Text Only" and the other that says "Template:Nhsc-v1-<page number>".

Click on the "Text Only" or "Template:Nhsc-v1-<page number>" link

If not yet completed, you will immediately be brought to an edit screen.

If already completed, you will see the text for that page only, and will need to click the "edit" button to make changes.

Copy raw text from PDF

Open the downloaded PDF with Adobe Acrobat, and navigate to the page you are editing. Using the Select Tool (Tools...Basic...Select), highlight and copy the text directly from the PDF, and paste it into the edit window. Typically, there will just be two columns of text - make sure to get it all.

Click on the "Save Page" button to save the raw text (we'll want to do this so as to have a baseline for our edits, so don't make it pretty before you save).

Copyedit text to standard format

After saving the raw text first, click the edit button again, and you can start fixing the following:

  • spelling errors (note, some pages actually have entire columns of letters missing, so sometimes you need to reconstruct it from context)
  • formatting (I'll put more notes on guidelines for this later)

There will be some pages that have tables that will require Jere Krischel's personal attention. Just leave those uncorrected for me.

There will also be some pages with charts and graphs that we won't bother copyediting. If you have a question as to whether a specific chart or graph should be included/excluded, please ask Jere.

Also note that you should validate your copyedits against the actual images of the pages - they can be found by navigating to the NHSC Volume I Full Page List, clicking on a page link, then clicking on the image thumbnail on that page. If you want, you can download the image to your computer, either in a low resolution format, or a really really high resolution format. For particularly ambiguous problems, the high resolution format is probably the best bet.