Difference between revisions of "Instructions for NHSC Editors"
(→Copyedit text to standard format) |
|||
Line 34: | Line 34: | ||
There will also be some pages with charts and graphs that we won't bother copyediting. If you have a question as to whether a specific chart or graph should be included/excluded, please ask [[User:Jere Krischel|Jere]]. | There will also be some pages with charts and graphs that we won't bother copyediting. If you have a question as to whether a specific chart or graph should be included/excluded, please ask [[User:Jere Krischel|Jere]]. | ||
+ | |||
+ | Also note that you should validate your copyedits against the actual images of the pages - they can be found by navigating to the [[NHSC Volume I Full Page List]], clicking on a page link, then clicking on the image thumbnail on that page. If you want, you can download the image to your computer, either in a low resolution format, or a really really high resolution format. For particularly ambiguous problems, the high resolution format is probably the best bet. |
Revision as of 23:38, 13 February 2006
In order to complete copy editing of the NHSC Report, Volume I, please follow this procedure:
Contents
Download the raw OCR text
It exists as two PDF files:
- Cover to xvi (cover of the book, inside flap, and pages i-xvi)
- p1-497 (only pages 1-497 have any real recognizable text - after 497 each page has 4 pages reduced in size, and often unreadable)
Go to the NHSC Volume I Full Page List
Pick a given numbered page and click on the link
For example, page 1 is here.
If completed, the link will show you a page with navigation links at the top and bottom (Previous Page and Next Page), and a link that says "Text Only".
If not completed, the link will show you a page with navigation links at the top and bottom (Previous Page and Next Page), and two red links, one that says "Text Only" and the other that says "Template:Nhsc-v1-<page number>".
Click on the "Text Only" or "Template:Nhsc-v1-<page number>" link
If not yet completed, you will immediately be brought to an edit screen.
If already completed, you will see the text for that page only, and will need to click the "edit" button to make changes.
Copy raw text from PDF
Open the downloaded PDF with Adobe Acrobat, and navigate to the page you are editing. Using the Select Tool (Tools...Basic...Select), highlight and copy the text directly from the PDF, and paste it into the edit window. Typically, there will just be two columns of text - make sure to get it all.
Click on the "Save Page" button to save the raw text (we'll want to do this so as to have a baseline for our edits, so don't make it pretty before you save).
Copyedit text to standard format
After saving the raw text first, click the edit button again, and you can start fixing the following:
- spelling errors (note, some pages actually have entire columns of letters missing, so sometimes you need to reconstruct it from context)
- formatting (I'll put more notes on guidelines for this later)
There will be some pages that have tables that will require Jere Krischel's personal attention. Just leave those uncorrected for me.
There will also be some pages with charts and graphs that we won't bother copyediting. If you have a question as to whether a specific chart or graph should be included/excluded, please ask Jere.
Also note that you should validate your copyedits against the actual images of the pages - they can be found by navigating to the NHSC Volume I Full Page List, clicking on a page link, then clicking on the image thumbnail on that page. If you want, you can download the image to your computer, either in a low resolution format, or a really really high resolution format. For particularly ambiguous problems, the high resolution format is probably the best bet.