[R] Parsing XML?

Spencer Graves @pencer@gr@ve@ @end|ng |rom e||ect|vede|en@e@org
Thu Jul 28 12:52:55 CEST 2022


Hi, Richard et al.:


On 7/28/22 1:50 AM, Richard O'Keefe wrote:
> What do you mean by "a list that I can understand"?
> A quick tally of the number of XML elements by identifier:
> 1 echoedSearchRetrieveRequest
> 1 frbrGrouping
> 1 maximumRecords
> 1 nextRecordPosition
> 1 numberOfRecords
> 1 query
> 1 records
> 1 resultSetIdleTime
> 1 searchRetrieveResponse
> 1 servicelevel
> 1 sortKeys
> 1 startRecord
> 1 wskey
> 2 version
> 50 leader
> 50 recordData
> 51 recordPacking
> 51 recordSchema
> 100 record
> 105 controlfield
> 923 datafield
> 1900 subfield


	  How did you get that?


	  Please forgive me for being so dense.  I've done several web searches 
and tried to work several tutorials, etc., without so far seeing what I 
might do that could be informative.


	  Even this list of "XML elements by identifiers" STILL does not 
include things like the name of the newspaper and publisher plus start 
and end dates.  I believe these fields are there, but I can't see how to 
parse them.  I earlier parsed a JSON version of essentially the same 
dataset.  However, the JSON version seemed not to distinguish between 
newspapers that were still publishing and those for which the end date 
was unknown.  My contact at the Library of Congress then suggested I 
parse the XML version.


	  Thanks,
	  Spencer

> 
> What of this information do you actually want?
> The elements of the list should be what?
> 
> 
> On Thu, 28 Jul 2022 at 08:52, Spencer Graves 
> <spencer.graves using effectivedefense.org 
> <mailto:spencer.graves using effectivedefense.org>> wrote:
> 
>     Hello, All:
> 
> 
>                What would you suggest I do to parse the following XML
>     file into a
>     list that I can understand:
> 
> 
>     XMLfile <-
>     "https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml
>     <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/ndnp_Alabama_all-yrs_e_0001_0050.xml>"
> 
> 
> 
> 
>                This is the first of 6666 XML files containing "U.S.
>     Newspaper
>     Directory" maintained by the US Library of Congress discussed in the
>     thread below.  I've tried various things using the XML and xml2.
> 
> 
>     XMLdata <- xml2::read_xml(XMLfile)
>     str(XMLdata)
>     XMLdat <- XML::xmlParse(XMLdata)
>     str(XMLdat)
>     XMLtxt <- xml2::xml_text(XMLdata)
>     nchar(XMLtxt)
>     #[1] 29415
> 
> 
>                Someplace there's a schema for this.  I don't know if
>     it's embedded
>     in this XML file or in a separate file.  If it's in a separate file,
>     how
>     could I describe it to my contacts with the Library of Congress so they
>     would understand what I needed and could help me get it.
> 
> 
>                Thanks,
>                Spencer Graves
> 
> 
>     p.s.  All 29415 characters in XMLtext appear in the thread below.
> 
> 
>     -------- Forwarded Message --------
>     Subject:        [Newspapers and Current Periodicals] How can I get
>     counts of
>     the numbers of newspapers by year in the US, and preferably also
>     elsewhere? A search of "U.S. Newspaper Directory,
>     Date:   Wed, 27 Jul 2022 14:59:03 +0000
>     From:   Kerry Huller <serials using ask.loc.gov <mailto:serials using ask.loc.gov>>
>     To:     Spencer Graves <spencer.graves using effectivedefense.org
>     <mailto:spencer.graves using effectivedefense.org>>
>     CC: twes using loc.gov <mailto:twes using loc.gov>
> 
> 
> 
>     --# Type your reply above this line #--
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 27 2022, 10:59am via System
> 
>     Hello Spencer,
> 
>     So, when I view the xml, I'm actually looking at it in XML editor
>     software, so I can view the tags and it's structured neatly. I've
>     copied
>     and pasted the text from the beginning of the file and the first
>     newspaper title below from my XML editor:
> 
>     <?xml version="1.0" encoding="UTF-8" standalone="no"?>
>     <?xml-stylesheet type='text/xsl'
>     href='/webservices/catalog/xsl/searchRetrieveResponse.xsl'?>
> 
>     <searchRetrieveResponse xmlns="http://www.loc.gov/zing/srw/
>     <http://www.loc.gov/zing/srw/>"
>     xmlns:oclcterms="http://purl.org/oclc/terms/
>     <http://purl.org/oclc/terms/>"
>     xmlns:dc="http://purl.org/dc/elements/1.1/
>     <http://purl.org/dc/elements/1.1/>"
>     xmlns:diag="http://www.loc.gov/zing/srw/diagnostic/
>     <http://www.loc.gov/zing/srw/diagnostic/>"
>     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
>     <http://www.w3.org/2001/XMLSchema-instance>">
>     <version>1.1</version>
>     <numberOfRecords>2250</numberOfRecords>
>     <records>
>     <record>
>     <recordSchema>info:srw/schema/1/marcxml</recordSchema>
>     <recordPacking>xml</recordPacking>
>     <recordData>
>     <record xmlns="http://www.loc.gov/MARC21/slim
>     <http://www.loc.gov/MARC21/slim>">
>            <leader>00000nas a22000007i 4500</leader>
>            <controlfield tag="001">1030438981</controlfield>
>            <controlfield tag="008">180404c20159999aluwr n       0   a0eng
>        </controlfield>
>            <datafield ind1=" " ind2=" " tag="010">
>              <subfield code="a">  2018200464</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="040">
>              <subfield code="a">DLC</subfield>
>              <subfield code="e">rda</subfield>
>              <subfield code="c">DLC</subfield>
>              <subfield code="b">eng</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="012">
>              <subfield code="m">1</subfield>
>            </datafield>
>            <datafield ind1="0" ind2=" " tag="022">
>              <subfield code="a">2577-5316</subfield>
>              <subfield code="2">1</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="032">
>              <subfield code="a">021110</subfield>
>              <subfield code="b">USPS</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="037">
>              <subfield code="b">711 Alabama Avenue, Selma, AL
>     36701</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="042">
>              <subfield code="a">nsdp</subfield>
>              <subfield code="a">pcc</subfield>
>            </datafield>
>            <datafield ind1="1" ind2="0" tag="050">
>              <subfield code="a">ISSN RECORD</subfield>
>            </datafield>
>            <datafield ind1="1" ind2="0" tag="082">
>              <subfield code="a">071</subfield>
>              <subfield code="2">15</subfield>
>            </datafield>
>            <datafield ind1=" " ind2="0" tag="222">
>              <subfield code="a">Selma sun</subfield>
>            </datafield>
>            <datafield ind1="0" ind2="0" tag="245">
>              <subfield code="a">Selma sun.</subfield>
>            </datafield>
>            <datafield ind1=" " ind2="1" tag="264">
>              <subfield code="a">Selma, AL :</subfield>
>              <subfield code="b">North Shore Press, LLC</subfield>
>              <subfield code="c">2016-</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="310">
>              <subfield code="a">Weekly</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="336">
>              <subfield code="a">text</subfield>
>              <subfield code="b">txt</subfield>
>              <subfield code="2">rdacontent</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="337">
>              <subfield code="a">unmediated</subfield>
>              <subfield code="b">n</subfield>
>              <subfield code="2">rdamedia</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="338">
>              <subfield code="a">volume</subfield>
>              <subfield code="b">nc</subfield>
>              <subfield code="2">rdacarrier</subfield>
>            </datafield>
>            <datafield ind1="1" ind2=" " tag="362">
>              <subfield code="a">Began in 2015.</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="588">
>              <subfield code="a">Description based on: Volume 2, Issue 40
>     (October 5, 2017) (surrogate); title from caption.</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="588">
>              <subfield code="a">Latest issue consulted: Volume 2, Issue 40
>     (October 5, 2017).</subfield>
>            </datafield>
>            <datafield ind1=" " ind2=" " tag="752">
>              <subfield code="a">United States</subfield>
>              <subfield code="b">Alabama</subfield>
>              <subfield code="c">Dallas</subfield>
>              <subfield code="d">Selma.</subfield>
>            </datafield>
>          </record>
>     </recordData>
>     </record>
> 
>     When I view the records in the XML editor, these 2 lines below do begin
>     each of the records for each individual title, but of course this is
>     including the xml tags:
> 
>     <recordSchema>info:srw/schema/1/marcxml</recordSchema>
>     <recordPacking>xml</recordPacking>
> 
>     Hopefully this helps you decide where to break or parse each record.
> 
>     On another note, I just noticed as well that at the top of this first
>     file it lists the total number of records for the Alabama grouping -
>     2250. This also appeared to be the case for the Alaska records when I
>     took a look at the first one for that state. I imagine that should be
>     consistent throughout each "grouping" of records.
> 
>     Let me know if you have follow-up questions!
> 
>     Best wishes,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 27 2022, 10:21am via Email
> 
>     Hi, Kerry:
> 
> 
>     Thanks. I understand the chunking in files of at most 50. I've read
>     the first file "ndnp_Alabama_all-yrs_e_0001_0050.xml" into a string of
>     29415 characters, copied below. Might you have any suggestions on the
>     next step in parsing this? Staring at it now, it looks splitting on
>     "info:srw/schema/1/marcxmlxml" might convert the 29415 characters into
>     shorter chunks, each of which could then be parsed further.
> 
> 
>     This is not as bad as reading ancient Egyptian heiroglyphics without
>     the Rosetta Stone, but I wondered if you might have something that could
>     make this work easier and more reliable? I guess I could compare with
>     what I already read as JSON ;-)
> 
> 
>     Thanks,
>     Spencer Graves
> 
> 
>     "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
>     45001030438981180404c20159999aluwr n 0 a0eng
>     2018200464DLCrdaDLCeng12577-53161021110USPS711 Alabama Avenue, Selma, AL
>     36701nsdppccISSN RECORD07115Selma sunSelma sun.Selma, AL :North Shore
>     Press,
>     LLC2016-WeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
>     in
>     2015.Description based on: Volume 2, Issue 40 (October 5, 2017)
>     (surrogate); title from caption.Latest issue consulted: Volume 2, Issue
>     40 (October 5, 2017).United
>     StatesAlabamaDallasSelma.info:srw/schema/1/marcxmlxml00000cas a22000007a
>     4500502150053100127c20109999aluwr n 0 a0eng
>     2010200019DLCengDLCDLCOCLCQ112153-18111750USPSB & C Publishing, LLC,
>     3514 Martin St. S. Ste 104, Cropwell, AL 35054pccnsdpISSN RECORDSt.
>     Clair County news (Cropwell, Ala.)St. Clair County news(Cropwell,
>     Ala.)St. Clair County news.Cropwell, AL :B & C Pub.WeeklyBegan in
>     2010.Description based on: Nov. 4, 2010 (surrogate); title from
>     caption.info:srw/schema/1/marcxmlxml00000cas a22000007a
>     4500426491872090720c20099999alumr n 0 a0eng
>     2009203372DLCengDLCOCLCQ12150-346X2150-346X1AU using 000044489617NZ116076352Devon
>     Applewhite/Applewhite Publishing Co., 1910 Honeysuckle Rd., #N183,
>     Dothan, AL 36305mscnsdpISSN RECORD30514Triangle tribune(Dothan,
>     Ala.)Triangle tribune.Dothan, AL :Applewhite Pub. CoMonthlyBegan with
>     vol. 1, issue 1 (May 2009).\"Connecting the Tri-State African -American
>     Community.\"Description based on: Vol. 1, issue 1 (May 2009); title from
>     masthead.Applewhite, Devon.United StatesAlabama.United
>     StatesGeorgia.United StatesFlorida.info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 4500289017315081219c20089999aluwr n | a0eng c
>     2008213218NSDengNSDOCLCQDLCOCLCQ111945-93191945-93191005270USPSSpringhill
>     Publications,
>     LLC, P.O. Box 186, Greenville, AL 36037nsdppccISSN RECORD07014Greenville
>     standardThe Greenville standard.Greenville, AL :Springhill
>     PublicationsWeeklytexttxtrdacontentunmediatednrdamediaBegan with vol. 1,
>     issue 1 (Sept. 3, 2008)Description based on surrogate of: Vol. 1, no. 15
>     (Dec. 18, 2008); title from masthead (publisher's Web site, viewed Dec.
>     19, 2008).Latest issue consulted: Vol. 1, no. 99 (July 27, 2011)
>     (surrogate).info:srw/schema/1/marcxmlxml00000cas a22000007a
>     4500123539969070426c20079999aluwr ne 0 a0eng c
>     2007212138NSDengNSDNSDOCLCQ101936-95571936-95571The Western Tribune,
>     1530 Third Ave. N., Bessemer, AL 35020mscnsdpISSN RECORDWestern tribune
>     (Bessemer, Ala.)The Western tribune(Bessemer, Ala.)The Western
>     tribune.Bessemer, Ala. :D-Med, Inc.v.WeeklyBegan in 2007.Description
>     based on: May 23, 2007 (surrogate); title from
>     caption.AU using 000041575341info:srw/schema/1/marcxmlxml00000cas a22000007a
>     4500226300653080425c20079999aluwr ne | a0eng
>     2008212112NSDengNSDNSDOCLCQ11942-20751942-20751nsdppccISSN RECORDThe
>     corridor messengerThe corridor messenger.Carbon Hill, AL :Corridor
>     Messenger, Inc.WeeklyBegan with vol. 1, issue (10.03.2007).Description
>     based on: 1st issue.United StatesAlabamaWalkerCarbon
>     Hill.http://www
>     <http://www>.corridormessenger.cominfo:srw/schema/1/marcxmlxml00000cas
>     a22000007a
>     450077560432070109c20069999aluwr ne 0 a0eng c
>     2007213400NSDengNSDOCLCQAUBRNOCLCOOCLCFa01935-37901935-37901AU using 000041190283The
> 
> 
>     Auburn Villager, P.O. Box 1633, Auburn, AL 36831-1633pccnsdpISSN
>     RECORDThe Auburn villagerThe Auburn villager.Auburn, AL :Auburn
>     Villagerv.WeeklyBegan in 2006.Description based on: Vol. 1, no. 4 (July
>     20, 2006) (surrogate); title from caption.Auburn (Ala.)Newspapers.Lee
>     County (Ala.)Newspapers.AlabamaAuburn.fast(OCoLC)fst01209634AlabamaLee
>     County.fast(OCoLC)fst01211930Newspapers.fast(OCoLC)fst01423814United
>     StatesAlabamaLeeAuburn.info:srw/schema/1/marcxmlxml00000cas a2200000Ii
>     4500872286785m o d s cr mn|---a||||140311c20069999alucr n o b
>     s0 a0eng cABCengrdaABCABCOCLCFLD59.13University of Alabama at
>     Birmingham.The eReporter.[Birmingham, Alabama] :The University of
>     Alabama at Birmingham,[2006]-[Birmingham, Alabama] :Offices of Public
>     Relations & Marketing and Information Technology1 online resource2
>     issues weeklytexttxtrdacontentcomputercrdamediaonline
>     resourcecrrdacarrierSeptember 19, 2006-\"The eReporter is an official
>     communication of The University of Alabama at Birmingham, companion to
>     the UAB Reporter and recommended alternative to mass e-mails.\"Issues
>     for <March 11, 2014- published and distributed via e-mail subscription
>     on Tuesdays and Fridays.Description based on: September 19, 2006; title
>     from title screen (viewed March 12, 2014).University of Alabama at
>     BirminghamPeriodicals.Periodicals.fast(OCoLC)fst01411641University of
>     Alabama at Birmingham.fast(OCoLC)fst00645114University of Alabama at
>     Birmingham.Office of Public Relations and Marketing.University of
>     Alabama at Birmingham.Information Technology.2006-2012, companion
>     to:University of Alabama at Birmingham.UAB
>     reporter.(OCoLC)32435748Archived
>     issueshttp://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas
>     <http://hatteras.dpo.uab.edu/cgi-bin/ereporter.cgiinfo:srw/schema/1/marcxmlxml00000cas>
> 
> 
>     a22000007a 4500166387050070829c20059999aluwr ne | a0eng c
>     2007215501NSDengNSDOCLCQ11939-68991939-68991The Wilkie Clark Memorial
>     Foundation, P.O. Box 514, Roanoke, AL 36274$30.00nsdpmscISSN
>     RECORD305.89614People's voice (Roanoke, Ala.)The people's voice(Roanoke,
>     Ala.)The people's voice.Roanoke, AL :Wilkie Clark Memorial
>     Foundationv.WeeklyBegan with vol. 1, no. 1 in 2005.Description based on:
>     Vol. 2, no. 20 (Apr. 20, 2007); title from caption.Wilkie Clark Memorial
>     Foundation.United
>     StatesAlabamaRandolphRoanoke.AU using 000042141390info:srw/schema/1/marcxmlxml00000nas
> 
> 
>     a22000007i 45001124677787191021c20uu9999aluwr ne | a0eng
>     2019202521DLCengrdaDLC12689-3258122730USPSNorth Jackson Press, 42950 Hwy
>     72, Suite 406, Stevenson, AL 35772nsdppccISSN RECORD071.323North Jackson
>     pressNorth Jackson press.Stevenson, AL :Caney Creek Publications
>     LLCWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierDescription
>     based on surrogate of: Volume 1, number 36 (October 11, 2019); title
>     from masthead.Latest issue consulted: Volume 1, number 36 (October 11,
>     2019) (Surrogate).United
>     StatesAlabamaJacksonStevensoninfo:srw/schema/1/marcxmlxml00000cas
>     a2200000 a 4500226315099080428d19981998aluwr ne | 0eng c
>     2008233691GUAengGUAOCLCQOCLCFOCLCO39911644pccn-us-gaThe Dekalb
>     news.Birmingham, Ala. :Community newspaper holdings Inc.v.WeeklyBegan
>     with 1st year, no. 1 (Apr. 1, 1998); ceased with 1st year, no. 31 (Oct.
>     28, 1998).Final issue consulted.Description based on first issue; title
>     from caption.Decatur (Ga.)Newspapers.DeKalb County
>     (Ga.)Newspapers.Newspapers.fast(OCoLC)fst01423814GeorgiaDecatur.fast(OCoLC)fst01226234GeorgiaDeKalb
> 
> 
>     County.fast(OCoLC)fst01215288United
>     StatesGeorgiaDeKalbDecatur.Decatur-DeKalb news/era(DLC)sn
>     89053661(OCoLC)19946163info:srw/schema/1/marcxmlxml00000cas a2200000 i
>     450050263311m o d cr cn|||||||||020730c19979999alu x neo
>     0 a0eng c
>     2015238492AMHengrdapnAMHOCLCQOCLCFOCLCOIULOCLHTMOCLCQCOODLC66460694810970435082687-93791AU using 000050711528OCLCS45109pccnsdpn-us---AP2.B5707023Birmingham
> 
> 
>     weekly (Online)Birmingham weekly(Online)Birmingham weekly.Birmingham, AL
>     :Birmingham Weekly1 online resourceIrregular,Feb. 16-28,
>     2012-Weekly,Sept. 4-11, 1997-Feb. 9-16,
>     2012texttxtrdacontentcomputercrdamediaonline resourcecrrdacarrierBegan
>     with vol. 1, issue 1 (Sept. 4-11, 1997).\"City news, views &
>     entertainment\"--Cover.Numbering dropped in Mar. 2012.Also issued in
>     print.Description based on: Publication information from ProQuest; title
>     from web page (viewed June 18, 2015).Latest issue consulted: Aug. 15-20,
>     2012.Birmingham (Ala.)Newspapers.Internet resources.Electronic
>     journals.AlabamaBirmingham.fast(OCoLC)fst01204958Newspapers.fast(OCoLC)fst01423814United
> 
> 
>     StatesAlabamaBirmingham.Print version:Birmingham
>     Weekly(OCoLC)39271050http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas
>     <http://apw.softlineweb.com/http://WC2VB5MT8E.search.serialssolutions.com/?sid=sersol&SS_jc=JC_000051895&title=Birmingham+Weeklyinfo:srw/schema/1/marcxmlxml00000cas>
> 
> 
>     a22000007a 450031471314941116d19941995aluwr ne 0 a0eng csn
>     94003083
>     NSDengNSDANEOCLCQOCLCFOCLCOOCLCQ11079-65411079-65411nsdppccn-us-akSoutheast
>     shopperSoutheast shopper.Juneau, Alaska :Kemper
>     Communications,1994-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol.
> 
> 
>     1, no. 1 (Nov. 16, 1994)-Ceased in Feb. 1995.Juneau
>     (Alaska)Newspapers.AlaskaJuneau.fast(OCoLC)fst01213587Newspapers.fast(OCoLC)fst01423814United
> 
> 
>     StatesAlaskaJuneau.AU using 000011356572info:srw/schema/1/marcxmlxml00000cas
>     a22000008a 450027910515930413c19949999alumr n 0 a0eng dsn
>     93002581 NSDengNSDOCLCQ11069-06621Birmingham Tribune, 216 Ave. T. Pratt
>     City, Birmingham, AL 35214nsdpBirmingham tribuneBirmingham
>     tribune.Birmingham, Ala. :Kervin
>     Fondren9501volumesMonthlytexttxtrdacontentunmediatednrdamediavolumencrdacarrierPREPUB:
> 
> 
>     publication expected Jan.
>     1995AU using 000025863987info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450026199931920716d19922013alumr ne 0 a0eng csn 92003357
>     NSDengNSDOCLOCLCQDLC011064-01341064-01341Black & White, POB 13215,
>     Birmingham, AL 35202-3215nsdppccBlack & white (Birmingham, Ala.)Black &
>     white(Birmingham, Ala.)Black & white.Black and whiteBirmingham, Ala.
>     :Black & White, Inc.v.Biweekly,Oct. 2, 1997-Monthly,May 1, 1992-Sept.
>     1997Began in May 1992; ceased with Jan. 10, 2013.\"Birmingham's New City
>     paper.\"Description based on: June 1992.Latest issue consulted: No. 67
>     (Oct. 16, 1997) (surrogate).info:srw/schema/1/marcxmlxml00000cas
>     a2200000 a 450032145723950314d19901999alumr ne 0 a0eng csn
>     95068755
>     MGNengMGNNSDCLUOCLCQOCLCFOCLCOOCLCA971211082-34841082-34841AU using 000011579542nsdppccn-us-alF335.J5S68The
> 
> 
>     Southern shofarThe Southern shofar.Birmingham, AL :L. Brook,-[1999]v.
>     :ill. ;35 cm.MonthlyBegan in 1990.-v. 9, issue 9 (Aug./Sept. 1999).\"The
>     monthly newspaper of Alabama's Jewish community.\"Some issues also
>     available on the Internet via the World Wide Web.Description based on:
>     Vol. 3, issue 11 (Oct. 1993).Jewish newspapersAlabama.Jewish
>     newspapers.fast(OCoLC)fst00982872Alabama.fast(OCoLC)fst01204694United
>     StatesAlabamaJeffersonBirmingham.Deep South Jewish voice(DLC)sn
>     99018499(OCoLC)42431704CLUhttp://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas
>     <http://bibpurl.oclc.org/web/719http://www.bham.net/shofar/info:srw/schema/1/marcxmlxml00000cas>
> 
> 
>     a22000007a 450021265141900326c19909999aluwr ne 0 a0eng csn
>     90099004 AARengAARCPNNSDOCLCQ11050-08981050-08981005022USPSE.O.N., Inc.,
>     Main St., Eclectic, AL 36024pccnsdpISSN RECORDThe Eclectic observerThe
>     Eclectic observer.Eclectic, Ala. :E.O.N., Inc.,1990-v.WeeklyVol. 1, no.
>     1 (Feb. 22, 1990)-Published by: Price Publications, Inc., <2006->Latest
>     issue consulted: Vol. 17, no. 1 (Jan. 5, 2006).United
>     StatesAlabamaElmoreEclectic.AU using 000040212446info:srw/schema/1/marcxmlxml00000cas
> 
> 
>     a22000007a 450021214781900314c19909999aluir ne 0 a0eng csn
>     90002457 AAAengAAANSDOCLCQ111050-20841050-20841931180USPSClanton
>     Newspapers, 1109 Seventh St., N., PO Box 1379, Clanton, AL
>     35045nsdppccn-us-alThe Clanton advertiserThe Clanton
>     advertiser.AdvertiserClanton, Ala. :Clanton Newspapersv. :ill. ;58
>     cm.Three no. a week,<May 13, 1992->Semiweekly,<Apr. 4, 1990->Began in
>     Jan. 1990.Description based on: Vol. 19, no. 27 (Wed., Apr. 4,
>     1990).Latest issue consulted: Vol. 22, no. 58 (May 13, 1992).United
>     StatesAlabamaChiltonClanton.Independent advertiser (Clanton,
>     Ala.)(OCoLC)21214732AU using 000025908452info:srw/schema/1/marcxmlxml00000cas
>     a2200000 a 450021214814900314c19909999aluwr ne 0 a0eng dsn
>     90099009 AAAengAAACPNNSDOCLCQ11056-32881056-32881505740USPSThe Blount
>     Countian, 3rd St. at Washington Ave., PO Box 310, Oneonta, AL
>     35121mscnsdpn-us-alThe Blount countianThe Blount countian.Oneonta, Ala.
>     :Southern Democrat, Inc.,1990-v. :ill.WeeklyVol. 1, no. 1 (Jan. 3,
>     1990)-Editor: Molly Howard Ryan, 1990-Latest issue consulted: Vol. 1,
>     no. 36 (Sept. 5, 1990).Ryan, Molly Howard.United
>     StatesAlabamaBlountOneonta.Southern Democrat(DLC)sn
>     85044741(OCoLC)12038577AU using 000025884049info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450022413044900920c19909999aluwr ne 0 a0eng dsn
>     90099011
>     AARengAARCPNNSDNSTOCLCQ92081707011191053-91231053-91231314240USPSmscnsdpThe
>     Clay times-journalThe Clay times-journal.Lineville, Ala. :C.L.
>     Proctor,1990-v.WeeklyVol. 1, no. 1 (Sept. 6, 1990)-United
>     StatesAlabamaClayLineville.Ashland progress(DLC)sn 85044701Lineville
>     tribune(DLC)sn 85044702AUinfo:srw/schema/1/marcxmlxml00000cas a22000007a
>     450021265218900326c19909999aluwr ne 0 0eng dsn 90099005
>     AARengAARCPNOCLCQmscTrussville news-journal.Trussville, Ala. :Mike
>     Mitchell,1990-v.BimonthlyVol. 1, no. 1 (Feb. 20, 1990)-United
>     StatesAlabamaJeffersonTrussville.info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450022301035900831c19909999aluwr ne 0 0eng dsn
>     90099010 AARengAARCPNOCLCQmscWeaver tribune.Oxford, Ala. :Cheaha
>     Pub.,1990-v.WeeklyVol. 1, no. 1 (July 19, 1990)-United
>     StatesAlabamaCalhounWeaver.United
>     StatesAlabamaCalhounOxford.info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450015155895870205c19879999aludr ne 0 a0eng csn
>     87050045
>     AAAengAAACPNNSDDLCCPNNSDDLCCPNDLCOCLDLCOCLCQOCLCFOCLCQ19261126829944596670892-44570892-44571AU using 000020456714360980USPSThe
> 
> 
>     Advertiser, P.O. Box 1000, Montgomery, AL
>     36192pccnsdpn-us-alNewspaperMontgomery advertiser (Montgomery, Ala. :
>     1987)The Montgomery advertiser(1987)The Montgomery advertiser.Montgomery
>     advertiser & the Alabama journalSunday Montgomery advertiserMontgomery,
>     Ala. :Advertiser Co.,1987-volumes
>     :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier160th
> 
> 
>     year, no. 1 (Jan. 2, 1987)-On Saturdays, Sundays and holidays a combined
>     edition is published with the Alabama journal, and called: Montgomery
>     advertiser and the Alabama journal, Jan. 3, 1987, and: Alabama journal
>     and Montgomery advertiser, Jan. 4, 1987-Feb. 25, 1990.Issues for Sunday
>     called: Sunday Montgomery advertiser, Mar. 4, 1990-Issues for Saturday,
>     Sunday and holidays have their own numbering, Jan. 3, 1987-Feb. 25,
>     1990.Montgomery
>     (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
> 
> 
>     StatesAlabamaMontgomeryMontgomery.Advertiser (Montgomery,
>     Ala.)0745-3221(DLC)sn 82008412(OCoLC)9049482Alabama journal (Montgomery,
>     Ala. : 1940)0745-323X(DLC)sn
>     87062018(OCoLC)2666111info:srw/schema/1/marcxmlxml00000cas a2200000 a
>     450016942287871105c19879999aludn ne 0 a0eng dsn 88050149
>     AAAengAAACPNNSDOCLCQy1044-00701044-0070746--32780746-32781565580USPSTroy
>     Publications, Inc., 113 North Market St., Troy, AL 36081mscnsdpMessenger
>     (Troy, Ala.)The Messenger(Troy, Ala.)The Messenger.Troy, Ala. :Troy
>     Pub.,1987-v.Daily (Sunday, Tuesday, Thursday and Friday)Vol. 121, no.
>     166 (July 1, 1987)-Sunday, Apr. 2, 1989 misprinted as v. 113.Latest
>     issue consulted: Vol. 113 [sic 123], no. 96 (Sunday, Apr. 2,
>     1989).United StatesAlabamaPikeTroy.Troy messenger0746-3278(DLC)sn
>     83009935(OCoLC)9921908info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450017799786880415c19879999aluir ne 0 a0eng dsn 88050086
>     AARengAARCPNNSDOCLCQ1p1044-03801044-03800745-75961441520USPSThe
>     Prattville Progress, 152 W. 3rd St., Prattville, AL
>     36067mscnsdpPrattville progress (Prattville, Ala. : 1987)The Prattville
>     progress(Prattville, Ala.)The Prattville progress.Prattville, Ala.
>     :James C. Seymour,1987-v.Three times a weekVol. 102, no. 8 (Jan. 20,
>     1987)-Latest issue consulted: Vol. 105, no. 153 (Wednesday, Dec. 26,
>     1990).United StatesAlabamaAutaugaPrattville.Progress (Prattville,
>     Ala.)0745-7596(DLC)sn
>     83007623(OCoLC)9428489info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450015344667870319c19869999aluwr ne 0 a0eng dsn 87000284
>     NSDengNSDCPNOCLCQy0893-07670893-07671431800USPSPickens County Herald,
>     P.O. Drawer E, Carrollton, AL 35447nsdpPickens County heraldPickens
>     County herald.Pickens County herald and west AlabamianCarrollton, Ala.
>     :Pickens Newspapers, Inc.,1986-WeeklyVol. 138, no. 40 (Oct. 2,
>     1986)-United StatesAlabamaPickensCarrollton.Pickens County herald and
>     west Alabamian0746-0473(DLC)sn
>     83008141AU using 000040635809info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450018917586881217c19869999aluwr ne 0 0eng dsn 88050225
>     CPNengCPNOCLCQmscThe Oxford sun/times.Oxford, Ala.
>     :[s.n.],1986-v.WeeklyVol. 1, no. 1 (Jan. 16, 1986)-Editor: Andy
>     Goggans.Numbering is irregular.United StatesAlabamaCalhounOxford.Oxford
>     sun (Oxford, Ala.)(DLC)sn
>     85045023AU using 000025803813info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450013991168860731c19869999aluwr ne 0 0eng dsn 86050322
>     CPNengCPNOCLCQmscIndependent (Brewton, Ala.)The Independent.Brewton,
>     Ala. :Jim Thornton,1986-v. :ill. ;58 cm.WeeklyVol. 1, no. 1 (June 19,
>     1986)-United
>     StatesAlabamaEscambiaBrewton.info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450018957493881231c19859999aluwr ne 0 0eng dsn
>     88050247 CPNengCPNOCLCQmscPiedmont journal-independent (Piedmont,
>     Ala.)The Piedmont journal-independent.Journal independentPiedmont, Ala.
>     :Lane Weatherbee,1985-v.WeeklyVol. 4, no. 52 (Dec. 24, 1985)-Sometimes
>     published as: Journal independent.United
>     StatesAlabamaCalhounPiedmont.Journal-independent(DLC)sn
>     85045014info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450012715821851024d19841985aluwr ne 0 a0eng dsn 85045014
>     CPNengCPNNSDCPNOCLCQmscThe Journal-independent.Piedmont, Ala.
>     :Journal-Independent, Inc.,1984-1985.volumes :illustrations ;58
>     cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 3,
>     no. 27 (July 3, 1984)- v. 4, no. 51 (Dec. 18, 1985).Carries the same
>     vol. numbering as the Piedmont journal-independent.United
>     StatesAlabamaCalhounPiedmont.Piedmont
>     journal-independent0890-6017(DLC)sn 85045013Piedmont journal-independent
>     (Piedmont, Ala.)(DLC)sn 88050247info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450012691448851018c19839999aludr ne 0 0eng dsn
>     85045007 CPNengCPNOCLCQmscTimesDaily.Times dailyFlorence, Ala. :T.S.P.
>     Newspapers, Inc.,1983-volumes :illustrations ;58
>     cmDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 114,
>     no. 226 (Aug. 14, 1983)-United StatesAlabamaLauderdaleFlorence.Florence
>     times + tri-cities daily(DLC)sn
>     85044995info:srw/schema/1/marcxmlxml00000cas a22000007a
>     45009428489830420d19831987aluir ne 0 a0eng dsn 83007623
>     NSDengNSDCPNNSDNSTOCLCQ89090d0745-75960745-75961The Progress, 152 W. 3rd
>     St., Prattville, AL 36067nsdpmscProgress (Prattville, Ala.)The
>     Progress(Prattville, Ala.)The Progress.Prattville, Ala. :The Prattville
>     Progress,1983-1987.volumes :illustrations ;58 cmThree times a
>     weektexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 98, no.
>     32 (Mar. 17, 1983)-v. 102, no. 7 (Jan. 17, 1987).United
>     StatesAlabamaAutaugaPrattville.Prattville progress(DLC)sn
>     85044740Prattville progress (Prattville, Ala.)1044-0380(DLC)sn
>     88050086(OCoLC)12254317AAPinfo:srw/schema/1/marcxmlxml00000cas a2200000
>     a 45009867255830831c19839999aludr ne 0 a0eng dsn 84008052
>     AAAengAAANSDOCLOCLCQX0743-15110743-15111617760USPST.S.P. Newspapers,
>     Inc., 219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Shoals
>     edition)TimesDaily(Shoals ed.)TimesDaily.Times dailyShoals ed.Florence,
>     Ala. :T.S.P. Newspapersvolumes
>     :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
> 
> 
>     with: Vol. 114, no. 226 (Aug. 14,
>     1983).\"Florence/Sheffield/Tuscumbia/Muscle Shoals.\"Shoals ed. and
>     Regional ed. combined on Sundays.Description based on: Vol. 114, no. 346
>     (Monday, Dec. 12, 1983).United
>     StatesAlabamaLauderdaleFlorence.TimesDaily (Regional
>     edition)0743-152XTimes Tri-cities dailyUnknownDec. 12,
>     1983info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450010536023840319c19839999aludr ne 0 a0eng dsn 84008051
>     NSDengNSDOCLCQ1x0743-152X0743-152X1617760USPST.S.P. Newspapers, Inc.,
>     219 W. Tennessee St., Florence, AL 35630nsdpTimesDaily (Regional
>     edition)TimesDaily(Regional ed.)TimesDaily.Times dailyRegional
>     ed.Florence, Ala. :T.S.P.
>     NewspapersDailytexttxtrdacontentunmediatednrdamediaBegan with: Vol. 114,
>     no. 226 (Aug. 14, 1983).Shoals ed. and Regional ed. combined on
>     Sundays.Description based on: Vol. 114, no. 346 (Monday, Dec. 12,
>     1983).United StatesAlabamaLauderdaleFlorence.TimesDaily (Shoals
>     edition)0743-1511Times Tri-cities dailyDec. 12,
>     1983AU using 000025818125info:srw/schema/1/marcxmlxml00000cas a22000007a
>     45009049482821213d19821987aludn ne 0 a0eng csn 82008412
>     AAAengAAANSDNPWCPNDLCCPNNSDDLCNSDDLCCPNNVFDLCOCLCQCRLOCLCFOCLCQ1d0745-32210745-32211nsdppccn-us-alNewspaperAdvertiser
> 
> 
>     (Montgomery, Ala.)The Advertiser(Montgomery, Ala.)The advertiser.Alabama
>     journal and advertiserMontgomery, Ala. :Advertiser Co.,1982-1987.volumes
>     :illustrationsDailytexttxtrdacontentunmediatednrdamediavolumencrdacarrier155th
> 
> 
>     year, no. 232 (Nov. 22, 1982)- ; -v. 14-3, Jan. 1, 1987.On Saturdays,
>     Sundays and holidays published as: The Alabama journal and advertiser,
>     Nov. 27, 1982-Jan. 1, 1987.Saturday, Sunday and holiday issues have
>     their own numbering.Montgomery
>     (Ala.)Newspapers.AlabamaMontgomery.fast(OCoLC)fst01202689Newspapers.fast(OCoLC)fst01423814United
> 
> 
>     StatesAlabamaMontgomeryMontgomery.Montgomery advertiser (Montgomery,
>     Ala. : Daily)(DLC)sn 84020645(OCoLC)2685433Montgomery advertiser
>     (Montgomery, Ala. : 1987)0892-4457(DLC)sn
>     87050045(OCoLC)15155895AU using 000020281746info:srw/schema/1/marcxmlxml00000cas
>     a2200000 a 45009237931830218c19829999aluwr ne 0 0eng dsn
>     86050139 AAAengAAACPNOCLOCLCQmscThe Randolph leader.Roanoke, Ala. :David
>     S. Stevenson,1982-volumes :illustrations ;58
>     cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 91,
>     no. 1 (Oct. 6, 1982)-United StatesAlabamaRandolphRoanoke.Roanoke
>     leader(DLC)sn 86050137Randolph press(DLC)sn
>     86050138info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450012715815851024d19821984aluwr ne 0 a0eng dsn 85045013
>     CPNengCPNNSDCPNOCLCQ110890-60170890-60171432080USPSThe Piedmont
>     Journal-Independent, 115 N. Center Ave., Piedmont, AL 36272mscnsdpThe
>     Piedmont journal-independentThe Piedmont journal-independent.Piedmont,
>     Ala. :Piedmont Journal-Independent, Inc.,1982-1984.volumes
>     :illustrations ;58
>     cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 1,
>     no. 1 (Mar. 31, 1982)-v. 3, no. 26 (June 27, 1984).Latest issue
>     consulted: Vol. 5, no. 31 (August 20, 1986).United
>     StatesAlabamaCalhounPiedmont.Piedmont journal(DLC)sn
>     85045012Journal-independent(DLC)sn
>     85045014(OCoLC)12715821AU using 000045312916info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 45009183905830202c19829999aluwr n 0 a0eng dsn
>     85044580 AAAengAAACPNNSDOCLOCLCQ11098-58671098-58671016409USPSNo. 4,
>     Rucker Plaza, Enterprise, AL 36331P.O. Box 1536, Enterprise, AL
>     36331mscnsdpSoutheast sun (Enterprise, Ala.)The southeast
>     sun(Enterprise, Ala.)The Southeast sun.Enterprise, Ala. :QST
>     Publicationsvolumes :illustrations ;58
>     cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan in
>     1982.Description based on: Vol. 1, no. 25 (Oct. 21, 1982).Latest issue
>     consulted: Vol. 16, no. 43 (Mar. 4, 1998).United
>     StatesAlabamaCoffeeEnterprise.AU using 000025827687info:srw/schema/1/marcxmlxml00000cas
> 
> 
>     a22000007a 450010487314840305c19819999aluwr ne 0 a0eng dsn
>     85044906
>     AAAengAAACPNNSDNSTCPNOCLOCLCQOCLCFOCLCOOCLCAOCLCQ900410885-16620885-16621749310USPSThe
> 
> 
>     New Times, 1618 1/2 St. Stephens Rd., Mobile, AL 36603mscnsdpn-us-alNew
>     times (Mobile, Ala.)The New times(Mobile, Ala.)The new times.Mobile,
>     Ala. :New Times Groupvolumes
>     :illustrationsWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierBegan
> 
> 
>     in 1981.Vol. 3, no. 49 (Dec. 15-21, 1983) and vol. 3, no. 50 (Dec.
>     22-28, 1983) are both called vol. 3, no. 49 (Dec. 15-21,
>     1983).Description based on: Vol. 2, no. 3 (Jan. 28-Feb. 3, 1982).African
>     AmericansAlabamaNewspapers.African
>     Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694Newspapers.fast(OCoLC)fst01423814United
> 
> 
>     StatesAlabamaMobileMobile.AAPUnknownAug. 15,
>     1985AU using 000024686659info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450018922463881219d19811983alucr ne 0 0eng dsn 88050233
>     AARengAARCPNNSDOCLCQmscThe Sylacauga daily advance.Advance/Sylacauga
>     dailySylacauga advanceSunday advanceAdvanceSylacauga, Ala. :Mrs. W.A.
>     Moody,1981-1893.v.Semiweekly,<Nov. 24, 1982-Feb. 13, 1983>Daily (except
>     Mon., Tues. & Sat.),<May 26, 1982-Nov. 21, 1982>Daily (except Sat. &
>     Mon.),<Jan. 1, 1981-May 23, 1982>74th Year, no. 123 (Jan. 1, 1981)-76th
>     year, no. 83 (Feb. 13, 1983).Days of publication vary.Published as: The
>     Advance/Sylacauga daily, <Aug. 28, 1981-May 23, 1982>.Published as:
>     Sylacauga advance, <Nov. 24, 1982-Feb. 13, 1983>.On Sunday, published
>     as: Sunday advance.United StatesAlabamaTalladegaSylacauga.Childersburg
>     star(DLC)sn 88050232Coosa press(DLC)sn 86050293Daily
>     home1059-6461(DLC)sn 88050234info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450021026715cr un|||||||||900209c19809999aluwr ne 0
>     0eng dsn 90099002
>     AARengAARCPNCUSOCLOCLCQTJCOCLCQOCLCFOCLCOOCLCA926143844AU using 000020585756mscn-us-alSpeakin'
> 
> 
>     out news.Speaking out newsDecatur, Ala. :Minority Network,
>     Inc.v.WeeklyBegan in 1980.Published in Huntsville, Ala., <1987>-Also
>     issued by subscription via the World Wide Web.Description based on: Vol.
>     7, no. 8 (Jan. 7-13, 1987).African AmericansAlabamaNewspapers.African
>     American
>     newspapersAlabama.AlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
> 
> 
>     American newspapers.fast(OCoLC)fst00799278African
>     Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
>     StatesAlabamaMorganDecatur.United
>     StatesAlabamaMadisonHuntsville.Speakin' out weekly news(DLC)sn
>     88050097http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas
>     <http://www.softlineweb.com/softlineweb/ethnic.htminfo:srw/schema/1/marcxmlxml00000cas>
> 
> 
>     a22000007a 450014996511861219c19809999aluwr ne 0 a0eng csn
>     86050472
>     AARengAARCPNNSDOCLCQ11080-15021080-15021328110USPSnsdppccWest-Alabama
>     gazetteWest-Alabama gazette.GazetteMillport, Ala. :Millport Pub.
>     Co.,1980-volumesWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrier4th
> 
> 
>     year, no. 32 (Jan. 3, 1980)-United StatesAlabamaLamarMillport.Gazette
>     (Millport, Ala.)(DLC)sn 86050471info:srw/schema/1/marcxmlxml00000cas
>     a2200000 a 450011828156850320c19809999aluwr ne 0 0eng dsn
>     86050314 AAAengAAACPNOCLOCLCQmscThe Hartford news-herald.Hartford, Ala.
>     :Geneva Publications,1980-volumes :illustrations ;57-59
>     cmWeeklytexttxtrdacontentunmediatednrdamediavolumencrdacarrierVol. 80,
>     no. 20 (Feb. 14, 1980)-United StatesAlabamaGenevaHartford.News-herald
>     (Hartford, Ala.)(DLC)sn 86050313info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450017857788880427d198u198ualusr ne 0 0eng dsn
>     88050097 AARengAARCPNOCLOCLCQOCLCFOCLCOOCLCAmscn-us-alSpeakin' out
>     weekly news.Decatur, Ala. :Smothers PublicationsPublished every first
>     and third Wed. of each monthDescription based on: Vol. 3, no. 13 (May
>     4-17, 1983).African
>     AmericansAlabamaNewspapers.Newspapers.fast(OCoLC)fst01423814African
>     Americans.fast(OCoLC)fst00799558Alabama.fast(OCoLC)fst01204694United
>     StatesAlabamaMorganDecatur.Weekly news (Huntsville, Ala.)(DLC)sn
>     87050012Speakin' out news(DLC)sn
>     90099002info:srw/schema/1/marcxmlxml00000cas a2200000 a
>     450017807936880418c198u9999aluwr ne 0 a0eng dsn 90099001
>     AAAengAAACPNOCLOCLCQThe Daleville Sun-Courier, 310 Daleville Ave.,
>     Daleville, AL 36322mscn-us-alDaleville sun-courier.Daleville, Ala. :QST
>     Publicationsv. :ill. ;58 cm.WeeklyDescription based on: Vol. 2, no. 28
>     (Wed., Feb. 17, 1988).United
>     StatesAlabamaDaleDaleville.AU using 000020585749info:srw/schema/1/marcxmlxml00000cas
> 
> 
>     a22000007a 450015580838870423c198u9999aluwr ne 0 0eng dsn
>     87050128 AARengAARCPNOCLCQmscGreene County independent.Eutaw, Ala.
>     :Greene County Independent, Inc.v.WeeklyDescription based on: Vol. 2,
>     no. 10 (Mar. 12, 1987).United
>     StatesAlabamaGreeneEutaw.info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450010125135831114d198u198ualucr ne 0 a0eng dsn 83003221
>     NSDengNSDOCLCQ0d0746-55210746-55211Auburn Bulletin & Lee County Eagle,
>     PO Box 2111, Auburn, Ala. 36830nsdpThe Auburn bulletin & the Lee County
>     eagleThe Auburn bulletin & the Lee County eagle.Lee County eagleAuburn
>     bulletin and the Lee County eagleAuburn, Ala. :[publisher not
>     identified]Semiweekly,<Sept. 5,
>     1984->WeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
>     Oct. 19, 1983.United StatesAlabamaLeeAuburn.Auburn bulletin(DLC)sn
>     89050006Eagle (Auburn, Ala.)(OCoLC)18435663Sept. 5,
>     1984info:srw/schema/1/marcxmlxml00000cas a22000007a
>     450018370324880818c198u9999aluwr ne 0 0eng dsn 88050147
>     CPNengCPNOCLCQmscTri-city times (Geraldine, Ala.)The Tri-City
>     times.Geraldine, Ala. :Wanda Nelsonv.WeeklyDescription based on: Vol. 2,
>     no. 24 (Jan. 6, 1982).United
>     StatesAlabamaDeKalbGeraldine.info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450010199338831208c198u9999aluwr ne 0 a0eng dsn
>     83005367 NSDengNSDCPNOCLCQ10746-62770746-62771707590USPSSpringville Pub.
>     Co., 539 Main St., Springville, AL 35146nsdpThe St. Clair clarionThe St.
>     Clair clarion.Saint Clair clarionSpringville, AL :Gary L.
>     ShultsWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
>     Vol. 2, no. 1 (Jan. 5, 1982).United StatesAlabamaSt.
>     ClairSpringville.AU using 000025783743info:srw/schema/1/marcxmlxml00000cas
>     a22000007a 450013787251860627c198u9999aluwr ne 0 a0eng dsn
>     86001923 NSDengNSDCPNOCLCQ10889-00800889-00801The Westerner Star, P.O.
>     Box 2060, Bessemer, AL 35021nsdpWestern star (Bessemer, Ala.)The Western
>     star(Bessemer, Ala.)The western star.Bessemer, Ala. :Hal
>     HodgensWeeklytexttxtrdacontentunmediatednrdamediaDescription based on:
>     Vol. 3, no. 15 (Wednesday, June 11, 1986).United
>     StatesAlabamaJeffersonBessemer.Bessemer advertiser(DLC)sn
>     87050117AU using 000025805174511.1srw.pc any \"y\" and srw.mt
>     <http://srw.mt> any
>     \"newspaper\" and srw.cp exact
>     \"Alabama\"50info:srw/schema/1/marcxmlxml1Date,,0mq1lME887FoIbjulKUV6bx9ImwWQNCv9GqZzGS92IKS31lEbcpRJBNHgcE1l29tFaHP9CHe0Yexk1uWQofffull"
> 
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 27 2022, 09:22am via System
> 
>     Hello Spencer,
> 
>     Thank you for reaching out about the bulk xml files for the US
>     Newspaper
>     Directory.
> 
>     We don't have documentation specific to these bulk xml files, but upon
>     further inspection I can say that each of those files don't necessarily
>     contain info for 50 newspaper titles. The structure of the titles for
>     California and New York for instance are different from say, Alabama.
> 
>     If you look at California for example, the file naming structure
>     indicates the year the title started, and then the number of titles
>     included in that xml file. So for instance, the files below include
>     info
>     for newspapers that started in 2000, 2001, and 2002 respectively. And
>     there is info for 30 titles in the xml file from 2000, and 14 in the
>     file for 2001, and so on.
> 
>         * ndnp_California_2000_e_0001_0030.xml
>         * ndnp_California_2001_e_0001_0014.xml
>         * ndnp_California_2002_e_0001_0012.xml
> 
>     If there's more than 50 titles for a given year, say for California
>     starting in 1880, then the next 50 titles will roll into the next xml
>     file, and so on. And the last xml file for that year may not include 50
>     titles.
> 
>     Many of the states seem to group all the years together, so each xml
>     file contains 50 titles, until possibly the last one for a given state,
>     which may contain less.
> 
>     I hope this information helps explain the total number of records and
>     structure a bit better. Let me know if you have any further questions.
> 
>     Best wishes,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 25 2022, 02:22pm via Email
> 
>     Hi, Kerry:
> 
> 
>     Might there be documentation on the XML files you mentioned?
> 
> 
>     I've successfully read
>     'https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/
>     <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/>',
>     extracted the names of 6666 XML files, and read the first one,
>     "ndnp_Alabama_all-yrs_e_0001_0050.xml". It contains 29415 characters,
>     beginning, "1.12250info:srw/schema/1/marcxmlxml00000nas a22000007i
>     45001030438981180404c20159999aluwr n 0 a0eng ". With a bit
>     more effort, I will likely be able to parse all 6666 of these. The
>     names suggest that each contains information on 50 newspapers, totaling
>     333,300. The main page
>     "https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>" says there are only
>     157,521 "Titles currently listed". This suggests that these XML files
>     include place holders for a little more than double the number of
>     entries currently in
>     "https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>".
> 
> 
>     Thanks for this.
> 
> 
>     Progress.
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 07 2022, 08:55am via System
> 
>     Hi Spencer,
> 
>     I thought of one more option after I emailed you yesterday that I
>     wanted
>     to make you aware of.
> 
>     I had explained the other day how we pull the records from OCLC into
>     our
>     U.S. Newspaper Directory. You can also access all of the raw MARC
>     records found in the directory in xml format from here if you choose:
>     https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/
>     <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/>
>     <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/
>     <https://chroniclingamerica.loc.gov/data/bib/worldcat_titles/bulk5/>> These will
> 
>     provide you all of the data from the record fields in MARC format, so
>     you'd get all the data you see here for example:
>     https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/>
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792/marc/>> but in
>     xml. I
>     don't know if this might be more data and info than you want to work
>     with, but wanted to make sure you were aware of this option as well.
> 
>     Best wishes,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 06 2022, 10:55am via System
> 
>     Hi Spencer,
> 
>     Thanks for reaching out again. I have been looking at the json view a
>     bit closer this morning and your example of "9999."
> 
>     After talking with a colleague this morning and looking at various
>     examples, I see there is some variation in how the titles with
>     either an
>     unknown starting/ending date or currently published titles are being
>     handled - depending on the view.
> 
>     As an example, I completed a search in the directory for Alaska and the
>     city of Anchorage. There are 80 results, and on the first page of
>     results you'll see # 4. Fort Richardson news, which was published from
>     1952-19??. The csv view of this state/city search result will show the
>     ending date of 19??. But if I append &format=json to this search
>     result,
>     this specific title will show an ending date of 1999. After talking
>     with
>     a colleague this morning, I discovered an integer had to be used in
>     these cases where dates were "?" so that the search based on year range
>     would work. Similarly, if you look at # 12 Alaska digest, which was
>     published 1994-current, the "current" becomes "9999" in the json view.
>     So, the records you are seeing with "9999" would most likely be titles
>     with an ending date of "current."
> 
>     However, there is an issue with the unknown dates, like "1999" being
>     used for "19??" in the example above. The "9" does not get inserted in
>     place of "?" when you are looking at the title/LCCN view of a specific
>     newspaper. So for instance, if you view the #4 title: Fort Richardson
>     news at this url:
>     https://chroniclingamerica.loc.gov/lccn/sn98059792/
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792/>
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792/
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792/>> but append .json
>     to the end of the url, after the LCCN, like this:
>     https://chroniclingamerica.loc.gov/lccn/sn98059792.json
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792.json>
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792.json
>     <https://chroniclingamerica.loc.gov/lccn/sn98059792.json>> you'll see
>     that the end_year is "19??." Viewing the title/LCCN json view for
>     titles
>     that are currently published will also show the end_year as "current."
>     The Alaska digest example from above can be viewed here:
>     https://chroniclingamerica.loc.gov/lccn/sn97060056.json
>     <https://chroniclingamerica.loc.gov/lccn/sn97060056.json>
>     <https://chroniclingamerica.loc.gov/lccn/sn97060056.json
>     <https://chroniclingamerica.loc.gov/lccn/sn97060056.json>>
> 
>     I wasn't aware of the difference between the directory search json view
>     and the title/LCCN view. But I think it would be possible to grab
>     the data from the title/LCCN json url through an additional script
>     potentially. The json url is included in the view under the "url" field.
> 
>     Of course, there are unknowns with publishing dates, but better to know
>     where the question marks are, and what titles are considered to be
>     current.
> 
>     I hope this clarifies the data a bit more - let me know if any of it
>     needs more clarification though. And let me know if you have follow-up
>     questions.
> 
>     Thank you,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 05 2022, 04:42pm via Email
> 
>     Hi, Kerry:
> 
> 
>     What would you suggest I do to get a count of the numbers of
>     newspapers and publishers operating by year from, say, 1790 to 2021?
> 
> 
>     I just determined that 20630 (13 percent) of the 157520 records in
>     the US Newspaper database I downloaded a week ago have end_year = 9999.
>     I don't think it's feasible to assume that all or even most of those
>     are still publishing.
> 
> 
>     Might there be some other database that might have this kind of
>     information?
> 
> 
>     I ask, because Robert McChesney (2004) The Problem of the Media
>     (Monthly Review Pr., esp. pp. 34-35) suggests that in the first half of
>     the nineteenth century, the US had more newspapers and newspaper
>     publishers per capita than any other place or time. He suggests that
>     that diversity of newspapers helped encourage literacy and limit
>     political corruption, both of which helped propel the young US to its
>     current dominance of the international political economy. I'm hoping to
>     get some data to evaluate this claim. Sadly, it looks like there is too
>     much missing and questionable data in this dataset for me to use this
>     without a fairly substantive data cleaning effort.
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 05 2022, 09:05am via System
> 
>     Hello Spencer,
> 
>     Thank you for reaching out about your additional questions.
> 
>     I was looking at the records you mention above, and yes, you are
>     correct
>     - those 9 records with the date inconsistencies and the one record for
>     the The New Mexican mining news
>     <https://chroniclingamerica.loc.gov/lccn/sn93061507/
>     <https://chroniclingamerica.loc.gov/lccn/sn93061507/>> containing
>     "Santa
>     Fe.\" have typos in them. Thanks for spotting these - it may be
>     possible
>     to have the cataloger in our division correct those typos. I will look
>     into this further.
> 
>     The U.S. Newspaper Directory doesn't have a connection with
>     Wikimedia or
>     Wikipedia. The Library of Congress periodically pulls the records for
>     the Directory from OCLC Worldcat
>     <https://www.oclc.org/en/worldcat.html
>     <https://www.oclc.org/en/worldcat.html>>. And those newspaper
>     records in
>     OCLC Worldcat have been created by catalogers at various institutions
>     around the U.S. over the span of several years. So, occasionally, you
>     will find a typo in the records. Corrections can be made by OCLC and
>     library staff at the various institutions. Every time we complete a new
>     pull on the OCLC records, any corrected records will then populate our
>     Directory.
> 
>     Regarding your question on the New-York weekly journal - yes, that is
>     also correct that it has two records. There is actually a record for
>     each format of the newspaper, so this record is for the microfilm
>     format
>     <https://chroniclingamerica.loc.gov/lccn/2009252748/
>     <https://chroniclingamerica.loc.gov/lccn/2009252748/>> and this one is
>     for the original print format
>     <https://chroniclingamerica.loc.gov/lccn/sn83030211/
>     <https://chroniclingamerica.loc.gov/lccn/sn83030211/>>. You can see in
>     the heading for the microfilm record where it says [microfilm reel] and
>     the print version shows [volume]. You are likely to see this for other
>     titles as well because each format has been cataloged with its own
>     LCCN.
>     You are also likely to see additional records with [online resource]
>     identified as the format as more and more titles are available as
>     ePrints or online.
> 
>     I hope this helps answer your additional questions a bit more. Please
>     reach out if you have any other questions.
> 
>     Thank you,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 04 2022, 01:47pm via Email
> 
>     Hi, Kelly:
> 
> 
>     At the risk of bombing your inbox with more emails than you want,
>     what is your relationship with Wikipedia and other Wikimedia Foundation
>     projects like Wikidata?
> 
> 
>     I ask, because I've logged over 20,000 edits in Wikimedia Foundation
>     projects since 2010, and I would happily try to answer questions about
>     Wikidata and other Wikimedia Foundation projects. I have NOT organized
>     an edit-a-thon, but I've made presentations at conferences with people
>     who have, and I would happily try to help organize such if you could
>     find a group of people who want to work to improve this US Newspaper
>     database. I think it would be good to establish links between this US
>     Newspaper database and Wikidata, with appropriate procedures so changes
>     to one could be evaluated for acceptance into the other.
> 
> 
>     FYI, John Peter Zenger's famous "New-York weekly journal" (1733-1751)
>     appears TWICE in your database with lccn = 2009252748 and sn83030211 and
>     ONCE in Wikidata WITHOUT an lccn, even though many other Wikidata items
>     have an lccn. See:
> 
> 
>     https://www.wikidata.org/wiki/Q23091960
>     <https://www.wikidata.org/wiki/Q23091960>
> 
> 
>     There's a "WikiProject Newspapers" on Wikipedia and a companion
>     "WikiProject Periodicals" on Wikidata:
> 
> 
>     https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata
>     <https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Newspapers/Wikidata>
> 
> 
>     https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals
>     <https://www.wikidata.org/wiki/Wikidata:WikiProject_Periodicals>
> 
> 
>     I've tried to connect with others on those projects, so far with only
>     limited success. However, you may know that almost anyone can change
>     almost anything on Wikipedia and other Wikimedia Foundation projects.
>     What stays tends to be written from a neutral point of view citing
>     credible sources. They have problems with vandals, but the problems are
>     usually easily controlled. This makes Wikipedia and Wikidata very
>     useful platforms for cleaning up databases like your US Newspaper
>     dataset.
> 
> 
>     Spencer Graves
> 
> 
>     ##########
> 
> 
>     Hello, Kelly:
> 
> 
>     In addition to the invalid JSON, discussed below [NOTE: The "below"
>     contains a slight addition to the report of the I sent last Friday.], I
>     found 9 (NINE!) cases where start_year was AFTER end_year. These have
>     lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926"
>     "sn99065409" "sn89065002" "sn98069857" "sn91059179"
> 
> 
>     See:
> 
> 
>     https://chroniclingamerica.loc.gov/lccn/sn86071531/
>     <https://chroniclingamerica.loc.gov/lccn/sn86071531/>
>     https://chroniclingamerica.loc.gov/lccn/sn95069213/
>     <https://chroniclingamerica.loc.gov/lccn/sn95069213/>
>     https://chroniclingamerica.loc.gov/lccn/sn90059096/
>     <https://chroniclingamerica.loc.gov/lccn/sn90059096/>
>     https://chroniclingamerica.loc.gov/lccn/sn86058451/
>     <https://chroniclingamerica.loc.gov/lccn/sn86058451/>
>     https://chroniclingamerica.loc.gov/lccn/sn90060926/
>     <https://chroniclingamerica.loc.gov/lccn/sn90060926/>
>     https://chroniclingamerica.loc.gov/lccn/sn99065409/
>     <https://chroniclingamerica.loc.gov/lccn/sn99065409/>
>     https://chroniclingamerica.loc.gov/lccn/sn89065002/
>     <https://chroniclingamerica.loc.gov/lccn/sn89065002/>
>     https://chroniclingamerica.loc.gov/lccn/sn98069857/
>     <https://chroniclingamerica.loc.gov/lccn/sn98069857/>
>     https://chroniclingamerica.loc.gov/lccn/sn91059179/
>     <https://chroniclingamerica.loc.gov/lccn/sn91059179/>
> 
> 
>     These all have obvious coding errors that can be easily fixed. The
>     data may not be completely accurate after the fix, but at least they are
>     not obviously wrong ;-)
> 
> 
>     ##################
> 
>     I got invalid JSON from:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json>
> 
> 
>     After some experimentation, I was able to replicate the problem with
>     a request for rows=10:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json>
> 
> 
>     Duncan Temple Lang <dtemplelang using ucdavis.edu
>     <mailto:dtemplelang using ucdavis.edu>>, Professor of Statistics
>     and Associate Dean for Graduate Programs at the University of California
>     - Davis, confirmed that it was a JSON error using:
> 
> 
>     https://codebeautify.org/jsonvalidator
>     <https://codebeautify.org/jsonvalidator>
> 
> 
>     He is part of the core team developing the R free, open-source
>     programming language. He said, that starting at offsets 161070 and
>     161502 in the character string you get from [the R code RCurl::getURL()]
>     we have:
> 
> 
>     Santa Fe.\"
> 
> 
>     and these are in an entry such as
> 
> 
>     "city": ["Santa Fe.\"]
> 
> 
>     So the final " is escaped and therefore there is no closing " for the
>     string. The parser continues to consume characters looking for the end
>     of that string.
> 
> 
>     If one "repairs" the text from getURL() with
> 
> 
>     ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
> 
> 
>     then the rest of my code worked fine.
> 
> 
>     You may wish to do something to implement other checks for valid JSON
>     and repair this problem. I've scanned all the 157520 records that were
>     in that database a couple of days ago, and this is the only JSON error
>     identified by the code I used.
> 
> 
>     NOTE: I was NOT able to replicate this error when downloading records
>     one at a time. That suggests a problem NOT in the database itself but
>     in the download algorithm. ???
> 
> 
>     Thank you for your help. I will almost certainly have other
>     questions ;-)
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 03 2022, 10:39pm via Email
> 
>     Hello, Kelly:
> 
> 
>     In addition to the invalid JSON, discussed below [NOTE: The "below"
>     contains a slight addition to the report of the I sent last Friday.], I
>     found 9 (NINE!) cases where start_year was AFTER end_year. These have
>     lccn = "sn86071531" "sn95069213" "sn90059096" "sn86058451" "sn90060926"
>     "sn99065409" "sn89065002" "sn98069857" "sn91059179"
> 
> 
>     See:
> 
> 
>     https://chroniclingamerica.loc.gov/lccn/sn86071531/
>     <https://chroniclingamerica.loc.gov/lccn/sn86071531/>
>     https://chroniclingamerica.loc.gov/lccn/sn95069213/
>     <https://chroniclingamerica.loc.gov/lccn/sn95069213/>
>     https://chroniclingamerica.loc.gov/lccn/sn90059096/
>     <https://chroniclingamerica.loc.gov/lccn/sn90059096/>
>     https://chroniclingamerica.loc.gov/lccn/sn86058451/
>     <https://chroniclingamerica.loc.gov/lccn/sn86058451/>
>     https://chroniclingamerica.loc.gov/lccn/sn90060926/
>     <https://chroniclingamerica.loc.gov/lccn/sn90060926/>
>     https://chroniclingamerica.loc.gov/lccn/sn99065409/
>     <https://chroniclingamerica.loc.gov/lccn/sn99065409/>
>     https://chroniclingamerica.loc.gov/lccn/sn89065002/
>     <https://chroniclingamerica.loc.gov/lccn/sn89065002/>
>     https://chroniclingamerica.loc.gov/lccn/sn98069857/
>     <https://chroniclingamerica.loc.gov/lccn/sn98069857/>
>     https://chroniclingamerica.loc.gov/lccn/sn91059179/
>     <https://chroniclingamerica.loc.gov/lccn/sn91059179/>
> 
> 
>     These all have obvious coding errors that can be easily fixed. The
>     data may not be completely accurate after the fix, but at least they are
>     not obviously wrong ;-)
> 
> 
>     ##################
> 
>     I got invalid JSON from:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json>
> 
> 
>     After some experimentation, I was able to replicate the problem with
>     a request for rows=10:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json>
> 
> 
>     Duncan Temple Lang <dtemplelang using ucdavis.edu
>     <mailto:dtemplelang using ucdavis.edu>>, Professor of Statistics
>     and Associate Dean for Graduate Programs at the University of California
>     - Davis, confirmed that it was a JSON error using:
> 
> 
>     https://codebeautify.org/jsonvalidator
>     <https://codebeautify.org/jsonvalidator>
> 
> 
>     He is part of the core team developing the R free, open-source
>     programming language. He said, that starting at offsets 161070 and
>     161502 in the character string you get from [the R code RCurl::getURL()]
>     we have:
> 
> 
>     Santa Fe.\"
> 
> 
>     and these are in an entry such as
> 
> 
>     "city": ["Santa Fe.\"]
> 
> 
>     So the final " is escaped and therefore there is no closing " for the
>     string. The parser continues to consume characters looking for the end
>     of that string.
> 
> 
>     If one "repairs" the text from getURL() with
> 
> 
>     ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
> 
> 
>     then the rest of my code worked fine.
> 
> 
>     You may wish to do something to implement other checks for valid JSON
>     and repair this problem. I've scanned all the 157520 records that were
>     in that database a couple of days ago, and this is the only JSON error
>     identified by the code I used.
> 
> 
>     NOTE: I was NOT able to replicate this error when downloading records
>     one at a time. That suggests a problem NOT in the database itself but
>     in the download algorithm. ???
> 
> 
>     Thank you for your help. I will almost certainly have other
>     questions ;-)
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jul 01 2022, 11:46am via Email
> 
>     Hello, Kelly:
> 
> 
>     I got invalid JSON from:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=500&page=103&format=json>
> 
> 
>     After some experimentation, I was able to replicate the problem with
>     a request for rows=10:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=10&page=5117&format=json>
> 
> 
>     Duncan Temple Lang <dtemplelang using ucdavis.edu
>     <mailto:dtemplelang using ucdavis.edu>>, Professor of Statistics
>     and Associate Dean for Graduate Programs at the University of California
>     - Davis, confirmed that it was a JSON error using:
> 
> 
>     https://codebeautify.org/jsonvalidator
>     <https://codebeautify.org/jsonvalidator>
> 
> 
>     He is part of the core team developing the R free, open-source
>     programming language. He said, that starting at offsets 161070 and
>     161502 in the character string you get from [the R code RCurl::getURL()]
>     we have:
> 
> 
>     Santa Fe.\"
> 
> 
>     and these are in an entry such as
> 
> 
>     "city": ["Santa Fe.\"]
> 
> 
>     So the final " is escaped and therefore there is no closing " for the
>     string. The parser continues to consume characters looking for the end
>     of that string.
> 
> 
>     If one "repairs" the text from getURL() with
> 
> 
>     ftxt= gsub('Santa Fe.\\\\"', 'Santa Fe."', txt)
> 
> 
>     then the rest of my code worked fine.
> 
> 
>     You may wish to do something to implement other checks for valid JSON
>     and repair this problem. I've scanned all the 157520 records that were
>     in that database a couple of days ago, and this is the only JSON error
>     identified by the code I used.
> 
> 
>     Thank you for your help. I will almost certainly have other
>     questions ;-)
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 28 2022, 02:20pm via System
> 
>     Hello Spencer,
> 
>     Thank you for sending along your follow-up questions.
> 
>     I'm glad to hear the json view will work for you. It was recommended to
>     me that you limit your requests to 500 rows at a time. And a developer
>     here at LC suggests the following regarding rate limiting:
> 
>     “To avoid being blocked by the server, the current rate-limiting rules
>     restrict un-cached requests to URLs starting with
>     https://chroniclingamerica.loc.gov/search/
>     <https://chroniclingamerica.loc.gov/search/>
>     <https://chroniclingamerica.loc.gov/search/
>     <https://chroniclingamerica.loc.gov/search/>> to 120 requests every 10
>     minutes from a single IP address.”
> 
>     So, I think if you limited each of your requests to 500 rows at a time
>     with the proper pauses, then you should be able to access what you need.
> 
>     As for the csv view, I checked on this as well, and was informed that
>     the csv view was not implemented for all url formats. The csv view was
>     only implemented for this view:
>     https://chroniclingamerica.loc.gov/newspapers/
>     <https://chroniclingamerica.loc.gov/newspapers/>
>     <https://chroniclingamerica.loc.gov/newspapers/
>     <https://chroniclingamerica.loc.gov/newspapers/>>and urls resulting
>     from
>     US Directory search results - for e.g. if you wanted to narrow down
>     your
>     search results by state, city, date range, etc. found at this link:
>     https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>
>     <https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>>. So, if you
>     wanted a
>     csv and limited your search by state ( for example:
>     https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv
>     <https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv>
> 
>     <https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv
>     <https://chroniclingamerica.loc.gov/search/titles/results/?state=Alaska&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv>>
> 
>     ), you could append &format=csv to the search result url and get the
>     csv
>     to automatically download. But, if your search results ended up being
>     over a couple thousand titles, then the system would probably time out.
> 
>     I hope this info helps! Let me know if you have any other questions.
> 
>     Best wishes,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 27 2022, 04:15pm via Email
> 
>     Hello, Kerry:
> 
> 
>     Thanks for the reply. Can you please give me some further guidance
>     on two thing "so that the system is not overwhelmed"?
> 
> 
>     1. The max size in a small batch?
> 
> 
>     2. Any limit on the number of small batches in a second or minute?
> 
> 
>     I've found that I can download small batches under program control
>     using "RCurl::getURL" in R (programming language) using, e.g.;
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json
>     <https://chroniclingamerica.loc.gov/search/titles/results/?rows=20&page=2&format=json>
> 
> 
>     With this, I can control the batch size with "row=20" vs. "row=50"
>     vs., e.g., "row=1000". A naive search says there are 157520 "results".
>     With "row=1000", this would require 158 calls. With "row=20", it
>     would require 7876 calls. Before I start, I need to decide which fields
>     I want; I don't need them all.
> 
> 
>     Thanks,
>     Spencer Graves
> 
> 
>     p.s. I tried appending "&format=csv" and got "Error 504 Ray ID:
>     7220896da85e86e7 • 2022-06-27 19:19:53 UTC Gateway time-out". I used:
> 
> 
>     https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv
>     <https://chroniclingamerica.loc.gov/search/titles/results/?state=&county=&city=&year1=1690&year2=2022&terms=&frequency=&language=&ethnicity=&labor=&material_type=&lccn=&rows=20&format=csv>
> 
> 
>     I can get what I want using json so do not need csv. However, I
>     thought you might want to know that I was unable to get csv to work.
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 27 2022, 10:54am via System
> 
>     Hello Spencer,
> 
>     Thank you for contacting the Library of Congress about searching the US
>     Newspaper Directory. I wanted to follow up with you regarding your
>     request to output the data in a machine readable format.
> 
>     It looks like you were provided the link to the API documentation for
>     the website: About the Site and API
>     <https://chroniclingamerica.loc.gov/about/api/
>     <https://chroniclingamerica.loc.gov/about/api/>>. Scroll down to the
>     section with the heading, Searching the directory and newspaper pages
>     using OpenSearch. This section describes the search functionality and
>     structure for the US Newspaper Directory in more detail. It is possible
>     to return your directory searches in json format by appending
>     &format=json to the end of the url. It is also possible to return
>     search
>     results in csv format by appending &format=csv to the end of the url,
>     but I would strongly suggest that you do this in small batches by
>     putting limits on your search so that the system is not overwhelmed.
> 
>     So, from the search page for the US Newspaper Directory
>     <https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>> you could
>     potentially limit your search based on state and city, or date range,
>     and/or even frequency. Then once you've completed the search, you can
>     add &format=csv to the end of the url to automatically download a
>     csv of
>     those records. The resulting csv will contain several fields/headers:
>     lccn, title, place of publication, start year, end year, publisher,
>     edition, frequency, subject, state, city, country, language, oclc
>     number, and holding type. I think these fields include the information
>     you were looking for. But, again, I would like to stress that you put
>     limits on your search before creating the csv so as not overwhelm the
>     system.
> 
>     Please let me know if you have any other additional questions.
> 
>     Best wishes,
> 
>     Kerry Huller
>     Newspaper & Current Periodical Reading Room
>     Serial & Government Publications Division
>     Library of Congress
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 23 2022, 01:55pm via System
> 
>     Mr. Graves,
> 
>     I'm going to transfer you request to a member of our digital
>     collections
>     team who may be of more assistance to you than me.
> 
>     Mike
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 23 2022, 01:51pm via Email
> 
>     Dear Mr. Queen:
> 
> 
>     Thanks for the reply. I'm still confused. I downloaded and
>     installed Docker Desktop and "docker-compose.yml" and ran their "Getting
>     Started" Tutorial, but I don't see what to do next.
> 
> 
>     I repeat: I'd like to analyze "U.S. Newspaper Directory,
>     1690-Present" (https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>), which
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 22 2022, 07:15pm via System
> 
>     Mr. Graves,
> 
>     Programmatic access to the data forChronicling America
>     <https://chroniclingamerica.loc.gov/
>     <https://chroniclingamerica.loc.gov/>>and possibly the U.S. Newspaper
>     Directory <https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>>can be
>     found on theAbout the Site and API
>     <https://chroniclingamerica.loc.gov/about/api/
>     <https://chroniclingamerica.loc.gov/about/api/>>page in various
>     formats.
>     Also, please note that Chronicling Americacontains newspapers published
>     from 1777-1963, but does not include everyU.S. newspaper published in
>     that time period.
> 
>     Please let me know if I can be of further assistance.
> 
> 
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 22 2022, 06:14pm via Email
> 
>     Dear Mr. Queen:
> 
> 
>     Can we simplify this to just giving me the data behind "U.S.
>     Newspaper Directory, 1690-Present"
>     (https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>) in a machine
>     readable format, e.g., csv or xlsx or a MySQL database?
> 
> 
>     As I mentioned in my original email, a naive search of that without
>     restrictions returned 157520 titles in 7876 pages with up to 20 titles
>     per page giving date ranges in at least some cases. I could probably
>     write software to scrape those 7876 pages from your web site and combine
>     them into a data file.
> 
> 
>     I have a PhD in statistics, I have been using the R programming
>     language and similar software for decades. This includes publishing
>     tutorials on how to analyze data like this on Wikiversity.[1] I'd like
>     to do something similar with this. I could help make your data more
>     useful to others and discuss with you how we might prioritize
>     improvements like accessing the other sources you mentioned.
> 
> 
>     Thanks very much for your reply.
> 
> 
>     Sincerely,
>     Spencer Graves, PhD
>     Founder, EffectiveDefense.org
>     4550 Warwick Blvd 508
>     Kansas City, MO 64111
>     m: 408-655-4567
> 
> 
>     [1] e.g.:
> 
> 
>     https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita
>     <https://en.wikiversity.org/wiki/US_Gross_Domestic_Product_(GDP)_per_capita>
>     ------------------------------------------------------------------------
> 
>     Newspapers and Current Periodicals Reference Librarian
> 
>     Jun 22 2022, 05:27pm via System
> 
>     Mr. Graves
> 
>     Your request is a little more complex than it first appears and
>     requires
>     extensive research. A variety of resources should be consulted to
>     determine the circulation statistics of newspapers published prior to
>     1851. You will need to check newspaper union lists and newspaper
>     histories. Union listspresent lists of newspapers in geographic
>     arrangement according to place of publication, and specify which
>     libraries or other institutions hold collections of those newspapers
>     and
>     the dates of their holdings. These can also be useful for tracking
>     title
>     changes throughout a newspaper's history. Newspaper
>     historieslikeAmerican Journalism: A History: 1690-1960
>     <https://lccn.loc.gov/62007157
>     <https://lccn.loc.gov/62007157>>(Mott),The Penny Press
>     <https://lccn.loc.gov/2004043078
>     <https://lccn.loc.gov/2004043078>>(Thompson), andThe Press and America
>     <https://lccn.loc.gov/99044295
>     <https://lccn.loc.gov/99044295>>(Emery et al.) may not include
>     circulation statistics, but they do document the diversity and progress
>     of newspaper publishing, including notable newspapers of the era.
>     Newspaper histories also cover the history of the printers and printing
>     of newspapers in a state, county, or region more generally, and provide
>     more condensed histories of the editors, journalists, and evolution of
>     the newspapers in a specific area. Newspaper histories and union lists
>     should be available at most large public or university libraries. More
>     information about union lists, newspaper histories, and researching
>     newspapers in general can be found in theU.S. Newspaper Collections at
>     the Library of Congress
>     <https://guides.loc.gov/united-states-newspapers/introduction
>     <https://guides.loc.gov/united-states-newspapers/introduction>>research
>     guide (see Reference Sources).
> 
>     Please let me know if I can be of further assistance.
> 
>     ------------------------------------------------------------------------
> 
>     Original Question
> 
>     Jun 20 2022, 02:34pm via System
> 
>     How can I get counts of the numbers of newspapers by year in the US,
>     and
>     preferably also elsewhere? A search of "U.S. Newspaper Directory,
>     How can I get counts of the numbers of newspapers by year in the US,
>     and
>     preferably also elsewhere?
> 
>     A search of "U.S. Newspaper Directory, 1690-Present"
>     (https://chroniclingamerica.loc.gov/search/titles/
>     <https://chroniclingamerica.loc.gov/search/titles/>) returned 157520
>     titles in 7876 pages with up to 20 titles per page giving date
>     ranges to
>     the extent that it's known. If I can get a data file (e.g., csv or
>     xls),
>     I can summarize. I could also use data on circulation and frequency and
>     especially parent company for multiple newspapers published by the same
>     company, to the extant that such is available.
> 
>     I'm interested in this, because McChesney quoted Tocqueville in
>     suggesting that the US had more newspapers per person (or per million
>     population) prior to 1851 than at any other time or place in history.
>     I'd like to evaluate that claim with data to the extent that I can. See
>     "https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present
>     <https://en.wikiversity.org/wiki/Social_construction_of_crime_and_what_we_can_do_about_it#Newspapers_1790_-_present>".
> 
> 
> 
>     Thanks, Spencer Graves, PhD
>     m: 408-655-4567
> 
>     ------------------------------------------------------------------------
> 
>     Thank you for using Newspapers & Current Periodicals Ask a Librarian
>     Service!
> 
> 
>     This email is sent from Ask a Librarian in relationship to ticket
>     #9625195.
> 
>     Read our privacy policy. <https://springshare.com/privacy.html
>     <https://springshare.com/privacy.html>>
> 
>     ______________________________________________
>     R-help using r-project.org <mailto:R-help using r-project.org> mailing list --
>     To UNSUBSCRIBE and more, see
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     <https://stat.ethz.ch/mailman/listinfo/r-help>
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     <http://www.R-project.org/posting-guide.html>
>     and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list