This web page documents and summarizes geographic entities used in the 2007 edition of the MABLE database, accessible via the
geocorr2k, Version 1.3 web application.
The Missouri State Census Data Center and
OSEDA maintain a library of geographic code modules in the form of SAS(r) format codes. These modules have special application for
SAS software users since they allow codes to be readily converted to their
corresponding names. Sometimes format modules are used not to provide names, but rather to link codes to other entities as a kind of table lookup. Note that although these modules are technically "code" you do
not have to be a programmer, or know any SAS, to use these as codebook files to look up a geographic code.
In the text below when a SAS format module is available for a geocode we provide a link to it at the end of the entry for that code.
Most of these geographic codes are comprised of numeric digits but they have no numeric significance. They are stored in the MABLE database as character strings rather than binary numeric fields. In reports they will display with leading zeroes ("01" as the code for Alabama, for example, rather than just "1") and these leading zeroes also are written to output csv files by the geocorr application. When the latter get imported into Excel, however, the import routine turns them into numerics and the leading 0's disappear.
The FIPS county codes are 3-digit numbers assigned within states. They
generally are odd numbers assigned in alphabetical order. Exceptions are
independent cities (i.e. cities like Baltimore and St. Louis that are not
in any county and serve as county equivalents) which are usually assigned
codes over 500 (such as '510' for some reason.) On output files and
listing we usually combine the FIPS state and county codes. Thus the
value of the County variable for Autauga County, Alabama is 01001 and for
Baltimore City, Maryland is 24510. In some states (such as Louisiana and
Alaska) the primary substate legal entities are not called counties but for
the sake of this application they are "county equivalents" and act
exactly the same as counties. Counties appearing here are those defined
at the time of the 2000 census. There have been changes to the county definition since
then in Colorado and Virginia. See Census Bureau web page describing these changes. There are approximately 3141 counties
in the U.S. See the COUNTY02 entry, next.
Format tables:
Scousub.sas
and
Smcdcnvt.sas The Smcdcnvt format
module shows the relationship between the Census Bureau codes used for
these entities and the FIPS codes (as used in MABLE). The FIPS codes
are 5 digits and are unique within state, while the Census Bureau codes
are only 3 digits and are unique within county. The Census Bureau codes are no longer used for current data and we
are not even sure if such codes are assigned for new county subdivisions. They are of interest only when needing
to link to earlier data that used these codes.
On output files and listings generated by geocorr this variable goes
by the name cousubfp ("COUnty SUBdivision- -FiPs").
Format tables:
Sfplace.sas
and
Splccnvt.sas
(The Splccnvt format module shows the relationship between the Census
Bureau codes used for these entities and the FIPS codes (as used in
MABLE). The FIPS codes are 5 digits and are unique within state, while
the Census Bureau codes are 4 digits and are also unqiue within state.)
On output files and listings generated by geocorr this variable goes
by the name placefp.
Only "residential" ZIP codes - those containing household addresses -
are included on this file. There are no business or Post Office Box-only
ZIPs, etc. The latter account for about a fourth of all ZIP codes in the
U.S.
Another problem is that ZIP codes are not really spatial entities -- they
are simply lists of addresses, organized to facilitate mail delivery.
While they often do form areas that can be viewed as geographic areas,
that is not what they really are. This can create problems when you try
to relate them to a spatial entity such as a census block. Think of a
classic census block formed by the intersection of 1st St., Elm Ave, 2nd
St. and Pine Ave. If 1st St is the northern border of the block then
folks living on the south side of 1st St. between Elm and Pine are in our
block (lets call it "101"). But people living across the street -- on the
north side of 1st St. are living in a different block, say "102". But the
U.S. Postal Service would never (well, hardly ever) have a ZIP boundary go
down the middle of a street. If this were an area where the ZIP changed
it would almost certainly divide along (vague and invisible) "back-lot
lines". For example, the folks living on both sides of 1st St. in our
example might live in ZIP 12345, while the folks living on 2nd St. might
live in 12346. Thus you have households in the same census block, but in
different ZIP codes. Hence, the fundamental concept of census block as
the atomic unit is violated. Of course, this only happens in a certain
percentage of blocks, and in many cases the ZIP boundaries are on
commercial streets where not many people live and you can assign most of
the population in the boundary blocks to the right ZIP. These issues are dealt with in
the Bureau's definitive web page (see link above) that describes how these issues were dealt with
when defining ZCTA's.
On output files this geocode is stored as ZCTA5. The variable Zipname is also included (unless otherwise
requested). This name is based on and old file we obtained from the USPS back in the 90's with post office
names, slightly supplemented. In a few cases there will not be a name available, in which case the code value also appears as the ZIPName value.
Pseudo-ZCTA codes are those that end with XX or HH (e.g. 594XX and 594HH in Montana). The 594XX ZCTA is the set of blocks within the 594 3-digit ZIP area that the USPS had not assigned to any ZIP code. (Not every location in the country has a ZIP code.) The HH pseudo-ZIP areas are the combination of all water blocks within the 3-digit ZIP area. These usually (but not always) have zero population.
Format table:
Szipnmus.sas
This format code was derived from a file from the U.S. Postal Service.
Its a combination of Post Office and local geographic names. Its the
source for the ZIPNAME fields that will be added to your geocorr outputs
if you specify that you want names to got with your geocodes and you
also select ZIP as one of your geocodes.
This geocode has two values: "U" means urban and "R" means rural.
On geocorr output files this field is called ur and is one
character long. There is no name field associated with it.
PUMA codes are 5 digits (characters) long. Most end with "00".
Generally when the last two digits are not 0's it represents a county
that has been split into subareas. Thus, for example, the PUMA codes for
the City of St. Louis are '01801', '01802' and '01803'.
On all geocorr output files this field will be called puma5
and will be 5 characters wide with leading and trailing 0's. There will be
no names associated with them.
There are also codes called "Super PUMAs" (aka "1% PUMAs") that were used on the 2000 PUMS 1% sample files.
These are not currently kept on the MABLE database. We omitted them to avoid some confusion. But if user demand indicates a need for them
they can be put back. The 5% PUMAs nest within the 1% PUMAs.
On geocorr output files this field will be called msacmsa
and will be 4 characters wide. A value of '9999' is used to indicate an
area that is not within a metro area.
Format table:
Smetro.sas
This format code handles MSA, CMSA and PMSA codes and returns the names
of the areas. (Note that these 3 kinds of codes do not overlap, i.e. if
there is an MSA with code 1234 then there will never be a PMSA or CMSA
with that code.) The format code will return a "(P)" at the end of the
metro name to indicate a Primary MSA.
On geocorr output files this field will be called pmsa and
will be 4 characters wide. It will have a value of '9999' to indicate
not applicable.