Uexplore is a web application that provides access to the Missouri Census Data Center's public data archive, featuring U.S. Census and other public data. The interface resembles the Windows Explore tool on the desktop, allowing you to navigate through the directories that comprise the archive and access files or subdirectories for further processing.
Files in the Archive can be either database files (data tables in a special format) or non-database (mostly metadata, special navigational index pages, or small custom extracts) in a variety of common formats. Users can view and/or download most non-database files (in plain text, csv, xls, pdf and html formats, for example) directly in their browser. However, when the user selects a database
file a special interactive web application, uex2dex, is invoked. The application generates a custom data extraction form-fillout page which displays in your browser. The form includes hyperlinks (the section headers) to the Dexter online documentation which provides the user with instructions and tips for making Dexter do what you want. As you can probably guess, the program that is invoked when you fill out this form and click an Extract Data button is called Dexter.
The Dexter Quick Start Guide, which is
linked to from the top of the Dexter input form, provides a quick overview of the basics of the application, geared primarily to first-time and/or very casual users. It also contains a link to the Dexter Quick Start Video page with links to three show-and-tell video modules covering the basics (the first of which deals more with Uexplore than with Dexter.) To see what Dexter looks like we recommend starting at the Uexplore home page and navigating your way to a dataset. But if you just can't wait here's a link to access one of the datasets in our georef (geographic reference) data directory - cbsacos.sas7bdat. When the query form appears in your browser follow the link near the top of the page to detailed metadata; this will give you a much better chance at understanding the dataset from which you are now readey to extract something.
The Dexter data extraction module (written in SAS ©, in case you care about such things - most users do not and do not need to)
lets you view and/or download (in several formats, discussed below) all or selected observations
(records/rows) and all or selected variables (fields/columns) from the chosen database file. The archive has been designed with
the extract capabilities of Dexter in mind, so that we do not have to have hundreds or thousands of separate files
in order to provide the user with great flexibility in terms of accessing just the data they need. For example, if we have a
collection of data for all cities, states and counties in the United States, we do not need to create separate
sets for each kind of geography or one for each state. We can have a single large dataset that has variables whose values indicate what is being summarized: one such variable indicates the type of geographic entity (city, state or county) and another identifies the state. With Dexter it is quite easy to extract just the subset you need, e.g. just data for counties in California, or just state level data for the entire country.
Data can be extracted into any of the following common formats:
- ascii delimited file (comma or tab) - readily converted to Excel (xls) and other spreadsheet formats
- plain text report ("Proc Print")
- Adobe acrobat ("pdf")
- dBase file (dbf)
- SAS dataset (currently in V7/V8 for Windows/Unix format - sas7bdat).
Degree of Difficulty
How hard is it going to be for me to find and extract anything? This is the question that is utmost in the minds of most people who are coming to this application for the first time.
The answer is that it depends -- on quite a few things. If you are a 1-time user who just wants to extract a few numbers
from the Census then this is probably not the tool for you - at least not without assistance. This tool is aimed at a fairly sophisticated and probably
repeat user, one who is willing to do a little extra reading and entering of codes; and who is willing to put up with
a bit of a learning curve in return for a powerful and fast access tool. There are certainly other applications which are more user-friendly, but sometimes at the expense of limiting you in what you can get.
There are two basic kinds of "difficulty" that you may experience when using this system
We think that the first kind of difficulty can be overcome with a reasonable amount of effort. It would be comparable to learning how to drive with a manual transmission; it seems hard at first but once you get the hang of it, it almost seems easier that the automatic. And you can go a little faster. (However, we understand that the large majority of drivers refuse to deal with it because it's too hard.)
- Difficulty with using the Dexter application, i.e. in understanding what a filter is and how to fill out the form to define one
- Understanding the details regarding the datasets. You are not likely to have much luck extracting "just the data [you] want" when you don't have a clue about the source data.
The second kind of difficulty - understanding the data - is not so simple. There are many different kinds of data and users arrive with widely varying background regarding the data. If the specific data source being accessed is very complex (such as a decennial census summary file or a collection of Base Tables from the American Community Survey) and the user has never worked with that sort of data, then there is a whole separate learning curve regarding what you can get and where it may be stored. The level of assistance that is available within Uexplore/Dexter varies with data collections. Some are much more well-documented than others, and many presume that the user comes in with some basic knowledge.
Users should keep in mind, however, that when they are overwhelmed by the complexity of the data - when all they want is to get the latest estimates of persons below the age of 18 for 5 large cities in Missouri - that this is not a purely self-service web site / application. There are always links at the bottom of the page where you can contact someone with a question or comment. Use these and you may be pointed to the exact data set you need to be extracting from and told exactly which variables and/or tables need to be selected to get what you want. Or, as frequently happens, you can be informed that what you want is not something you can get - at least not from this archive. You may be pointed to an alternate source.
A good predictor of the degree of difficulty you are likely to have using this system is the extent to which you are already familiar with (or willing to become familiar with) the way we have organized the archive into categories (subdirectories) called "filetypes". These categories have mnemonic names such as sf32000x (standard extract based on the 2000 Census Summary File 3 data) or beareis (Regional Economic Information System data from the Bureau of Economic Analysis). Such product-based categories and naming conventions may actually provide the comfort of a familiar acronym to users familiar with them, while for others they may seem like pure technobabble jargon. In the ideal
modern data archive such categories would be invisible to (at least some) users. But this archive - for now, at least - is more geared towards researchers who have come to accept this somewhat pre-web way of organizing the data. The best way to start becoming familiar with these categories is to spend time carefully perusing the Uexplore/Dexter home page, where all these filetypes are listed with brief descriptions. Obviously, there is much more to be learned by simply selecting various filetypes and exploring the directories.
Detailed Metadata and Datasets.html Files
The biggest improvement we have made to the Uexplore/Dexter system over the years (see the Brief History, below) is not in any of the major program pieces, but rather in the development of a systematic way of providing access to more detailed and reliable metadata, i.e. to data that describe the datasets. This feature is implemented primarily in the form of special files (web pages) named Datasets.html which have been added to many of our key data directories. When you "uexplore" a directory with one of these files in it, you should heed the bolded desciption of the file displayed at the top of the page:
Use this custom data directory page to access the database files (only) with greatly enhanced descriptions and metadata.
. The two major benefits of using the Datsets.html page as your guide are
- the datasets are presented in a more logical order (not alphabetical by filename as with the Uexplore directory page), and
- it provides links to special Details pages for many of the datasets. (You can also get access to these same detailed metadata pages (when they exist) by clicking on the Detailed Metadata link at the top of the Dexter query form.)
Returning to the subject of degree of difficulty,
measured in terms of the probability that you will be able to find and extract the infomation you want without having to contact somebody at OSEDA or the Missouri Census Data Center. It is primarily a function of
how much you know about the data already, combined with the quality of the metadata that we have created for it, and the user's willingness to actually read what we have written. This is pretty much the
same as saying it depends on whether or not there is a Datasets.html file in the data directory and, if so, whether or not most of the datasets in the directory actually have entries on that page with a link to a customized Details page. (This is not entirely true, but it's a useful oversimplification.)
(In case you care. Feel free to ignore this section.)
When the original version of this application was written back in 1996, it was intended primarily for use by the staff of
the Missouri Census Data Center and a handful of power users of their data. The
pages were all black on white with only the minimum required hyperlinks. Files were all presented in alphabetical order (even the master directory of filetypes - there was no "Archive Directory" page) and you pretty well had to know what you were looking for. This was the DOS ages.
You needed to know the codes if you were expecting to extract census tract data for Lincoln county. We've come a pretty long way since then.
In 2003 the Dexter extraction tool replaced the original set of 3 modules
(with 3 corresponding form-fillout screen pages) that we used for doing extracts prior to that. Dexter is much easier to use. It is also considerably faster, since it runs on a new server machine which is about 7 times faster than the one we used to run on.
To Learn More
We have a series of online tutorials and help files to assist users with using the data archive with the
Uexplore and Dexter web tools. See our Uexplore / Dexter Tutorials web page.