US Surname Distribution Analysis
The Aggregated Distribution File

Home | UK Distribution | US Distribution  Help 
 

When you upload a Census National Index Extract file, you will be given the opportunity to download the Aggregated Distribution file which is created from the National Index Extract file.

Alternately, if you are a Windows user, you can download a program that will convert a National Index Extract file into a an Aggregated Distribution File on your own PC.  

The Aggregated Distribution file can be uploaded instead of the National Index Extract file if you wish to create distribution maps in a subsequent session.  Because the Aggregated Data File is considerably smaller than the National Index file, it is much quicker to upload.

The format of the Aggregated Distribution File downloaded from this facility when you upload a National Index File is 'Tab delimited text', which uses TAB characters to separate the three fields in each record, as follows:-

WICKES (ROBERTSON)VA1
WICKIRMI6
WICKSCT17
WICKSNY381
WICKSOH3
WICKSPA2
WICKSSC44
WICKSSD12
WICKSVA1

Note that the TAB character is non-printing, and is represented by the in the above example. The 'Tab delimited text' format is convenient for loading into a spreadsheet for editing or further analysis.  

The three fields in each record are the Surname, the abbreviation of the State, and the number associated with this state.  When the Aggregated Distribution File is downloaded from this site, or created with the usdistag program which can be downloaded from this site, this number is the count of individuals with the Surname in the State.  However, it could be some other figure, such as a population density (e.g. 0.000013) providing you calculate that figure and create an Aggregated Distribution File of your own.

In addition to the data, the file can also contain an optional title, which identifies the data in this set. The title appears on the first line of the file, thus:-

Title=WYKES - 1880 Census Place
WICKES (ROBERTSON)VA1
WICKIRMI6
WICKSCT17
WICKSNY381
WICKSOH3
WICKSPA2
WICKSSC44
WICKSSD12
WICKSVA1

The title must appear on the first line of the file and it must be preceded by 'Title=' which identifies the record as the title record.

Creating Your Own Aggregated Distribution File

If you want to create distribution maps from a source of data other than the LDS 1880 US Census CDs, you can do so, but you will need to create your own Aggregated Distribution file.

As noted above, the file is a plain text file.  You may use either TAB characters or commas (,) to separate the three fields in each record, as follows:-

WICKES (ROBERTSON)VA1
WICKIRMI6
WICKSCT17
WICKSNY381
WICKSOH3
WICKSPA2
WICKSSC44
WICKSSD12
WICKSVA11

or 

WICKES (ROBERTSON),VA,1
WICKIR,MI,6
WICKS,CT,17
WICKS,NY,381
WICKS,OH,3
WICKS,PA,2
WICKS,SC,44
WICKS,SD,12
WICKS,VA,1

The three fields in each record are:

  • Surname 
  • State Abbreviation
  • The number associated with the surname in the state. 

The number can be a count of the number of people in the state with the surname, or any other meaningful measurement, such as density or frequency of the surname.

If your data are based on pre-1889 sources, then you will probably have state data for Dakota Territory, rather than the post-1889 states of North and South Dakota. In this case you should use the code DT for Dakota Territory, rather than the codes SD and ND. Take care that you use either DT or the state codes, ND and SD.  If you do include DT and either of the codes ND or SD then the data for the two states North Dakota and South Dakota will not be plotted and will appear as Other in the Legend. 

The easiest way to create your own Aggregated file is using a spreadsheet.  Your worksheet should contain 3 columns, corresponding to the 3 fields in each aggregated record.  When you have entered all your data save the file as a Text (Tab Delimited) file or as a CSV file. This file can then be uploaded for analysis, using the form on the Aggregated Distribution File Upload page.  Some spreadsheets will enclose fields in double quotes (").  That's OK - the upload process strips any double quotes from the file being uploaded.  If you are entering a title, it should be placed in cell A1, according to the rules specified above.

The table below shows the state abbreviations and the full names of the States.

State    State 
Code  State Code  State
AK Alaska   MT Montana
AL Alabama   NC North Carolina
AR Arkansas   ND North Dakota (post-1889)
AZ Arizona   NE Nebraska
CA California   NH New Hampshire
CO Colorado   NJ New Jersey
CT Connecticut   NM New Mexico
DC District of Columbia   NV Nevada
DE Delaware   NY New York
DT Dakota Territory (pre-1889)   NYC New York City
FL Florida   OH Ohio
GA Georgia   OK Oklahoma
HI Hawaii   OR Oregon
IA Iowa   PA Pennsylvania
ID Idaho   RI Rhode Island
IL Illinois   SC South Carolina
IN Indiana   SD South Dakota (post-1889)
KS Kansas   TN Tennessee
KY Kentucky   TX Texas
LA Louisiana   UT Utah
MA Massachusetts   VA Virginia
MD Maryland   VT Vermont
ME Maine   WA Washington
MI Michigan   WI Wisconsin
MN Minnesota   WV West Virginia
MO Missouri   WY Wyoming
MS Mississippi  

Home | US Distribution | Help