Skip to Main Content
Data Standards and Data Dictionary

* indicates required field

The NAACCR Data Standards and Data Dictionary is a reference to ensure uniform data collection and is intended for hospital and central cancer registries, cancer registry software vendors, public health researchers and data scientists/analysts.

Goal of the Data Dictionary

The goal is to define data collection standards for use by central cancer registries, hospital-based cancer registries, and other groups in North America to abstract cancer and other reportable diagnoses on or after January 1 of the implementation year.

Objectives of the standardization effort are to:

  1. Provide a comprehensive reference to ensure uniform data collection, and
  2. Facilitate the collection of comparable data among the standard setting agencies (CoC, CDC, NCI, the Canadian Council of Cancer Registries (CCCR), and NAACCR).

These data standards are used by new and existing facility-based and central cancer registries to ensure that their program's standard definitions and codes are consistent with those used by regional and national databases. Other potential users include registry software providers and those using registry data, especially if they are combining data from multiple sources or exchanging data.

Scope

The data dictionary is limited to standards regarding data, rather than procedures. More specifically, it focuses on data standards that NAACCR considers important to establish. These include:

  • Reportability

    Reportability defines the rules for inclusion of specific types of tumors in the registry (see Standards for Tumor Inclusion and Reportability).

  • Data Items or Elements to Be Included

    Some data items are required or recommended by particular standard setters while others are optional or are retained because they were abstracted in the past. The Required Status Table specifies the required status for each data item.

  • Standardized Item Numbers and Item Names

    For ease and consistency of reference, all data items are assigned both numbers and names (e.g., the item “Sex” is assigned the item number 220). The item number is intended to be permanent and will not change. Assignment of permanent numbers is necessary because standard-setting organizations have changed item names over time or have applied similar names to items with different definitions. Item numbers allow the required precision of reference. When data items are deleted, the item numbers are retired and will never be reused for a different data item. Some numbers were intentionally skipped to allow the insertion of related items in the future.

    Where possible, the NAACCR item name is the same as that used by the standard setter. However, the following constraints are placed on the names:

    • Length

      Beginning with the Data Standards and Data Dictionary Version 18 the data item name length is limited to 50 characters. Previously the data item name length was limited to 25 characters because that had been the maximum length for item names in the EDITS software system. Standardized abbreviations, punctuation, and spacing have been used (e.g., the word “first” is entered “1st,” “treatment” is “RX,” and so on). Other limitations will be imposed as needed. Thus, item names will be identical in this data dictionary version and the NAACCR Metafile of standard edits.

    • Consistency

      Consistency was a goal in formatting names and in using special characters. The character -- is used to distinguish among item names built on the same stem name.

      Example: “Sequence Number--Hospital” and “Sequence Number--Central” are the names of two differently defined sequence number

    • Interrelated Items, Fields, and Subfields

      To make the relationship among items more apparent, a constant term was consistently added to the stem of the name.

      Example: Names of treatment fields related to radiation therapy begin with “Rad,” so that in a list of item names they will appear together:
      Rad--No of Treatment Vol
      Rad--Elapsed RX Days

  • Record Layout/Data Exchange

    Starting with Version 21 the record layout / data exchange will be collected and maintained in an XML format, the former fixed-width positions of data items are no longer defined. For the latest information on the XML data exchange format, visit NAACCR XML Data Exchange Standard.

  • Codes and Coding Rules

    Each data item has either the codes with definitions or a reference to the codes and coding instructions. The Source of Standard is the standard-setting agency that defines the codes, coding instructions and rules for a given tumor. For example, when the source of standard for a data item is SEER the Codes section will refer to the SEER Coding Manual for the standard guidelines of the codes and coding instructions.