* indicates required field
The NAACCR Data Standards and Data Dictionary is a reference to ensure uniform data collection and is intended for hospital and central cancer registries, cancer registry software vendors, public health researchers and data scientists/analysts.
Goal of the Data Dictionary
The goal is to define data collection standards for use by central cancer registries, hospital-based cancer registries, and other groups in North America to abstract cancer and other reportable diagnoses on or after January 1 of the implementation year.
Objectives of the standardization effort are to:
- Provide a comprehensive reference to ensure uniform data collection, and
- Facilitate the collection of comparable data among the standard setting agencies (CoC, CDC, NCI, the Canadian Council of Cancer Registries (CCCR), and NAACCR).
These data standards are used by new and existing facility-based and central cancer registries to ensure that their program's standard definitions and codes are consistent with those used by regional and national databases. Other potential users include registry software providers and those using registry data, especially if they are combining data from multiple sources or exchanging data.
Scope
The data dictionary is limited to standards regarding data, rather than procedures. More specifically, it focuses on data standards that NAACCR considers important to establish. These include:
-
Reportability
Reportability defines the rules for inclusion of specific types of tumors in the registry (see Standards for Tumor Inclusion and Reportability).
-
Data Items or Elements to Be Included
Some data items are required or recommended by particular standard setters while others are optional or are retained because they were abstracted in the past. The Required Status Table specifies the required status for each data item.
-
Standardized Item Numbers and Item Names
For ease and consistency of reference, all data items are assigned both numbers and names (e.g., the item “Sex” is assigned the item number 220). The item number is intended to be permanent and will not change. Assignment of permanent numbers is necessary because standard-setting organizations have changed item names over time or have applied similar names to items with different definitions. Item numbers allow the required precision of reference. When data items are deleted, the item numbers are retired and will never be reused for a different data item. Some numbers were intentionally skipped to allow the insertion of related items in the future.
Where possible, the NAACCR item name is the same as that used by the standard setter. However, the following constraints are placed on the names:
-
Length
Beginning with the Data Standards and Data Dictionary Version 18 the data item name length is limited to 50 characters. Previously the data item name length was limited to 25 characters because that had been the maximum length for item names in the EDITS software system. Standardized abbreviations, punctuation, and spacing have been used (e.g., the word “first” is entered “1st,” “treatment” is “RX,” and so on). Other limitations will be imposed as needed. Thus, item names will be identical in this data dictionary version and the NAACCR Metafile of standard edits.
-
Consistency
Consistency was a goal in formatting names and in using special characters. The character
--
is used to distinguish among item names built on the same stem name.Example: “Sequence Number--Hospital” and “Sequence Number--Central” are the names of two differently defined sequence number
-
Interrelated Items, Fields, and Subfields
To make the relationship among items more apparent, a constant term was consistently added to the stem of the name.
Example: Names of treatment fields related to radiation therapy begin with “Rad,” so that in a list of item names they will appear together:
Rad--No of Treatment Vol
Rad--Elapsed RX Days
-
-
Record Layout/Data Exchange
Starting with Version 21 the record layout / data exchange will be collected and maintained in an XML format, the former fixed-width positions of data items are no longer defined. For the latest information on the XML data exchange format, visit NAACCR XML Data Exchange Standard.
-
Codes and Coding Rules
Each data item has either the codes with definitions or a reference to the codes and coding instructions. The Source of Standard is the standard-setting agency that defines the codes, coding instructions and rules for a given tumor. For example, when the source of standard for a data item is SEER the Codes section will refer to the SEER Coding Manual for the standard guidelines of the codes and coding instructions.