Skip to Main Content
Data Standards and Data Dictionary

* indicates required field

The earliest standard setters were CoC and SEER. The End Results Group, predecessor of SEER, published coding rules and guidelines as early as the 1950s; CoC published its first data collection manual, the Supplement on the Tumor Registry, in conjunction with its Cancer Program Manual 1981. At that time, hospital-based cancer registries often used CoC's recommended codes and coding rules, and SEER central registries used those of the SEER Program. The two systems were not always in agreement. As a result, CoC and SEER began working together in the early 1980s to make the codes and definitions in their manuals consistent.

In the late 1980s, increased efforts to pool data collected by different cancer registries drew attention to problems resulting from insufficient data standardization. The lack of standardization took many forms. Data items used by different registries or software systems varied in their definitions and codes, even when they had the same name and were intended to represent the same information. Blanks, dashes, and defined codes were all used to indicate “unknown” data. Other substantial discrepancies were less easy to detect and correct. Hospitals and software providers faced conflicting standards and requirements when they were both reporting to a central registry and maintaining a database consistent with CoC standards.

This lack of standardization had a substantial cost and limited more widespread use of valuable data. Three groups especially felt the impact: state registries receiving data from hospital registries, the NAACCR Data Evaluation and Publication committee, and the Commission on Cancer's (CoC) National Cancer Data Base (NCDB).

In Canada, cancer registries at the provincial and territorial level joined together with Statistics Canada, a national agency, to form the Canadian Council of Cancer Registries. This process started in 1986 and led to the development of common national standards for the Canadian Cancer Registry, which were implemented with a reference date of January 1, 1992. The Data Quality Management Committee, which reports to the Council, is responsible for making recommendations to set national standards, and will review and monitor data quality and resolve any inconsistencies in procedures, coding, or other activities affecting data comparability.

By 1988, the CoC and SEER collaboration resulted in the publication of both CoC's Data Acquisition Manual and the SEER Program Code Manual, with data items and codes in substantial agreement. Having more congruent data sets allowed for easier data sharing and data comparisons, especially with the advent of personal computers that were sufficiently powerful to analyze large amounts of cancer data. This achievement helped set precedents for cooperation in data management and maintaining congruence whenever possible.

During the same period, the California Cancer Registry was developing a statewide automated system that allowed facilities to report electronically to the state registry system. One region in California was a SEER registry at that time, and many hospitals maintained CoC-accredited programs. To facilitate implementation of standards within its program, the California Cancer Registry requested that SEER and CoC establish a formal committee to pursue data standardization and requested membership on this committee.

The function of that committee was transferred to NAACCR's Uniform Data Standards Committee established in 1987. Membership was expanded to include all of the major standard-setting organizations, representation from registry software vendors, and central registries. This group has made enormous progress toward standardization and provides a national forum to discuss data issues to reach consensus on data standards. A major success occurred when all participating groups agreed to implement the Second Edition of ICD-O simultaneously for tumors diagnosed in 1992 and later. In 1993, NAACCR convened a multidisciplinary conference to address the issue of collecting data on pre-invasive cervical neoplasia, resulting in specific recommendations for member registries to cease collection of cervical carcinoma in situ.

CDC added another strong voice for standardization. CDC requires that the registries funded by NPCR use standard data items and codes. CDC provides quality control activities for participants in NPCR and has facilitated the setting of standards and encouraged their adoption.

Many NAACCR sponsoring organizations, including NCI SEER, CDC, and CoC, recognized increasing standardization as an essential step in decreasing the costs associated with data collection, making more efficient use of increasingly limited human resources needed for data collection, management, analysis, and obtaining more useful data that could be compared across registries and geographic areas.

Despite the progress made toward standardization and the agreement that standardization is desirable, implementation is not uniform. For example, SEER and CoC will continue to publish separate coding manuals on different schedules.

NAACCR hopes that documenting existing standards, recommending standards where they do not yet exist, and publishing the results in a concise and authoritative form will enable registries and software providers to move forward in achieving comparable data that can be widely used.