AlzRisk Glossary

**Age at Start of Follow-Up**

A study participant’s age at the start of follow-up is his or her age when first assessed by the investigators (see “Average Follow-Up Time"). Sometimes the start of follow-up coincides with the establishment of the cohort and enrollment of participants. Often, however, follow-up begins when the investigators first assess the participant for AD or for the risk factor of interest (or both), and this could occur well after the cohort has been in place. For example, suppose the Acme Cohort was established in 1985, when the participant was 67 years old. In 1995, the investigators assessed participants’ current diabetes status and followed their development of AD from that time forward. Thus, for a study of diabetes and incident AD, this participant’s age at the start of follow-up was 77.

**Average Follow-up Time**

The follow-up time for any given participant begins when he or she is first assessed by the investigators. Sometimes the start of follow-up coincides with the establishment of the cohort and enrollment of participants. Often, however, follow-up begins when the investigators first assess the participant for the risk factor of interest, and this could occur well after the cohort has been in place. For example, suppose the Acme Cohort was established in 1985. In 1995, the investigators assessed participants’ current overweight/obesity status and followed their development of AD from that time forward. Thus, for a study of body mass and incident AD, follow-up began in 1995. In other settings, participants’ risk factor status has been known for some time, but follow-up for the study of the risk factor and AD begins when the investigators first evaluate participants for AD.

In studies that estimate incidence rates, incidence rate ratios (IRRs) or hazard ratios (HRs), follow-up for any given participant generally ends when the participant develops AD, when the participant dies, when the participant is lost (e.g., moves out of the study area, cannot be located, or is otherwise unable to participate), or when the investigators stop follow-up (administrative censoring).

The unit of follow-up in such a study is the person-year, the sum of all at-risk years of follow-up for all participants. In AlzRisk, we occasionally derive average follow-up time from the reported number of person-years and the number of participants in the study (person-years/persons).

**Covariates and Analysis Type**

*Covariates*

In evaluating the relation of an exposure to AD, investigators account for factors that might explain the observed association. In particular, we are concerned that factors which affect both the exposure and AD risk could induce misleading results. To take a common example, for most exposures, we would be reticent to interpret at face value an association that was not adjusted for age, since many exposures vary by age, and advancing age is a strong predictor of AD (i.e., age confounds many exposure-AD associations). A study might “adjust” for a covariate in one (or more) ways. A common approach is to include the covariate as an independent variable in the regression model of the exposure and AD. An approach used in many older studies is to conduct analyses within strata (levels) of the covariate and then compute a sort of weighted average of the results. In some studies that use Cox proportional hazards regression, investigators use age, instead of time in study, as the analytical time metric, which amounts to very fine stratification by age. Studies that restrict their enrollment to persons of the same sex (e.g., only men as done in the Honolulu-Asia Aging Study) inherently account for sex. In the “Covariates and Analysis” section of the Risk Factor Overview table, we display covariates that were included as variables in a study’s regression model or that were variables used as the basis for performing stratified analyses. We do not include factors that were the basis for restricting inclusion in the study.

It is common for studies to report results that are adjusted for different sets of covariates. For example, a study of type 2 diabetes and AD might report a relative risk that is adjusted for age, sex, and education, and then another relative risk that is further adjusted for body mass index, hypertension, physical activity, and smoking. We record all results in AlzRisk but generally report the most fully adjusted results in the tables. The limitation of the approach is that some of these analyses may be “overadjusted,” accounting for factors that are potential intermediates between the exposure and AD and thus obtaining attenuated (i.e., dampened) measures of association.

*Analysis Type*

**Cox proportional hazards regression (Cox PH)**

Investigators use Cox PH regression models to evaluate the association of an exposure with cases of AD that develop over the course of observation (i.e., incident cases). Information on the timing of each diagnosis is integral to this type of analysis. Cox PH regression produces the hazard ratio (HR) as its measure of association. Also see Incidence study reporting HRs.**Logistic regression (Logreg)**

Several different study designs make use of logistic regression models. In a cross-sectional setting, in which investigators examine “prevalent” cases of AD (cases of AD that exist when the exposure is assessed), logistic regression models estimate the prevalence odds ratio (OR) as a measure of association (also see Prevalence study reporting ORs). In the particular prospective setting in which cases of AD develop over the course of a well-defined period—but whose timing is not precisely known—cumulative risk of AD can be assessed. In particular, logistic models estimate the cumulative odds ratio as a measure of association (also see Cumulative incidence study reporting ORs). Because “odds” are somewhat less intuitive than “risk,” some investigators report prevalence ratios (or cumulative incidence ratios), which they derive from the results of their logistic regression models.

**Dx Assessment (Diagnostic Assessment)**

**Alzheimer's disease**

The diagnosis of Alzheimer's disease is typically made by NINDS/ADRDA criteria (sometimes referred to as the McKhann criteria: McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan M. Clinical diagnosis of Alzheimer.s disease: report of the NINCDS-ADRDA Work Group. Neurology 1984;34:939-944), or DSM-III, IIIR, or IV criteria (American Psychiatric Association, American Psychiatric Press, 1980, 1987, 1997). These criteria are fundamentally similar, and any subtle differences are unlikely to affect the research conclusions. All require dementia (link) with an insidious onset and progressive course, and the absence of other conditions that might better account for the dementia (e.g. stroke, endocrine deficiency, infection). The diagnosis can be made by a clinician based on a thorough examination and work-up, and, when performed in a research setting, is 80-90% predictive of a pathological diagnosis of AD at autopsy.

**Dementia**

The diagnosis of dementia is most commonly made by DSM-III, III-R, or IV criteria, which differ only slightly. All require a decline in memory and at least one other cognitive domain, and a significant impairment in function in everyday life. The impairment must be acquired (to distinguish it from mental retardation). The prevalence of dementia rises steeply with age. By far the most common dementia is Alzheimer's disease, often mixed to a greater or lesser extent with vascular factors. A slowly progressive course makes the diagnosis of AD more likely, and the term Progressive Dementing Disorder is sometimes used to refer to an AD-like dementia for which a full work-up is unavailable.

**Effect Size**

The term “effect size” is shorthand for the more accurate “size of the measure of association” and is not intended to imply causality.

To describe the observed association between an exposure (a risk factor) and AD, studies in AlzRisk estimate a measure of association. Most studies in the AlzRisk database report multiplicative measures of association (i.e., ratios). Generically speaking, these measures are the risk of AD in those with the exposure divided by the risk of AD in those without the exposure. The “effect size” refers to the magnitude of this ratio. If the risk of AD is the same in the exposed and non-exposed, the relative risk is 1.0. As the risk among the exposed grows larger relative to that in the non-exposed, the relative risk becomes progressively larger than 1.0, with no theoretical maximum. Conversely, if the risk of AD among the exposed is smaller than among the non-exposed, the relative risk is smaller than 1.0, and as the risk among the exposed diminishes relative to that in the non-exposed, the relative risk becomes progressively smaller than 1.0, approaching but never reaching zero. Suppose a report examines AD risk in a group of “exposed” and “non-exposed” persons and estimates a relative risk of 1.50 for the exposed group. Then we say that the risk of AD in the exposed is 50 percent greater than that in the non-exposed (or 1.5 times as high). Likewise, for a relative risk of 0.75, we say that the risk of AD in the exposed is 25 percent smaller than among the non-exposed (or 0.75 times the risk).

Understanding the “reference group” (or reference level) and the units of the exposure is critical to the interpretation of the effect size. The “reference group” (or reference level) is the exposure group (or level) to which all other exposure groups are compared. In the examples above, the “reference group” comprised non-exposed persons. In our table on diabetes mellitus, we compare the AD risk among those having diabetes with the AD risk among those who do not have diabetes (the reference group). The relative risk for the reference group will always be 1.0, since this is the division of the reference group’s risk by itself. For exposures evaluated along a continuum (e.g., systolic blood pressure), the “reference group” is not a well-defined exposure state. Instead, we interpret the effect size in terms of a relative risk per unit increment in the exposure. For example, an investigation might report a rate ratio of 1.10 per 5 mm Hg in systolic blood pressure (i.e., a 10 percent increase in rate per 5 mm Hg). To optimize the comparability of results in a given table on a Risk Factor Overview page, we strive to present results so that the reference group is as similar as possible across the studies. Similarly, for tables devoted to continuously evaluated exposures, we try to present effect sizes that reflect the same increment in exposure. Occasionally, these efforts require us to mathematically transform an investigation’s reported findings, and we indicate where we have done this.

**Ethnicity**

We use this term broadly to describe the ethnic and racial characteristics of the study population. Studies generally assess these traits in their participants via self-identification. Occasionally, a paper does not report ethnicity data for its study population, and in such cases we consult other published reports on the study population to obtain approximations of its ethnic and racial composition. We indicate in the detail links when we have obtained this information from other sources.

**Exposure**

In AlzRisk, we use "exposure" and "risk factor" interchangeably to refer to factors that putatively predict AD risk. All exposures in AlzRisk are presumed to be non-genetic unless stated otherwise. However, we appreciate that some exposures may be under genetic influence.

**Exposure Distribution**

The exposure distribution indicates the fractions of the study population at specific levels of an exposure (i.e., risk factor) of interest. For exposures that are assessed as continuous entities, such as systolic blood pressure, we describe the distribution in terms of means, standard deviations, and ranges. Information about the exposure distribution in an investigation report indicates the investigation’s capacity for evaluating the association between the exposure and AD. This capacity will be limited if the exposure is rare in the study population, unless the study population is very large. Using diabetes mellitus as an example, with all other factors being equal, a study in which 20 percent of participants have diabetes will have more capacity to detect an association with AD than a study in which 5 percent of participants have diabetes.

**Measure of Association**

To describe the observed association between an exposure (a risk factor) and AD, studies in AlzRisk use their data to estimate a "measure of association." Most studies in the AlzRisk database report multiplicative measures of association (i.e., ratios). Generically speaking, a multiplicative measure of association expresses the risk of AD in those with the exposure divided by the risk of AD in those without the exposure. Common measures of association include:

**The hazard ratio (HR):**

The HR is the rate of AD among “exposed” persons (cases per at-risk person-time) divided by the rate of AD among “non-exposed” persons. The most commonly used analysis for estimating HRs is Cox proportional hazards regression.**The incidence rate ratio (IRR):**

The IRR is the rate of AD among “exposed” persons (cases per at-risk person-time) divided by the rate of AD among “non-exposed” persons. Conceptually similar to the HR, estimation of the IRR involves more restrictive assumptions about how the rate of AD changes over time. A commonly used analysis for estimating IRRs is Poisson regression.**The odds ratio (OR):**

The odds of AD are the probability of having AD divided by the probability of not having AD, so that odds range from zero to infinity. The OR is the odds of AD among “exposed” persons divided by the rate of AD among “non-exposed” persons. ORs may be used in prevalence studies, where the OR is the prevalence OR. ORs are also commonly used to compare the incident cases that have accrued over a defined interval, as in a cumulative incidence study, where the OR is the cumulative OR. Logistic regression analyses estimate ORs.**The standardized morbidity ratio (SMR):**

The SMR compares the AD occurrence in an “exposed” population with that of a reference (or standard) population. The standard population is assumed to be “non-exposed” or at least to have an exposure distribution that varies dramatically from that of the “exposed” population. SMR computations often adjust for age and sex by using data that are stratified on these variables (for age, usually five- to 10-year age groups). An SMR can be computed for AD prevalence, AD risk (cumulative incidence), or AD rates.

**Ninety-five Percent Confidence Interval (95 Percent CI)**

The 95 percent confidence interval (95 percent CI) indicates the statistical precision of the estimated measure of association. The narrower the interval, the more precise the estimate. Generally, a larger number of cases makes for a more precise estimate. The 95 percent CI always contains the estimated measure of association. Moreover, the estimated measure of association falls roughly in the center of the 95 percent CI (for multiplicative measures, the estimate falls on the side toward 1.0). For multiplicative measures of association (e.g., odds ratios, prevalence ratios, cumulative incidence ratios (also called risk ratios), incidence rate ratios, hazard ratios, and standardized morbidity ratios), a 95 percent CI that does not contain 1.0 designates a “statistically significant” finding. The 95 percent CI is consistent with the p-value for the measure of association; specifically, with some exceptions, when the 95 percent CI does not contain 1.0, the p-value < 0.05. The advantage of the 95 percent CI over the p-value is that it identifies a range of results that are consistent with the study data.

The 95 percent CI is often interpreted as a measure of “confidence” in the results, but its statistical meaning is far more exacting. Statistically speaking, the 95 percent CI is such: suppose that we repeatedly sampled the study population from the larger real (or theoretical) population of all eligible persons, and for each of these samples, we used the same number of participants and, using identical methods for all samples, computed the measure of association between the risk factor and AD and its 95 percent CI. Then 95 percent of these confidence intervals would contain the true level of association (assuming appropriate statistical models and no other biases).

**P-value**

The p-value is a probability and therefore ranges between 0 and 1. The probability it expresses is contingent on the assumption that the risk factor is not truly related to AD (the “null” condition). The p-value is the probability, under this “null” condition, that the observed measure of association, or a more extreme one, could have emerged from the data. Therefore, a very small p-value indicates that the observed results would be rare given that no true association exists (and assuming the study design and analysis are free of other biases). Suppose, for example, that a paper on diabetes reports a hazard ratio (HR) of 1.25 with a corresponding p-value of 0.01. This means that, in the condition in which there is no true association between diabetes and AD, the probability that we could have observed a HR of 1.25 (or greater) in these data is 0.01 (or 1 out of 100), indicating that such a finding would be rare under “null” conditions. Generally, a larger number of cases and a larger effect size make for a smaller p-value.

The p-values reported in AlzRisk are “two-sided,” a more conservative and the conventional form used in modern epidemiologic literature. This means that the p-value reflects probabilities on both sides—that is, beneficial and adverse—of the null point. Using the example above, a p-value of 0.01 is the probability, under no association, that an HR at least as great as 1.25 or at least as small as 0.8 (1/1.25) could have been observed.

The p-value is consistent with the 95 percent confidence interval for a given estimated measure of association. Specifically, with some exceptions, when the p-value <0.05, the 95 percent CI does not contain 1.0.

Importantly, the p-value is

*not*the probability that the results are true (or false).

**Study Type**

We distinguish studies on the basis of 3 characteristics.

1.

*Whether the study investigators randomly assigned the exposure to participants (i.e., a randomized controlled trial [RCT]) or the investigators observed the exposure as it was adopted by or occurred naturally among study participants without intervention from the investigators (i.e., an observational study).*Results from observational studies make up the nearly all of the AlzRisk data. Thus, we indicate only where results have come from a RCT; all other results are from observational studies. For example, "RCT, cumulative incidence study reporting odds ratios" (see study types below).

2. The study's

*use of the rate at which new AD cases develop over time (i.e., incidence rate) or the percentage of participants who develop AD over a fixed period (i.e., cumulative incidence fraction)*.

3. the

*measure of association*they use to estimate the relation of the risk factor to AD.

Common study types on AlzRisk:

**Cumulative incidence study reporting odds ratios (ORs):**

This type of study evaluates cases of AD that develop over the course of a well-defined period among a population of people who do not have AD at the start of observation. The measure of association is the cumulative odds ratio.**Incidence study reporting hazard ratios (HRs):**

This type of study evaluates cases of AD that develop among a population of people who do not initially have the condition, over a given observation period, with attention to the timing of the diagnosis. The measure of association is the hazard ratio (HR).**Incidence study reporting standardized morbidity ratios (SMRs):**

This type of study evaluates the rate of AD within a study population of interest (the "exposed") population and compares this rate with the rate of AD in the general ("non-exposed") population. The measure of association is the standardized morbidity ratio (SMR).