Back to Journals » Clinical Epidemiology » Volume 10
Attributing diseases to multiple pathways: a causalpie modeling approach
Received 18 December 2017
Accepted for publication 8 March 2018
Published 27 April 2018 Volume 2018:10 Pages 489—497
DOI https://doi.org/10.2147/CLEP.S160205
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 3
Editor who approved publication: Professor Irene Petersen
Christine Chen,^{1} WenChung Lee^{1,2}
^{ 1}Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan; ^{2}Research Center for Genes, Environment and Human Health, College of Public Health, National Taiwan University, Taipei, Taiwan
Abstract: Characterizing the relations between exposures and diseases is the central tenet of epidemiology. Researchers may want to evaluate exposuredisease causation by assessing whether the disease under concern is induced by the various exposures – the socalled “attribution”. In this paper, the authors propose a method to attribute diseases to multiple pathways based on the causalpie model. The method can also be used to evaluate the potential impact of an intervention strategy and to allocate responsibility in tortlaw liability issues.
Keywords: epidemiological methods, attribution, disease pathways, causalpie model
Introduction
Characterizing the relations between exposures and diseases is the central tenet of epidemiology. Epidemiologists may be interested in knowing the influence of a single exposure on a disease (using effect measures such as risk difference, risk ratio, and odds ratio) or the total influence of multiple exposures on the disease. They may also be interested in knowing any possible interaction between exposures. Through epidemiological studies, the complex relations between multiple exposures and a disease can be clarified.^{1}
Attention has also been given to the “processes”, “pathways”, or “mechanisms” themselves, through which an exposure brings about the disease. For example, one may want to know whether the causal relationship between an exposure and a disease is mediated by a specific “mediator”. If so, the influence of the exposure on the disease can be decomposed: the “indirect effect” is the effect mediated by the mediator, and the “direct effect” is the one not mediated by it. A statistical method that can decompose the exposure effect is structural equation modeling (SEM).^{2–4 }Effect decomposition in SEM is straightforward; the effect pertaining to a specific pathway is simply the product of the path coefficients of the traveled paths. For the indirect effect, we sum up the effects of those pathways that pass through the mediator, and for the direct effect, those that do not pass through it. However, this only works for a continuous mediator and continuous disease. The methods proposed by Robins and Greenland,^{5} Pearl,^{6} and VanderWeele^{7} are more general. These methods can accommodate any variable type and can cope with exposure–mediator interactions and nonlinear relations between variables.
The aforementioned methods evaluate exposure–disease causation by going from exposures to a disease. Sometimes we may be interested in backward induction by assessing whether the disease under concern is induced by the various exposures – the socalled “attribution”. (Note that the attribution here is based on epidemiologic data,^{8–14} and should not be confused with attribution in social psychology where the human perception of causations is in focus.)^{15} For example, when planning intervention strategies, policymakers may want to compare the effectiveness of various intervention programs directed at removing different exposures in the population. In this case, we need to know the proportion of disease that was induced by each exposure. As another example, in some tort litigation, the court is concerned about the contribution of a specific exposure to the disease occurrence of the plaintiff. If probabilistic apportionment of causal responsibility^{16,17} is adopted, the court needs to know the probability that the occurrence of the disease was induced by this exposure. In situations like these, we can use indices such as the attributable fraction^{8–12} and the causalpie weight^{13,14} for attribution. When there are multiple exposures, a summation of the attributable fractions for all exposures may exceed 100%. Clearly, this makes no sense, and the index needs some rectifications.^{18–23} When there are multiple exposures, one can compute a panel of causalpie weights (summing up to 100%) for the individual effects of each and every exposure as well as the interactive effects between them. However, neither the attributable fraction nor the causalpie weight takes disease pathways into account.
In this paper, we propose a method to attribute diseases to multiple pathways based on the causalpie model.^{1,24} The method can also be used to evaluate the potential impact of an intervention strategy and to allocate responsibility in tortlaw liability issues.
Methods
Relations between an exposure, a mediator, and a disease
A “directed acyclic graph” (DAG)^{1,25,26} is used to depict the causal relations between an exposure (E) and a disease (D), which can be mediated by a mediator (M) (Figure 1). Causality (also referred to as causation, or cause and effect) is a process (arrows in Figure 1) that connects one set of variables (the “causes” or “risk factors”) with another set of variables (the “effects” or “outcomes”), where the first is partly responsible for the second, and the second is partly dependent on the first. An effect (outcome) can, in turn, be a cause (risk factor) for many other effects (outcomes). Note that a DAG depicts a simplified biology, ignoring any feedback loop where an effect can feed back to the same cause that leads to the very effect in the first place.
Figure 1 The two paths for Mstage and four paths for Dstage. Abbreviations: D, disease; E, exposure; M, mediator. 
We consider the exposure, the disease, and the mediator as dichotomous variables. We call the occurrence of the mediator, the Mstage, such as the paths M1 and M2, and the occurrence of the disease, the Dstage, such as the paths D1, D2, D3, and D4 (Figure 1). Note that to indicate “interaction”, we allow two DAG arrows to meet and merge before pointing at the same variable, such as the D4 path in Figure 1.
A causalpie model for mediator and disease
We follow the causalpie framework for mediator and disease proposed by Hafeman.^{27} We invoke the “sufficientcause positive monotonicity assumption” at the individual level, that is, the effects of the exposure on the mediator and on the disease as well as the effect of the mediator on the disease, if any, can only be harmful and cannot be preventive for any individual.^{1,13,14,27,28} In Figure 2, under the assumption, there is a total of six classes of causal pies – two for the Mstage and four for the Dstage. The two causalpie classes for the Mstage are: 1) a causalpie class not containing E as its component and 2) a causalpie class containing E as its component (1 and 2 correspond to paths M1 and M2 in Figure 1, respectively). The four causalpie classes for the Dstage are: 3) a causalpie class containing neither E nor M as its component, 4) a causalpie class containing E but not M as its component, 5) a causalpie class containing M but not E as its component, and 6) a causalpie class containing both E and M as its components (3, 4, 5, and 6 correspond to paths D1, D2, D3, and D4 in Figure 1, respectively).
Figure 2 The total six causalpie classes for Mstage and Dstage. Notes: (A) Mstage; (B) Dstage. Abbreviations: D, disease; E, exposure; M, mediator; U, unknown components. 
Aside from the exposure and the mediator, each causalpie class contains a distinct constellation of unknown components. We denote these by U – the U_{M1}, U_{M2}, U_{D1}, U_{D2}, U_{D3}, and U_{D4}, respectively, in Figure 2. When all components in a causal pie appear, the causal pie is completed, and the corresponding mediator or disease is meant to occur. The arrivals of the unknown components (U) are random events. When the U of a particular causal pie arrives and other component(s) (E, M, or both), if any, in the causal pie all exists, the causal pie is completed, and as mentioned previously, the corresponding mediator or disease occurs. Otherwise, the U departs, and the completion of this causal pie is contingent on the events that the same U arrives again.
Disease pathways
An individual can follow the paths depicted in Figure 1 to become diseased. A total of six distinct disease pathways can thus be identified (Figure 3):
 The exposure causes the disease directly (D2).
 The exposure causes the mediator, which in turn causes the disease (M2D3).
 The exposure causes the mediator, and then both interact to cause the disease (M2D4).
 The exposure and an exogenous mediator interact to cause the disease (M1D4).
 An exogenous mediator causes the disease directly (M1D3).
 Neither the exposure nor the mediator causes the disease (D1).
Figure 3 The six disease pathways. Notes: (A) Pathway I; (B) Pathway II; (C) Pathway III; (D) Pathway IV; (E) Pathway V; (F) Pathway VI. Abbreviations: D, disease; E, exposure; M, mediator. 
Note that these pathways are examined at the individual level rather than population, that is, the same person is followedup throughout for the path(s) he/she had actually taken to eventually become diseased. Therefore, we can distinguish the actual “operation” from the mere “presence” of a mediator.^{29,30} Pathways IV and V (Figure 3) only imply the presence of an “M” in the disease pathway. On the other hand, Pathways II and III (Figure 3) not only imply the presence of the M but also its operation (a person goes from E to M, and then to D), and the M now can truly be called a “mediator”.
Estimation of the causalpie parameters
We assume that in the followup period, the arrival rates of U in the six classes of causal pies, denoted by λ_{M1}, λ_{M2}, λ_{D1}, λ_{D2}, λ_{D3}, and λ_{D4}, respectively, are constant. We also invoke the “no redundancy assumption”,^{28,31,32} that is, for each and every subject in the population, at most one U can arrive in a sufficiently short time interval.
One can conduct a cohort study to estimate the aforementioned six causalpie parameters – λ_{M1}, λ_{M2}, λ_{D1}, λ_{D2}, λ_{D3}, and λ_{D4}. Suppose that there are n exposed subjects and m unexposed subjects in the cohort. At the start of the followup (t=0), all the subjects are mediator and diseasefree. During the followup period (from t=0 to t=T), for subjects who contracted the disease, the researcher records their mediator status at the moments they contracted the disease. For subjects who did not contract the disease during the following period, the researcher records their mediator status at the end of the followup (t=T). A tally of subjects at the end of the followup is shown in Table 1. This dataset has a total of 6 degrees of freedom (2^{2}  1=3 for the exposed subjects and 2^{2}  1=3 for the unexposed), which is equal to the number of the unknown parameters. Therefore, λ_{M1}, λ_{M2}, λ_{D1}, λ_{D2}, λ_{D3}, and λ_{D4} are just identifiable. See Supplementary materials for details of the estimation procedure.
Attribution, a backward induction process
As pointed out earlier, attribution is a backward induction process, assessing whether the outcome under concern is induced by some variables. Thus, we reverse the direction of the usual DAG arrows in Figure 1 to become the “attribution arrows” (Figures 4 and 5). When an attribution arrow points at a variable (exposure or mediator), it means that the indicated variable is one cause of the outcome (disease or mediator, depending on the point from which the arrow originates). When an attribution arrow points at the exposure and the mediator simultaneously, it means that the exposure and the mediator interact to cause the disease. When an attribution arrow points at nothing, it means that neither the exposure nor the mediator is a cause of the disease (or the mediator).
Figure 4 Formulae for attribution. Abbreviations: D, disease; E, exposure; M, mediator. 
Given the six causalpie parameters, we can compute the probability for any path (Figure 4). Consider the Mstage first (begin with the “M” in Figure 4 and follow the attribution arrows), an unexposed subject who acquires the mediator during the followup can only acquire it through path M1 (probability=1) but not path M2 (probability=0). An exposed subject who acquires the mediator can acquire it either through path M1 or M2, but not both (because of the no redundancy assumption). By Bayes theorem (Supplementary materials), the probabilities are (path M1) and (path M2), respectively. Next, consider the Dstage (begin with the “D” in Figure 4 and follow the attribution arrows) and also apply the Bayes theorem. An unexposed subject who acquires the disease but not the mediator during the followup can do so only through path D1 (probability=1). An unexposed subject who acquires the disease and the mediator can take either path D1 or D3 (with probabilities and , respectively). An exposed subject who acquires the disease but not the mediator can take either path D1 or D2 (with probabilities and , respectively). An exposed subject who acquires the disease and the mediator can take either path D1, D2, D3, or D4 (with probabilities , , , and , respectively).
Now we can compute the probability for any pathway. First, we note that under the no redundancy assumption, no one can acquire the disease and the mediator at the same time. A subject who acquires both the disease and the mediator during the followup must acquire the mediator before acquiring the disease. To calculate the probability for a pathway that straddles an Mstage path and a Dstage path, we simply multiply the two corresponding probabilities for the two paths. Following this multiplication rule, we can attribute the disease to multiple pathways probabilistically for a diseased subject with known exposure and mediator status. For a subject with unknown exposure and/or mediator status or for all the diseased subjects in the population, we can use the cell counts inside the box in Table 1 as the weights (shown underneath each panel in Figure 4) for attribution.
Next, we discuss attribution from three different perspectives: 1) attributing diseases to multiple pathways, 2) evaluating the potential impact of an intervention strategy, and 3) allocating responsibility in tortlaw liability issues.
Attributing diseases to multiple pathways
We can attribute the diseases in the population to the aforementioned six pathways. The population attributable fractions (PAF), which take into account all the diseased subjects in the population, are:





and

for Pathways I, II, …, VI, respectively. It is worth noting that the six PAFs sum to one.
Evaluating the potential impact of an intervention strategy
We now consider the impact of a specific intervention. We note that if an intervention can block a segment of a pathway (for example, either path M2 or D3, but not necessarily both, of Pathway II), the whole pathway is blocked. To calculate the impact fraction for an intervention, we sum the PAFs for those pathways that are blocked by this intervention.
The impact fractions for a number of interventions are detailed: 1) a complete removal of the exposure from the population: this would block paths M2, D2, and D4 and therefore Pathways I, II, III, and IV. The impact fraction for this intervention is PAF_{I} + PAF_{II} + PAF_{I}_{II} + PAF_{IV}. 2) A complete obstruction of the exposure effect on the mediator: this would block path M2 and therefore Pathways II and III. The impact fraction for this intervention is PAF_{II} + PAF_{III}. 3) A complete obstruction of the mediator effect on the disease: this would block paths D3 and D4 and therefore Pathways II, III, IV, and V. The impact fraction of this intervention is PAF_{II} + PAF_{III} + PAF_{IV} + PAF_{V }.
Allocating responsibility in tortlaw liability issues
As pointed out earlier, if probabilistic apportionment of causal responsibility is adopted for tortlaw liability issues,^{16,17} the court needs to know the probability that the occurrence of the disease was induced by the particular exposure. We can follow the attribution arrow(s) of a pathway and examine whether the arrow points at the exposure to decide whether the exposure is involved in the pathway. If the attribution arrow of the disease points at the exposure and the mediator simultaneously, the probability that the exposure is involved is taken to be 0.5 (since there is no further information about which path is more likely to be actually taken). But when the attribution arrow of the mediator points again at the exposure, it is then known for certain that the exposure is involved somewhere in the causal chain. Using these rules, the full attributable fractions for Pathways I, II, and III, a half of the attributable fraction for Pathway IV, and none for Pathways V and VI are allocated to the exposure, respectively.
To be precise, for an exposed subject who contracts the disease, the contribution of the exposure to his/her disease – the “attributable fraction among the exposed” (AFE) – is as follows: 1) if the subject does not acquire the mediator during the followup (the “E=1, M=0” panel in Figure 4),

2) if the subject acquires the mediator during the followup (the “E=1, M=1” panel in Figure 4),

and 3) if the mediator status of the subject is unknown,

Example
We use Richiardi et al’s^{33} cohort data (m_{1}=9900, m_{2}=490, m_{3}=100, m_{4}=10, n_{1}=4850, n_{2}=800, n_{3}=150, and n_{4}=200, using the notations in Table 1) as an example. For this dataset, using Robins and Greenland’s^{5} and Pearl’s^{6} methods, we can decompose the total effect of the exposure on the disease (0.048) into direct effect (0.028) and indirect effect (0.020). Using VanderWeele’s^{7} method, we can further decompose the total effect into four components: controlled direct effect (0.02), reference interaction (0.008), mediated interaction (0.019), and pure indirect effect (0.001). However, we cannot accomplish attribution using these previous methods.
We use the present method to analyze the data (R code in Supplementary materials). The estimates of causalpie parameters are as follows: , , , , , and , respectively (Richiardi et al^{33} did not mention the duration of the followup in their paper; as such, we assume T=1, and Supplementary materials show that assuming different Ts will cause the six to change according to a constant proportion and thus, the estimates of the attributable fractions remain the same). Figure 5 presents the path probabilities.
Figure 5 Disease attributions for the example cohort. Abbreviations: D, disease; E, exposure; M, mediator. 
The PAFs for the six pathways are as follows: PAF_{I}=22.8%, PAF_{II}=2.2%, PAF_{III}=27.8%, PAF_{IV}=10.0%, PAF_{V}=2.4%, and PAF_{VI}=34.7%, respectively. The total sum of the six PAFs is 22.8%+2.2%+27.8%+10.0%+2.4%+34.7%=100.0%.
The impact fraction for a complete removal of the exposure from the population is 62.8%, for a complete obstruction of the exposure effect on the mediator is 30.0%, and for a complete obstruction of the mediator effect on the disease is 42.4%.
For an exposed subject who contracts the disease, if the subject does not acquire the mediator during the followup, AFE^{M=0}=64.7%; if the subject acquires the mediator during the followup, AFE^{M=1}=84.5%; and if the mediator status of the subject is unknown, AFE=76.0%.
Discussion
In this paper, we invoke three assumptions for the causalpie model. The first assumption is the monotonicity assumption.^{1,13,14,27,28} Without this assumption, the number of the causalpie classes (12; 3 for the Mstage and 9 for the Dstage) will be larger than the degrees of freedom of the data (6), which makes the causalpie parameters nonidentifiable. Researchers who use the present method should have prior knowledge that the effects of the exposure on the mediator and on the disease and the effect of the mediator on the disease are “monotonic”. To be precise, neither the “no exposure” nor the “no mediator” can be a component of any causal pie. Second, we assume that the arrival rates of the U’s are constant in the followup period. When the followup time is not too long (for example, less than 5 years), the assumption is reasonable or approximately so. The third assumption is the no redundancy assumption.^{28,31,32} This is a Poissonlike assumption, which is weaker than the assumption of independent competing causes.^{13,14,34} Even though two causalpie classes have overlapping components, the assumption still holds if the overlapping components are not the last one arriving. In addition, the assumption only specifies at most one arrival event of the U’s in an infinitesimally short time interval. Nonrarity of the mediator or the disease for the entire followup period by itself does not necessarily imply the violation of the no redundancy assumption.
Controlling for confounding is essential in observational studies. One can stratify the data by the confounders and compute the attributable fractions for each and every stratum. One then uses the count of the diseased subjects in each stratum as the weight to pool the results. This will yield “adjusted” attributable fractions. The present method can also be extended to accommodate other variable types or more general situations. If the exposure or the mediator is multilevel (a continuous variable can be categorized into a multilevel one for an approximation; but caution should be exercised as this may create bias)^{35} – for example, the exposure has a total of k_{1} levels and the mediator, a total of k_{2} levels – under the monotonicity assumption there will be a total of k_{1} × k_{2} causalpie classes for the Dstage. Furthermore, if the disease has a total of k_{3} subtypes, each with a total of k_{1} × k_{2} causalpie classes, then there will be a total of k_{1} × k_{2} × k_{3} causalpie classes. In addition, if there are multiple exposures or multiple mediators (an exposureinduced mediatordisease confounder^{36–38} can be viewed as another mediator; Figure 6), the total number of causalpie classes will be even larger. It seems rather complex. But if one can conduct a largescale cohort study and use appropriate statistical models, such as a multistate model,^{39–42} the many causalpie parameters (or the state transition rates, using the terminology of a multistate model) can be amenable to estimation. Then, one simply follows the present method for attribution.
Figure 6 An exposureinduced mediatordisease confounder as another mediator. Abbreviations: C, confounder; D, disease; E, exposure; M, mediator. 
Last but not least, the causalpie model by itself deserves careful scrutiny. Like the DAG, a causalpie model depicts an overtly simplified biology. But unfortunately, a direct biological modeling of exposuredisease relations considering all physical or chemical reactions among exposures, their metabolites, or their reaction products within individuals is seldom feasible. Previously, Siemiatycki and Thomas^{43} and Thompson^{44} held a pessimistic view that there is a limit of biological inference from epidemiologic data, since a number of very dissimilar mechanisms or models for disease development can often fit the same data equally well. Recently, an emerging interdisciplinary science, the molecular pathological epidemiology (MPE), has come into focus.^{45–47} MPE uses molecular pathology tools to dissect disease pathways and mechanisms at molecular, individual, and population levels. Casting the causalpie model in the MPE framework is a promising future research direction.
Acknowledgments
This paper is partly supported by grants from the Ministry of Science and Technology, Taiwan (MOST 1052314B002049MY3 and MOST 1042314B002118MY3). No additional external funding was received for this study. The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Disclosure
The authors report no conflicts of interest in this work.
References
Rothman KJ, Greenland S, Lash TL, editors. Modern Epidemiology. 3rd ed. Philadelphia, PA: Lippincott Williams & Wilkins; 2008.  
Bollen KA. Structural Equations with Latent Variables. New York, NY: John Wiley & Sons; 1989.  
Kaplan D. Structural Equation Modeling: Foundations and Extensions. 2nd ed. Thousand Oaks, CA: SAGE; 2009.  
Kline RB. Principles and Practice of Structural Equation Modeling. 4th ed. New York, NY: Guilford; 2015.  
Robins JM, Greenland S. Identifiability and exchangeability for direct and indirect effects. Epidemiology. 1992;3(2):143–155.  
Pearl J. Direct and Indirect Effects: Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, 2001. San Francisco, CA: Morgan Kaufmann; August 2  5, 2001; Seattle, Washington.  
VanderWeele TJ. A unification of mediation and interaction: a 4way decomposition. Epidemiology. 2014;25(5):749–761.  
Cole P, MacMahon B. Attributable risk percent in casecontrol studies. Brit J Prev Soc Med. 1971;25(4):242–244.  
Miettinen OS. Proportion of disease caused or prevented by a given exposure, trait or intervention. Am J Epidemiol. 1974;99(5):325–332.  
Walter SD. The estimation and interpretation of attributable risk in health research. Biometrics. 1976;32(4):829–849.  
Bruzzi P, Green SB, Byar DP, Brinton LA, Schairer C. Estimating the population attributable risk for multiple risk factors using casecontrol data. Am J Epidemiol. 1985;122(5):904–914.  
Benichou J. A review of adjusted estimators of attributable risk. Stat Methods Med Res. 2001;10(3):195–216.  
Liao SF, Lee WC. Weighing the causal pies in casecontrol studies. Ann Epidemiol. 2010;20(7):568–573.  
Lee WC. Completion potentials of sufficient component causes. Epidemiology. 2012;23(3):446–453.  
Kelley HH, Michela JL. Attribution theory and research. Ann Rev Psychol. 1980;31:457–501.  
Wright RW. Causation in tort law. California Law Review. 1985;73:1735–1828.  
Robinson GO. Probabilistic causation and compensation for tortious risk. J Legal Stud. 1985;14:779–798.  
Eide GE, Gefeller O. Sequential and average attributable fractions as aids in the selection of preventive strategies. J Clin Epidemiol. 1995;48(5):645–655.  
Land M, Gefeller O. A gametheoretic approach to partitioning attributable risks in epidemiology. Biom J. 1997;39:777–792.  
Land M, Vogel C, Gefeller O. Partitioning methods for multifactorial risk attribution. Stat Methods Med Res. 2001;10(3):217–230.  
McElduff P, Attia J, Ewald B, Cockburn J, Heller R. Estimating the contribution of individual risk factors to disease in a person with more than one risk factor. J Clin Epidemiol. 2002;55(6):588–592.  
Llorca J, DelgadoRodrı́guez M. A new way to estimate the contribution of a risk factor in populations avoided nonadditivity. J Clin Epidemiol. 2004;57(5):479–483.  
Rabe C, LehnertBatar A, Gefeller O. Generalized approaches to partitioning the attributable risk of interacting risk factors can remedy existing pitfalls. J Clin Epidemiol. 2007;60(5):461–468.  
Rothman KJ. Causes. Am J Epidemiol. 1976;104(6):587–592.  
Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48.  
Pearl J. Causality: models, reasoning, and inference. 2nd ed. New York, NY: Cambridge University Press; 2009.  
Hafeman DM. A sufficient cause based approach to the assessment of mediation. Eur J Epidemiol. 2008;23(11):711–721.  
Suzuki E, Yamamoto E, Tsuda T. On the relations between excess fraction, attributable fraction, and etiologic fraction. Am J Epidemiol. 2012;175(6):567–575.  
VanderWeele TJ. Mediation and mechanism. Eur J Epidemiol. 2009;24(5):217–224.  
Suzuki E, Yamamoto E, Tsuda T. Identification of operating mediation and mechanism in the sufficientcomponent cause framework. Eur J Epidemiol. 2011;26(5):347–357.  
Gatto NM, Campbell UB. Redundant causation from a sufficient cause perspective. Epidemiol Perspect Innov. 2010;7:5.  
Lee WC. Assessing causal mechanistic interactions: a peril ratio index of synergy based on multiplicativity. PLoS One. 2013;8(6):e67424.  
Richiardi L, Bellocco R, Zugna D. Mediation analysis in epidemiology: methods, interpretation and bias. Int J Epidemiol. 2013;42(5):1511–1519.  
Robins JM, Greenland S. Estimability and estimation of excess and etiologic fractions. Stat Med. 1989;8(7):845–859.  
VanderWeele TJ, Chen Y, Ahsan H. Inference for causal interactions for continuous exposures under dichotomization. Biometrics. 2011;67(4):1414–1421.  
Vansteelandt S, VanderWeele TJ. Natural direct and indirect effects on the exposed: effect decomposition under weaker assumptions. Biometrics. 2012;68(4):1019–1027.  
Tchetgen Tchetgen EJ, Vanderweele TJ. Identification of natural direct effects when a confounder of the mediator is directly affected by exposure. Epidemiology. 2014;25(2):282–291.  
Vanderweele TJ, Vansteelandt S, Robins JM. Effect decomposition in the presence of an exposureinduced mediatoroutcome confounder. Epidemiology. 2014;25(2):300–306.  
Kalbfleisch J, Lawless JF. The analysis of panel data under a Markov assumption. J Am Stat Assoc. 1985;80(392):863–871.  
Kay R. A Markov model for analysing cancer markers and disease states in survival studies. Biometrics. 1986;42(4):855–865.  
Jackson CH. Multistate models for panel data: the msm package for R. J Stat Softw. 2011;38(8):1–28.  
Welton NJ, Ades AE. Estimation of Markov chain transition probabilities and rates from fully and partially observed data: uncertainty propagation, evidence synthesis, and model calibration. Med Decis Making. 2005;25(6):633–645.  
Siemiatycki J, Thomas DC. Biological models and statistical interactions: an example from multistage carcinogenesis. Int J Epidemiol. 1981;10(4):382–387.  
Thompson WD. Effect modification and the limits of biological inference from epidemiologic data. J Clin Epidemiol. 1991;44(3):221–232.  
Ogino S, Chan AT, Fuchs CS, Giovannucci E. Molecular pathological epidemiology of colorectal neoplasia: an emerging transdisciplinary and interdisciplinary field. Gut. 2011;60(3):397–411.  
Ogino S, Nishihara R, VanderWeele TJ, et al. The role of molecular pathological epidemiology in the study of neoplastic and nonneoplastic diseases in the era of precision medicine. Epidemiology. 2016;27(4):602–611.  
Richiardi L, BaroneAdesi F, Pearce N. Cancer subtypes in aetiological research. Eur J Epidemiol. 2017;32(5):353–361. 
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution  Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Noncommercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.
Download Article [PDF]